Scientists at Columbia University have devised a clever way of converting thoughts into speech using a nifty combination of speech synthesizers and artificial intelligence (AI).
The technology effectively connects to and "listens" to the brain, detecting patterns of activity it can then "translate" into words. As of right now, its abilities are relatively basic but, as the researchers note in Scientific Reports, the possibilities are huge. Not only could it offer us a way to communicate with computers, it may one day offer potentially life-changing solutions to people with speech-limiting conditions – for example, those who have had a stroke or are living with amyotrophic lateral sclerosis (ALS), like the late great Stephen Hawking.
The process hinges on the tell-tale patterns of activity that light up our brains when we speak or even just think about speaking. Similarly, when we listen to someone else speak (or imagine doing so), there are various other patterns that present in the brain.
But while previous attempts to "read" brain activity have relied on spectrogram-analyzing computer models and have been unsuccessful, this new technique uses the technology adopted by Apple for Siri and Amazon for Alexa – an AI-enabled vocoder.
Vocoders are a type of computer algorithm that is able to synthesize speech, but first it has to be trained on recordings of people talking. For this particular study, led by Nima Mesgarani, a principal investigator at Columbia University's Mortimer B. Zuckerman Mind Brain Behavior Institute, the vocoder was trained with the help of five epilepsy patients, chosen because they were already undergoing brain surgery. While the epilepsy patients were asked to listen to the speech of various different people, the researchers monitored their brain activity.
Then, the experiment really began. To test whether or not the algorithm was now able to "read" the participants' brainwaves, the researchers played recordings of the same speakers reeling off sequences of digits between 0 and 9. The brain signals of the epilepsy patients were recorded and run through the vocoder. The results of the vocoder were then checked and "cleaned up" with AI (neural networks). Finally, a robotic-sounding voice repeated the sequence numbers.