At Ann Johnson’s wedding reception 20 years ago, her gift for speech was vividly evident. In an ebullient 15-minute toast, she joked that she had run down the aisle, wondered if the ceremony program should have said “flutist” or “flautist” and acknowledged that she was “hogging the mic.”
Just two years later, Mrs. Johnson — then a 30-year-old teacher, volleyball coach and mother of an infant — had a cataclysmic stroke that paralyzed her and left her unable to talk.
On Wednesday, scientists reported a remarkable advance toward helping her, and other patients, speak again. In a milestone of neuroscience and artificial intelligence, implanted electrodes decoded Mrs. Johnson’s brain signals as she silently tried to say sentences. Technology converted her brain signals into written and vocalized language, and enabled an avatar on a computer screen to speak the words and display smiles, pursed lips and other expressions.
The research, published in the journal Nature, demonstrates the first time spoken words and facial expressions have been directly synthesized from brain signals, experts say. Mrs. Johnson chose the avatar, a face resembling hers, and researchers used her wedding toast to develop the avatar’s voice.
“We’re just trying to restore who people are,” said the team’s leader, Dr. Edward Chang, the chairman of neurological surgery at the University of California, San Francisco.
“It let me feel like I was a whole person again,” Mrs. Johnson, now 48, wrote me.
The goal is to help people who cannot speak because of strokes or conditions like cerebral palsy and amyotrophic lateral sclerosis. To work, Mrs. Johnson’s implant must be connected by cable from her head to a computer, but her team and others are developing wireless versions. Eventually, researchers hope, people who have lost speech may converse in real time through computerized pictures of themselves that convey tone, inflection and emotions like joy and anger.
“What’s quite exciting is that just from the surface of the brain, the investigators were able to get out pretty good information about these different features of communication,” said Dr. Parag Patil, a neurosurgeon and biomedical engineer at the University of Michigan, who was asked by Nature to review the study before publication.
Video by Metzger et al., Weill Institute for Neurosciences/University of California, San Francisco
Mrs. Johnson’s experience reflects the field’s fast-paced progress. Just two years ago, the same team published research in which a paralyzed man, who went by the nickname Pancho, used a simpler implant and algorithm to produce 50 basic words like “hello” and “hungry” that were displayed as text on a computer after he tried to say them.
Mrs. Johnson’s implant has nearly twice as many electrodes, increasing its ability to detect brain signals from speech-related sensory and motor processes linked to the mouth, lips, jaw, tongue and larynx. Researchers trained the sophisticated artificial intelligence to recognize not individual words, but phonemes, or sound units like “ow” and “ah” that can ultimately form any word.
“It’s like an alphabet of speech sounds,” David Moses, the project manager, said.
While Pancho’s system produced 15 to 18 words per minute, Mrs. Johnson’s rate was 78 using a much larger vocabulary list. Typical conversational speech is about 160 words per minute.
When researchers began working with her, they didn’t expect to try the avatar or audio. But the promising results were “a huge green light to say, ‘OK, let’s try the harder stuff, let’s just go for it,’” Dr. Moses said.
They programmed an algorithm to decode brain activity into audio waveforms, producing vocalized speech, said Kaylo Littlejohn, a graduate student at the University of California, Berkeley, and one of the study’s lead authors, along with Dr. Moses, Sean Metzger, Alex Silva and Margaret Seaton.
“Speech has a lot of information that is not well preserved by just text, like intonation, pitch, expression,” Mr. Littlejohn said.
Working with a company that produces facial animation, researchers programmed the avatar with data on muscle movements. Mrs. Johnson then tried to make facial expressions for happy, sad and surprised, each at high, medium and low intensity. She also tried to make various jaw, tongue and lip movements. Her decoded brain signals were conveyed on the avatar’s face.
Through the avatar, she said, “I think you are wonderful” and “What do you think of my artificial voice?”
“Hearing a voice similar to your own is emotional,” Mrs. Johnson told the researchers.
She and her husband, William, a postal worker, even engaged in conversation. She said through the avatar: “Do not make me laugh.” He asked how she was feeling about the Toronto Blue Jays’ chances. “Anything is possible,” she replied.
The field is moving so quickly that experts believe federally approved wireless versions might be available within the next decade. Different methods might be optimal for certain patients.
On Wednesday, Nature also published another team’s study involving electrodes implanted deeper in the brain, detecting activity of individual neurons, said Dr. Jaimie Henderson, a professor of neurosurgery at Stanford and the team’s leader, who was motivated by his childhood experience of watching his father lose speech after an accident. He said their method might be more precise but less stable because specific neurons’ firing patterns can shift.
Their system decoded sentences at 62 words per minute that the participant, Pat Bennett, 68, who has A.L.S., tried to say from a large vocabulary. That study didn’t include an avatar or sound decoding.
Both studies used predictive language models to help guess words in sentences. The systems don’t just match words but are “figuring out new language patterns” as they improve their recognition of participants’ neural activity, said Melanie Fried-Oken, an expert in speech-language assistive technology at Oregon Health & Science University, who consulted on the Stanford study.
Video by Metzger et al., Weill Institute for Neurosciences/University of California, San Francisco
Neither approach was completely accurate. When using large vocabulary sets, they incorrectly decoded individual words about a quarter of the time.
For example, when Mrs. Johnson tried to say, “Maybe we lost them,” the system decoded, “Maybe we that name.” But in nearly half of her sentences, it correctly deciphered every word.
Researchers found that people on a crowdsourcing platform could correctly interpret the avatar’s facial expressions most of the time. Interpreting what the voice said was harder, so the team is developing a prediction algorithm to improve that. “Our speaking avatar is just at the starting point,” Dr. Chang said.
Experts emphasize that these systems aren’t reading people’s minds or thoughts. Rather, Dr. Patil said, they resemble baseball batters who “are not reading the mind of the pitcher but are kind of interpreting what they see the pitcher doing” to predict pitches.
Still, mind reading may ultimately be possible, raising ethical and privacy issues, Dr. Fried-Oken said.
Mrs. Johnson contacted Dr. Chang in 2021, the day after her husband showed her my article about Pancho, the paralyzed man the researchers had helped. Dr. Chang said he initially discouraged her because she lived in Saskatchewan, Canada, far from his lab in San Francisco, but “she was persistent.”
Mr. Johnson, 48, arranged to work part time. “Ann’s always supported me to do what I’ve wanted,” including leading his postal union local, he said. “So I just thought it was important to be able to support her in this.”
She started participating last September. Traveling to California takes them three days in a van packed with equipment, including a lift to transfer her between wheelchair and bed. They rent an apartment there, where researchers conduct their experiments to make it easier for her. The Johnsons, who raise money online and in their community to pay for travel and rent for the multiyear study, spend weeks in California, returning home between research phases.
“If she could have done it for 10 hours a day, seven days a week, she would have,” Mr. Johnson said.
Determination has always been part of her nature. When they began dating, Mrs. Johnson gave Mr. Johnson 18 months to propose, which he said he did “on the exact day of the 18th month,” after she had “already gone and picked out her engagement ring.”
Mrs. Johnson communicated with me in emails composed with the more rudimentary assistive system she uses at home. She wears eyeglasses affixed with a reflective dot that she aims at letters and words on a computer screen.
It’s slow, allowing her to generate only 14 words per minute. But it’s faster than the only other way she can communicate at home: using a plastic letter board, a method Mr. Johnson described as “her just trying to show me which letter she’s trying to try to look at and then me trying to figure out what she’s trying to say.”
The inability to have free-flowing conversations frustrates them. When discussing detailed matters, Mr. Johnson sometimes says something and receives her response by email the next day.
“Ann’s always been a big talker in life, an outgoing, social individual who loves talking, and I don’t,” he said, but her stroke “made the roles reverse, and now I’m supposed to be the talker.”
Mrs. Johnson was teaching high school math, health and physical education, and coaching volleyball and basketball when she had her brainstem stroke while warming up to play volleyball. After a year in a hospital and a rehabilitation facility, she came home to her 10-year-old stepson and her 23-month-old daughter, who has now grown up without any memory of hearing her mother speak, Mr. Johnson said.
“Not being able to hug and kiss my children hurt so bad, but it was my reality,” Mrs. Johnson wrote. “The real nail in the coffin was being told I couldn’t have more children.”
For five years after the stroke, she was terrified. “I thought I would die at any moment,” she wrote, adding, “The part of my brain that wasn’t frozen knew I needed help, but how would I communicate?”
Gradually, her doggedness resurfaced. Initially, “my face muscles didn’t work at all,” she wrote, but after about five years, she could smile at will.
She was entirely tube-fed for about a decade, but decided she wanted to taste solid food. “If I die, so be it,” she told herself. “I started sucking on chocolate.” She took swallowing therapy and now eats finely chopped or soft foods. “My daughter and I love cupcakes,” she wrote.
When Mrs. Johnson learned that trauma counselors were needed after a fatal bus crash in Saskatchewan in 2018, she decided to take a university counseling course online.
“I had minimal computer skills and, being a math and science person, the thought of writing papers scared me,” she wrote in a class report. “At the same time, my daughter was in grade 9 and being diagnosed with a processing disability. I decided to push through my fears and show her that disabilities don’t need to stop us or slow us down.”
Helping trauma survivors remains her goal. “My shot at the moon was that I would become a counselor and use this technology to talk to my clients,” she told Dr. Chang’s team.
At first when she started making emotional expressions with the avatar, “I felt silly, but I like feeling like I have an expressive face again,” she wrote, adding that the exercises also enabled her to move the left side of her forehead for the first time.
She has gained something else, too. After the stroke, “it hurt so bad when I lost everything,” she wrote. “I told myself that I was never again going to put myself in line for that disappointment again.”
Now, “I feel like I have a job again,” she wrote.
Besides, the technology makes her imagine being in “Star Wars”: “I have kind of gotten used to having my mind blown.”