EXCLUSIVE: A Voice Tech Startup is Gathering Data to Build a Coronavirus Speech Test
Identifying people infected with COVID-19 by the sound of their voice sounds far-fetched, but enterprise voice assistant developer Voca.ai has started collecting the data that could lead to one. The startup partnered with Carnegie Mellon University to launch Corona Voice Detect this week, soliciting people to record their voices for an eventual open-source dataset and potential voice test for the disease.
Corona Voice Detect at the moment consists mainly of a website where people can record themselves speaking a few sentences. Users fill in a few details about their location, age, how they are feeling, and if they have been diagnosed with the coronavirus. The information is then anonymized and added to a growing dataset for analysis.
“We ask people to use the platform and record themselves every day. They say if they have the virus and how they are feeling,” Voca.ai co-founder Alan Bekker told Voicebot in an interview. “In viruses like the coronavirus that harm the respiratory system, there’s a high probability we might find a pattern in the way a person speaks using voice biomarkers research. We only launched a few days ago and are getting thousands of recordings an hour from Italy, the U.S., Asia, Israel, and all over. There are 20,000 to 30,000 people who have recorded so far.”
Bekker noted that this is not a standard project for Voca.ai, which normally focuses on building virtual call center voice assistants for companies. This is a side project the company is putting together for free, with plans to release all of the dataset and notes to the public for others to use and improve. Researchers could use the data to design automatic vocal coronavirus tests, a crucial part of limiting the spread of the disease. The more data that is collected, the more accurate any resulting algorithm will be.
“It’s too early to make predictions but we are seeing some patterns,” Bekker said. “The goal is a few weeks from now to use this algorithm to say there’s this probability you have it so you should go get checked by a doctor.”
Power of Data
“They seem to be going about it the right way. To train [an AI] to detect a disease you’d need to get recordings of people coming down with the disease and monitor them from there,” Canary Speech founder Jeff Adams said to Voicebot in an interview. Canary Speech is working on ways of detecting neurological disease through voice. “But, I get nervous about this because if they don’t do it quite right people might not take the idea of diagnosing illness by voice seriously.”
Whether the effect of coronavirus on the respiratory system will affect the voice and how is important to know, Adams added. That could determine how much of an early warning a change in voice could give a patient. Identifying the coronavirus by voice isn’t helpful if there are already many other signs of it.
“I will be interested to look at the data,” Adams said. “That they say they are releasing it to the public is very magnanimous of them.”
Skepticism is not surprising, Bekker said, especially because the project is barely underway. As more data is gathered and analyzed, however, he expects that to change.
“Of course, people have questions if it’s possible or not,” Bekker said. “We’ve spoken with many doctors who said there are patterns in speech that could be a sign of the virus, but we know we are maybe optimistic in that way. We know that this data we are getting is super critical and might help with other diseases as well.”