What if it was possible to detect whether someone has COVID-19 or not, just from the sounds of their coughing or talking? It sounds like science fiction, but it may soon come true. This is the goal of the project “Detecção de COVID-19 a partir de tosse e fala” (“COVID-19 detection from coughs and speech”), developed by a team of researchers from Instituto Superior Técnico and INESC-ID.
Using Artificial Intelligence (AI) technologies, the project aims to develop a robust system that helps to identify who is infected with the SARS-CoV-2 virus, through recorded voice and cough. “The main purpose of this project is to be one more clue that can indicate the disease or even be combined with other biomarkers”, highlights the project coordinator, professor Isabel Trancoso, who is also Técnico professor (Department of Electrical and Computer Engineering – DEEC) and INESC-ID researcher.
Although not yet conclusive, the research carried out around this topic is already getting some exciting answers. Several articles published on the subject suggest the hypothesis that even asymptomatic patients reveal changes in their voice, due to the impact of the virus on the lungs and vocal cords, showing slight differences when compared with a healthy person. Although this difference is not decipherable to the human ear, an AI model may be able to detect it.
RT-PCR testing is the mainstay in diagnosing COVID-19, and more recently, antigen tests. There are several disadvantages associated with this testing protocol, namely delayed results, due to the increased workload in laboratories and the huge demand. Consequently, there is a growing interest in developing a cheap, immediate and easy to use system that allows to optimize the testing process. This project was created to follow this need and to take advantage of the solid knowledge that already exists about the potential of speech as a biomarker for health, strongly based on AI methods.
Analyzing speech patterns can help diagnose diseases
Speaking requires the coordination of numerous anatomical structures and systems. The lungs send air through the vocal cords, which produce sounds that are shaped by the tongue, lips and nasal cavities, among other structures. The brain, along with other parts of the nervous system, helps to regulate all these processes and determine the words someone is saying. A disease that affects any one of these systems might leave diagnostic clues in a patient’s speech.
The Técnico professor explains “the potential of speech as a biomarker for health has already been identified for diseases that affect respiratory organs, such as simple cold, or sleep apnea; for mental disorders, such as depression, bipolar disorder, autism spectrum; and for neurodegenerative diseases such as Parkinson’s disease, Alzheimer’s disease, Huntington’s disease; or amyotrophic lateral sclerosis, among many other diseases”. Over the past decade, scientists have used machine learning systems to identify potential vocal biomarkers for a wide variety of these clinical conditions.
The idea for this project comes up right at the beginning of the first lockdown. “Our experience with these diseases clearly pointed to the need to make a great effort to collect an extensive sound data related with COVID-19”, says professor Isabel Trancoso.
A similar project, carried out by a team of researchers at the University of Cambridge, explored the use of traditional acoustic clues (cepstral coefficients, energy, fundamental frequency, etc.) and clues obtained through transfer learning techniques using neural networks, along with different classifiers for COVID-19 detection. The developed models for COVID-19 detection show that the performance is close 80%, even in users who tested negative for COVID-19, but who also had cough due to cold or asthma.
According to the INESC-ID researcher, “the results of the various research works on this topic are very promising, but there are still many areas left unexplored”.
The importance of the Técnico community in this project
The first phase of the project is to collect an extensive dataset with representative examples of speech and simulated coughs and snores from both COVID-19 positive (symptomatic and asymptomatic) and negative individuals (ideally including also participants with respiratory conditions other than COVID-19, such as flu, cold, asthma, etc.). These data will be crucial for the development and success of the project, and for this reason the participation of Técnico community is essential and warmly appreciated. The challenge of participating in this study extends to the whole society.
The participants will have to supply an audio recording of their cough and snoring, as well as speech – sustained vowel, reading a short text, free description of an image. In addition, participants just need to provide some personal data, namely demographic data – age, sex, mother tongue; health data – date and result of the COVID test (for those who were already tested), symptoms in the last 15 days, chronic diseases or chronic medical conditions, voice disorders. All necessary measures will be taken to ensure the security and anonymity of the data collected.
After the necessary data is collected, the research team will use signal processing and machine learning techniques to assess the presence of biomarkers indicative of COVID-19 in coughs and speech, and to develop robust systems for the detection of COVID-19. Once properly tested, these systems can be easily deployed as a web tool and/or a mobile application.
An important screening tool
The research team do not intend to develop a clinical diagnostic test, but rather a complementary and low-cost test – a simple screening tool – using non-intrusive techniques and whose use does not depend on health professionals. In the future, the effective implementation of this screening tool may be essential to curb the spread of COVID-19 pandemic if, for example, it is used at the entrance of schools or companies/institutions.
The data collected in this study will also allow to continue studying other diseases that affect the respiratory system. “It is extremely important to have a volume of data that allows us to carry out this study”, stresses professor Isabel Trancoso.
“My vision is that collecting speech samples will become as common as a blood test”, says the INESC-ID researcher. “It is a ubiquitous signal and can be collected in a non-invasive way, both in person and by teleconsultations”, she stresses.