DigiVoice: Voice Biomarker Featurization and Analysis Pipeline

17 Jun 2019  ·  Larry Zhang, Xiaotong Chen, Abbad Vakil, Ali Byott, Reza Hosseini Ghomi ·

In recent years, data-driven models have enabled significant advances in medicine. Simultaneously, voice has shown potential for analysis in precision medicine as a biomarker for screening illnesses. There has been a growing trend to pursue voice data to understand neuropsychiatric diseases. In this paper, we present DigiVoice, a comprehensive feature extraction and analysis pipeline for voice data. DigiVoice supports raw .WAV files and text transcriptions in order to analyze the entire content of voice. DigiVoice supports feature extraction including acoustic, natural language, linguistic complexity, and semantic coherence features. DigiVoice also supports machine learning capabilities including data visualization, feature selection, feature transformation, and modeling. To our knowledge, DigiVoice provides the most comprehensive voice feature set for data analysis to date. With DigiVoice, we plan to accelerate research to correlate voice biomarkers with illness to enable data-driven treatment. We have worked closely with our industry partner, NeuroLex Laboratories, to make voice computing open source and accessible. DigiVoice enables researchers to leverage our technology across the domains of voice computing and precision medicine without domain-specific expertise. Our work allows any researchers to use voice as a biomarker in their past, current, or future studies.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here