no code implementations • 17 Feb 2024 • Xiangyu Zhang, Hexin Liu, Kaishuai Xu, Qiquan Zhang, Daijiao Liu, Beena Ahmed, Julien Epps
In addition, this approach is not only valuable for the detection of depression but also represents a new perspective in enhancing the ability of LLMs to comprehend and process speech signals.
no code implementations • 13 Nov 2023 • Mostafa Shahin, Julien Epps, Beena Ahmed
We further propose a multi-label variant of the Connectionist Temporal Classification (CTC) approach to jointly model the non-mutually exclusive speech attributes using a single model.
no code implementations • 17 Oct 2023 • Antoni Dimitriadis, Siqi Pan, Vidhyasaharan Sethu, Beena Ahmed
Spatial HuBERT learns representations that outperform state-of-the-art single-channel speech representations on a variety of spatial downstream tasks, particularly in reverberant and noisy environments.
no code implementations • 21 Sep 2023 • Zheng Nan, Ting Dang, Vidhyasaharan Sethu, Beena Ahmed
Connectionist temporal classification (CTC) is commonly adopted for sequence modeling tasks like speech recognition, where it is necessary to preserve order between the input and target sequences.
1 code implementation • 14 Nov 2022 • Renee Lu, Mostafa Shahin, Beena Ahmed
We assess the performance of fine-tuning on both native and non-native children's speech, examine the effect of cross-domain child corpora, and investigate the minimum amount of child speech required to fine-tune a model which outperforms a state-of-the-art adult model.
no code implementations • 19 Oct 2022 • Mostafa Shahin, Beena Ahmed, Julien Epps
These high acoustic variations along with the scarcity of child speech corpora have impeded the development of a reliable speech recognition system for children.