1 code implementation • 9 Jan 2024 • Soumya Dutta, Sriram Ganapathy
The problem of audio-to-audio (A2A) style transfer involves replacing the style features of the source audio with those from the target audio while preserving the content related attributes of the source audio.
no code implementations • 2 Oct 2023 • Humayra Tasnim, Soumya Dutta, Melanie Moses
In the era of burgeoning data generation, managing and storing large-scale time-varying datasets poses significant challenges.
no code implementations • 14 Apr 2023 • Soumya Dutta, Sriram Ganapathy
The audio and text representations are processed using a set of bi-directional recurrent neural network layers with self-attention that converts each utterance in a given conversation to a fixed dimensional embedding.
Ranked #1 on Multimodal Emotion Recognition on MELD
Emotion Classification Emotion Recognition in Conversation +1