no code implementations • 4 Jun 2024 • Alex Agranovich, Eliya Nachmani, Oleg Rybakov, Yifan Ding, Ye Jia, Nadav Bar, Heiga Zen, Michelle Tadmor Ramanovich
Simultaneous speech-to-speech translation (S2ST) holds the promise of breaking down communication barriers and enabling fluid conversations across languages.
no code implementations • 27 May 2023 • Eliya Nachmani, Alon Levkovitch, Yifan Ding, Chulayuth Asawaroengchai, Heiga Zen, Michelle Tadmor Ramanovich
This paper presents Translatotron 3, a novel approach to unsupervised direct speech-to-speech translation from monolingual speech-text datasets by combining masked autoencoder, unsupervised embedding mapping, and back-translation.
no code implementations • 24 May 2023 • Eliya Nachmani, Alon Levkovitch, Roy Hirsch, Julian Salazar, Chulayuth Asawaroengchai, Soroosh Mariooryad, Ehud Rivlin, RJ Skerry-Ryan, Michelle Tadmor Ramanovich
Key to our approach is a training objective that jointly supervises speech recognition, text continuation, and speech synthesis using only paired speech-text pairs, enabling a `cross-modal' chain-of-thought within a single decoding pass.
1 code implementation • LREC 2022 • Ye Jia, Michelle Tadmor Ramanovich, Quan Wang, Heiga Zen
In addition, CVSS provides normalized translation text which matches the pronunciation in the translation speech.
no code implementations • CVPR 2022 • Michael Hassid, Michelle Tadmor Ramanovich, Brendan Shillingford, Miaosen Wang, Ye Jia, Tal Remez
In this paper we present VDTTS, a Visually-Driven Text-to-Speech model.
no code implementations • 19 Jul 2021 • Ye Jia, Michelle Tadmor Ramanovich, Tal Remez, Roi Pomerantz
We present Translatotron 2, a neural direct speech-to-speech translation model that can be trained end-to-end.