no code implementations • 25 Jan 2024 • Sunghee Jung, Won Jang, Jaesam Yoon, BongWan Kim
Zero-shot TTS demands additional efforts to ensure clear pronunciation and speech quality due to its inherent requirement of replacing a core parameter (speaker embedding or acoustic prompt) with a new one at the inference stage.
6 code implementations • 15 Jun 2021 • Won Jang, Dan Lim, Jaesam Yoon, BongWan Kim, Juntae Kim
Using full-band mel-spectrograms as input, we expect to generate high-resolution signals by adding a discriminator that employs spectrograms of multiple resolutions as the input.