no code implementations • 15 May 2020 • Dan Lim, Won Jang, Gyeonghwan O, Heayoung Park, Bong-Wan Kim, Jaesam Yoon
We propose Jointly trained Duration Informed Transformer (JDI-T), a feed-forward Transformer with a duration predictor jointly trained without explicit alignments in order to generate an acoustic feature sequence from an input text.