1 code implementation • 5 Apr 2022 • Venkatesh S. Kadandale, Juan F. Montesinos, Gloria Haro
Finally, we use the frozen visual features learned by our lip synchronisation model in the singing voice separation task to outperform a baseline audio-visual model which was trained end-to-end.
1 code implementation • 8 Mar 2022 • Juan F. Montesinos, Venkatesh S. Kadandale, Gloria Haro
In a second stage, the predominant voice is enhanced with an audio-only network.
2 code implementations • 20 Apr 2021 • Juan F. Montesinos, Venkatesh S. Kadandale, Gloria Haro
The task of isolating a target singing voice in music videos has useful applications.
2 code implementations • 23 Mar 2020 • Venkatesh S. Kadandale, Juan F. Montesinos, Gloria Haro, Emilia Gómez
However, Conditioned U-Net (C-U-Net) uses a control mechanism to train a single model for multi-source separation and attempts to achieve a performance comparable to that of the dedicated models.