Non-Parallel Training Approach for Emotional Voice Conversion Using CycleGAN

The focus of this research is proposing a nonparallel emotional voice conversion for Egyptian Arabic speech. This method aims to change emotion-related features of a speech signal without changing its lexical content or speaker identity. We relied on the assumption that any speech signal can be divided into content and style code and the conversion between different emotion domains is done by combining the target style code with the content code of the input speech signal. We evaluated the model using an Egyptian Arabic dataset covering two emotion domains and the conversion results were successful depending on a survey conducted on random people. Our purpose is to produce a state-of-the-art pre-trained model as it will be an unprecedented model in the Egyptian Arabic language as far as we are concerned.

PDF

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods