Towards Building State-of-the-art Text-to-Speech Systems for Indian Languages
Date18th Jan 2024
Time03:00 PM
Venue CSD 308/ Googlemeet
PAST EVENT
Details
India has a wide linguistic diversity, with 23 official languages. Of its one billion+ population, the literacy rate is 74.04%. Therefore, good-quality Indic text-to-speech (TTS) synthesis systems need to be developed to better engage the general public. This task is challenging given that most Indian languages have limited or no resource availability.
With the advent of neural-network based end-to-end (E2E) approach, training TTS systems has become easier. However, E2E systems are still prone to errors in the generated audio. In this work, we explore two approaches to addressing the errors: (1) An inter-pausal unit (IPU) based approach to address the issues of word skips and repetitions. (2) A signal processing directed alignment approach for better duration modelling to reduce mispronunciations in the generated audio. The advantage of these methods is that they are architecture and language-agnostic. We show that our proposed approaches outperform state-of-the-art TTS systems available for 13 Indian languages.
Speakers
Ms. Anusha Prakash ( EE17D039 )
Electrical Engineering