End-to-End Speech Synthesis

Tacotron

At Google, I am now a member of the team that brought you Tacotron, an end-to-end speech synthesis system that uses neural networks to convert text directly to audio. Check out the audio samples from the recently released Tacotron 2 system, which combines Tacotron with a Wavenet-based vocoder.

Publications I contributed to are listed below.

Publications

. Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron. arXiv, 2018.

arXiv PDF Project Audio Examples Blog Post

. Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis. arXiv, 2018.

arXiv PDF Project Audio Examples Blog Post

. Uncovering Latent Style Factors for Expressive Speech Synthesis. NIPS ML4Audio Workshop, 2017.

arXiv PDF Project Poster Audio Examples Workshop