Research Areas and Projects
Speech Synthesis with Deep Learning
- Artificial Neural Network as a powerful modelling tool
- Barriers to depth: The recalcitrance of convergence & the computational complexity
The critical path to depth
- Pre-training with auto-encoders or restricted Boltzmann machines
- Advancements in hardware: Multi-CPUs & GPU
Applying Deep Learning to Speech Synthesis
The new power of the deep network
Substituting the deep network for different components in the existing systems
- Decision tree/clustering
- Prosody model
- Direct parameter model
Key References and Links
- Y. Bengio, "Learning Deep Architectures for AI," Foundations and Trends in Machine Learning, 2(1): 1-127, 2009.
- A. Fischer & C. Igel, "An Introduction to Restricted Boltzmann Machines," 2012.
- Deep Learning Tutorial, LISA lab, University of Montreal, 2014.
- H. Zen, "Deep Learning in Speech Synthesis," Google, 2013.
- Z. Ling et al., "Deep Learning for Acoustic Modeling in Parametric Speech Generation," IEEE Signal Processing Magazine, 35-52, May 2015.