This research focuses on more accurate speech recognition with end-to-end models and scale. In the past I also worked on hybrid HMM-based speech recognition.
- Sequence-to-Sequence Speech Recognition with Time-Depth Separable Convolutions, Awni Hannun, Ann Lee, Qiantong Xu, Ronan Collobert. Interspeech 2019. (paper, code)
- Wav2Letter++: A Fast Open-source Speech Recognition System,Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel Synnaeve, Vitaliy Liptchinsky, Ronan Collobert. ICASSP 2019. (paper, code, blog)
- Deep Speech 2: End-to-End Speech Recognition in English and Mandarin,
SVAIL. ICML 2016.
Mentions: MIT Tech Review, MIT Tech Review
- Deep Speech: Scaling up end-to-end speech recognition,
Awni Y. Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich
Elsen, Ryan Prenger, Sanjeev Satheesh, Shubho Sengupta, Adam Coates, Andrew
Y. Ng. arXiv:1412.5567, 2014.
- Rectifier Nonlinearities Improve Neural Network Acoustic Models, Andrew L. Maas, Awni Y. Hannun, and Andrew Y. Ng. ICML Workshop on Deep Learning for Audio, Speech, and Language Processing (WDLASL 2013). (pdf)