The Data Exchange Podcast: Yishay Carmiel on recent progress in speech technologies.
In this episode of the Data Exchange I speak with Yishay Carmiel, an AI Leader at Avaya, a company focused on digital communications. He has long been immersed in speech technologies and conversational applications and I have frequently used him as a resource to understand the latest in speech systems. We previously co-wrote an article that listed out recommendations for teams building speech applications. We also had a previous conversation on the impact of deep learning and big data on speech technologies.
We focused on recent developments in speech technologies. We recorded this podcast right after the NLP Summit where one of the keynotes was presented by Bo Li, a noted researcher from Google. In his keynote, Bo gave an overview of end-to-end speech models for ASR (automatic speech recognition). An end-to-end model incorporates functions from traditionally disparate components and puts them into a single neural network and optimizes them jointly.
Given that he works with both researchers and application builders, I devoted this episode to understanding Yishay’s perspective on the rise of end-to-end ASR models, text-to-speech systems, and responsible AI in the context of speech technologies. According to Yishay, for now real-time, end-to-end deep learning models aren’t widely available in open source:
Usually when I’m thinking about cutting-edge speech recognition systems, I’m thinking of ‘real-time’ speech recognition: this means that while I’m talking the system is already doing some speech recognition. For end-to-end systems there isn’t some open source solution or open architecture that you can use to build a ‘real-time’ speech recognition system. … End-to-end, ‘real-time’ systems represent the future.
Subscribe to our Newsletter:
We also publish a popular newsletter where we share highlights from recent episodes, trends in AI / machine learning / data, and a collection of recommendations.
- A video version of this conversation is available on our YouTube channel.
- Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.
- Got speech? These guidelines will help you get started building voice applications
- Yishay Carmiel: “Commercial speech recognition systems in the age of big data and deep learning”
- Alan Nichol: “Best practices for building conversational AI applications”
- Matthew Honnibal: “Building open source developer tools for language applications”
- Marco Ribeiro: “Testing Natural Language Models”
Register to join live or watch on-demand.