The Data Exchange Podcast: David Talby on Spark NLP and turning NLP research into enterprise solutions.
In this episode of the Data Exchange I speak with David Talby, co-creator of Spark NLP, an open source, highly scalable, production grade natural language processing (NLP) library. Spark NLP has become one of the more popular NLP libraries and is available on PyPI, Conda, Maven, and Spark Packages. With recent advances in research in large-scale natural language models, there is strong interest in domain specific natural language applications. Besides their work on Spark NLP, David and his collaborators are building natural language models tuned specifically for healthcare applications.
Our conversation spanned many topics, including:
- Spark NLP: its current status and some common and surprising use cases.
- Recent developments in NLP research and their implications for companies.
- Spark NLP for Healthcare
Our goal in this podcast is to build a community of people interested in Data, Machine Learning and AI. If you have suggestions for us on what to recommend (books, conferences, links), and guests to book, please visit TheDataExchange.media site and fill out the “contact” form.
- David Talby on “Building a natural language processing library for Apache Spark”
- One simple chart: Who is interested in Spark NLP?
- Reza Zadeh on “Building large-scale, real-time computer vision applications”
- David Talby on: Lessons learned building natural language processing systems in health care
- Rajat Monga on “The evolution of TensorFlow and of machine learning infrastructure”
- “Managing machine learning in the enterprise: Lessons from banking and health care”
[Photo from Free Stock Photos.biz: A woman sitting in a library.]