The Data Exchange Podcast: Yoav Shoham on lessons learned building the largest language model available to developers.
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
This week’s guest is Yoav Shoham, co-founder of AI21 Labs, creators of the largest language model available to developers. Yoav is also a Professor Emeritus of Computer Science at Stanford University, and a serial entrepreneur who has co-founded numerous data and AI startups. This episode presents a great overview of NLP and language models today and in the future.
Yoav Shoham:
One of the buzzwords is Neuro-Symbolic programming, combining symbolic reasoning with neural network machinery. Different people have different interpretations, we have our approach to this and others have others. Some people do believe that you need to embody the machine in the real world. So it gets reinforced by actual performance. Certainly, there’s an emphasis on multimodal learning. … the jury’s still out. My own bet is that reinforce, neural backprop, and reinforcement learning are necessary components, but not sufficient. You want to inject structured prior knowledge.… We believe in brains and brawn. So we think that large language models will continue to be important, but we think they can be much, much smarter than they are today.
… We have a large model with 178 billion parameters, but we also have a smaller model, with 7.5 billion parameters. We actually encourage people to use a small model. … And the truth is, you can get incredibly good performance, if you use the large model to generate training data, and then use it to fine tune the small model. So we think that different size models will have different roles.
Highlights in the video version:
- AI21 Labs, Train Billion Parameter Model, and Large Language Models Available to Developers
State of AI in the 1980s, 1990s, and Today
Impact in Language, Object Recognition, Transformers, and Academic Benchmarks
Natural Language Understanding and Prompt Engineering
Inject Structured Knowledge, Semantics, and Priors into the Networks
Will language models become more manageable over time?
WordTune.com
LLM: Size Will Matter, AI21:ab’s Jurassic Language Model, using Large and Small Models
Tuning Language Models: Training Data and Tuning Setup
NLP: Benchmarks, and Performance
Academia vs. Industry, Resources, and Engineering Talent
NeurIPS and Deep Learning
Neural Network, Reinforcement Learning, and Neuro Symbolic Programming
Theoretical Computer Scientists, Understanding How things Work, What Else Do We Know?
Is creativity hard to measure? Can computers think, have feelings, have free will?
Summarization, Benchmarks and Evaluation Methods
NLP, Tuning, and Thoughts on Transfer Learning
Multimodal Systems, Co-pilot, and other Coding Assistants
What will AI21 Labs focus on moving forward?
What’s your advice to someone who wants to get a PhD in NLP?
Related content:
- A video version of this conversation is available on our YouTube channel.
- Connor Leahy: “Training and Sharing Large Language Models”
- “Resurgence of Conversational AI”
- Matthew Honnibal: “Building open source developer tools for language applications”
- Alan Nichol: “Best practices for building conversational AI applications”
- Lauren Kunze: “How to build state-of-the-art chatbots”
- Charles Martin; “An oscilloscope for deep learning”
- Rumman Chowdury: “The State of Responsible AI”
Subscribe to our Newsletter:
We also publish a popular newsletter where we share highlights from recent episodes, trends in AI / machine learning / data, and a collection of recommendations.
[Photo by Amador Loureiro on Unsplash.]