Making Large Language Models Smarter

The Data Exchange Podcast: Yoav Shoham on lessons learned building the largest language model available to developers.


SubscribeApple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.

This week’s guest is Yoav Shoham, co-founder of AI21 Labs, creators of the largest language model available to developers. Yoav is also a Professor Emeritus of Computer Science at Stanford University, and a serial entrepreneur who has co-founded numerous data and AI startups. This episode presents a great overview of NLP and language models today and in the future.

Download the 2021 NLP Survey Report and learn how companies are using and implementing natural language technologies.

Yoav Shoham:

One of the buzzwords is Neuro-Symbolic programming, combining symbolic reasoning with neural network machinery. Different people have different interpretations, we have our approach to this and others have others. Some people do believe that you need to embody the machine in the real world. So it gets reinforced by actual performance. Certainly, there’s an emphasis on multimodal learning. … the jury’s still out. My own bet is that reinforce, neural backprop, and reinforcement learning are necessary components, but not sufficient. You want to inject structured prior knowledge.

… We believe in brains and brawn. So we think that large language models will continue to be important, but we think they can be much, much smarter than they are today.

… We have a large model with 178 billion parameters, but we also have a smaller model, with 7.5 billion parameters. We actually encourage people to use a small model. … And the truth is, you can get incredibly good performance, if you use the large model to generate training data, and then use it to fine tune the small model. So we think that different size models will have different roles.

Highlights in the video version:

Related content:


Subscribe to our Newsletter:
We also publish a popular newsletter where we share highlights from recent episodes, trends in AI / machine learning / data, and a collection of recommendations.


[Photo by Amador Loureiro on Unsplash.]