Site icon The Data Exchange

Evaluating Language Models

Percy Liang on new tools for the Holistic Evaluation of Language Models.


SubscribeApple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon •  RSS.

Percy Liang is Associate Professor of Computer Science and Statistics, and Director of the new Center for Research on Foundation Models at Stanford University. We discussed a new suit of tools (HELM) designed to help users and researchers understand language models in their totality. We also discuss recent trends in AI including the rise of Generative AI and Foundation Models.

Subscribe to the Gradient Flow Newsletter

If there’s only API access, that’s not enough. It depends on how technical and how much a company wants to invest, and how much data they have, and whether they are comfortable shipping data to an API and or to someone else. I don’t think there will be just one GPT-like model that rules them all. It will come down to the dynamics of how organizations are structured, and considerations like trust, cost, and other things.

– Percy Liang on the likely rise of decentralized custom models.

Interview highlights – key sections from the video version:

  1. What are language models?
  2. What is HELM (Holistic Evaluation of Language Models)?
  3. Using HELM
  4. Metrics they plan to add to HELM
  5. The impact of Model Size – key findings from HELM
  6. “Private” vs “Public” models
  7. Are we going to run out of data?
  8. Fine tuning and pre-training
  9. Foundation Models
  10. HELM roadmap and schedule
  11. The Center for Research on Foundation Models at Stanford University


FREE Report


Related content:


If you enjoyed this episode, please support our work by encouraging your friends and colleagues to subscribe to our newsletter:



[Image: Evaluating a Language Model, by Ben Lorica, with images generated using DALL-E 2.]

Exit mobile version