Waleed Kadous on open source LLMs, fine tuning, RAG, and productionizing LLM applications.
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Waleed Kadous, Chief Scientist at Anyscale1, is one of my go-to experts for best practices on building applications leveraging large language models. He has authored pivotal articles that I regularly reference, including:
- Fine tuning is for form, not facts
- Llama 2 is about as factually accurate as GPT-4 for summaries and is 30X cheaper
- Reproducible Performance Metrics for LLM inference
- LLM-based summarization: A case study of human, Llama 2 70b and GPT-4 summarization quality
- LLMs In ProductionLearning From Experience (video of Waleed’s talk at the AI Conference)
Interview highlights – key sections from the video version:
- Open Source LLMs: when and how to use them
- Code Llama vs. GitHub Copilot
- Deploying open source LLMs
- Fine tuning LLMs
- Using GPT to create fine tuning datasets
- Retrieval augmented generation
- RAG at scale and the role of LLMs in RAG
- Evaluating RAG and experimenting with different RAG configurations and settings
- Reimagining “AutoML” in the age of LLMs
- Mixture of experts
- AMD and other hardware options for LLM inference
- Supply of open source LLMs
Related content:
- A video version of this conversation is available on our YouTube channel.
- Philipp Moritz and Goku Mohandas: Navigating the Nuances of Retrieval Augmented Generation
- Ivy: Streamlining AI Model Deployment and Development
- Best Practices in Retrieval Augmented Generation
- OpenAI Developer Conference: Customizable AI Sparks Excitement and Concern
- Expanding access to Frontier Models with software and hardware optimizations
- Open Source Principles in Foundation Models
- Michele Catasta: Software Development with AI and LLMs
If you enjoyed this episode, please support our work by encouraging your friends and colleagues to subscribe to our newsletter:
[1] Ben Lorica is an advisor to Anyscale and other startups.