Archive

Past episodes listed in reverse chronological order. We have high-quality transcripts for a few episodes (see here).

2024

231. Gunther Hagleither → LLMs for Data Access – Unlocking Insights with Text-to-SQL

230. Nestor Maslej → 2024 Artificial Intelligence Index

229. Hagay Lupesko → DBRX and the Future of Open LLMs

228. Ben Lorica and Paco Nathan → Monthly Roundup – New LLMs, GTC 2024, Constraint-Driven Innovation, Model Safety, and GraphRAG

227. Steve Pike → Automating Software Upgrades – How to Combine AI and Expert Developers

226. Chetan Gupta → Generative AI in the Industrial Sphere

225. Semih Salihoglu → The Intersection of LLMs, Knowledge Graphs, and Query Generation

224. Sadegh Riazi → Unlocking the Potential of Private Data Collaboration

223. Ben Lorica and Paco Nathan → Frontiers of AI – From Text-to-Video Models to Knowledge Graphs

222. Jerry Kaplan → Where AI Systems Are Heading Next

221. 2024 Themes and Trends in AI

220. Bryan Cantrill → The AI Infrastructure Revolution: From Cloud Computing to Data Center Design

219. Evangelos Simoudis → AI in Depth: Transforming Transportation, Enterprise, and Policy

218. Sharon Zhou and Greg Diamos → Software Meets Hardware – Enabling AMD for Large Language Models

217. Uri Gneezy → Incentives are Superpowers – Mastering Motivation in the AI Era

216. Dmitriy Ryaboy → The Convergence of Biology and AI

215. Jian Zhang → AI Co-Pilots in Action – Transforming Function Calling in Cybersecurity

214. Sarmad Qadri → Tools and Techniques to Make AI Development More Accessible

213. Nir Shavit → LLMs on CPUs, Period

2023

212. Chirag Yagnik → Democratizing Wealth Management With AI

211. Juan Sequeda and Dean Allemang → Knowledge Graphs: Contextualizing Enterprise Data for More Accurate LLMs

210. Max Mergenthaler and Azul Garza Ramirez → TimeGPT: Machine Learning for Time Series, Made Accessible

209. Waleed Kadous → Best Practices for Building LLM-Backed Applications

208. Kieren James-Lubin → The Evolution of Crypto, Blockchain, and Web3

207. Ben Lorica on the Open||Source||Data podcast → Open Source Data and AI: Past, Present, Future

206. Malte Pietsch → Orchestration for LLM and RAG applications

205. Paco Nathan and Ben Lorica → Reflections from the First AI Conference in San Francisco

204. Semih Salihoglu → Kùzu – A simple, extremely fast, and embeddable graph database

203. Philipp Moritz and Goku Mohandas → Navigating the Nuances of Retrieval Augmented Generation

202. Bill Marcellino and Nathan Beauchamp-Mustafaga → The Rise of Generative AI-Powered Social Media Manipulation

201. Yucheng Low → Versioning and MLOps for Generative AI

200. Christopher Nguyen → Navigating the Generative AI Landscape

199. Sudhir Hasbe → Trends in Data Management: From Source to BI and Generative AI

198. Yishay Carmiel → AI and the Future of Speech Technologies

197. Casey Ellis → The Future of Cybersecurity – Generative AI and its Implications

196. Daniel Lenton → Ivy – The One-Stop Interface for AI Model Deployment and Development

195. Andrew Burt → Navigating the Risk Landscape – A Deep Dive into Generative AI

194. Michele Catasta → Software Development with AI and LLMs

193. Alex Chao → A Lightweight SDK for Integrating AI Models and Plugins

192. Steve Hsu → Using LLMs to Build AI Co-pilots for Knowledge Workers

191. Brian Raymond → ETL for LLMs

190. Emil Eifrem → The Future of Graph Databases

189. David Talby → Delivering Safe and Effective LLM and NLP Applications with LangTest

188. Jeff Jonas → Using Data and AI to Democratize Entity Resolution and Master Data Management

187. Jerry Liu → An Open Source Data Framework for LLMs

186. Tim Davis → Redefining AI Infrastructure

185. Andrew Feldman → The Rise of Custom Foundation Models

184. Louis Brandy → The Future of Vector Databases and the Rise of Instant Updates

183. Amin Ahmad → LLMs Are the Key to Unlocking the Next Generation of Search

182. Jonas Andrulis → Building and Deploying Foundation Models for Enterprises

181. Alex Remedios → Building Robust AI Infrastructure for Critical Solutions

180. Patrick Hall and Agus Sudjianto → Machine Learning for High-Risk Applications

179. Omar Maher → Boosting Perception With Synthetic Data

178. Simon Chan → Revolutionizing B2B: Unleashing the Power of AI and Data

177. Gev Sogomonian → AI Metadata

176. Raymond Perrault → 2023 AI Index

175. Hagay Lupesko → Custom Foundation Models

174. Jakub Zavrel → Uncovering and Highlighting AI Trends

173. Chris Wiggins → How Data and AI Happened

172. Paras Jain and Sarah Wooders → Blazing fast bulk data transfers between any cloud

171. Pablo Villalobos → Exhaustion of High-Quality Data Could Slow Down AI Progress in Coming Decades

170. Jinsung Yoon and Sercan Arik → Generating high-fidelity and privacy-preserving synthetic data

169. Brandon Jenkins → How technology is disrupting the venture capital industry

168. Zongheng Yang → 2023 Running Machine Learning Workloads On Any Cloud

167. Jesse Anderson, Evan Chan, and Ben Lorica → 2023 Trends in Data Engineering and Infrastructure

166. Gabriela Zanfir-Fortuna and Andrew Burt → Preparing for the Implementation of the EU AI Act and Other AI Regulations

165. Dylan Patel → The Open Source Stack Unleashing a Game-Changing AI Hardware Shift

164. Peter Norvig and Alfred Spector → Data Science and AI in Context

163. Percy Liang → Evaluating Language Models

162. Ben Lorica, Mikio Braun, and Jenn Webb → 2023 Opportunities and Trends – Data, Machine Learning, and AI

161. Mark Chen → Exploring DALL·E 2

2022

160. Wendy Foster and Olivia Liao → Data Science at Shopify and Stitch Fix

159. Shayan Mohanty → Building a data management system for unstructured data

158. Frank Liu → A Cloud Native Vector Database Management System

157. Ira Cohen → What’s Next for Machine Learning in Time Series

156. Roy Schwartz → Efficient Methods for Natural Language Processing

155. Andrew Burt and Bob Friday → Responsible and Trustworthy AI (Thanksgiving holiday episode)

154. Hung Bui → Building a premier industrial AI research and product group

153. Bob van Luijt → An open source, production grade vector search engine

152. Federico Garza and Max Mergenthaler Canseco → A comprehensive suite of open source tools for time series modeling

151. Christopher Nguyen → Building Safe and Reliable AI applications

150. Ram Sriharsha → A new storage engine for vectors

149. Karthik Ramasamy → Project Lightspeed: Next-generation Spark Streaming

148. Piotr Żelasko → The Unreasonable Effectiveness of Speech Data

147. Yaron Singer → Machine Learning Integrity

146. Yashar Behzadi → Synthetic data technologies can enable more capable and ethical AI

145. Sadegh Riazi → Confidential Computing for Machine Learning

144. John Bohannon → Applied NLP Research at Primer

143. Jon Udell → Using SQL to Retrieve Data from APIs and Web Services

142. Aadyot Bhatnagar → Machine Learning for Time Series Intelligence

141. Maarten Grootendorst → Unleashing the power of large language models

140. Hamza Tahir and Adam Probst → Building production-ready machine learning pipelines

139. Omri Allouche → Machine Learning at Gong

138. Danny Bickson and Amir Alush → Data Infrastructure for Computer Vision

137. Mark Chen → How DALL·E works

136. Jules Damji and Richard Liaw → Scalable, end-to-end machine learning, for everyone

135. Rick Lamers → Orchestration and Pipelines for Data Scientists

134. Devin Petersohn → Dataframes at scale

133. Nick Schrock → Software-defined Assets

132. Edmon Begoli → Adversarial Machine Learning

131. Haytham Abuelfutuh → Orchestrating Machine Learning Applications

130. Hilary Mason → Narrative AI

129. Oren Razon → Machine Learning Model Observability

128. Jeremiah Lowin → Dataflow Automation

127. Sebastian Raschka → Practical Machine Learning and Deep learning

126. Ade Fajemisin and Donato Maragno → Machine Learning for Optimization

125. Barret Zoph and Liam Fedus → Efficient Scaling of Language Models

124. Olivia Liao → Data Science at Stitch Fix

123. Jack Clark → The 2022 AI Index

122. Ajay Kulkarni and Mike Freedman → Why You Need A Time-Series Database

121. Wendy Foster → Data Science at Shopify

120. Elham Tabassi and Andrew Burt → An AI Risk Management Framework

119. Amit Sharma and Emre Kiciman → An open source and end-to-end library for causal inference

118. Leo Meyerovich → The Graph Intelligence Stack

117. Dia Trambitas-Miron and David Talby → NLP and Language Models in Healthcare and the Life Sciences

116. Simon Crosby → Delivering Continuous Intelligence at Scale

115. Nicholas Boucher → Imperceptible NLP Attacks

114. Anjali Samani → Evolving Data Science Training Programs

113. Savin Goyal → Building Machine Learning Infrastructure at Netflix and beyond

112. Moshe Wasserblat → Democratizing NLP

111. Gaurav Chakravorty → Machine Learning at Discord

110. Mike Tung → Applications of Knowledge Graphs

109. Ben Lorica and Mikio Braun in conversation with Jenn Webb → Key AI and Data Trends for 2022

2021

108. Connor Leahy and Yoav Shoham → Large Language Models

107. Azeem Ahmed → Data and Machine Learning Platforms at Shopify

106. Christopher Nguyen → What is AI Engineering?

105. Anshul Pandey → NLP and AI in Financial Services

104. Che Sharma → Modern Experimentation Platforms

103. Nic Hohn and Max Pumperla → Reinforcement Learning in Real-World Applications

102. Nikhil Muralidhar → MLOps Anti-Patterns

101. Pardhu Gunnam and Mars Lan → Why You Need a Modern Metadata Platform

100. Yoav Shoham → Making Large Language Models Smarter

99. Jeremy Stanley → AI Begins With Data Quality

98. Michel Tricot → Modernizing Data Integration

97. Hamel Husain → Deploying Machine Learning Models Safely and Systematically

96. Bob Friday → Large-scale machine learning and AI on multi-modal data

95. Viviana Acquaviva → Machine Learning in Astronomy and Physics

94. Viral Shah → The Unreasonable Effectiveness of Multiple Dispatch

93. Jike Chong and Yue Cathy Chang in conversation with Jenn Webb and Ben Lorica → How To Lead In Data Science

92. Paco Nathan in conversation with Jenn Webb and Ben Lorica → Why interest in graph databases and graph analytics are growing

91. Tara Kelly in conversation with Jenn Webb and Ben Lorica → The State of Data Journalism

90. Rayid Ghani and Andrew Burt → Auditing machine learning models for discrimination, bias, and other risks

89. Charles Martin → An oscilloscope for deep learning

88. Jesse Anderson in conversation with Jenn Webb and Ben Lorica → What’s new in data engineering

87. Sean Taylor in conversation with Jenn Webb and Ben Lorica → Changes to the data science role and to data science tools

86. Steven Feng and Eduard Hovy → Data Augmentation in Natural Language Processing

85. Brad King → Storage Technologies for a Multi-cloud World

84. Chris White in conversation with Jenn Webb and Ben Lorica → Towards a next-generation dataflow orchestration and automation system

83. Reza Hosseini and Albert Chen → Building a flexible, intuitive, and fast forecasting library

82. Sercan Arik → Neural Models for Tabular Data

81. Connor Leahy → Training and Sharing Large Language Models

80. Paolo Cremonesi and Maurizio Ferrari Dacrema → Questioning the Efficacy of Neural Recommendation Systems

79. Hyun Kim → Automation in Data Management and Data Labeling

78. Nicolas Hohn → Reinforcement Learning For the Win

77. Andrew Burt → How Companies Are Investing in AI Risk and Liability Minimization

76. Travis Addair → The Future of Machine Learning Lies in Better Abstractions

75. Yonatan Geifman and Ran El-Yaniv → Why You Should Optimize Your Deep Learning Inference Platform

74. Jerry Overton in conversation with Jenn Webb and Ben Lorica → AI Beyond Automation

73. Steve Touw → Injecting Software Engineering Practices and Rigor into Data Governance

72. Davit Buniatyan → Building a data store for unstructured data and deep learning applications

71. Zhe Zhang → How Technology Companies Are Using Ray

70. Abe Gong → Data quality is key to great AI products and services

69. Parisa Rashidi → Machine Learning in Healthcare

68. Simon Rodriguez in conversation with Jenn Webb and Ben Lorica → Measuring the Impact of AI and Machine Learning Research

67. Ryan Wisnesky → The Mathematics of Data Integration and Data Quality

66. Jian Pei → Pricing Data Products

65. Sharon Zhou in conversation with Jenn Webb and Ben Lorica → Challenges, Opportunities, and Trends in EdTech

64. Alex Wong and Sheldon Fernandez → Towards Simple, Interpretable, and Trustworthy AI

63. Assaf Araki and Ben Lorica in conversation with Jenn Webb → The Rise of Metadata Management Systems

62. Michael Mahoney → Tools for building robust, state-of-the-art machine learning models

61. Sonal Goyal and Ben Lorica in conversation with Jenn Webb → Creating Master Data at Scale with AI

60. Bruno Fernandez-Ruiz → Bringing AI and computing closer to data sources

59. Bharath Ramsundar → Deep Learning in the Sciences

58. Ira Cohen → Taking business intelligence and analyst tools to the next level

57. Omer Dror → Data exchanges and their applications in healthcare and the life sciences

2020

56. Ben Lorica and Mikio Braun in conversation with Jenn Webb → Key AI and Data Trends for 2021

55. Jesse Anderson and Ben Lorica in conversation with Jenn Webb → A Unified Management Model for Successful Data-Focused Teams

54. Dan Geer and Andrew Burt → Security and privacy for the disoriented

53. Rumman Chowdury → The State of Responsible AI

52. Jack Morris → Improving the robustness of natural language applications

51. Yishay Carmiel → End-to-end deep learning models for speech applications

50. Ram Shankar → Securing machine learning applications

49. Marco Ribeiro → Testing Natural Language Models

48. Xiyin Zhou → Detecting Fake News

47. Neil Thompson → The Computational Limits of Deep Learning

46. Piero Molino → Making deep learning accessible

45. Mayank Kejriwal → Building and deploying knowledge graphs

44. Murat Özbayoğlu → Financial Time Series Forecasting with Deep Learning

43. Viral Shah → A programming language for scientific machine learning and differentiable programming

42. Kira Radinsky → Using machine learning to modernize medical triage and monitoring systems

41. Max Pumperla → Connecting Reinforcement Learning to Simulation Software

40. Weifeng Zhong → Using machine learning to detect shifts in government policy

39. Ofer Razon → What is AI Assurance?

38. Alan Nichol → Best practices for building conversational AI applications

37. Paco Nathan and Ben Lorica in conversation with Jenn Webb → Tools for scaling machine learning

36. Joel Grus → From Python beginner to seasoned software engineer

35. Bruno Gonçalves → Assessing Models and Simulations of Epidemic Infectious Diseases

34. Karthik Ramasamy and Arun Kejariwal → Improving the hiring pipeline for software engineers

33. Lauren Kunze → How to build state-of-the-art chatbots

32. Ameet Talwalkar → Democratizing Machine Learning

31. Denise Gosnell → How graph technologies are being used to solve complex business problems

30. Amy Heineike → Machines for unlocking the deluge of COVID-19 papers, articles, and conversations

29. Christopher Nguyen → Designing machine learning models for both consumer and industrial applications

28. Matthew Honnibal → Building open source developer tools for language applications

27. Chris Wiggins → Viewing machine learning and data science applications as sociotechnical systems

26. Andrew Burt → Identifying and mitigating liabilities and risks associated with AI

25. Arun Verma (in conversation with Jenn Webb) → How machine learning is being used in quantitative finance

24. Harish Doddi → Understanding machine learning model governance

23. Wes McKinney → Improving performance and scalability of data science libraries

22. Pete Warden → Why TinyML will be huge

21. Evan Sparks → An open source platform for training deep learning models

20. Kenneth Stanley → Algorithms that continually invent both problems and solutions

19. Bruno Gonçalves → Computational Models and Simulations of Epidemic Infectious Diseases

18. Robert Munro → Human-in-the-loop machine learning

17. Chris Nicholson → Next-generation simulation software will incorporate deep reinforcement learning

16. Solmaz Shahalizadeh → Business at the speed of AI: Lessons from Shopify

15. Edo Liberty → How deep learning is being used in search and information retrieval

14. Alejandro Saucedo → The responsible development, deployment and operation of machine learning systems

13. Edmon Begoli → Hyperscaling natural language processing

12. Krishna Gade → What businesses need to know about model explainability

11. Dean Wampler → Scalable Machine Learning, Scalable Python, For Everyone

10. Dafna Shahaf → Computational humanness, analogy and innovation, and soft concepts

9. David Talby → Building domain specific natural language applications

8. Morten Dahl → The state of privacy-preserving machine learning

7. Sijie Guo → Taking messaging and data ingestion systems to the next level

6. Bahman Bahmani → Business at the speed of AI: Lessons from Rakuten

5. Nir Shavit → The combination of the right software and commodity hardware will prove capable of handling most machine learning tasks

2019

4. Ben and Mikio Braun → Key AI and Data Trends for 2020

3. Rajat Monga → The evolution of TensorFlow and of machine learning infrastructure

2. Reza Zadeh → Building large-scale, real-time computer vision applications

1. Paco Nathan → Taking stock of foundational tools for analytics and machine learning