Dylan Patel on the open source software stack poised to enable more AI hardware accelerators.
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Dylan Patel is the Chief Analyst at SemiAnalysis, a boutique semiconductor research and consulting firm focused on the semiconductor supply chain from chemical inputs to fabs to design IP and strategy. In this episode, we discuss the emerging open source software stack for PyTorch that makes it easier and more accessible to implement non-Nvidia backends (see his recent post). Many people have long surmised that there will be other successful accelerators besides Nvidia GPUs and Google TPUs. Unfortunately, the companies behind new hardware accelerators do not possess enough resources to build a software stack to mimic CUDA or XLA. A natural solution is for other players to build an open source software stack that goes all the way to the accelerator instruction set. The hope is that such a software stack matures and eventually becomes a viable alternative to CUDA.
We also covered the market share of deep learning frameworks (PyTorch; TensorFlow; JAX), the latest on RISC V, and the Biden administration’s export controls for “Certain Advanced Computing and Semiconductor Manufacturing Items; Supercomputer and Semiconductor End Use” .
Interview highlights – key sections from the video version:
- Market share of deep learning frameworks: PyTorch, TensorFlow, JAX
- Trends in hardware: FLOPS. Memory, and Memory Bandwidth
- Bifurcating trends: comparing FLOPS and Memory
- Importance of Inference vs Training
- The emerging open source software stack built on Triton and PyTorch 2
- Prediction time: how successful will this OSS stack be one year from today?
- Will TensorFlow and JAX be part of this shift?
- RISC V
- Geopolitics: the Biden administration’s export controls for Certain Advanced Computing and Semiconductor Manufacturing Items

Related content:
- A video version of this conversation is available on our YouTube channel.
- Specialized Hardware for AI: Rethinking Assumptions and Implications for the Future
- Percy Liang: Evaluating Language Models
- Roy Schwartz: Efficient Methods for Natural Language Processing
- Barret Zoph and Liam Fedus: Efficient Scaling of Language Models
- Connor Leahy and Yoav Shoham: Large Language Models
- Foundation Models: A Primer for Investors and Builders
- Machine Learning Trends You Need To Know
- Mark Chen of OpenAI: How DALL·E works
- Andrew Feldman of Cerbras (from 2018): Specialized hardware for deep learning will unleash innovation
If you enjoyed this episode, please support our work by encouraging your friends and colleagues to subscribe to our newsletter:
[Image: Open Source Software Stack for AI Hardware, by Ben Lorica.]