The Data Exchange Podcast: Travis Addair on how higher levels of abstractions enable non-experts to build efficient machine learning models.
This week’s guest is Travis Addair, he previously led the team at Uber that was responsible for building Uber’s deep learning infrastructure. Travis is deeply involved with two popular open source projects related to deep learning:
- He is maintainer of Horovod, a distributed deep learning training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
- And Travis is a co-maintainer of Ludwig, a toolbox that allows users to train and test deep learning models without the need to write code.
We spoke about these two projects (Horovod, Ludwig) in detail and we also spoke about trends in MLOps and machine learning platforms. One of the things we discussed is how Uber has been using Ray to simplify and optimize their machine learning infrastructure and workloads:
When I was the tech lead at the deep learning training team at Uber, one of the things that I did was to shift a lot of our platform off of this bifurcated Spark/Horovod model into a single Ray based model. The goal was to get to a point where feature processing, distributed training, all of this happens through a single Ray based pipeline. So you can get rid of these separate stages – some stages involve Spark, others involve Horovod – and this lets you do more optimization between stages, such as feature transformation and model training. Fusing these steps together into a single graph definition results in a single entity that we can then serve for real time serving as well.
… The state of the world as I was departing Uber was that we had Ray working for a lot of Horovod distributed training. And we had proof of concepts in place for using Ray for doing some data processing as well as for doing hyperparameter search. The long term vision being to consolidate all these things together into a single piece of infrastructure.
Download a complete transcript of this episode by filling out the form below:
Related content and resources:
- A video version of this conversation is available on our YouTube channel.
- Zhe Zhang: “How Technology Companies Are Using Ray”
- Piero Molino: “Making deep learning accessible”
- Max Pumperla: “Connecting Reinforcement Learning to Simulation Software”
- Dean Wampler: “Scalable Machine Learning, Scalable Python, For Everyone”
Subscribe to our Newsletter:
We also publish a popular newsletter where we share highlights from recent episodes, trends in AI / machine learning / data, and a collection of recommendations.
[Image from pxhere.]