The Data Exchange Podcast: Davit Buniatyan on tensorial data stores optimized for deep learning.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS
In this episode of the Data Exchange, I speak with Davit Buniatyan, founder and CEO of ActiveLoop, a startup building data management tools for unstructured data types commonly associated with deep learning.
Davit and team have worked in many areas pertaining to machine learning and MLOps, and over time they realized that existing data management solutions are not well-suited for unstructured data types such as video, images, and audio. To that end, they created an open source data management solution Hub and have been working on a Ray integration for their data store:
- ❛ Video, audio text: if you store these data types using data management tools for structured or semi-structured data, the tools that are there are already very optimized. When you talk about images, video audio, what people do is that they store it in files, or in an object store in blob files.
In fact if you do a Google search like “Can you give me a database for images?”, you will find responses in Stack Overflow that recommend that you store your metadata in a SQL database. People typically store the location of their images in a file system. This is inefficient when you’re trying to do training or any computation.
So we thought that there is an opportunity where we can come up with, what we call a tensor database. It’s really more like a data store to represent any unstructured data set in tensorial form where you can natively stream to a deep learning, or machine learning process.
Subscribe to our Newsletter:
We also publish a popular newsletter where we share highlights from recent episodes, trends in AI / machine learning / data, and a collection of recommendations.
- A video version of this conversation is available on our YouTube channel.
- FREE Report: 2021 Trends in Data, Machine Learning, and AI
- Edo Liberty: “How deep learning is being used in search and information retrieval”
- Ameet Talwalkar: “Democratizing Machine Learning”
- Neil Thompson: “The Computational Limits of Deep Learning”
- Jian Pei: “Pricing Data Products”
- Assaf Araki and Ben Lorica: The Growing Importance of Metadata Management Systems
- Piero Molino: “Making deep learning accessible”
[Image: Forklift Warehouse Machine Worker Industry Pallet from Pixabay.]