Building a data store for unstructured data and deep learning applications

The Data Exchange Podcast: Davit Buniatyan on tensorial data stores optimized for deep learning.


SubscribeApple • Android • Spotify • Stitcher • Google • RSS

In this episode of the Data Exchange, I speak with Davit Buniatyan, founder and CEO of ActiveLoop, a startup building data management tools for unstructured data types commonly associated with deep learning.

Attend the 2021 Ray Summit, a FREE virtual conference that brings together developers, machine learning practitioners, data scientists, DevOps, and cloud-native architects interested in building scalable data & AI applications.

Davit and team have worked in many areas pertaining to machine learning and MLOps, and over time they realized that existing data management solutions are not well-suited for unstructured data types such as video, images, and audio. To that end, they created an open source data management solution Hub  and have been working on a Ray integration for their data store:

    ❛ Video, audio text:  if you store these data types using data management tools for structured or semi-structured data, the tools that are there are already very optimized. When you talk about images, video audio, what people do is that they store it in files, or in an object store in blob files.

    In fact if you do a Google search like “Can you give me a database for images?”, you will find responses in Stack Overflow that recommend that you store your metadata in a SQL database. People typically store the location of their images in a file system. This is inefficient when you’re trying to do training or any computation.

    So we thought that there is an opportunity where we can come up with, what we call a tensor database. It’s really more like a data store to represent any unstructured data set in tensorial form where you can natively stream to a deep learning, or machine learning process.

Subscribe to our Newsletter:
We also publish a popular newsletter where we share highlights from recent episodes, trends in AI / machine learning / data, and a collection of recommendations.

 

Related content:



Free Report


[Image: Forklift Warehouse Machine Worker Industry Pallet from Pixabay.]