ETL for LLMs

Ben Lorica

3 years ago

Brian Raymond on how data can be made AI-friendly with open-source building blocks that connect unstructured enterprise data to LLMs.

Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.

Brian Raymond is the founder of Unstructured, a startup building open source data pre-processing and ingestion tools specifically for Large Language Models (LLMs). Unstructured is focused on building tools for transforming unstructured data, particularly from large organizations, into a format that can be effectively processed by NLP solutions and LLMs. The process is complex and time-consuming, often involving the transformation and curation of varied document formats and layouts, while ensuring a high-quality, clean data feed. Solving this problem is critical now more than ever, as it allows us to fully exploit the potential of LLMs, leading to more cost-effective, efficient, and high-performing AI systems.

Subscribe to the Gradient Flow Newsletter

Brian Ryamond will be speaking at the AI Conference in San Francisco (Sep 26-27). Use the discount code FriendsofBen18 to save 18% on your registration.

Interview highlights – key sections from the video version:

Related Content:

A video version of this conversation is available on our YouTube channel.
The Data Integration Market
The Vector Database Primer
Building LLM-powered Apps: What You Need to Know
Navigating the Future of Search
Jerry Liu: An Open Source Data Framework for LLMs
Michel Tricot: Modernizing Data Integration
Louis Brandy: The Future of Vector Databases and the Rise of Instant Updates
Amin Ahmad: LLMs Are the Key to Unlocking the Next Generation of Search
Gev Sogomonian: AI Metadata

If you enjoyed this episode, please support our work by encouraging your friends and colleagues to subscribe to our newsletter:

Brian Raymond on how data can be made AI-friendly with open-source building blocks that connect unstructured enterprise data to LLMs.

Share this: