Jeff Jonas on how Senzing makes entity resolution easier and more effective.
Jeff Jonas is Founder and CEO of Senzing, a startup focused on democratizing entity resolution – making this deceptively complicated task easy for programmers to use and deploy. Entity resolution (ER) is a critical process that connects disparate data records that represent the same real-world entity. This can be done for customer, product, or company names, among others. ER is important for safeguarding data quality, as poor data can negatively impact downstream analytics and AI applications.
Jeff explains that while ER may seem straightforward at first, requirements like accuracy, scale, latency, real-time updates, and privacy make ER a deceptively complex problem with many applications. These include customer data management, fraud detection, data quality enhancement, data integration, data governance, and business intelligence. The devil is in the details when it comes to ER, making it an intriguing yet formidable challenge to tackle on your own. Jeff describes interesting concepts such as sequence neutrality and principle-based ER, and he explains the role emerging technologies (such as LLMs, vector databases & vector search, graph databases) might play in large-scale, real-time entity resolution systems.
Interview highlights – key sections from the video version:
- Why accuracy is challenging for entity resolution systems
- Vector Databases, Vector Search, and entity resolution systems
- Streaming, real-time and entity resolution systems
- The cold start problem and entity resolution systems
- Principle-based entity resolution
- Privacy: hashing and encryption
- Graph databases and graphs
- Deep learning, neural networks, LLMs
- More on vector databases
- Strict latency requirements
- Large Language Models and ER – looking ahead
- A video version of this conversation is available on our YouTube channel.
- Entity Resolution: Insights and Implications for AI Applications
- Building LLM-powered Apps: What You Need to Know
- Navigating the Future of Search
- Vector Database Primer
- What is Graph Intelligence?
- Amin Ahmad: LLMs Are the Key to Unlocking the Next Generation of Search
- Bob van Luijt: An open source, production grade vector search engine
- Frank Liu: A Cloud Native Vector Database Management System
If you enjoyed this episode, please support our work by encouraging your friends and colleagues to subscribe to our newsletter: