Site icon The Data Exchange

Using Data and AI to Democratize Entity Resolution and Master Data Management

Jeff Jonas on how Senzing makes entity resolution easier and more effective.


SubscribeApple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon •  RSS.

Jeff Jonas is Founder and CEO of Senzing, a startup focused on democratizing entity resolution – making this deceptively complicated task easy for programmers to use and deploy. Entity resolution (ER) is a critical process that connects disparate data records that represent the same real-world entity. This can be done for customer, product, or company names, among others. ER is important for safeguarding data quality, as poor data can negatively impact downstream analytics and AI applications.
 

Subscribe to the Gradient Flow Newsletter

 
Jeff explains that while ER may seem straightforward at first, requirements like accuracy, scale, latency, real-time updates, and privacy make ER a deceptively complex problem with many applications. These include customer data management, fraud detection, data quality enhancement, data integration, data governance, and business intelligence. The devil is in the details when it comes to ER, making it an intriguing yet formidable challenge to tackle on your own. Jeff describes interesting concepts such as sequence neutrality and principle-based ER, and he explains the role emerging technologies (such as LLMs, vector databases & vector search, graph databases) might play in large-scale, real-time entity resolution systems.

Interview highlights – key sections from the video version:

  1. Why accuracy is challenging for entity resolution systems
  2. Vector Databases, Vector Search, and entity resolution systems
  3. Streaming, real-time and entity resolution systems
  4. The cold start problem and entity resolution systems
  5. Explainability
  6. Principle-based entity resolution
  7. Privacy: hashing and encryption
  8. Graph databases and graphs
  9. Deep learning, neural networks, LLMs
  10. More on vector databases
  11. Strict latency requirements
  12. Large Language Models and ER – looking ahead

Learn how to build practical, robust and safe AI applications by attending the AI Conference in San Francisco (Sep 26-27). Use the discount code FriendsofBen18 to save 18% on your registration.



Related content:


If you enjoyed this episode, please support our work by encouraging your friends and colleagues to subscribe to our newsletter:

Exit mobile version