The Data Exchange Podcast: Denise Gosnell on tools for unlocking the interconnectedness of your data.
In this episode of the Data Exchange I speak with Denise Gosnell, Chief Data Officer at DataStax1. Denise is also the co-author of the new book, The Practitioner’s Guide to Graph Data, which covers foundational tools and techniques needed to utilize graph technologies in production applications. This conversation is a great introduction to what has become an important class of technologies and tools. Graph technologies are used to power a wide array of applications, including recommendation engines, fraud detection systems, identity and access management, search, and many other use cases.
Denise provides a set of practical recommendations and advice for developers who are interested in unlocking the power of large graphs. She also explains how architects should think of graph technologies in the context of modern data platforms. I admit that I have not been following the important topic of graph technologies closely recently. Graphs power many applications we rely on including search, recommendation systems, fraud detection, identity management and much more. There is no better guide to the world of graphs than Denise Gosnell so make sure you listen to the entire episode!
Our conversation covered a range of topics including:
- The state of tools and technologies for building and deploying graph applications. We also discussed challenges facing data scientists and data engineers who want to incorporate graphs into their applications.
- Graph thinking, a concept that Denise and her co-author introduced in their book on graphs.
- Real-world applications of graph technologies.
- Technical challenges in dealing with large-scale and continuously evolving graphs
Download a complete transcript of this episode by filling out the form below:
Ben: A typical data scientist, tends to work on structured data—text and, increasingly, images. As for graphs, the people you described who work at social media companies work on graphs, but it’s not a typical dataset that an average data scientist works on day to day. Correct?
Denise: Correct. I would call it an emerging shape. When you’re a data scientist and you’re thinking about all the skills you need to sharpen in order to be a modern data scientist, absolutely—nested data, JSON, NLP, those types of processes are all required for your modern skillset. And, we’re starting to see an emergence of a new shape of data—namely, connected data, graph-shaped data—being the new tool you need to have in your arsenal for being able to work across a myriad of applications.
Ben: So, for our data scientists and data engineers listening to this, make the case for why they should add graphs to their toolbox. What’s compelling about graphs? Can I get jobs if I sharpen my skills on graphs?
Denise: You definitely can get jobs, but the case for learning graph data is that it’s the layer underneath that provides deep insight into the analytics we’ve all become used to understanding. So, when we have the traditional evolution from business intelligence to data science, we’re traditionally still working in feature tables, primarily flat sets of statistics and information. But, when you really want to dive in and understand the underlying behaviors driving those changes or maybe driving slight adjustments in your feature table, it’s usually coming from connected data or connected behaviors that are existing across your system. It’s a natural evolution that we’re seeing a lot of modern companies start to take from their research departments and then put into production, for being able to derive more value and explain the reason behind an insight or a feature.
Subscribe to our Newsletter:
We also publish a popular newsletter where we share highlights from recent episodes, trends in AI / machine learning / data, and a collection of recommendations.
- A video version of this conversation is available in our YouTube channel.
- Edo Liberty: “How deep learning is being used for search and information retrieval”
- Wes McKinney: “Improving performance and scalability of data science libraries”
- Chris Nicholson: “Next-generation simulation software will incorporate deep reinforcement learning”
- Evan Sparks: “An open source platform for training deep learning models”
- Solmaz Shahalizadeh: “Business at the speed of AI: Lessons from Shopify”
- Edmon Begoli: “Hyperscaling natural language processing”
[Image: Network of People – Brain Wired to Be Social from Pikrepo]