The Data Exchange Podcast: Edo Liberty on building tools for deploying deep learning models in search and information retrieval.
In this episode of the Data Exchange I speak with Edo Liberty, founder of Hypercube, a startup building tools for deploying deep learning models in search and information retrieval involving large collections. When I spoke at AI Week in Tel Aviv last November several friends encouraged me to learn more about Hypercube – I’m glad I took their advice!
Our conversation covered several topics including:
- Edo’s experience applying machine learning and building tools for ML at places like Yale, Yahoo’s Research Lab in New York, and Amazon’s AI Lab.
- How deep learning is being used in search and information retrieval.
- Challenges one faces in building search and information retrieval applications when the size of collections are large.
- Deep learning based search and information retrieval and what Edo describes as “enterprise end-to-end deep search platforms”.
(Full transcript of our conversation is below.)
Subscribe to our Newsletter:
We also publish a popular newsletter where we share highlights from recent episodes, trends in AI / machine learning / data, and a collection of recommendations.
Download a complete transcript of this episode by filling out the form below:
Ben: First off, you have such a great background, and you’re one of these luminaries in machine learning who has flown under the radar a little bit. We’ll make sure that people know about all the interesting things you’re working on. Let’s start off by doing a quick tour of things you’ve done in the past. You have a computer science Ph.D., but you’ve also led research groups at both Yahoo and Amazon. I noticed, Edo, at some point you also ran a startup. So, you went from being in academia to a startup, and then went to two research groups. Why didn’t you just continue on the startup track?
Edo: Good question. I love everything that has to do with machine learning and engineering, and data, and applications. I just keep wanting to do more things. So, in academia, I started in physics, went to computer science for my Ph.D., and did my postdoc work in applied math, and became more detached from the world. Then with a startup, you go immediately to the nitty gritty, and there’s very little theory and very few algorithms.
We ended up building a pretty exciting system there, but I was missing some of the research, so I moved and found a really good spot for me in a research industry setting where I could do a lot of engineering and still do a lot of deep science and have an impact on the product. Even that seems like it’s one position, but it isn’t. You start as a young scientist, you do a lot of the data pipelining, and a lot of the data cleaning stuff, and the model training, then you start managing teams, and it’s more about planning and future work and so on. It’s a whole career. Every year things change. For me, it just keeps on changing. I see those as one and the same. It’s all different parts of the same thing.
Ben: Speaking of change, you’ve been in machine learning for over a decade. You were in machine learning before this renaissance of deep learning. What are your thoughts, generally, in terms of trend—the cycles of techniques and schools in machine learning. Is deep learning here to stay, or are we going to see something new that people will rush and embrace? At this point, it seems like a lot of companies have made major investments in deep learning in particular, and you see that not just in industry output, but also academic conferences are heavily dominated by deep learning.
Edo: It’s definitely here to stay. I’m a bit of a snob when it comes to theory and math, and things that aren’t proven, aren’t ironclad. I tend to give them less weight, and I’m willing to rely on them a little bit less. The experiments are conclusive. Deep learning is a great tool, and people use it to build great things: projects that we would see back as research projects in 2005 or the early 2000s, where you’d need a postdoc and a professor and three Ph.D.s. Today, an undergraduate can do them with one open source platform because the tools have gotten so much better. Whether the underpinning of the science behind it is something that we fully understand, the answer is absolutely not. The concepts, the tools, and the value to the ecosystem is here to stay.
- Dean Wampler: “Scalable Machine Learning, Scalable Python, For Everyone”
- Edmon Begoli: “Hyperscaling natural language processing”
- Dafna Shahaf: “Computational humanness, analogy and innovation, and soft concepts”
- Rajat Monga: “The evolution of TensorFlow and of machine learning infrastructure”
- “Key AI and Data Trends for 2020”
- David Talby: “Building domain specific natural language applications”
[Image: “Green Roof, California Academy of Sciences” by Dean Wampler; used with permission.]