Hyperscaling natural language processing

The Data Exchange Podcast: Edmon Begoli on distributed online learning and its applications in public health.


SubscribeiTunesAndroidSpotifyStitcherGoogle, and RSS.

In this episode of the Data Exchange I speak with Edmon Begoli, Chief Data Architect at Oak Ridge National Laboratory (ORNL).  Edmon has developed and implemented large-scale data applications on systems like Open MPI, Hadoop/MapReduce, Apache Calcite, Apache Spark, and Akka. Most recently he has been building large-scale machine learning and natural language applications with Ray, a distributed execution framework that makes it easy to scale machine learning and Python applications

Join Michael Jordan, Manuela Veloso, Azalia Mirhoseini, Zoubin Ghahramani, Wes McKinney, Ion Stoica, Gaël Varoquaux, and many other speakers at the first Ray Summit In San Francisco, May 27-28. Tickets start at $200.

Our conversation included a range of topics, including:

  • Edmon’s role at the ORNL and his experience building applications with Hadoop and Spark.
  • What is distributed online learning?
  • Why they started using Ray to build distributed online learning applications.
  • Two important use cases: suicide prevention among US veterans and infectious disease surveillance.

Our goal in this podcast is to build a community of people interested in Data, Machine Learning and AI. If you have suggestions for us on what to recommend (books, conferences, links), and guests to book, please visit TheDataExchange.media site and fill out the “contact” form.

Related content:

Subscribe to our Newsletter:
We have an occasional newsletter where we share highlights from recent episodes, trends in AI / machine learning / data, and a collection of recommendations.

[Image library-university-books-students by Tamás Mészáros.]