Hyperscaling natural language processing

The Data Exchange Podcast: Edmon Begoli on distributed online learning and its applications in public health.

SubscribeiTunesAndroidSpotifyStitcherGoogle, and RSS.

In this episode of the Data Exchange I speak with Edmon Begoli, Chief Data Architect at Oak Ridge National Laboratory (ORNL).  Edmon has developed and implemented large-scale data applications on systems like Open MPI, Hadoop/MapReduce, Apache Calcite, Apache Spark, and Akka. Most recently he has been building large-scale machine learning and natural language applications with Ray, a distributed execution framework that makes it easy to scale machine learning and Python applications

Ray Summit has been postponed until the Fall. In the meantime, enjoy an amazing series of virtual conferences beginning in mid May on the theme “Scalable machine learning, scalable Python, for everyone”. Go to anyscale.com/events for details.


Our conversation included a range of topics, including:

  • Edmon’s role at the ORNL and his experience building applications with Hadoop and Spark.
  • What is distributed online learning?
  • Why they started using Ray to build distributed online learning applications.
  • Two important use cases: suicide prevention among US veterans and infectious disease surveillance.

Subscribe to our Newsletter:
We also publish a popular newsletter where we share highlights from recent episodes, trends in AI / machine learning / data, and a collection of recommendations.

Related content:

[Image library-university-books-students by Tamás Mészáros.]