Hyperscaling natural language processing

The Data Exchange Podcast: Edmon Begoli on distributed online learning and its applications in public health.


SubscribeiTunesAndroidSpotifyStitcherGoogle, and RSS.

In this episode of the Data Exchange I speak with Edmon Begoli, Chief Data Architect at Oak Ridge National Laboratory (ORNL).  Edmon has developed and implemented large-scale data applications on systems like Open MPI, Hadoop/MapReduce, Apache Calcite, Apache Spark, and Akka. Most recently he has been building large-scale machine learning and natural language applications with Ray, a distributed execution framework that makes it easy to scale machine learning and Python applications

Scalable machine learning, scalable Python, for everyone: Join David Patterson, Michael Jordan, Oriol Vinyals, Manuela Veloso, Azalia Mirhoseini, Zoubin Ghahramani, Wes McKinney, Ion Stoica, Gaël Varoquaux, Raluca Popa and many other speakers at the first Ray Summit, a FREE virtual conference which takes place Sep 30th and Oct 1st.

Our conversation included a range of topics, including:

  • Edmon’s role at the ORNL and his experience building applications with Hadoop and Spark.
  • What is distributed online learning?
  • Why they started using Ray to build distributed online learning applications.
  • Two important use cases: suicide prevention among US veterans and infectious disease surveillance.

Subscribe to our Newsletter:
We also publish a popular newsletter where we share highlights from recent episodes, trends in AI / machine learning / data, and a collection of recommendations.


Related content:


[Image library-university-books-students by Tamás Mészáros.]