Building a flexible, intuitive, and fast forecasting library

The Data Exchange Podcast Reza Hosseini and Albert Chen on the impressive new forecasting library Greykite and its flagship algorithm Silverkite.

SubscribeApple • Android • Spotify • Stitcher • Google • RSS.

This week’s guests are Reza Hosseini, Staff Software Engineer, and Albert Chen, Staff Data Scientist, both at Linkedin. Reza and Albert are part of the team behind the new open source library Greykite, a flexible and fast library for time-series forecasting. I’ve been experimenting with Greykite and I’m really impressed with its speed, flexibility, and accuracy of its default model (Silverkite). It’s been particularly fun to hook it up with my favorite hyperparameter tuning library, Ray Tune!

Take the 2021 Data Engineering Survey and get a free copy of the results and be entered into a drawing for a free Data Teams book & other prizes.

Forecasting is a fundamental task at most companies. For a long time there weren’t any industrial grade libraries, particularly ones that non specialists could pick up and use. That’s why many data scientists – myself included – were excited when Facebook decided to open source Prophet in 2017.  Depending on their data and domain, suddenly many more practitioners could build decent forecasting models.

Think of Greykite as a similar set of tools,  it’s as easy to use as Prophet but much faster and potentially more accurate. This episode is devoted to Greykite as it stands today, as well as future plans and near-term roadmap.

    Reza Hosseini:

We had a few reasons for why we started working on a new library. One was we have a very diverse set of problems, we want to solve: short term forecast, long term forecast, and also different frequencies. We also have some problems related to what is the objective of the forecasting – for example some teams may be interested to sort of be able to capture the peaks (capacity planning) better than the means. A lot of models are trying to basically fit to the means, and you know, and some teams may be interested in the mean, so, there is a very diverse set of problems.

One of the ideas with Silverkite, which is the core algorithm is basically putting a model together, that’s really, really flexible in the sense that you can change different components of the model. So that’s one design decision: you could swap ridge regression as the engine, or you could do quantile regression and so on. And you can expand that. So that’s one source of flexibility. The other flexibility is, it’s really easy to tell the model that you want some patterns to be captured. So for example, let’s assume you have growth, maybe you know, you have a business metric that’s growing over the weekends, you know, maybe you have a restaurant business, over the weekdays is not growing. It’s really easy to search for Silverkite, and if you want to add that feature.

… And for the volatility part, also is fairly flexible, even though it’s a very simple volatility model. But it allows you to specify features, maybe weekends and holidays are more volatile. It’s easy to tell Silverkite to capture that type of thing.

Download a complete transcript of this episode by filling out the form below:

Subscribe to our Newsletter:
We also publish a popular newsletter where we share highlights from recent episodes, trends in AI / machine learning / data, and a collection of recommendations.

Related content:

2021 Data Engineering Survey

The 2021 Data Engineering Survey is now open and we need your help. The survey takes about 5 minutes to fill out and we’ll share the report of the survey findings with you. You’ll also be entered in a drawing for free copies of the Data Teams book and other prizes.

[Image by Gerd Altmann from Pixabay.]