This is the third in a series of posts about using Prophet to forecast time series data. The other parts can be found here:

- Forecasting Time Series data with Prophet – Part 1
- Forecasting Time Series data with Prophet – Part 2

In those previous posts, I looked at forecasting monthly sales data 24 months into the future. In this post, I wanted to look at using the ‘holiday’ construct found within the Prophet library to try to better forecast around specific events. If we look at our sales data (you can find it here), there’s an obvious pattern each December. That pattern could be for a variety of reasons, but lets assume that its due to a promotion that is run every December. You can see the chart and pattern in the chart below.

Prophet allows you to build a holiday‘ dataframe and use that data in your modeling. For the purposes of this example, I’ll build my prophet holiday dataframe in the following manner:

1 2 3 4 5 6 7 |
promotions = pd.DataFrame({ 'holiday': 'december_promotion', 'ds': pd.to_datetime(['2009-12-01', '2010-12-01', '2011-12-01', '2012-12-01', '2013-12-01', '2014-12-01','2015-12-01']), 'lower_window': 0, 'upper_window': 0, }) |

This promotions dataframe consisists of promotion dates for Dec in 2009 through 2015, The *lower_window* and *upper_window* values are set to zero to indicate that we don’t want prophet to consider any other months than the ones listed.

Now that I have my promotions dataframe ready to go, I’ll run through the modeling quickly (you can check out the jupyter notebook for more details):

1 2 3 4 5 6 7 8 9 |
sales_df = pd.read_csv('../examples/retail_sales.csv', index_col='date', parse_dates=True) df = sales_df.reset_index() df=df.rename(columns={'date':'ds', 'sales':'y'}) df['y'] = np.log(df['y']) model = Prophet(holidays=promotions) model.fit(df); future = model.make_future_dataframe(periods=24, freq = 'm') forecast = model.predict(future) model.plot(forecast); |

With these steps, we’ve loaded the data, set it up the way prophet expects and ran our model with the promotions data and then plotted the model, which looks like the following:

Given that we have such little data, I doubt the use of holidays will make that much difference in the forecasts, but its a good example to use. We can check the difference in the model with holidays vs the model without by re-running the prophet forecast without holidays and see that the average difference between the two is ~ 0.06%…which isn’t terribly large, but still worth investigating. The jupyter notebook that accompanies this post goes into much more detail on this aspect (as well as the overall analysis).

Note: You can find the full code for this post in a Jupyter notebook here:

[…] Forecasting Time Series data with Prophet – Part 3 […]

hi Eric, good tutorial, do you have similar thing for Random Forest forecast using Jupyter Notebook etc?

Alex- I am working on a Random Forest tutorial.