Tag: charting

Forecasting Time-Series data with Prophet – Part 2

Note: There’s been some questions (and some issues with my original code). I’ve uploaded a jupyter notebook with corrected code for Part 1 and Part 2.  The notebook can be found here.

In Forecasting Time-Series data with Prophet – Part 1, I introduced Facebook’s Prophet library for time-series forecasting.   In this article, I wanted to take some time to share how I work with the data after the forecasts. Specifically, I wanted to share some tips on how I visualize the Prophet forecasts using matplotlib rather than relying on the default prophet charts (which I’m not a fan of).

Just like part 1, I’m going to be using this retail sales example csv file find on github.

For this work, we’ll need to import matplotlib and set up some basic parameters to be format our plots in a nice way (unlike the hideous default matplotlib format).

With this chunk of code, we import fbprophet, numpy, pandas and matplotlib. Additionally, since I’m working in jupyter notebook, I want to add the %matplotlib inline instruction to view the charts that are created during the session. Lastly, I set my figuresize and sytle to use the ‘ggplot’ style.

Since I’ve already described the analysis phase with Prophet, I’m not going to provide commentary on it here. You can jump back to Part 1 for a walk-through.

At this point, your data should look like this:

sample output of sales forecast

 

Now, let’s plot the output using Prophet’s built-in plotting capabilities.

 

 

Plot from fbprophet

While this is a nice chart, it is kind of ‘busy’ for me.  Additionally, I like to view my forecasts with original data first and forecasts appended to the end (this ‘might’ make sense in a minute).

First, we need to get our data combined and indexed appropriately to start plotting. We are only interested (at least for the purposes of this article) in the ‘yhat’, ‘yhat_lower’ and ‘yhat_upper’ columns from the Prophet forecasted dataset.  Note: There are much more pythonic ways to these steps, but I’m breaking them out for each of understanding.

You don’t need to delete the ‘y’and ‘index’ columns, but it makes for a cleaner dataframe.

If you ‘tail’ your dataframe, your data should look something like this:

final dataframe for visualization

You’ll notice that the ‘y_orig’ column is full of “NaN” here. This is due to the fact that there is no original data for the ‘future date’ rows.

Now, let’s take a look at how to visualize this data a bit better than the Prophet library does by default.

First, we need to get the last date in the original sales data. This will be used to split the data for plotting.

To plot our forecasted data, we’ll set up a function (for re-usability of course). This function imports a couple of extra libraries for subtracting dates (timedelta) and then sets up the function.

This function does a few simple things. It finds the 2nd to last row of original data and then creates a new set of data (predict_df) with only the ‘future data’ included. It then creates a plot with confidence bands along the predicted data.

The ploit should look something like this:

Actual Sales vs Forecasted Sales


Hopefully you’ve found some useful information here. Check back soon for Part 3 of my Forecasting Time-Series data with Prophet.

Eric D. Brown , D.Sc. has a doctorate in Information Systems with a specialization in Data Sciences, Decision Support and Knowledge Management. He writes about utilizing python for data analytics at pythondata.com and the crossroads of technology and strategy at ericbrown.com