Predicting Trends with Time Series Analysis

By Jeff Sauro

A natural extension of regression analysis is time series analysis, which uses past customer data collected over regular intervals to predict future customer data on the same intervals. Time series analysis can be used to predict things like

  • Subscription rates

  • Train ridership

  • Product sales

  • Web page views

For example, requiring customers to register for updates with a website is a way to nurture lead generation. With customers providing their email addresses, they are also giving permission for an organization to directly communicate, market, and (attempt to) convert them into paying customers.

The figure shows the total number of subscribers from January 2012 through February 2014 from a B2B services company website. With this data, you can use the past pattern of subscribers to predict what the future number of subscribers will be.

image0.jpg

To estimate the cumulative number of subscribers in the future, follow these steps to use time series analysis in Excel:

  1. Create a line graph from the data by month and year in Excel. Insert a line graph into an Excel sheet with the data.

  2. Add the cumulative column as the series values in the graph in the Edit Series dialog box.

  3. To create x-axis date labels, select both the month and year columns in the Axis Labels dialog box.

    The following figure shows the cumulative number of subscribers by month and year.

    image1.jpg

    You can see that pattern of cumulative subscribers is generally linear (forming a line going up). By adding a regression equation, you can predict the future number of subscribers (assuming subscriber growth continues to exhibit this linear pattern).

  4. Add a regression equation:

    • Click on the data line and right-click “Add Trendline.”

    • In the Format Trendline dialog box, select the Display Equation on Chart” and Display R-Squared Value on Chart boxes.

In a linear regression equation the best fitting line does a good job of describing the relationship. This r2 value is .988, meaning this line explains 98.8% of the variation in subscriber rates, which is excellent.

image2.jpg

The only independent variable used here is the sequence of time over 26 months (from 1 to 26). The regression equation for subscribers for the 26 months is:

Subscribers = 81.109(x) +1896.8

You can now predict the number of subscribers for a specific month — say, May 2014, which would be the 29th data point (3 into the future).

The estimated total number of subscribers for May is:

May Subscribers = 81.109(29) +1896.8 = 4249

Any judgment about the future is susceptible to errors. It’s important to understand the limitations of using past data to predict the future.