Time series forecasting in Python for beginners

This semester I am teaching Engineering Management Methods here at Cornell University. The course is aimed at introducing engineering students to systems thinking and a variety of tools and analyses they can use to analyze data. The first chapter has been on time series forecasting, where we discussed some of the simpler models one can use and apply for forecasting purposes, including Simple and Weighted Moving Average, Single and Double Exponential Smoothing, Additive and Multiplicative Seasonal Models, and Holt Winter’s Method.

The class applications as well as the homework are primarily performed in Excel, but I have been trying, with limited success, to encourage the use of programming languages for the assignments. One comment I’ve received by a student has been that it takes significantly more time to perform the calculations by coding; they feel that it’s a waste of time. I initially attributed the comment to the fact that the student was new to coding and it takes time in the beginning, but on later reflection I realized that, in fact, the student was probably simply manually repeating the same Excel operations by using code: take a set of 30 observations, create an array to store forecasts, loop through every value and calculate forecast using model formula, calculate error metrics, print results, repeat steps for next set of data. It occurred to me that of course they think it’s a waste of time, because doing it that way completely negates what programming is all about: designing and building an executable program or function to accomplish a specific computing task. In this instance, the task is to forecast using each of the models we learn in class and the advantage of coding comes with the development of some sort of program or function that performs these operations for us, given a set of data as input. Simply going through the steps of performing a set of calculations for a problem using code is not much different than doing so manually or in Excel. What is different (and beneficial) is designing a code so that it can then be effortlessly applied to all similar problems without having to re-perform all calculations. I realize this is obvious to the coding virtuosos frequenting this blog, but it’s not immediately obvious to the uninitiated who are rather confused on why Dr. Hadjimichael is asking them to waste so much time for a meager bonus on the homework.

So this blog post, is aimed at demonstrating to coding beginners how one can transition from one way of thinking to the other, and providing a small time-series-forecasting toolkit for users that simply want to apply the models to their data.

The code and data for this example can be found on my GitHub page and I will discuss it below. I will be using a wine sales dataset that lists Australian wine sales (in kiloliters) from January 1980 to October 1991. The data looks like this:

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
Date Sales 1/1/80 464 2/1/80 675 3/1/80 703 4/1/80 887 5/1/80 1139 6/1/80 1077 7/1/80 1318 8/1/80 1260 9/1/80 1120 10/1/80 963 11/1/80 996 12/1/80 960
view raw wine_sales.csv hosted with ❤ by GitHub

And this is what the time series looks like:

We first need to import the packages we’ll be using and load the data. I will be using Pandas in this example (but there’s other ways). I’m also defining the number of seasonal periods in a cycle, in this case 12.

import numpy as np #Package we'll use for numerical calculations
import matplotlib.pyplot as plt #From matplotlib package we import pyplot for plots
import pandas #Package to data manipulation
import scipy.optimize #Package we'll use to optimize
plt.style.use('seaborn-colorblind') #This is a pyplot style (optional)

'''Load the data into a pandas series with the name wine_sales'''

P=12 #number of seasonal periods in a cycle

In class, I’ve always mentioned that one should use a training and a validation set for model development, primarily to avoid overfitting our model to the specific training set. In this example, the functions are written as they apply to the training set. Should you choose to apply the functions listed here, you should apply the functions for the training set, extract forecasts and then use those to initialize your validation period. To divide the observations, you would do something like this:
training = time_series[0:108] # Up to December '88
validation = time_series[108:] # From January '89 until end


Now, if say, we wanted to apply the Naive model of the next steps forecast being equal to the current observation, i.e., $\hat{y}_{t+1}=y_t$, we’d do something like:


y_hat=pandas.Series().reindex_like(time_series) # Create an array to store forecasts
y_hat[0]= time_series[0] # Initialize forecasting array with first observation
''' Loop through every month using the model to forecast y_hat'''
for t in range(len(y_hat)-1): # Set a range for the index to loop through
y_hat[t+1]= time_series[t] # Apply model to forecast time i+1


Now if we’d like to use this for any time series, so we don’t have to perform our calculations every time, we need to reformat this a bit so it’s a function:


def naive(time_series):
y_hat=pandas.Series().reindex_like(time_series)
y_hat[0]= time_series[0] # Initialize forecasting array with first observation
''' Loop through every month using the model to forecast y'''
#This sets a range for the index to loop through
for t in range(len(y_hat)-1):
y_hat[t+1]= time_series[t] # Apply model to forecast time i+1
return y_hat


Now we can just call define this function at the top of our code and just call it with any time series as an input. The function as I’ve defined it returns a pandas.Series with all our forecasts. We can then do the same for all the other modeling methods (below). Some things to note:

• The data we read in the top, outside the functions, as well as any parameters defined (P in this case) are global variables and do not need to be defined as an input to the function. The functions below only need a list of parameter values as inputs.
• For the models with seasonality and/or trend we need to create separate series to store those estimates for E, S, and T.
• Each model has its own initialization formulas and if we wanted to apply them to the validation set that follows our training set, we’d need to initialize with the last values of our training.
'''SIMPLE MOVING AVERAGE
Using this model, y_hat(t+1)=(y(t)+y(t-1)...+y(t-k+1))/k (i.e., the predicted
next value is equal to the average of the last k observed values).'''
def SMA(params):
k=int(np.array(params))
y_hat=pandas.Series().reindex_like(time_series)
y_hat[0:k]=time_series[0:k]
''' Loop through every month using the model to forecast y.
Be careful with Python indexing!'''
for t in range(k-1,len(y_hat)-1): #This sets a range for the index to loop through
y_hat[t+1]= np.sum(time_series[t-k+1:t+1])/k # Apply model to forecast time i+1
return y_hat

'''WEIGHTED MOVING AVERAGE
Using this model, y_hat(t+1)=w(1)*y(t)+w(2)*y(t-1)...+w(k)*y(t-k+1) (i.e., the
predicted next value is equal to the weighted average of the last k observed
values).'''
def WMA(params):
weights = np.array(params)
k=len(weights)
y_hat=pandas.Series().reindex_like(time_series)
y_hat[0:k]=time_series[0:k] # Initialize values
''' Loop through every month using the model to forecast y.
Be careful with Python indexing!'''
for t in range(k-1,len(y_hat)-1): #This sets a range for the index to loop through
y_hat[t+1]= np.sum(time_series[t-k+1:t+1].multiply(weights)) # Apply model to forecast time i+1
return y_hat
'''This model includes the constraint that all our weights should sum to one.
To include this in our optimization later, we need to define it as a function of our
weights.'''
def WMAcon(params):
weights = np.array(params)
return np.sum(weights)-1

'''SINGLE EXPONENTIAL SMOOTHING
Using this model, y_hat(t+1)=y_hat(t)+a*(y(t)-y_hat(t))(i.e., the
predicted next value is equal to the weighted average of the last forecasted value and its
difference from the observed).'''
def SES(params):
a = np.array(params)
y_hat=pandas.Series().reindex_like(time_series)
y_hat[0]=time_series[0] # Initialize values
''' Loop through every month using the model to forecast y.
Be careful with Python indexing!'''
for t in range(len(y_hat)-1): #This sets a range for the index to loop through
y_hat[t+1]=  y_hat[t]+a*(time_series[t]-y_hat[t])# Apply model to forecast time i+1
return y_hat

'''DOUBLE EXPONENTIAL SMOOTHING (Holts Method)
Using this model, y_hat(t+1)=E(t)+T(t) (i.e., the
predicted next value is equal to the expected level of the time series plus the
trend).'''
def DES(params):
a,b = np.array(params)
y_hat=pandas.Series().reindex_like(time_series)
'''We need to create series to store our E and T values.'''
E = pandas.Series().reindex_like(time_series)
T = pandas.Series().reindex_like(time_series)
y_hat[0]=E[0]=time_series[0] # Initialize values
T[0]=0
''' Loop through every month using the model to forecast y.
Be careful with Python indexing!'''
for t in range(len(y_hat)-1): #This sets a range for the index to loop through
E[t+1] = a*time_series[t]+(1-a)*(E[t]+T[t])
T[t+1] = b*(E[t+1]-E[t])+(1-b)*T[t]
y_hat[t+1] = E[t] + T[t] # Apply model to forecast time i+1
return y_hat

Using this model, y_hat(t+1)=E(t)+S(t-p) (i.e., the
predicted next value is equal to the expected level of the time series plus the
appropriate seasonal factor). We first need to create an array to store our
forecast values.'''
def ASM(params):
a,b = np.array(params)
p = P
y_hat=pandas.Series().reindex_like(time_series)
'''We need to create series to store our E and S values.'''
E = pandas.Series().reindex_like(time_series)
S = pandas.Series().reindex_like(time_series)
y_hat[:p]=time_series[0] # Initialize values
'''We need to initialize the first p number of E and S values'''
E[:p] = np.sum(time_series[:p])/p
S[:p] = time_series[:p]-E[:p]
''' Loop through every month using the model to forecast y.
Be careful with Python indexing!'''
for t in range(p-1, len(y_hat)-1): #This sets a range for the index to loop through
E[t+1] = a*(time_series[t]-S[t+1-p])+(1-a)*E[t]
S[t+1] = b*(time_series[t]-E[t])+(1-b)*S[t+1-p]
y_hat[t+1] = E[t] + S[t+1-p] # Apply model to forecast time i+1
return y_hat

'''MULTIPLICATIVE SEASONAL
Using this model, y_hat(t+1)=E(t)*S(t-p) (i.e., the
predicted next value is equal to the expected level of the time series times
the appropriate seasonal factor). We first need to create an array to store our
forecast values.'''
def MSM(params):
a,b = np.array(params)
p = P
y_hat=pandas.Series().reindex_like(time_series)
'''We need to create series to store our E and S values.'''
E = pandas.Series().reindex_like(time_series)
S = pandas.Series().reindex_like(time_series)
y_hat[:p]=time_series[0] # Initialize values
'''We need to initialize the first p number of E and S values'''
E[:p] = np.sum(time_series[:p])/p
S[:p] = time_series[:p]/E[:p]
''' Loop through every month using the model to forecast y.
Be careful with Python indexing!'''
for t in range(p-1, len(y_hat)-1): #This sets a range for the index to loop through
E[t+1] = a*(time_series[t]/S[t+1-p])+(1-a)*E[t]
S[t+1] = b*(time_series[t]/E[t])+(1-b)*S[t+1-p]
y_hat[t+1] = E[t]*S[t+1-p] # Apply model to forecast time i+1
return y_hat

Using this model, y_hat(t+1)=(E(t)+T(t))*S(t-p) (i.e., the
predicted next value is equal to the expected level of the time series plus the
trend, times the appropriate seasonal factor). We first need to create an array
to store our forecast values.'''
def AHW(params):
a, b, g = np.array(params)
p = P
y_hat=pandas.Series().reindex_like(time_series)
'''We need to create series to store our E and S values.'''
E = pandas.Series().reindex_like(time_series)
S = pandas.Series().reindex_like(time_series)
T = pandas.Series().reindex_like(time_series)
y_hat[:p]=time_series[0] # Initialize values
'''We need to initialize the first p number of E and S values'''
E[:p] = np.sum(time_series[:p])/p
S[:p] = time_series[:p]-E[:p]
T[:p] = 0
''' Loop through every month using the model to forecast y.
Be careful with Python indexing!'''
for t in range(p-1, len(y_hat)-1): #This sets a range for the index to loop through
E[t+1] = a*(time_series[t]-S[t+1-p])+(1-a)*(E[t]+T[t])
T[t+1] = b*(E[t+1]-E[t])+(1-b)*T[t]
S[t+1] = g*(time_series[t]-E[t])+(1-g)*S[t+1-p]
y_hat[t+1] = E[t]+T[t]+S[t+1-p] # Apply model to forecast time i+1
return y_hat

'''MUTLIPLICATIVE HOLT-WINTERS METHOD
Using this model, y_hat(t+1)=(E(t)+T(t))*S(t-p) (i.e., the
predicted next value is equal to the expected level of the time series plus the
trend, times the appropriate seasonal factor). We first need to create an array
to store our forecast values.'''
def MHW(params):
a, b, g = np.array(params)
p = P
y_hat=pandas.Series().reindex_like(time_series)
'''We need to create series to store our E and S values.'''
E = pandas.Series().reindex_like(time_series)
S = pandas.Series().reindex_like(time_series)
T = pandas.Series().reindex_like(time_series)
y_hat[:p]=time_series[0] # Initialize values
'''We need to initialize the first p number of E and S values'''
S[:p] = time_series[:p]/(np.sum(time_series[:p])/p)
E[:p] = time_series[:p]/S[:p]
T[:p] = 0
''' Loop through every month using the model to forecast y.
Be careful with Python indexing!'''
for t in range(p-1, len(y_hat)-1): #This sets a range for the index to loop through
E[t+1] = a*(time_series[t]/S[t+1-p])+(1-a)*(E[t]+T[t])
T[t+1] = b*(E[t+1]-E[t])+(1-b)*T[t]
S[t+1] = g*(time_series[t]/E[t])+(1-g)*S[t+1-p]
y_hat[t+1] = (E[t]+T[t])*S[t+1-p] # Apply model to forecast time i+1
return y_hat


Having defined this, I can then, for example, call the Multiplicative Holt Winters method by simply typing:

MHW([0.5,0.5,0.5])


This will produce a forecast using the Multiplicative Holt Winters method with those default parameters, but we would like to calibrate them to get the “best” forecasts from our model. To do so, we need to define what we mean by “best”, and in this example I’m choosing to use Mean Square Error as my performance metric. I define it below as a function that receives the parameters and some additional arguments as inputs. I only need to set it up this way because my optimization function is trying to minimize the MSE function by use of those parameters. I’m using the “args” array to simply tell the function which model it’s using to forecast.

def MSE(params, args):
model, = args
t_error = np.zeros(len(time_series))
forecast = model(params)
for t in range(len(time_series)):
t_error[t] = time_series[t]-forecast[t]
MSE = np.mean(np.square(t_error))
return MSE


To perform the optimization in Excel, we’d use Solver, but in Python we have other options. SciPy is a Python package that allows us, among many other things, to optimize such single-objective problems. What I’m doing here is that I define a list of all the models I want to optimize, their default parameters, and the parameters’ bounds. I then use a loop to go through my list of models and run the optimization. To store the minimized MSE values as well as the parameter values that produce them, we can create an array to store the MSEs and a list to store the parameter values for each model. The optimization function produces a “dictionary” item that contains the minimized MSE value (under ‘fun’), the parameters that produce it (under ‘x’) and other information.

''' List of all the models we will be optimizing'''
models = [SES, DES, ASM, MSM, AHW, MHW]
''' This is a list of all the default parameters for the models we will be
optimizing. '''
#SES,  DES,     ASM
default_parameters = [[0.5],[0.5,0.5],[0.5,0.5],
#MSM,        AHW,            MHW
[0.5,0.5],[0.5,0.5,0.5],[0.5,0.5,0.5]]
''' This is a list of all the bounds for the default parameters we will be
optimizing. All the a,b,g's are weights between 0 and 1. '''
bounds = [[(0,1)],[(0,1)]*2, [(0,1)]*2,
[(0,1)]*2,[(0,1)]*3,[(0,1)]*3]
min_MSEs = np.zeros(len(models)) # Array to store minimized MSEs
opt_params = [None]*len(models) # Empty list to store optim. parameters
for i in range(len(models)):
res = scipy.optimize.minimize(MSE, # Function we're minimizing (MSE in this case)
default_parameters[i], # Default parameters to use
# Additional arguments that the optimizer
# won't be changing (model in this case)
args=[models[i]],
method='L-BFGS-B', # Optimization method to use
bounds=bounds[i]) # Parameter bounds
min_MSEs[i] = res['fun'] #Store minimized MSE value
opt_params[i] = res['x'] #Store parameter values identified by optimizer


Note: For the WMA model, the weights should sum to 1 and this should be input to our optimization as a constraint. To do so, we need to define the constraint function as a dictionary and include the following in our minimization call: constraints=[{‘type’:’eq’,’fun’: WMAcon}]. The number of periods to consider cannot be optimized by this type of optimizer.

Finally, we’d like to present our results. I’ll do so by plotting the observations and all my models as well as their minimized MSE values:

fig = plt.figure()
ax = fig.add_subplot(1, 1, 1) # Create figure
ax.set_title("Australian wine sales (kilolitres)") # Set figure title
l1 = ax.plot(time_series, color='black', linewidth=3.0, label='Observations') # Plot observations
for i in range(len(models)):
ax.plot(time_series.index,models[i](opt_params[i]), label = models[i].__name__)
ax.legend() # Activate figure legend
plt.show()
print('The estimated MSEs for all the models are:')
for i in range(len(models)):
print(models[i].__name__ +': '+str(min_MSEs[i]))


This snippet of code should produce this figure of all our forecasts, as well as a report of all MSEs:

The estimated MSEs for all the models are:
SES: 133348.78
DES: 245436.67
ASM: 80684.00
MSM: 64084.48
AHW: 72422.34
MHW: 64031.19

The Multiplicative Holt Winters method appears to give the smallest MSE when applied to these data.

2 thoughts on “Time series forecasting in Python for beginners”

1. David Gold |

This is beyond the scope of your post, but many of the simple functions you have here can very easily be vectorized which would dramatically increase their speed. This isn’t an issue for data sets the size of this example, but is very important as the size of your training data set increases.

2. How to predict the forecasted values further