Time Series Modeling: ARMA Notation


A quick note!

If you are looking for more exhaustive resources on time series modeling, check out Forecasting: Principles and Practice and Penn State 510: Applied Time Series Analysis. These have time series theory plus examples of how to implement it in R. (for a more detailed description of these resources, see the ‘References’ section)


Hydrological, meteorological, and ecological observations are often a special type of data: a time series. A time series consists of observations (say streamflow) at equally-spaced intervals over some period of time. Many of us on this blog are interested in running simulation-optimization models which receive time series data as an input. But the time series data from the historical record may be insufficient for our work, we also want to create synthetic time series data to explore a wider range of scenarios. To do so, we need to fit a time series model. If you are uncertain why we would want to generate synthetic data, check out Jon L.’s post “Synethic streamflow generation” for some background. If you are interested in some applications, read up on this 2-part post from Julie.

A common time series model is the autoregressive moving average (ARMA) model. This model has many variations including the autoregressive integrated moving average (ARIMA), seasonal ARIMA (SARIMA) models, and ARIMA models with external covariates (ARIMAX and SARIMAX). This class of models is useful but it has its own special notation which can be hard to unpack. Take the SARIMA model for example:

Confused yet? Me too. What are those functions? What does the B stand for? To help figure that out, I’m going to break down some time series notation into bite-sized pieces. In this post, I will unpack the ARMA model (eq. 2). If you are interested in understanding (eq. 1)  check out Penn State 510: Applied Time Series Analysis – Lessons 4: Seasonal Models.

Autoregressive (AR) and moving average (MA) models

An ARMA model is generalized form of two different models: the autoregressive (AR) and moving average (MA). Both the AR (eq. 3) and MA (eq. 4) models have a single parameter, p and q, respectively, which represent the order of the model.

The c and μ are constants, x’s are the time series observations, θ’s and Φ’s are weighting parameters for the different lagged terms, and ε represents a random error term (i.e. it has a normal distribution with mean zero). You can see already how these equations might get a bit tedious to write out. Using what is known as a backshift operator and defining specific polynomials for each model, we can use less ink to get the same point across.

Backshift operator

The backshift (also known as the lag) operator, B, is used to designate different lags on a particular time series observation. By applying the backshift operator to the observation at the current timestep, xt, it yields the one from the previous timestep xt-1 (also known as lag 1).

It doesn’t save much ink in this simple example, but with more model terms the backshift operator comes in handy. Using this operator, we can represent any lagged term by raising B to the power of the desired lag. Let’s say we want to represent the lag 2 of xt.

Or possibly the lag 12 term.

Example 1: AR(2) – order two autoregressive model

Let’s apply the backshift operator to the AR(2) model as an example. First, let’s specify the model in our familiar notation.

Now, let’s apply the backshift operator.

Notice that xt. shows up a few times in this equation, so let’s rearrange the model and simplify.

Once we’ve gotten to this point, we can define a backshift polynomial to further distill this equation down. For order two autoregressive models, this polynomial is defined as

Combine this with the above equation to get the final form of the AR(2) equation.

Example 2: MA(1) – order one moving average model

Starting to get the hand of it? Now we’re going to apply the same approach to a MA(1) model.

Now let’s apply the backshift operator.

Rearrange and simplify by grouping εt terms together.

Define a backshift polynomial to substitute for the terms in the parentheses.

Substitute polynomial to reach the compact notation.

Autoregressive moving average (ARMA) models

Now that we’ve had some practice with the AR and MA models, we can move onto ARMA models. As the name implies, the ARMA model is simply a hybrid between the AR and MA models. As a shorthand, AR(p) is equivalent to ARMA(p,0) and MA(q) is the same as ARMA(0,q). The full ARMA(p,q) model is as follows:

Example 3: ARMA(2,2)

For the grand finale, let’s take the ARMA model from it’s familiar (but really long) form and put in it more compact notation. As an example we’ll look at the ARMA(1,2) model.

First, apply the backshift operator.

Rearrange and simplify by grouping the terms from the current timestep, t. (If you are confused by this step check out “Clarifying Notes #2”)

Substitute the polynomials defined for AR and MA to reach the compact notation.

And that’s it! Hopefully that clears up ARMA model notation for you.

Clarifying Notes

  1. There are many different conventions for the symbols used in these equations. For example, the backshift operator (B) is also known as the lag operator (L). Furthermore, sometimes the constants used in AR, MA, and ARMA models are omitted with the assumption that they are centered around 0. I’ve decided to use the form which corresponds to agreement between a few sources with which I’m familiar and is consistent with their Wikipedia pages.
  2. What does it mean for a backshift operator to be applied to a constant? For example, like for μ in equation 2. Based on my understanding, a backshift operator has no effect on constants: Bμ = μ. This makes sense because a backshift operator is time-dependent but a constant is not. I don’t know why some of these equations have constants multiplied by the backshift operator but it appears to be the convention. It seems to be more confusing to me at least.
  3. One question you may be asking is “why don’t we just use summation terms to shorten these equations?” For example, why don’t we represent the AR(p) model like this?

We can definitely represent these equations with a summation, and for simple models (like the ones we’ve discussed) that might make more sense. However, as these models get more complicated, the backshift operators and polynomials will make things more efficient.


Applied Time Series Anlysis, The Pennsylvania State University: https://onlinecourses.science.psu.edu/stat510/
Note: This is a nice resource for anyone looking for a more extensive resource on time series analysis. This blogpost was inspired largely by my own attempt to understand Lessons 1 – 4 and apply it to my own research.
Chatfield, Chris. The Analysis of Time Series: An Introduction. CRC press, 2016.
Hyndman, Rob J., and George Athanasopoulos. Forecasting: Principles and Practice. Accessed October 27, 2017. http://Otexts.org/fpp2/.
Note: This is an AWESOME resource for everything time series. It is a bit more modern than the Penn State course and is nice because it is based around the R package ‘forecast’ and has a companion package ‘fpp2’ for access to data. Since it is written by the author of ‘forecast’ (who has a nice blog and is a consistent contributor to Cross Validated and Stack Overflow), it is consistent in its approach throughout the book which is a nice bonus.

Wikipedia: https://en.wikipedia.org/wiki/Autoregressive%E2%80%93moving-average_model


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s