Time Series Forecasting Techniques: From ARIMA to VAR

Time Series Forecasting Techniques

In the realm of mathematics, time series refers to a sequence of data points listed in chronological order. It is a valuable tool for analyzing various phenomena that change over time, such as stock market closing values or counts of sunspots. Time series analysis involves extracting meaningful statistical information and other data characteristics, while time series forecasting utilizes models to predict future values based on previously observed ones. In this article, we will explore several regression techniques commonly used for time series forecasting.

Autoregressive and Moving Average (AR and MA)

One of the fundamental regression techniques used in time series forecasting is autoregressive (AR) modeling. In AR models, the value of the variable of interest is forecasted based on a linear combination of its past values. Essentially, it is a regression of the variable against itself. The autoregressive model can be formulated as follows:

In this formula:

  • Yt represents the value of the time series at time t.
  • C is the intercept.
  • Ø is the slope coefficient.
  • Yt-p denotes the lagged values of the time series.
  • ε represents the error term.

The autoregressive model is suitable for univariate time series without trend and seasonal components.

Code Implementation

Here is an example of implementing the autoregressive model using Python’s statsmodels library:

from statsmodels.tsa.ar_model import AutoReg
import matplotlib.pyplot as plt

# Fit model
train, test = data[0:1000], data[1000:]
model = AutoReg(train.humidity, lags=350)
model_fit = model.fit()

# Make prediction
pred = model_fit.predict(len(train), len(test) + len(train) - 1, dynamic=False)

# Plot the results
plt.plot(test.humidity)
plt.plot(pred, color='red')

Moving on, let’s explore the moving average (MA) model, which complements the autoregressive model by considering past forecast errors instead of past forecast values. In other words, the moving average model predicts the next sequence as a linear function of the residual error from the mean process at a previous time step. This approach combines autoregressive and moving average models.

The moving average model is also suitable for univariate time series without trend and seasonal components.

Code Implementation

Here’s an example of implementing the moving average model using Python’s statsmodels library:

from statsmodels.tsa.arima.model import ARIMA
import matplotlib.pyplot as plt

# Fit model
model = ARIMA(train.humidity, order=(300, 0, 0))
model_fit = model.fit()

# Make prediction
pred = model_fit.predict(len(train), len(test) + len(train) - 1)

# Plot the results
plt.plot(test.humidity)
plt.plot(pred, color='red')

Autoregressive Integrated Moving Average (ARIMA)

The Autoregressive Integrated Moving Average (ARIMA) model is an extension of the autoregressive model that incorporates differencing to make the time series stationary. By differencing the sequence, the ARIMA model effectively removes trends and prepares the data for forecasting. ARIMA combines both autoregressive and moving average models.

ARIMA models are particularly useful for univariate time series with trends but without seasonal components.

Code Implementation

Here is an example of implementing the ARIMA model using Python’s statsmodels library:

from statsmodels.tsa.arima.model import ARIMA
import matplotlib.pyplot as plt

# Split the data into train and test sets
train, test = data.humidity[0:1000], data.humidity[1000:]
X = train
size = int(len(X) * 0.66)
train, test = X[0:size], X[size:len(X)]
history = [x for x in train]
predictions = list()

# Perform walk-forward validation
for i in range(len(test)):
    model = ARIMA(history, order=(5, 1, 0))
    model_fit = model.fit()
    output = model_fit.forecast()
    pred = output[0]
    predictions.append(pred)
    true = test[i]
    history.append(obs)
    print('predicted=%f, expected=%f' % (pred, true))

# Plot the results
plt.plot(test)
plt.plot(predictions, color='red')

Seasonal Autoregressive Integrated Moving Average (SARIMA)

The Seasonal Autoregressive Integrated Moving Average (SARIMA) model is an extension of the ARIMA model that can handle seasonal time series data. Unlike ARIMA, SARIMA explicitly models the seasonal component of the series. It introduces three additional hyperparameters to specify the autoregressive, differencing, and moving average components of the seasonal series.

SARIMA models are suitable for univariate time series with trends and seasonal components.

Code Implementation

Here is an example of implementing the SARIMA model using Python’s statsmodels library:

from statsmodels.tsa.statespace.sarimax import SARIMAX
import matplotlib.pyplot as plt

# Split the data into train and test sets
size = int(len(X) * 0.66)
train, test = X[0:size], X[size:len(X)]
history = [x for x in train]
predictions = list()

# Perform walk-forward validation
for t in range(len(test)):
    model = SARIMAX(history, seasonal_order=(3, 1, 0, 2))
    model_fit = model.fit()
    output = model_fit.forecast()
    pred = output[0]
    predictions.append(pred)
    true = test[t]
    history.append(true)
    print('predicted=%f, expected=%f' % (pred, true))

# Plot the results
plt.plot(test)
plt.plot(predictions, color='red')

Vector Autoregression (VAR)

The Vector Autoregression (VAR) model is used when two or more time series variables influence each other bidirectionally. Unlike the previous models, which are unidirectional, VAR considers each variable as a function of its past values. VAR is suitable for multivariate time series without trends and seasonal components.

Code Implementation

Here’s an example of implementing the VAR model using Python’s statsmodels library:

from statsmodels.tsa.vector_ar.var_model import VAR
import numpy as np

# Load multiple variables
x1 = data.humidity.values
x2 = data.meantemp.values
list1 = []

for i in range(len(x1)):
    x3 = x1[i]
    x4 = x2[i]
    row1 = [x3, x4]
    list1.append(row1)

# Fit the VAR model
model = VAR(np.array(list1))
model_fit = model.fit()

# Make predictions for the next 5 steps
forecast = model_fit.forecast(model_fit.y, steps=5)
print(forecast)

Conclusion

In this article, we have explored several regression techniques commonly used for time series forecasting. We covered autoregressive (AR) and moving average (MA) models, autoregressive integrated moving average (ARIMA) models, seasonal autoregressive integrated moving average (SARIMA) models, and vector autoregression (VAR) models. Each technique has its own strengths and suitability for different types of time series data.

Remember that when working with time series forecasting, adjusting the lag values is crucial for accurate predictions. Properly setting the lag value determines the nature of the forecasting results. With the knowledge gained from this article, you are well-equipped to apply these regression techniques to your own time series forecasting tasks.