In the previous post of this series on volatility forecasting, I described the simple and the exponentially

weighted moving average volatility forecasting models.

In particular, I showed that these two models belong to the generic family of weighted moving average volatility forecasting models^{1}, whose members

represent the volatility of an asset as a weighted moving average of its past squared returns^{2}.

Another member of this family is the *Generalized AutoRegressive Conditional Heteroscedasticity (GARCH) model*,

*widely used in financial time series modelling and implemented in most statistics and econometric software packages*^{3}.

In this blog post, I will detail the *simplest but often very useful*^{4} GARCH(1,1) volatility forecasting model and I will illustrate its practical performances in the context of monthly volatility forecasting for various ETFs.

## Mathematical preliminaries (reminders)

This section contains reminders from a previous blog post.

### Volatility modelling and volatility proxies

Let $r_t$ be the (logarithmic) return of an asset over a time period $t$ (a day, a week, a month..).

Then:

The asset (conditional) variance is defined as $ \sigma_t^2 = \mathbb{E} \left[ r_t^2 \right] $

From this definition, the squared return $r_t^2$ of an asset is a (noisy

^{5})*variance estimator*– or*variance proxy*^{5}– for that asset variance over the considered time period.Another example of an asset variance proxy is the Parkinson range of an asset.

The generic notation for an asset variance proxy in this blog post is $\tilde{\sigma}_t^2$.

The asset (conditional) volatility is defined as $ \sigma_t = \sqrt { \sigma_t^2 } $

The generic notation for an asset volatility proxy in this blog post is $\tilde{\sigma}_t$.

### Weighted moving average volatility forecasting model

Boudoukh et al.^{1} shows that many seemingly different methods of volatility forecasting actually share the same underlying representation of the estimate of an asset next period’s variance $\hat{\sigma}_{T+1}^2$ as

a weighted moving average of that asset past periods’ variance proxies $\tilde{\sigma}^2_t$, $t=1..T$, with

\[\hat{\sigma}_{T+1}^2 = w_0 + \sum_{i=1}^{k} w_i \tilde{\sigma}^2_{T+1-i}\]

, where:

- $1 \leq k \leq T$ is the size of the moving average, possibly time-dependent
- $w_i, i=0..k$ are the weights of the moving average, possibly time-dependent as well

## GARCH(1,1) volatility forecasting model

### The GARCH(p,q) model

#### Definition

Bollerslev^{4}’s GARCH model is a generalization of Engle’s ARCH econometric model

which captures the time-varying nature of the (conditional) variance of certain time series like asset returns.

Under a GARCH(p,q) model, an asset next period’s conditional variance $\sigma_{T+1}^2$ is modeled as recursive linear function of its own $p$ lagged conditional variances $\sigma_{T}^2, \sigma_{T-1}^2…$ and of its $q$

lagged squared returns $r_{T}^2, r_{T-1}^2…$, which leads to the formula

\[\hat{\sigma}_{T+1}^2 = \omega + \sum_{i=1}^p \beta_i \hat{\sigma}_{T+1-i}^2+ \sum_{j=1}^q \alpha_j r_{T+1-i}^2\]

, where:

- The parameters $\omega$, $\alpha_j$, $j=1..q$ and $\beta_i$, $i=1..p$ are non-negative and subject to various inequality constraints depending on working assumptions
^{6} - The initial conditional variance $\hat{\sigma}_1^2$ is usually taken equal to $r_1^2$, but c.f. Pelagatti and Lisi
^{7}for a thorough discussion about this subject

#### Squared returns v.s. generic variance proxy

Molnar^{8} notes that *in GARCH type of models, demeaned squared returns serve as a way to calculate innovations to the volatility*^{8} so that *replacing the squared returns by more precise volatility estimates will produce better GARCH models, regarding both in-sample fit and out-of-sample forecasting performance*

^{8}.

Molnar^{8} then proposes to modify the GARCH(p,q) model for the estimation of an asset next period’s conditional variance $\sigma_{T+1}^2$ as follows

\[\hat{\sigma}_{T+1}^2 = \omega + \sum_{i=1}^p \beta_i \hat{\sigma}_{T+1-i}^2+ \sum_{j=1}^q \alpha_j \tilde{\sigma}_{T+1-i}^2\]

, where $\tilde{\sigma}^2_t$, $t=1..T$ are the asset past periods’ variance proxies.

To be noted that replacing squared returns by less noisy variance proxies is already discussed at length in the

previous blog post in the case of the simple and the exponentially weighted moving average volatility forecasting models.

### The GARCH(1,1) model

#### Definition

Because the GARCH(1,1) model *works surprisingly well in comparison with much more complex [GARCH] models*^{8}, it is usually the main GARCH model used in practice.

Under this model, the generic GARCH formula for the estimate of an asset next period’s conditional variance can be re-parametrized as follows

\[\hat{\sigma}_{T+1}^2 = \gamma \tilde{\sigma}^2 + \alpha \tilde{\sigma}^2_{T} + \beta \hat{\sigma}_{T}^2\]

, where:

- $\alpha$, $\beta$ and $\gamma$ are positive parameters summing to one
- $\tilde{\sigma}^2$ is a strictly positive parameter, corresponding to the asset unconditional variance
^{9}

The GARCH(1,1) model thus estimates an asset next period’s conditional variance $\hat{\sigma}_{T+1}^2$ as a weighted average^{10} of three different variance estimators:

- A long-term variance estimator $\tilde{\sigma}^2$
- A short-term variance estimator $\tilde{\sigma}^2_{T}$
- The current GARCH(1,1) variance estimator $\hat{\sigma}_{T}^2$

and the weights $\alpha$, $\beta$ and $\gamma$ determine the speed with which the model adapts to short-term variance v.s. reverts to its long-term variance.

#### Relationship with the generic weighted moving average model

By developing the recursive definition of the GARCH(1,1) model, it is possible to see that it is a specific kind of weighted moving average volatility forecasting model, with:

- $k = T$
- $w_0 = \gamma \sum_{k=0}^{T-1} \beta^k$
- $w_1 = \alpha$, $w_2 = \alpha \beta$, …, $w_{T-1} = \alpha \beta^{T-2}$, $w_T = \alpha \beta^{T-1}$, that is, exponentially decreasing weights emphasizing recent past variance proxies v.s. more distant ones in the model, exactly like in the exponentially weighted moving average volatility forecasting model
^{11}

#### Volatility forecasting formulas

Under a GARCH(1,1) volatility forecasting model, the generic weighted moving average volatility forecasting formula becomes:

To estimate an asset next period’s volatility:

\[\hat{\sigma}_{T+1} = \sqrt{ \gamma \tilde{\sigma}^2 + \alpha \tilde{\sigma}^2_{T} + \beta \hat{\sigma}_{T}^2 }\]

To estimate an asset next $h$-period’s ahead volatility

^{12}, $h \geq 2$:\[\hat{\sigma}_{T+h} = \sqrt{ \tilde{\sigma}^2 + \left( \alpha + \beta \right)^{h-1} \left( \hat{\sigma}_{T+1} – \tilde{\sigma}^2 \right) }\]

To estimate an asset aggregated volatility

^{12}over the next $h$ periods:\[\hat{\sigma}_{T+1:T+h} = \sqrt{h} \hat{\sigma}_{T+1}\]

#### How to determine the parameters of a GARCH(1,1) model?

The parameters of a GARCH(1,1) model – either $\omega$, $\alpha$ and $\beta$ or $\alpha$, $\beta$, $\gamma$ and $\tilde{\sigma}^2$ – are typically determined by

maximum likelihood estimation (MLE) with a Gaussian^{13} or Student’s $t$ assumption for the distribution of the innovations.

A note of caution, though.

There are plenty of software packages able to do this estimation, but the underlying optimization problem *has been documented to be numerically difficult and prone to error*

^{14}due to

*a one dimensional manifold in the parameter space where the likelihood function is large and almost constant*

^{14}, which tends to “trap” numerical algorithms.

Possible remediations have been suggested in Zumbach^{14} and in Kristensen and Linton^{15}, like reformulating the optimization problem in an alternative parameter space or using

a closed-form estimator for the GARCH(1,1) parameters that does not rely on any numerical optimization procedure, but unfortunately, these remediations are not sufficient due to the

problematic^{16} finite sample behavior of the maximum likelihood estimates…

## Implementation in Portfolio Optimizer

**Portfolio Optimizer** implements the GARCH(1,1) volatility forecasting model through the endpoint `/assets/volatility/forecast/garch`

.

This endpoint supports the 4 variance proxies below:

- Squared close-to-close returns
- Demeaned squared close-to-close returns
- The Parkinson range
- The jump-adjusted Parkinson range

Internally, this endpoint:

- Assumes that the asset unconditional variance $\tilde{\sigma}^2$ is equal to its long-term average value $\frac{1}{T} \sum_{t=1}^{T} \tilde{\sigma}^2_t$
- Automatically determines the optimal value of the GARCH(1,1) parameters $\alpha$, $\beta$ and $\gamma$ using a proprietary numerical optimization procedure

## Example of usage – Volatility forecasting at monthly level for various ETFs

As an example of usage, I propose to enrich the results of the previous blog post, in which

monthly forecasts produced by different volatility models are compared – using Mincer-Zarnowitz^{17} regressions – to the next month’s close-to-close observed volatility for 10 ETFs representative^{18} of misc. asset classes:

- U.S. stocks (SPY ETF)
- European stocks (EZU ETF)
- Japanese stocks (EWJ ETF)
- Emerging markets stocks (EEM ETF)
- U.S. REITs (VNQ ETF)
- International REITs (RWX ETF)
- U.S. 7-10 year Treasuries (IEF ETF)
- U.S. 20+ year Treasuries (TLT ETF)
- Commodities (DBC ETF)
- Gold (GLD ETF)

Averaged results for all ETFs/regression models over each ETF price history^{19} are the following^{20}:

Volatility model | Variance proxy | $\bar{\alpha}$ | $\bar{\beta}$ | $\bar{R^2}$ |
---|---|---|---|---|

Random walk | Squared close-to-close returns | 5.8% | 0.66 | 44% |

SMA, optimal $k \in \left[ 1, 5, 10, 15, 20 \right]$ days | Squared close-to-close returns | 5.8% | 0.68 | 46% |

EWMA, optimal $\lambda$ | Squared close-to-close returns | 4.7% | 0.73 | 45% |

GARCH(1,1) | Squared close-to-close returns | -1.3% | 0.98 | 43% |

Random walk | Parkinson range | 5.6% | 0.94 | 44% |

SMA, optimal $k \in \left[ 1, 5, 10, 15, 20 \right]$ days | Parkinson range | 5.1% | 1.00 | 47% |

EWMA, optimal $\lambda$ | Parkinson range | 4.3% | 1.06 | 48% |

GARCH(1,1) | Parkinson range | 2.7% | 1.18 | 47% |

Random walk | Jump-adjusted Parkinson range | 4.9% | 0.70 | 45% |

SMA, optimal $k \in \left[ 1, 5, 10, 15, 20 \right]$ days | Jump-adjusted Parkinson range | 5.1% | 0.71 | 47% |

EWMA, optimal $\lambda$ | Jump-adjusted Parkinson range | 4.0% | 0.76 | 45% |

GARCH(1,1) | Jump-adjusted Parkinson range | -1.0% | 1.00 | 45% |

From these, it is possible to conclude the following:

- The two GARCH(1,1) models using improved variance proxies produce volatility forecasts with better r-squared than the GARCH(1,1) model using squared returns (lines #8 and #12 v.s. line #4), which is in agreement with Molnar
^{8} - The two GARCH(1,1) models using variance proxies that integrate close prices produce nearly unbiased forecasts (lines #4 and #12), which, together with their relatively high r-squared, makes them volatility forecasting models to recommend in these cases
- The GARCH(1,1) using the Parkinson range as variance proxy produces the most biased forecasts (line #8), which makes it a volatility forecasting model to avoid in this case

## Conclusion

The GARCH(1,1) volatility forecasting model exhibits good practical performances for a wide range of assets, as empirically demonstrated in the previous section.

Nevertheless, because it is *unable to describe certain aspects often found in financial data*^{3}, many variations have been proposed in the literature^{3} (AGARCH, EGARCH, QGARCH, TGARCH…).

Next in this series, I will detail such a variation – very recent^{21} – whose main characteristic is its capability to adapt to time-varying GARCH parameters.

Meanwhile, feel free to connect with me on LinkedIn or to follow me on Twitter.

–

See Boudoukh, J., Richardson, M., & Whitelaw, R.F. (1997). Investigation of a class of volatility estimators, Journal of Derivatives, 4 Spring, 63-71. ↩ ↩

^{2}Or more generally, of a weighted moving average of one of its past variance proxies. ↩

See Brandon Williams, GARCH(1,1) models, B. Sc. Thesis, 15. Juli 2011. ↩ ↩

^{2}↩^{3}See Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics, 31(3), 307–327. ↩ ↩

^{2}See Andrew J. Patton, Volatility forecast comparison using imperfect volatility proxies, Journal of Econometrics, Volume 160, Issue 1, 2011, Pages 246-256. ↩ ↩

^{2}See Daniel B. Nelson and Charles Q. Cao, Inequality Constraints in the Univariate GARCH Model, Journal of Business & Economic Statistics, Vol. 10, No. 2 (Apr., 1992), pp. 229-235. ↩

See Pelagatti, M., Lisi, F. (2009). Variance initialisation in GARCH estimation. In Paganoni, A.M., Sangalli, L.M., Secchi, P., Vantini, S. (eds.), S.Co. 2009 Sixth Conference Complex Data Modeling and Computationally Intensive Statistical Methods for Estimation and Prediction, Maggioli Editore, Milan. ↩

See Peter Molnar (2016): High-low range in GARCH models of stock return volatility, Applied Economics. ↩ ↩

^{2}↩^{3}↩^{4}↩^{5}↩^{6}Also called the asset long-term variance. ↩

More precisely, a convex combination. ↩

Which is not surprising since

*in fact, exponential smoothing is a constrained version of GARCH (1,1)*^{1}, without mean-reversion. ↩See Brooks, Chris and Persand, Gitanjali (2003) Volatility forecasting for risk management. Journal of Forecasting, 22(1). pp. 1-22. ↩ ↩

^{2}In which case, the Gaussian MLE is usually considered as a quasi-maximum likelihood estimate. ↩

See Zumbach, G. (2000). The Pitfalls in Fitting Garch(1,1) Processes. In: Dunis, C.L. (eds) Advances in Quantitative Asset Management. Studies in Computational Finance, vol 1. Springer, Boston, MA. ↩ ↩

^{2}↩^{3}See Dennis Kristensen and Oliver Linton, A Closed-Form Estimator for the GARCH(1,1) Model, Econometric Theory, Vol. 22, No. 2 (Apr., 2006), pp. 323-337. ↩

See Mincer, J. and V. Zarnowitz (1969). The evaluation of economic forecasts. In J. Mincer (Ed.), Economic Forecasts and Expectations. ↩

These ETFs are used in the

*Adaptative Asset Allocation*strategy from ReSolve Asset Management, described in the paper*Adaptive Asset Allocation: A Primer*^{22}. ↩The common ending price history of all the ETFs is 31 August 2023, but there is no common starting price history, as all ETFs started trading on different dates. ↩

For all models, I used an expanding window for the volatility forecast computation. ↩

At the date of publication of this blog post. ↩

See Butler, Adam and Philbrick, Mike and Gordillo, Rodrigo and Varadi, David, Adaptive Asset Allocation: A Primer. ↩

This post originally published at https://portfoliooptimizer.io/blog/volatility-forecasting-garch11-model/.