27  Estimating VAR

Consider a more general VAR specification with \(n\) endogenous variables and \(m\) exogenous variables:

\[ y_t = A_1 y_{t-1} + A_2 y_{t-2} + \cdots + A_p y_{t-p} + C x_t + u_t \tag{27.1}\]

where \(t = 1, 2, ..., T\). \(y_t\) is a \(n\times 1\) vector of endogenous data, \(A_1, A_2,..., A_p\) are \(p\) matrices of dimension \(n\times n\); \(C\) is an \(n\times m\) matrix, and \(x_t\) is an \(m\times 1\) vector of exogenous regressors which can be e.g. constant terms, time trends, or exogenous data series. \(u_t\) is the vector of residuals.

We assume: (1) all variables are stationary; (2) the \(p\) lags are sufficient to summarize all the dynamic correlations among elements of \(y_t\); and (3) \(u_t\) is uncorrelated with \(y_{t-1},...,y_{t-p}\) and is free of serial correlation. Thus, \(\mathbb E(u_tu_t') = \Omega\), while \(\mathbb E(u_t u_s') = 0\) for \(t \neq s\). \(\Omega\) is an \(n\times n\) symmetric positive definite variance-covariance matrix, with variance terms on the diagonal and covariance terms off diagonal.

\(T\) is the size of the sample. There are \(k = np + m\) coefficients to estimate for each equation, and a total of \(q = nk = n(np + m)\) coefficients to estimate for the full model.

27.1 Stacked Form

For computation, it is more convenient to rewrite the VAR system in transpose:

\[ y_t' = y_{t-1}' A_1' + y_{t-2}' A_2' + \cdots + y_{t-p}' A_p' + x_t' C' + u_t' \]

We can stack observations to represent the whole data set:

\[ \begin{bmatrix} y_1' \\ y_2' \\ \vdots \\ y_T' \end{bmatrix}_{T\times n} = \sum_{j=1}^{p} \begin{bmatrix} y_{1-j}' \\ y_{2-j}' \\ \vdots \\ y_{T-j}' \end{bmatrix}_{T\times n} \underset{n\times n}{A_j'} + \begin{bmatrix} x_1' \\ x_2' \\ \vdots \\ x_T' \end{bmatrix}_{T\times m} \underset{m\times n}{C'} + \begin{bmatrix} u_1' \\ u_2' \\ \vdots \\ u_T' \end{bmatrix}_{T\times n} \]

Gathering the regressors into a single matrix, one obtains:

\[ \begin{bmatrix} y_1' \\ y_2' \\ \vdots \\ y_T' \end{bmatrix}_{T\times n} = \underbrace{\begin{bmatrix} y_0' & y_{-1}' & \dots & y_{1-p}' & x_1' \\ y_1' & y_0' & \dots & y_{2-p}' & x_2' \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ y_{T-1}' & y_{T-2}' & \dots & y_{T-p}' & x_T' \end{bmatrix}}_{T\times(np+m)} \underbrace{\begin{bmatrix} A_1' \\ A_2' \\ \vdots \\ A_p' \\ C' \end{bmatrix}}_{(np+m)\times n} + \begin{bmatrix} u_1' \\ u_2' \\ \vdots \\ u_T' \end{bmatrix}_{T\times n} \]

Or, more compactly:

\[ \underset{T\times n}{Y} = \underset{T\times k}{X}\ \underset{k\times n}{B} + \underset{T\times n}{U}. \tag{27.2}\]

Once the model is stacked this way, obtaining OLS estimates of the VAR is straightforward. An estimate \(\hat B\) is obtained from:

\[ \hat B = (X'X)^{-1} X'Y \]

This is equivalent to applying OLS on each column variable:

\[ \begin{bmatrix}\hat B^{(1)} & \hat B^{(2)} & \dots & \hat B^{(n)}\end{bmatrix} = (X'X)^{-1}X' \begin{bmatrix}Y^{(1)} & Y^{(2)} & \dots & Y^{(n)}\end{bmatrix} \]

where \(Y^{(i)}\) denotes \(i\)-th column of matrix \(Y\). An estimate of the covariance matrix \(\Omega\) can be obtained from:

\[ \hat\Omega= \frac{1}{T-k-1}\hat{U}'\hat{U} \]

Under the assumptions (1)-(3), the parameters are consistently estimated by OLS regressions. Standard asymptotic results apply.

27.2 Vectorized Form

Alternatively, one can vectorize the VAR system as:

\[ \begin{bmatrix} Y^{(1)} \\ Y^{(2)} \\ \vdots \\ Y^{(n)} \end{bmatrix}_{nT\times 1} = \begin{bmatrix} X & & & \\ & X & & \\ & & \ddots & \\ & & & X \end{bmatrix}_{nT \times nk} \begin{bmatrix} B^{(1)} \\ B^{(2)} \\ \vdots \\ B^{(n)} \end{bmatrix}_{nk \times 1} + \begin{bmatrix} U^{(1)} \\ U^{(2)} \\ \vdots \\ U^{(n)} \end{bmatrix}_{nT\times 1} \]

where \(Y^{(i)}\) denotes \(i\)-th column of matrix \(Y\). The vectorized system can be compactly written as

\[ y = \bar{X}\beta + u \tag{27.3}\]

where \(y = vec(Y)\), \(\bar{X} = I_n \otimes X\), \(\beta = vec(B)\), \(u = vec(U)\). Also, one has \(\mathbb E(uu') = \bar\Omega\), where \(\bar\Omega = \Omega\otimes I_T\). An OLS estimate of the vectorised \(\beta\) can be obtained as:

\[ \hat\beta = (\bar{X}' \bar{X})^{-1} \bar{X}'y \]

It should be noted that Equation 27.2 and Equation 27.3 are just alternative but equivalent representations of the same VAR model Equation 27.1. We opt to use one representation or the other according to which one is most convenient for our purposes. Equation 27.2 is typically faster to compute (because smaller matrices produce more accurate estimates), while statistical inference works more naturally with the vectorized form.