[1] 15
4 Advanced concepts
4.1 Trace
The trace of a \(k \times k\) square matrix \(\boldsymbol A\) is the sum of the diagonal entries: \[\mathop{\mathrm{tr}}(\boldsymbol A) = \sum_{i=1}^n a_{ii}\] Example: \[
\boldsymbol A = \begin{pmatrix}
1 & 2 & 3 \\ 3 & 9 & 1 \\ 0 & 11 & 5
\end{pmatrix} \quad \Rightarrow \quad \mathop{\mathrm{tr}}(\boldsymbol A) = 1+9+5 = 15
\] In R
we have
The following properties hold for square matrices \(\boldsymbol A\) and \(\boldsymbol B\) and scalars \(\lambda\):
- \(\mathop{\mathrm{tr}}(\lambda \boldsymbol A) = \lambda \mathop{\mathrm{tr}}(\boldsymbol A)\)
- \(\mathop{\mathrm{tr}}(\boldsymbol A + \boldsymbol B) = \mathop{\mathrm{tr}}(\boldsymbol A) + \mathop{\mathrm{tr}}(\boldsymbol B)\)
- \(\mathop{\mathrm{tr}}(\boldsymbol A') = \mathop{\mathrm{tr}}(\boldsymbol A)\)
- \(\mathop{\mathrm{tr}}(\boldsymbol I_k) = k\)
For \(\boldsymbol A\in \mathbb R^{k \times m}\) and \(\boldsymbol B \in \mathbb R^{m \times k}\) we have \[ \mathop{\mathrm{tr}}(\boldsymbol A \boldsymbol B) = \mathop{\mathrm{tr}}(\boldsymbol B \boldsymbol A). \]
4.2 Idempotent matrix
The square matrix \(\boldsymbol A\) is called idempotent if \(\boldsymbol A \boldsymbol A = \boldsymbol A\). The identity matrix is idempotent: \(\boldsymbol I_n \boldsymbol I_n = \boldsymbol I_n\). Another example is the matrix \[ \boldsymbol A = \begin{pmatrix} 4 & -1 \\ 12 & -3 \end{pmatrix}. \] We have \[\begin{align*} \boldsymbol A \boldsymbol A &= \begin{pmatrix} 4 & -1 \\ 12 & -3 \end{pmatrix} \begin{pmatrix} 4 & -1 \\ 12 & -3 \end{pmatrix} \\ &= \begin{pmatrix} 16-12 & -4+3 \\ 48-36 & -12+9 \end{pmatrix} \\ &= \begin{pmatrix} 4 & -1 \\ 12 & -3 \end{pmatrix} = \boldsymbol A. \end{align*}\]
4.3 Eigendecomposition
4.3.1 Eigenvalues
An eigenvalue \(\lambda\) of a \(k \times k\) square matrix is a solution to the equation \[ \det(\lambda \boldsymbol I_k - \boldsymbol A) = 0. \] The function \(f(\lambda) = \det(\lambda \boldsymbol I_k - \boldsymbol A)\) has exactly \(k\) roots so that \(\det(\lambda \boldsymbol I_k - \boldsymbol A) = 0\) has exactly \(k\) solutions. The solutions \(\lambda_1, \ldots, \lambda_k\) are the \(k\) eigenvalues of \(\boldsymbol A\).
Most applications of eigenvalues in econometrics concern symmetric matrices. In this case, all eigenvalues are real-valued. In the case of non-symmetric matrices, some eigenvalues may be complex-valued.
Useful properties of the eigenvalues of a symmetric \(k \times k\) matrix are:
- \(\det(\boldsymbol A) = \lambda_1 \cdot \ldots \cdot \lambda_k\)
- \(\mathop{\mathrm{tr}}(\boldsymbol A) = \lambda_1 + \ldots + \lambda_k\)
- \(\boldsymbol A\) is nonsingular if and only if all eigenvalues are nonzero
- \(\boldsymbol A \boldsymbol B\) and \(\boldsymbol B \boldsymbol A\) have the same eigenvalues.
4.3.2 Eigenvectors
If \(\lambda_i\) is an eigenvalue of \(\boldsymbol A\), then \(\lambda_i \boldsymbol I_k - \boldsymbol A\) is singular, which implies that there exists a linear combination vector \(\boldsymbol v_i\) with \((\lambda_i \boldsymbol I_k - \boldsymbol A) \boldsymbol v_i = \boldsymbol 0\). Equivalently, \[ \boldsymbol A \boldsymbol v_i = \lambda_i \boldsymbol v_i, \]
which can be solved by Gaussian elimination. It is convenient to normalize any solution such that \(\boldsymbol v_i'\boldsymbol v_i = 1\). The solutions \(\boldsymbol v_1, \ldots, \boldsymbol v_k\) are called eigenvectors of \(\boldsymbol A\) to corresponding eigenvalues \(\lambda_1, \ldots, \lambda_k\).
4.3.3 Spectral decomposition
If \(\boldsymbol A\) is symmetric, then \(\boldsymbol v_1, \ldots, \boldsymbol v_k\) are pairwise orthogonal (i.e., \(\boldsymbol v_i' \boldsymbol v_j = 0\) for \(i \neq j\)). Let \(\boldsymbol V = \begin{pmatrix} \boldsymbol v_1 & \ldots & \boldsymbol v_k \end{pmatrix}\) be the \(k \times k\) matrix of eigenvectors and let \(\boldsymbol \Lambda = \mathop{\mathrm{diag}}(\lambda_1, \ldots, \lambda_k)\) be the \(k \times k\) diagonal matrix with the eigenvalues on the main diagonal. Then, we can write \[ \boldsymbol A = \boldsymbol V \boldsymbol \Lambda \boldsymbol V', \]
which is called the spectral decomposition of \(\boldsymbol A\). The matrix of eigenvalues can be written as \(\boldsymbol \Lambda = \boldsymbol V' \boldsymbol A \boldsymbol V\).
4.3.4 Eigendecomposition in R
The function eigen()
computes the eigenvalues and corresponding eigenvectors.
[,1] [,2] [,3]
[1,] 10 29 6
[2,] 29 206 70
[3,] 6 70 35
eigen(B) #eigenvalues and eigenvector matrix
eigen() decomposition
$values
[1] 234.827160 12.582227 3.590613
$vectors
[,1] [,2] [,3]
[1,] -0.1293953 -0.5312592 0.8372697
[2,] -0.9346164 -0.2167553 -0.2819739
[3,] -0.3312839 0.8190121 0.4684764
4.4 Definite matrix
The \(k \times k\) square matrix \(\boldsymbol{A}\) is called positive definite if \[\boldsymbol{c}'\boldsymbol{Ac}>0\] holds for all nonzero vectors \(\boldsymbol{c}\in \mathbb{R}^k\). If \[\boldsymbol{c}'\boldsymbol{Ac}\geq 0\]
for all vectors \(\boldsymbol{c}\in \mathbb{R}^k\), the matrix is called positive semi-definite. Analogously, \(\boldsymbol A\) is called negative definite if \(\boldsymbol{c}'\boldsymbol{Ac}<0\) and negative semi-definite if \(\boldsymbol{c}'\boldsymbol{Ac}\leq 0\) for all nonzero vectors \(\boldsymbol c \in \mathbb R^k\). A matrix that is neither positive semi-definite nor negative semi-definite is called indefinite
The definiteness property of a symmetric matrix \(\boldsymbol A\) can be determined using its eigenvalues:
- \(\boldsymbol A\) is positive definite \(\Leftrightarrow\) all eigenvalues of \(\boldsymbol A\) are strictly positive
- \(\boldsymbol A\) is negative definite \(\Leftrightarrow\) all eigenvalues of \(\boldsymbol A\) are strictly negative
- \(\boldsymbol A\) is positive semi-definite \(\Leftrightarrow\) all eigenvalues of \(\boldsymbol A\) are non-negative
- \(\boldsymbol A\) is negative semi-definite \(\Leftrightarrow\) all eigenvalues of \(\boldsymbol A\) are non-positive
eigen(B)$values #B is positive definite (all eigenvalues positive)
[1] 234.827160 12.582227 3.590613
The matrix analog of a positive or negative number (scalar) is a positive definite or negative definite matrix. Therefore, we use the notation
- \(\boldsymbol A > 0\) if \(\boldsymbol A\) is positive definite
- \(\boldsymbol A < 0\) if \(\boldsymbol A\) is negative definite
- \(\boldsymbol A \geq 0\) if \(\boldsymbol A\) is positive semi-definite
- \(\boldsymbol A \leq 0\) if \(\boldsymbol A\) is negative semi-definite
The notation \(\boldsymbol A > \boldsymbol B\) means that the matrix \(\boldsymbol A - \boldsymbol B\) is positive definite.
4.5 Cholesky decomposition
Any positive definite and symmetric matrix \(\boldsymbol B\) can be written as \[ \boldsymbol B = \boldsymbol P \boldsymbol P', \] where \(P\) is a lower triangular matrix with strictly positive diagonal entries \(p_{jj} > 0\). This representation is called Cholesky decomposition. The matrix \(\boldsymbol P\) is unique. For a \(2 \times 2\) matrix \(\boldsymbol B\) we have \[\begin{align*} \begin{pmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{pmatrix} &= \begin{pmatrix} p_{11} & 0 \\ p_{21} & p_{22} \end{pmatrix} \begin{pmatrix} p_{11} & p_{21} \\ 0 & p_{22} \end{pmatrix} \\ &= \begin{pmatrix} p_{11}^2 & p_{11} p_{21} \\ p_{11} p_{21} & p_{21}^2 + p_{22}^2 \end{pmatrix}, \end{align*}\] which implies \(p_{11} = \sqrt{b_{11}}\), \(p_{21} = b_{21}/p_{11}\), and \(p_{22} = \sqrt{b_{22} - p_{21}^2}\). For a \(3 \times 3\) matrix we obtain
\[\begin{align*} \begin{pmatrix} b_{11} & b_{12} & b_{31} \\ b_{21} & b_{22} & b_{23} \\ b_{31} & b_{32} & b_{33} \end{pmatrix} = \begin{pmatrix} p_{11} & 0 & 0 \\ p_{21} & p_{22} & 0 \\ p_{31} & p_{32} & p_{33} \end{pmatrix} \begin{pmatrix} p_{11} & p_{21} & p_{31} \\ 0 & p_{22} & p_{32} \\ 0 & 0 & p_{33}\end{pmatrix} \\ = \begin{pmatrix} p_{11}^2 & p_{11} p_{21} & p_{11} p_{31} \\ p_{11} p_{21} & p_{21}^2 + p_{22}^2 & p_{21} p_{31} + p_{22} p_{32} \\ p_{11}p_{31} & p_{21}p_{31} + p_{22}p_{32} & p_{31}^2 + p_{32}^2 + p_{33}^2\end{pmatrix}, \end{align*}\] which implies
\[\begin{gather*} p_{11}=\sqrt{b_{11}}, \ \ p_{21} = \frac{b_{21}}{p_{11}}, \ \ p_{31} = \frac{b_{31}}{p_{11}}, \ \ p_{22} = \sqrt{b_{22}-p_{21}^2}, \\ p_{32}= \frac{b_{32}-p_{21}p_{31}}{p_{22}}, \ \ p_{33} = \sqrt{b_{33} - p_{31}^2 - p_{32}^2}. \end{gather*}\]Let’s compute the Cholesky decomposition of \[
\boldsymbol B = \begin{pmatrix} 1 & -0.5 & 0.6 \\ -0.5 & 1 & 0.25 \\ 0.6 & 0.25 & 1 \end{pmatrix}
\] using the R
function chol()
:
4.6 Vectorization
The vectorization operator \(\mathop{\mathrm{vec}}()\) stacks the matrix entries column-wise into a large vector. The vectorized \(k \times m\) matrix \(\boldsymbol A\) is the \(km \times 1\) vector \[ \mathop{\mathrm{vec}}(\boldsymbol A) = (a_{11}, \ldots, a_{k1}, a_{12}, \ldots, a_{k2}, \ldots, a_{1m}, \ldots, a_{km})'. \]
c(A) #vectorize the matrix A
[1] 1 3 0 2 9 11 3 1 5
4.7 Kronecker product
The Kronecker product \(\otimes\) multiplies each element of the left-hand side matrix with the entire matrix on the right-hand side. For a \(k \times m\) matrix \(\boldsymbol A\) and a \(r \times s\) matrix \(\boldsymbol B\), we get the \(kr\times ms\) matrix \[ A \otimes B = \begin{pmatrix} a_{11}\boldsymbol B & \ldots & a_{1m}\boldsymbol B \\ \vdots & & \vdots \\ a_{k1}\boldsymbol B & \ldots & a_{km}\boldsymbol B \end{pmatrix}, \] where each entry \(a_{ij} \boldsymbol B\) is a \(r \times s\) matrix.
A %x% B #Kronecker product in R
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1.0 -0.50 0.60 2.0 -1.00 1.20 3.0 -1.50 1.80
[2,] -0.5 1.00 0.25 -1.0 2.00 0.50 -1.5 3.00 0.75
[3,] 0.6 0.25 1.00 1.2 0.50 2.00 1.8 0.75 3.00
[4,] 3.0 -1.50 1.80 9.0 -4.50 5.40 1.0 -0.50 0.60
[5,] -1.5 3.00 0.75 -4.5 9.00 2.25 -0.5 1.00 0.25
[6,] 1.8 0.75 3.00 5.4 2.25 9.00 0.6 0.25 1.00
[7,] 0.0 0.00 0.00 11.0 -5.50 6.60 5.0 -2.50 3.00
[8,] 0.0 0.00 0.00 -5.5 11.00 2.75 -2.5 5.00 1.25
[9,] 0.0 0.00 0.00 6.6 2.75 11.00 3.0 1.25 5.00
4.8 Vector and matrix norm
A norm \(\|\cdot\|\) of a vector or a matrix is a measure of distance from the origin. The most commonly used norms are the Euclidean vector norm \[ \|\boldsymbol a\| = \sqrt{\boldsymbol a' \boldsymbol a} = \sqrt{\sum_{i=1}^k a_i^2} \] for \(\boldsymbol a \in \mathbb R^k\), and the Frobenius matrix norm \[ \|\boldsymbol A \| = \sqrt{\sum_{i=1}^k \sum_{j=1}^m a_{ij}^2} \] for \(\boldsymbol A \in \mathbb R^{k \times m}\).
A norm satisfies the following properties:
- \(\|\lambda \boldsymbol A\| = |\lambda| \|\boldsymbol A\|\) for any scalar \(\lambda\) (absolute homogeneity)
- \(\|\boldsymbol A + \boldsymbol B\| \leq \|\boldsymbol A\| + \|\boldsymbol B\|\) (triangle inequality)
- \(\|\boldsymbol A\| = 0\) implies \(\boldsymbol A = \boldsymbol 0\) (definiteness)