Derivation of the Least Squares Estimator for Beta in Matrix Notation – Proof Nr. 1

In the post that derives the least squares estimator, we make use of the following statement:

$\frac{\partial b'X'Xb}{\partial b} =2X'Xb$

This post shows how one can prove this statement. Let’s start from the statement that we want to prove:

$\frac{\partial \hat{\beta}'X'X\hat{\beta}}{\partial \hat{\beta}}=2 X'X \hat{\beta}'$

Note that $X'X$ is symmetric. Hence, in order to simplify the math we are going to label $X'X$ as A, i.e. $X'X :=A$.

$\hat{\beta}'A\hat{\beta}= \begin{bmatrix} \hat{\beta}_{1} & \hat{\beta}_{2} & \hdots & \hat{\beta}_{k}\end{bmatrix} \begin{bmatrix} a_{11} & a_{12} & \hdots & a_{1k}\\ a_{21} & a_{22} & \hdots & a_{2k}\\ \vdots & \vdots & \ddots & \vdots \\ a_{k1} & a_{k2} & \hdots & a_{kk} \end{bmatrix} \begin{bmatrix} \hat{\beta_{1}}\\ \hat{\beta_{2}} \\ \vdots \\ \hat{\beta_{k}} \end{bmatrix}$

$\hat{\beta}'A\hat{\beta}= \begin{bmatrix} \sum\limits_{i=1}^k \hat{\beta_{i}}a_{i1} & \sum\limits_{i=1}^k \hat{\beta_{i}}a_{i2} & \hdots & \sum\limits_{i=1}^k \hat{\beta_{i}}a_{ik}\end{bmatrix} \begin{bmatrix} \hat{\beta_{1}}\\ \hat{\beta_{2}} \\ \vdots \\ \hat{\beta_{k}} \end{bmatrix}$

$\hat{\beta}'A\hat{\beta}= \begin{matrix} \hat{\beta}^{2}_{1}a_{11}+\hat{\beta}_{1}\hat{\beta}_{2}a_{21}+\hdots+\hat{\beta}_{1}\hat{\beta}_{k}a_{k1}+\\ \hat{\beta}_{2}\hat{\beta}_{1}a_{21}+\hat{\beta}_{2}^{2}a_{22}+\hdots+\hat{\beta}_{2}\hat{\beta}_{k}a_{k2}+\\ \vdots \\ \hat{\beta}_{k}\hat{\beta}_{1}a_{k1}+\hat{\beta}_{k}\hat{\beta}_{2}a_{k2}+\hdots+\hat{\beta}_{k}^{2}a_{kk}\\ \end{matrix}$

Let’s compute the partial derivative of $\hat{\beta}'A\hat{\beta}$ with respect to $\hat{\beta}$.

$\frac{\partial \hat{\beta}'A\hat{\beta}}{\partial \hat{\beta}_{1}}=2\hat{\beta}_{1}a_{11}+\hat{\beta}_{2}a_{21}+\hdots+\hat{\beta}_{k}a_{k1}+\hat{\beta}_{2}a_{12}+\hdots+\hat{\beta}_{2}a_{2k}++\hdots+\hat{\beta}_{k}a_{1k}+\hdots+\hat{\beta}_{k}a_{kk}$

$\frac{\partial \hat{\beta}'A\hat{\beta}}{\partial \hat{\beta}_{1}}= 2(\hat{\beta}_{1}a_{11}+\hat{\beta}_{2}a_{12}+\hdots+\hat{\beta}_{k}a_{1k})$

$\frac{\partial \hat{\beta}'A\hat{\beta}}{\partial \hat{\beta}_{2}}= 2(\hat{\beta}_{1}a_{21}+\hat{\beta}_{2}a_{22}+\hdots+\hat{\beta}_{k}a_{2k})$

$\vdots$

$\frac{\partial \hat{\beta}'A\hat{\beta}}{\partial \hat{\beta}_{k}}= 2(\hat{\beta}_{1}a_{k1}+\hat{\beta}_{2}a_{k2}+\hdots+\hat{\beta}_{k}a_{kk})$

Instead of stating every single equation, one can state the same using the more compact matrix notation:

$\frac{\partial \hat{\beta}'A\hat{\beta}}{\partial \hat{\beta}_{1}}=2A\hat{\beta}$

plugging in $X'X$ for A

$\frac{\partial \hat{\beta}'A\hat{\beta}}{\partial \hat{\beta}_{1}}=2X'X\hat{\beta}$

6 thoughts on “Derivation of the Least Squares Estimator for Beta in Matrix Notation – Proof Nr. 1”

1. Maciej Bitner says:

I think there is a tiny error in the pre-last line. The right hand side should be 2A_1 times B as we use only the first row of the matrix A. Then it is generalized to a vector of partial derivatives on the left and matrix A times B on the right. Nevertheless the proof was very helpful, thank you for posting it!

Hi, thanks for the proof, I appreciate it. I just want to point out to a typo. When writing \hat{\beta}^\prime A \hat{\beta} as a number (i.e. sum of sums), two errors occur at the last row:

1) The index of very first beta in the row should be k, not 1.
2) The plus sign at the end of the row is redundant