# Derivation of the Least Squares Estimator for Beta in Matrix Notation – Proof Nr. 1

In the post that derives the least squares estimator, we make use of the following statement: $\frac{\partial b'X'Xb}{\partial b} =2X'Xb$

This post shows how one can prove this statement. Let’s start from the statement that we want to prove: $\frac{\partial \hat{\beta}'X'X\hat{\beta}}{\partial \hat{\beta}}=2 X'X \hat{\beta}'$

Note that $X'X$ is symmetric. Hence, in order to simplify the math we are going to label $X'X$ as A, i.e. $X'X :=A$. $\hat{\beta}'A\hat{\beta}= \begin{bmatrix} \hat{\beta}_{1} & \hat{\beta}_{2} & \hdots & \hat{\beta}_{k}\end{bmatrix} \begin{bmatrix} a_{11} & a_{12} & \hdots & a_{1k}\\ a_{21} & a_{22} & \hdots & a_{2k}\\ \vdots & \vdots & \ddots & \vdots \\ a_{k1} & a_{k2} & \hdots & a_{kk} \end{bmatrix} \begin{bmatrix} \hat{\beta_{1}}\\ \hat{\beta_{2}} \\ \vdots \\ \hat{\beta_{k}} \end{bmatrix}$ $\hat{\beta}'A\hat{\beta}= \begin{bmatrix} \sum\limits_{i=1}^k \hat{\beta_{i}}a_{i1} & \sum\limits_{i=1}^k \hat{\beta_{i}}a_{i2} & \hdots & \sum\limits_{i=1}^k \hat{\beta_{i}}a_{ik}\end{bmatrix} \begin{bmatrix} \hat{\beta_{1}}\\ \hat{\beta_{2}} \\ \vdots \\ \hat{\beta_{k}} \end{bmatrix}$ $\hat{\beta}'A\hat{\beta}= \begin{matrix} \hat{\beta}^{2}_{1}a_{11}+\hat{\beta}_{1}\hat{\beta}_{2}a_{21}+\hdots+\hat{\beta}_{1}\hat{\beta}_{k}a_{k1}+\\ \hat{\beta}_{2}\hat{\beta}_{1}a_{21}+\hat{\beta}_{2}^{2}a_{22}+\hdots+\hat{\beta}_{2}\hat{\beta}_{k}a_{2k}+\\ \vdots \\ \hat{\beta}_{k}\hat{\beta}_{1}a_{k1}+\hat{\beta}_{k}\hat{\beta}_{2}a_{k2}+\hdots+\hat{\beta}_{k}^{2}a_{kk}\\ \end{matrix}$

Let’s compute the partial derivative of $\hat{\beta}'A\hat{\beta}$ with respect to $\hat{\beta}$. $\frac{\partial \hat{\beta}'A\hat{\beta}}{\partial \hat{\beta}_{1}}=2\hat{\beta}_{1}a_{11}+\hat{\beta}_{2}a_{21}+\hdots+\hat{\beta}_{k}a_{k1}+\hat{\beta}_{2}a_{12}+\hdots+\hat{\beta}_{2}a_{1k}$ $\frac{\partial \hat{\beta}'A\hat{\beta}}{\partial \hat{\beta}_{1}}= 2(\hat{\beta}_{1}a_{11}+\hat{\beta}_{2}a_{12}+\hdots+\hat{\beta}_{k}a_{1k})$ $\frac{\partial \hat{\beta}'A\hat{\beta}}{\partial \hat{\beta}_{2}}= 2(\hat{\beta}_{1}a_{21}+\hat{\beta}_{2}a_{22}+\hdots+\hat{\beta}_{k}a_{2k})$ $\vdots$ $\frac{\partial \hat{\beta}'A\hat{\beta}}{\partial \hat{\beta}_{k}}= 2(\hat{\beta}_{1}a_{k1}+\hat{\beta}_{2}a_{k2}+\hdots+\hat{\beta}_{k}a_{kk})$

Instead of stating every single equation, one can state the same using the more compact matrix notation: $\frac{\partial \hat{\beta}'A\hat{\beta}}{\partial \hat{\beta}_{1}}=2A\hat{\beta}$

plugging in $X'X$ for A $\frac{\partial \hat{\beta}'A\hat{\beta}}{\partial \hat{\beta}_{1}}=2X'X\hat{\beta}$

Now let’s return to the derivation of the least squares estimator.

## 4 thoughts on “Derivation of the Least Squares Estimator for Beta in Matrix Notation – Proof Nr. 1”

1. Maciej Bitner says:

I think there is a tiny error in the pre-last line. The right hand side should be 2A_1 times B as we use only the first row of the matrix A. Then it is generalized to a vector of partial derivatives on the left and matrix A times B on the right. Nevertheless the proof was very helpful, thank you for posting it!

2. Adam Janovský says:

Hi, thanks for the proof, I appreciate it. I just want to point out to a typo. When writing \hat{\beta}^\prime A \hat{\beta} as a number (i.e. sum of sums), two errors occur at the last row:

1) The index of very first beta in the row should be k, not 1.
2) The plus sign at the end of the row is redundant

1. ad says:

Thanks Adam, you are right! I corrected the mistakes. Thanks a lot! Cheers, ad.

This site uses Akismet to reduce spam. Learn how your comment data is processed.