In the post that derives the least squares estimator, we make use of the following statement:
This post shows how one can prove this statement. Let’s start from the statement that we want to prove:
Note that is symmetric. Hence, in order to simplify the math we are going to label as A, i.e. .
Let’s compute the partial derivative of with respect to .
Instead of stating every single equation, one can state the same using the more compact matrix notation:
plugging in for A
Now let’s return to the derivation of the least squares estimator.