The coefficient of determination shows how much of the variation of the dependent variable
(
) can be explained by our model. Another way of interpreting the coefficient of determination
, which will not be discussed in this post, is to look at it as the squared Pearson correlation coefficient between the observed values
and the fitted values
. Why this is the case exactly can be found in another post.
The coefficient of determination is probably the most famous key figure when it comes to evaluate the fit of an ordinary least squares (OLS) estimation. The coefficient of determination
can be derived from a simple variance decomposition. Although it sounds complicated it is actually explained in few simple words. The first thing you have to remember is what OLS is actually all about. When using OLS you try to explain a certain dependent variable (
) through independent variables (
). Our model is thereby able to explain some variation of the dependent variable (
). We can summarize this as follows:
where is the dependent variable, which consists of a part we can explain
(also now as fitted value) and a part we cannot explain
(also now as error term). From this decomposition follows that the variance of
(
) can be decomposed in a similar manner:
In the case our regression model (it usually does) contains a constant (usually depicted as or
) we know that
. Given that
the equation above boils down to:
Plugging in the formula of the Variance we get:
And as is follows that
Where
Total Sum Squared
Explained Sum Squared
Sum of Squared Residuals
The coefficient of determination is consequently calculated as the ratio of Explained Sum Squared (ESS) to Total Sum Squared (TSS).
What is left after having seen how the coefficient of determination is calculated is to know what it actually expresses. Looking at the equation above should have already provided some idea what the coefficient of determination
says. It shows how much of the variation of the dependent variable
(
) can be explained by our model.
Another way of interpreting the coefficient of determination is to look at it as the squared Pearson correlation coefficient between the observed values
and the fitted values
. Why this is the case exactly can be found in another post.
Quick note. You dropped out the squared terms in the summations for ESS and TSS.
Thanks a lot, Ill fix it tomorrow! Cheers.
^2’s missing in the ‘Plugging in the formula of the Variance we get’-part
cheers
Thanks a lot for the comment. I corrected it! Cheers.
Absolutely lovely! Thank you for your post 🙂
I am glad you like it 🙂