Relationship between Coefficient of Determination & Squared Pearson Correlation Coefficient

The usual way of interpreting the coefficient of determination $R^{2}$ is to see it as the percentage of the variation of the dependent variable $y$ ($Var(y)$) can be explained by our model. The exact interpretation and derivation of the coefficient of determination $R^{2}$ can be found here.

Another way of interpreting the coefficient of determination $R^{2}$ is to look at it as the Squared Pearson Correlation Coefficient between the observed values $y_{i}$ and the fitted values  $\hat{y}_{i}$. In this post we are going to prove that this is actually the case. For the proof we have to know the following (taken from OLS theory and general statistics):

• $y = \hat{y} + e$
• $Cov[\hat{y},e]=0$
• $Cov[x,(y+Z)]=Cov(x,y)+Cov(x,Z)$
• $Var(x) = Cov(x,x)$
• $r_{y,\hat{y}}=\frac{Cov(y,\hat{y})}{\sqrt[2]{Var(y)Var(\hat{y}) }}$

In the following we are going to see how to derive the coefficient of determination $R^{2}$ from the the Squared Pearson Correlation Coefficient between the observed values $y_{i}$ and the fitted values $\hat{y}_{i}$.

$r^{2}_{y,\hat{y}}=\left(\frac{Cov(y,\hat{y})}{\sqrt[2]{Var(y)Var(\hat{y}) }}\right)^{2}$

$r^{2}_{y,\hat{y}}=\frac{Cov(y,\hat{y})}{\sqrt[2]{Var(y)Var(\hat{y}) }} \frac{Cov(y,\hat{y})}{\sqrt[2]{Var(y)Var(\hat{y}) }}$

$r^{2}_{y,\hat{y}}=\frac{Cov(y,\hat{y}) Cov(y,\hat{y})}{Var(y)Var(\hat{y}) }$

$r^{2}_{y,\hat{y}}=\frac{Cov(\hat{y}+e,\hat{y}) Cov(\hat{y}+e,\hat{y})}{Var(y)Var(\hat{y}) }$

$r^{2}_{y,\hat{y}}=\frac{\left(Cov(\hat{y},\hat{y})+ Cov(\hat{y},e) \right) \left(Cov(\hat{y},\hat{y})+ Cov(\hat{y},e) \right) }{Var(y)Var(\hat{y}) }$

$r^{2}_{y,\hat{y}}=\frac{Cov(\hat{y},\hat{y})Cov(\hat{y},\hat{y})}{Var(y)Var(\hat{y}) }$

$r^{2}_{y,\hat{y}}=\frac{Var(\hat{y}) Var(\hat{y})}{Var(y)Var(\hat{y}) }$

$r^{2}_{y,\hat{y}}=\frac{Var(\hat{y}) }{Var(y) }= \frac{ESS}{TSS} = R^{2}$

$r^{2}_{y,\hat{y}}= R^{2}$

This entry was posted in Econometrics, Statistic and tagged , , . Bookmark the permalink.

10 Responses to Relationship between Coefficient of Determination & Squared Pearson Correlation Coefficient

1. Immanuel says:

Thank you for this!

2. mark leeds says:

Hi Isidore: Do you know if the relation between the correlation coefficient R and r holds for the regression model with ma(1) errors ? empirically I seem to find that it doesn’t hold. but I wanted to make sure that my code didn’t have a bug. thanks for any wisdom and your blog.

• Hi Mark! That is actually a very good question which unfortunately I cannot answer out of the box. However, once I have some time I will look into it. So far my feeling is that the second of the five bullet points (listed in the post), i.e. the covariance between the fitted values and the error term being equal to zero, is most likely violated. Generally I think if you are able to show that all five bullet points hold for a ma(1) process, the relationship between r2 and the correlation coefficient should hold as well.

Let me know if you find an answer to the question.
Cheers!

3. mark leeds says:

Thanks Isidore: What you pointf out is equivalent to the sums of squares decomposition relation , SSETOT = SSREG + SSE, being true. So I think I should look for info on when that decomposition holds in general. cov(y hat, e ) not being zero makes it not true so your pont is a good one. Thanks and I’ll let you know if I find anything out about it.

4. student says:

How did you get from covar(x,y) to covar(y’,y)?