Relationship between Coefficient of Determination & Squared Pearson Correlation Coefficient

The usual way of interpreting the coefficient of determination R^{2} is to see it as the percentage of the variation of the dependent variable y (Var(y)) can be explained by our model. The exact interpretation and derivation of the coefficient of determination R^{2} can be found here.

Another way of interpreting the coefficient of determination R^{2} is to look at it as the Squared Pearson Correlation Coefficient between the observed values y_{i} and the fitted values  \hat{y}_{i}. In this post we are going to prove that this is actually the case. For the proof we have to know the following (taken from OLS theory and general statistics):

  • y = \hat{y} + e
  • Cov[\hat{y},e]=0
  • Cov[x,(y+Z)]=Cov(x,y)+Cov(x,Z)
  • Var(x) = Cov(x,x)
  • Var(x) = \frac{1}{n} \sum_i^N (x_i - \bar{x})^2
  • r_{y,\hat{y}}=\frac{Cov(y,\hat{y})}{\sqrt[2]{Var(y)Var(\hat{y}) }}

In the following we are going to see how to derive the coefficient of determination R^{2} from the the Squared Pearson Correlation Coefficient between the observed values y_{i} and the fitted values \hat{y}_{i}.

r^{2}_{y,\hat{y}}=\left(\frac{Cov(y,\hat{y})}{\sqrt[2]{Var(y)Var(\hat{y}) }}\right)^{2}

r^{2}_{y,\hat{y}}=\frac{Cov(y,\hat{y})}{\sqrt[2]{Var(y)Var(\hat{y}) }} \frac{Cov(y,\hat{y})}{\sqrt[2]{Var(y)Var(\hat{y}) }}

r^{2}_{y,\hat{y}}=\frac{Cov(y,\hat{y}) Cov(y,\hat{y})}{Var(y)Var(\hat{y}) }

r^{2}_{y,\hat{y}}=\frac{Cov(\hat{y}+e,\hat{y}) Cov(\hat{y}+e,\hat{y})}{Var(y)Var(\hat{y}) }

r^{2}_{y,\hat{y}}=\frac{\left(Cov(\hat{y},\hat{y})+ Cov(\hat{y},e) \right) \left(Cov(\hat{y},\hat{y})+ Cov(\hat{y},e) \right) }{Var(y)Var(\hat{y}) }

r^{2}_{y,\hat{y}}=\frac{Cov(\hat{y},\hat{y})Cov(\hat{y},\hat{y})}{Var(y)Var(\hat{y}) }

r^{2}_{y,\hat{y}}=\frac{Var(\hat{y}) Var(\hat{y})}{Var(y)Var(\hat{y}) }

r^{2}_{y,\hat{y}}=\frac{Var(\hat{y}) }{Var(y) }= \frac{\frac{1}{n} \sum_i^N (\hat{y_i} - \bar{\hat{y}})^2}{\frac{1}{n} \sum_i^N (y_i - \bar{y})^2} = \frac{\sum_i^N (\hat{y_i} - \bar{\hat{y}})^2}{\sum_i^N (y_i - \bar{y})^2} = \frac{ESS}{TSS} = R^{2}

r^{2}_{y,\hat{y}}= R^{2}


20 thoughts on “Relationship between Coefficient of Determination & Squared Pearson Correlation Coefficient”

  1. Hi Isidore: Do you know if the relation between the correlation coefficient R and r holds for the regression model with ma(1) errors ? empirically I seem to find that it doesn’t hold. but I wanted to make sure that my code didn’t have a bug. thanks for any wisdom and your blog.

    1. Hi Mark! That is actually a very good question which unfortunately I cannot answer out of the box. However, once I have some time I will look into it. So far my feeling is that the second of the five bullet points (listed in the post), i.e. the covariance between the fitted values and the error term being equal to zero, is most likely violated. Generally I think if you are able to show that all five bullet points hold for a ma(1) process, the relationship between r2 and the correlation coefficient should hold as well.

      Let me know if you find an answer to the question.

  2. Thanks Isidore: What you pointf out is equivalent to the sums of squares decomposition relation , SSETOT = SSREG + SSE, being true. So I think I should look for info on when that decomposition holds in general. cov(y hat, e ) not being zero makes it not true so your pont is a good one. Thanks and I’ll let you know if I find anything out about it.

    1. Thank you for your comment. I am sorry, but I cannot really help you as I do not understand to which equation you are referring to. If you could be more specific I might be able to help.

  3. Thanks a lot for this. Maybe, one possible small typo is: ESS/TSS should be RSS/TSS? This is true as the mean of y^hat is equal to the mean of y, as the mean of e_i is zero.

    1. Thank you for your comment. EES stands for “Explained Sum of Squares”, whereas RSS stands for the “Residual Sum of Squares”. Hence ESS/TSS is correct.

      Best, ad

  4. Very interesting content – thanks!

    For some reason I fail to see the intuition behind the second last line. Why can var(y^) be said to be equal to ESS?
    Does this have to do with the assumptions behind the y^ line? Coulnd’t there be variance in this regression line which doesn’t contribute to explaining the variance in y?

    I hope my question makes sense

    1. Hi, this is a very good question. I was to short on this point. I will adjust the post such that it becomes more clear. The short answer is, plug in the variance equation two times and 1/n cancels out. What remains are the sum of squared residuals.

      Thanks for this comment.

  5. Thank you. I think there is some missing squares on the second last line, no? var(y) = sum((yi-mean(y))^2)/n, and the same thing for the estimates.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.