The Coefficient Of Determination or R2

The coefficient of determination R^{2} shows how much of the variation of the dependent variable y (Var(y)) can be explained by our model. Another way of interpreting the coefficient of determination R^{2}, which will not be discussed in this post, is to look at it as the squared Pearson correlation coefficient between the observed values y_{i} and the fitted values \hat{y}_{i}. Why this is the case exactly can be found in another post.

The coefficient of determination R^{2} is probably the most famous key figure when it comes to evaluate the fit of an ordinary least squares (OLS) estimation. The coefficient of determination R^{2} can be derived from a simple variance decomposition. Although it sounds complicated it is actually explained in few simple words. The first thing you have to remember is what OLS is actually all about. When using OLS you try to explain a certain dependent variable (y) through independent variables (x). Our model is thereby able to explain some variation of the dependent variable (y). We can summarize this as follows:

y_{i} = \hat{y}_{i} + e_{i}

where y_{i} is the dependent variable, which consists of a part we can explain \hat{y}_{i } (also now as fitted value) and a part we cannot explain e_{i} (also now as error term). From this decomposition follows that the variance of y (Var(y)) can be decomposed in a similar manner:

Var(y) =Var( \hat{y}) + Var(e) + 2Cov(\hat{y},e)

In the case our regression model (it usually does) contains a constant (usually depicted as \beta_{0} or \alpha_{0}) we know that Cov(\hat{y},e)=0. Given that Cov(\hat{y},e)=0 the equation above boils down to:

Var(y) =Var( \hat{y}) + Var(e)

Plugging in the formula of the Variance we get:

\frac{1}{N}\sum ( y_{i} - \bar{y})^{2}=\frac{1}{N} \sum ( \hat{y}_{i} - \bar{\hat{y}}) + \frac{1}{N} \sum (e_{i} - \bar{ e})

And as \bar{e}=0 is follows that

\sum ( y_{i} - \bar{y})^{2}=\sum ( \hat{y}_{i} - \bar{\hat{y}})^{2} + \sum (e^{2}_{i})

Where

\sum ( y_{i} - \bar{y})^{2} = TSS = Total Sum Squared

\sum ( \hat{y}_{i} - \bar{\hat{y}})^{2} = ESS = Explained Sum Squared

\sum (e^{2}_{i}) = SSR = Sum of Squared Residuals

The coefficient of determination R^{2} is consequently calculated as the ratio of Explained Sum Squared (ESS) to Total Sum Squared (TSS).

R^{2} = \frac{ESS}{TSS} = 1 - \frac{SSR}{TSS} = 1 - \frac{\sum (e^{2}_{i}) }{\sum ( y_{i} - \bar{y})^{2} }

What is left after having seen how the coefficient of determination R^{2} is calculated is to know what it actually expresses. Looking at the equation above should have already provided some idea what the coefficient of determination R^{2} says. It shows how much of the variation of the dependent variable y (Var(y)) can be explained by our model.

Another way of interpreting the coefficient of determination R^{2} is to look at it as the squared Pearson correlation coefficient between the observed values y_{i} and the fitted values \hat{y}_{i}. Why this is the case exactly can be found in another post.

Advertisements
This entry was posted in Econometrics, Statistic and tagged , , . Bookmark the permalink.

7 Responses to The Coefficient Of Determination or R2

  1. Pingback: Relationship between Coefficient of Determination & Squared Pearson Correlation Coefficient | Economic Theory Blog

  2. Homer J says:

    Quick note. You dropped out the squared terms in the summations for ESS and TSS.

  3. robert says:

    ^2’s missing in the ‘Plugging in the formula of the Variance we get’-part
    cheers

  4. 여동훈 says:

    Absolutely lovely! Thank you for your post 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s