Omitted Variable Bias: Conclusion

The following post provides a recap of the previous posts on the omitted variable bias (Introduction, Explanation, In-depth discussion of the bias, Consequences of the omitted variable bias) and concludes with some general advise. In case you haven’t read the previous posts, you might want to start from the beginning in order to fully understand the issues related to the omitted variable bias.

All in all, the omitted variable bias is a severe problem. Neglecting a relevant variable leads to biased and inconsistent estimates. Hence, as a general advice, when you are working with linear regression models, you should pay close attention to potentially omitted variables. In particular, you should ask yourself the following questions:

1. What variables might potentially impact the dependent variable but are not (yet) included in the model?
2. Out of the variables identified in question one, what variables are likely to be correlated with other explanatory variables included in the model?
3. For those omitted variables that are likely to be correlated with the dependent variable and at least one other explanatory variable, what is the expected sign of the correlation — positive or negative?
4. Based on the sign of the correlation, what bias — upward or downward — are the estimates suffering from?
5. Finally, you should ask yourself what is the magnitude of the bias. Could it be strong enough to completely impact your regression?

Generally, with time (and experience) it will become easier to determine what variables are important and relevant and which variables are not.

Finally, from the previous posts, you should know that the omitted variable bias leads to biased estimates, but also leads to a decrease of the variance. In certain cases, one might want to weight one against the other, i.e. increase in bias versus decrease in variance. That is, sometimes it can be better to have wrong, but precise estimates rather than unbiased and imprecise ones. Thus, if you want to decrease the variance, the trade-off is to increase your bias, and if you want to decrease your bias, the trade-off is increased variance.