# Omitted Variable Bias: What can we do about it?

To deal with an omitted variables bias is not easy. However, one can try several things.

First, one can try, if the required data is available, to include as many variables as you can in the regression model. Of course, this will have other possible implications that one has to consider carefully. First, you need to have a sufficient number of data points to include additional explanatory variables or else you will not be able to estimate your model. Second, depending on how many extra variables you include, the issues of including unnecessary variables may arise and start to seriously influence your estimates. However, additional relevant explanatory variables can help to mitigate the problems associated with the omitted variable bias. But what do we mean by relevant explanatory variables? Note, one should include only those explanatory variables that control for the effect of confounding explanatory variables and not include all possible explanatory variables that explain the dependent variable in what so ever way. More precisely, if identification of the total effect of an explanatory variable is the objective, one needs to include all those variables that control for the effect of confounding and avoid to include those that open additional confounding paths or mediate the effect you are trying to measure. Technically, you should include those variables that satisfy the backdoor criterion. That said, you should not simply add all possible predictors of your dependent variable to your regression models. Doing so might conversely even bias your results.

Second, if you think that a variable is important and leaving it out of your regression model could cause an omitted variable bias, but at the same time you do not have data for it, you can look for proxies or find instrument variables for the omitted variables. For instance, in the car price example that we discussed earlier, the omitted variable was the age of the car. Suppose you do not have data on the age of the car, however you know how much time the last owner was in possession of the car, then the amount of time the car was owned by the last owner can be taken as a proxy for the age of a car. Note however, using proxies and instrumental variables comes with a whole set of additional assumptions and problems, most of them are quite complicated and not easily met.

Third, if you cannot resolve the omitted variable bias, you can try to make predictions in which direction your estimates are biased.