# Violation of CLRM – Assumption 4.2: Consequences of Heteroscedasticity

Violating assumption 4.2, i.e. $\sigma_{i}^{2} \neq \sigma_{j}^{2} \text{ for } i \neq j$ leads to heteroscedasticity. Recall, under heteroscedasticity the OLS estimator still delivers unbiased and consistent coefficient estimates, but the estimator will be biased for standard errors. Increasing the number of observations will not solve the problem in this case.

Fortunately, several ways exist to deal with heteroscedasticity:

It is possible that heteroscedasticity results from improper model specification. Typical sources of heteroscedasticity that arise from model misspecification include, subgroup differences, non-linear effects of variables or omitted variables. It is necessary to deal with these issues before applying other techniques.

“Robust” standard errors is a technique to obtain unbiased standard errors of OLS coefficients under heteroscedasticity. In the literature “Robust” standard errors are also referred to as White’s Standard Errors, Huber–White standard errors, Eicker–White, Eicker–Huber–White or even sandwich estimator of variance. “Robust” standard errors are usually larger than conventional standard errors. However, this has not always to be the case. You can find more information on robust standard errors including how the are implemented in STATA and R here.

Clustered standard errors are an additional method to deal with heteroscedastic data. You should use clustered standard errors if there are several different co-variance structures in your data. In order for clustered standard errors to make sense these different co-variance structures need to depend on a certain characteristic, a cluster. Furthermore, data need to be homoskedastic within each cluster.

4. Weighted Least Squares

Another option of dealing with heteroskedasticity is weighted least squares. Although the use of weighted least squares appears more difficult it can be superior when you applied the right way. Generally, Generalized Least Squares (GLS) will always yield estimators that are BLUE when either heteroskedasticity or serial correlation are present.