Clustered standard errors are a way to obtain unbiased standard errors of OLS coefficients under a specific kind of heteroscedasticity. Recall that the presence of heteroscedasticity violates the Gauss Markov assumptions that are necessary to render OLS the best linear unbiased estimator (BLUE).
The estimation of clustered standard errors is justified if there are several different covariance structures within your data sample that vary by a certain characteristic – a “cluster”. Furthermore, the covariance structures must be homoskedastic within each cluster. In this case clustered standard errors provide unbiased standard errors estimates.
Intuition of Clustered Standard Errors
The classical example that the literature tells to explain clustered standard errors uses student test scores form classes from different schools around a country. Let’s say you have a panel data set with different test scores from different classes from different schools around your country. Further, you are interested in the influence of class size on the test score. For instance, you want to test the hypothesis if smaller classrooms improve test scores. In your data, the test score varies on student level. However, class size varies only with class. In this case student test scores with a class are not independent. A class might have a better teacher or a better classroom community that provides a better learning environment.
Regressing class size on student test scores leaves you with standard errors that are heteroscedastic as the variance depends on the class. However, within each class – a class represents the cluster in this example – standard errors are homoscedastic.
Implementation of Clustered Standard Errors
In order to account for different covariance structures within your data that vary by a cluster, you want to relax the Gauss-Markov homoskedasticity assumption. Similar to heteroskedasticity-robust standard errors, you want to allow more flexibility in your variance-covariance (VCV) matrix. Recall that the diagonal elements of the VCV matrix are the squared standard errors of your estimated coefficients. The way to accomplish this is by using clustered standard errors. The formulation is as follows:
where number of unique clusters (e.g. number of classes) number of observations, and the number of regressors (including the intercept). See chapter 8.2.1 Clustering and the Moulton Factor in Angrist and Pischke’s Mostly Harmless Econometrics (Princeton University Press, 2009) for a more detailed elaboration on clustered standard errors.
This estimator returns the Variance-covariance (VCV) matrix where the diagonal elements are the estimated cluster-robust coefficient variances. We obtain clustered standard errors by taking the square root of the diagonal elements.
In STATA you can obtain clustered standard errors simply by adding
cluster(cluster) to your regression command. For instance
reg dependent_var independent_var, cluster(cluster)
You can find a tutorial on how to calculate clustered standard errors in STATA here.
It is also possible to estimate clustered standard errors in R. One can estimate clustered standard errors in R using the extended summary function. I extended the
summary() in order to simplify the computation of clustered standard errors in R. My intention was to create a function that allows to compute clustered standard errors in a similar fashion as in STATA. If you are interested in calculating clustered standard errors in R click here. However, if you are more interested in the code and the exact extension of the
summary() click here.