One can calculate robust standard errors in R in various ways. However, one can easily reach its limit when calculating robust standard errors in R, especially when you are new in R. It always bordered me that you can calculate robust standard errors so easily in STATA, but you needed ten lines of code to compute robust standard errors in R. I decided to solve the problem myself and modified the ` summary() `

function in R so that it replicates the simple way of STATA. I added the parameter ` robust `

to the ` summary() `

function that calculates robust standard errors if one sets the parameter to true. With the new ` summary() `

function you can get robust standard errors in your usual ` summary() `

output. All you need to do is to set the ` robust `

parameter to true:

` summary(lm.object, robust=T) `

Furthermore, I uploaded the function to a github.com repository. This makes it easy to load the function into your R session. The following lines of code import the function into your R session. You can also download the function directly from this post yourself.

# load necessary packages for importing the function
library(RCurl)
# import the function from repository
url_robust <- "https://raw.githubusercontent.com/IsidoreBeautrelet/economictheoryblog/master/robust_summary.R"
eval(parse(text = getURL(url_robust, ssl.verifypeer = FALSE)),
envir=.GlobalEnv)

I prepared a working example that tests the function and shows how it works. You find the code below.

# start with an empty workspace
rm(list=ls())
# load necessary packages for importing the function
library(RCurl)
# load necessary packages for the example
library(gdata)
library(zoo)
# import the function
url_robust <- "https://raw.githubusercontent.com/IsidoreBeautrelet/economictheoryblog/master/robust_summary.R"
eval(parse(text = getURL(url_robust, ssl.verifypeer = FALSE)),
envir=.GlobalEnv)
# download data set for example
url_data <- "https://economictheoryblog.files.wordpress.com/2016/08/data.xlsx"
data <- read.xls(gsub("s:",":",url_data))
# estimate simple linear model
reg <- lm(weight ~ lag_calories+lag_cycling+
I(lag_calories*lag_cycling),
data=data)
# use new summary function
summary(reg)
summary(reg,robust = T)
# strangely enough we get a case in which
# robust standard errors a smaller than
# conventional standard errors

### Like this:

Like Loading...

*Related*

Pingback: Robust Standard Errors in R – Function | Economic Theory Blog

Pingback: Robust Standard Errors | Economic Theory Blog

Pingback: Robust Standard Errors in STATA | Economic Theory Blog

Pingback: Violation of CLRM – Assumption 4.2: Consequences of Heteroscedasticity | Economic Theory Blog

The lack of the “robust” option was among my biggest disappointments in moving our courses (and students) from STATA to R. We will all be eternally grateful to you for rectifying this problem.

Thank you for your kind words of appreciation. I’m glad I was able to help.

Have you come across a heteroscedasticity-robust F-test for multiple linear restrictions in a model?

I can’t say I have.

What I know is that, once you start using heteroscedasticity consistent standard errors you should not use the sums of squares to calculate the F-statistic. The reason for this is that the meaning of those sums is no longer relevant, although the sums of squares themselves do not change. To my understanding one can still use the sums of squares to calculate the statistic that maintains its goodness-of-fit interpretation. However, you cannot use the sums of squares to obtain F-Statistics because those formulas do no longer apply. Instead of using an F-Statistic that is based on the sum of squared what one does is to use a Wald test that is based on the robustly estimated variance matrix.

So, if you use my function to obtain robust standard errors it actually returns you an F-Statistic that is based on a Wald test instead of sum of squares.

I suppose that if you want to test multiple linear restrictions you should use heteroscedasticity-robust Wald statistics. I don’t know that if there is actually an R implementation of the heteroscedasticity-robust Wald.

For now I am working on an implementation of clustered standard errors, but once I am done with it I might look into it myself.

HTH

I assumed that, if you went to all the hard work to calculate the robust standard errors, the F-statistic you produced would use them and took it on faith that I had the robust F. Stock and Watson report a value for the heteroscedasticity-robust F stat with q linear restrictions but only give instructions to students for calculating the F stat under the assumption of homoscedasticy, via the SSR/R-squared (although they do describe the process for coming up with the robust F in an appendix).

Having the robust option in R is a great leap forward for my teaching. The rest can wait.

Thanks again,

Matt

Awesome! Extremely useful! Thank you..

Thanks, I am glad to hear that!

Thanks for this. I was playing with R a couple years back thinking I’d make the switch and was baffled by how difficult it was to do this simple procedure. Does this only work for lm models? I tried it with a logit and it didn’t change the standard errors.

Hi. Unfortunately, the function only covers lm models so far. However, I will extent the function to logit and plm once I can free up some time. The “sandwich” package, created and maintained by Achim Zeileis, provides some useful functionalities with respect to robust standard errors. Cheers.

Hi all, interesting function. Previously, I have been using the sandwich package to report robust S.E.s. When I installed this extension and used the summary(, robust=T) option slightly different S.E.s were reported from the ones I observed in STATA. I trimmed some of my results and posted them below. Do you know why the robust standard errors on Family_Inc don’t match ?

> coeftest(mod1, vcov = vcovHC(mod1, “HC1”)) #Robust SE (Match those reported by STATA)

Estimate Std. Error t value Pr(>|t|)

(Intercept) 2.3460131 0.0974894 24.064 < 2.2e-16 ***

Famliy_Inc 0.5551564 0.0086837 63.931 summary(mod1, robust = T) #Different S.E.s reported by robust=T

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 2.346013 0.088341 26.56 <2e-16 ***

Family_Inc 0.555156 0.007878 70.47 <2e-16 ***

Hi! Thank you for your interest in my function. I am surprised that the standard errors do not match. On my blog I provide a reproducible example of a linear regression with robust standard errors both in R and STATA. Both programs deliver the same robust standard errors. See the following two links if you want to check it yourself:

https://economictheoryblog.com/2016/08/08/robust-standard-errors-in-r/

https://economictheoryblog.com/2016/08/20/robust-standard-errors-in-stata/

Furthermore, I also check coeftest(reg, vcov = vcovHC(reg, “HC1”)) for my example and the sandwich version of computing robust standard errors calculates the same values.

Unfortunately, I cannot tell you more right now. Could you provide a reproducible example? I am very keen to know what drives the differences in your case. Especially if the are a result of my function.

Cheers!