Lately I received some criticism saying that my proof (link to proof) on the unbiasedness of the estimator for the sample variance strikes through its unnecessary length. Well, as I am an economist and love proofs which read like a book, I never really saw the benefit of bowling down a proof to a couple of lines. Actually, I hate it if I have to brew over a proof for an hour before I clearly understand what’s going on. However, in order to satisfy the need for mathematical beauty, I looked around and found the following proof which is way shorter than my original version.
In order to prove that the estimator of the sample variance is unbiased we have to show the following:
(1)
However, before getting really to it, let’s start with the usual definition of notation. So for this proof it is important to know that
(2) are independent observations from a population with mean
and variance
(3)
(4)
(5)
(6)
(7)
Let’s try to show that
(8)
To make my life easier, I will omit the limits of summation from now onwards, but let it be known that we are always summing from to
.
(9)
(10)
(11)
(12)
Combining equiation 1 with equation 12 brings us to:
(13)
(14)
Finally, we showed that the estimator for the population variance is indeed unbiased. If you are mathematically adept you probably had no problem to follow every single step of this proof. However, if you are like me and want to be taken by hand through every single step you can find the exhaustive proof here.
steps 8 and 10 are the same terms.
You are right! I am going to fix it!
Its done 🙂
“Finally, we showed that the estimator for the sample variance is indeed unbiased.”
we are trying to estimate an unknown population parameter namely ‘sigma^2’: population variance, with a known quantity that is ‘s^2’: sample variance
therefore, ‘s^2’ is an estimator for ‘sigma^2’
the conclusion should be:
“the estimator for population variance is indeed unbiased”
Hello and thank you for your very useful comment. I will definively consider it. Cheers!
Hi! I don’t get how the assumptions (5) and (7) are justified. Can someone help?
Hi. These are just some basic variance properties. You can find them on Wikipedia under https://en.wikipedia.org/wiki/Variance. Hope it helps.
(8) follows Var(X) = 0. I’m confused about this.
Thank you for pointing that out. There was a typo in the equation. I corrected it. Thanks a lot.