Unravelling the secrets of the UMVUE

Ananyapam De
6 min readApr 21, 2021

UMVUE, also known as Uniformly Minimum Variance Unbiased Estimator is a tricky concept in Estimation Theory. Let us explore this and it’s related concepts like Cramér-Rao Bound, Rao-Blackwell Theorem etc in depth and by the end of this article, I promise, it will all make sense. Without further adieu, let us dive right in! (I assume that you’re familiar with the concepts of random variable, it’s expectation, variance, joint distributions etc. the basic probability stuffs!)

So let us first start by formulating our problem. We are given some data from a known probability distribution, but having unknown parameters which we would like to determine. Situations like these are really common where we are already have some idea about the underlying distribution, however we need to estimate the parameters of the distribution using the given data. For instance, we know that heights generally follow Normal distributions, incomes have a Log-Normal or a Pareto distribution etc.

Formally speaking, we have taken a random sample from the population, where each of the sample unit is a random variable from the underlying distribution. We will denote this sample as X whenever necessary.

Notations

We represent the estimator as g(X), the data as X, and the parameter as θ. An estimator g(X) is a function of the data X, which will supposedly help us in estimating the unknown parameter of the distribution θ.

Unbiasedness

For addressing this problem, we need to cook up an estimator, which is a function of the data which will supposedly help us in estimating the unknown parameters of the distribution. For example if we need to estimate the mean µ of the normal distribution, we can use an estimator g(X), which is the average of the data values.

In the language of statistics, the expectation of g turns out to be the mean µ of the Normal distribution. However, we can also say that any data value in the sample is a reasonable estimator of the mean µ. This is because each of the data value is a random variable and the expectation of that random variable is the mean µ. Such estimators which have an expectation equal to that of the quantity they are trying to estimate, are called unbiased estimators.

This clearly tells us that we can have several unbiased estimators. Now, the question arises, how do we choose between them?

Minimum Variance!

The variance of an estimator is an important measure to assess the quality of that estimator. Think about it intuitively, if have two estimators, both unbiased, choosing the one with a smaller variance would provide us better estimates as compared to the other. We would obviously want our estimator to vary as less as possible from the quantity it is trying to estimate.

Suppose, you’re trying to throw darts to the red spot. The set of darts on both of them have the approximately the same average, but which person threw better shots?

Now comes the interesting stuff! We can always choose between two unbiased estimators by assessing their variance. But can we find an estimator which belongs to the class of unbiased estimators and has the least variance compared among all of them? Sadly, we cannot always find such an estimator, but when we can, it is called the Uniformly Minimum Variance Unbiased Estimator or UMVUE. It is unique and is considered the best estimator to work with.

A Journey to finding the UMVUE

We now start start exploring the secrets of the UMVUE! But before we get there, we need to cross several bridges.

Cramér-Rao Inequality

The famous theorem by the Indian statistician CR Rao, doctoral student of the legendary Ronald Fischer, this is the first bridge to our destination. It asserts that under certain regularity conditions, there exists a lower bound on the variance of an unbiased estimator. This gives us an important hint, if suppose we are comparing a number of given unbiased estimators by assessing the variance. If suppose for one of them we hit the Cramér-Rao bound, then we can immediately say that this is the best estimator among all the others, since the Cramér-Rao inequality tells us that it is the minimum possible achievable variance.

So we have arrived at a point that if we have a number of estimators in front of us to compare where one of them is the UMVUE, we can recognise and pick out that bad boy. However, we are still left with a question about how do we find the UMVUE if we don’t have it served to us in a platter?

Sufficiency and Complete Sufficiency

To answer our previous question, we need to understand another concept: Sufficiency. This quality of a statistic that helps us a lot in data reduction. We call a statistic sufficient for a parameter θ if it contains all the information about the parameter θ. If we know a sufficient statistic, we can discard the original data and we have still not lost any information.

Another way to understand this is if the joint density of the random sample conditioned on the statistic is independent of the parameter θ, we call that statistic sufficient for the parameter θ. In simple words, finding the joint density of the sample, given that you already know statistic, does not change with respect to the parameter θ. A classic example would be data from coin tossing and the parameter of interest is the probability of heads, p which we’re trying to estimate. Suppose we have tossed the coin 100 times and have noted the observations as H,T,T,T,H… etc. A sufficient statistic for estimating p would be the total number of heads in our 100 tosses! Dividing that by 100 gives the proportion of successes and is an estimator for the success probability. In other words, the ordering between the different outcomes is irrelevant.

Rao-Blackwell theorem

This is the second bridge that we need to cross. This states that if g(X) is any kind of estimator of a parameter θ, then the conditional expectation of g(X) given T(X), where T is a sufficient statistic, is typically a better estimator of θ, and is never worse. Intuitively, if the statistic g(X) is conditioned on T(X), it means that we are provided more information, hence the variance of the statistic reduces. The reason why sufficiency comes into the picture is because conditioning g(X) on any other kind of statistic which is not sufficient, results in the expectation being a function of the parameter θ, which fails to be a statistic. This theorem provides us insights behind why sufficiency is linked with finding UMVUE.

Complete Statistic

Now a complete statistic T is one which has the property that the only function of the T which is an unbiased estimator of 0, is the 0 function, almost surely. If the density of the random variable is from an exponential family, we have an easy trick to find a complete sufficient statistic. Simply express the density having a form of something raised to the power of e, the dependencies of which split up into the parameter and the data. Then the sum of the data as a random variable is a complete sufficient statistic.

Lehmann–Scheffé theorem

We have now finally arrived at our destination, the last and the final theorem which will which will help us find our UMVUE. All the trouble that we’ve been through will be worth it.

This theorem states that if a statistic is unbiased, complete and sufficient for some parameter g(θ), then it is the UMVUE for g(θ). Elegant, isn't it?

Conclusion

So, how do we figure out the UMVUE for some parameter g(θ)? First find a sufficient statistic and check if it is complete. This is called a complete sufficient statistic. Then try to play around and figure out a function such that the function of that estimator becomes unbiased for g(θ)!!

Hope this article helped you understand the concepts and the intuition behind the UMVUE.

--

--

Ananyapam De

Undergrad at IISER Kolkata, Majoring in Statistics and Mathematics.