Lehmann–Scheffé theorem
In-game article clicks load inline without leaving the challenge.
In statistics, the Lehmann–Scheffé theorem ties together completeness, sufficiency, uniqueness, and best unbiased estimation. The theorem states that any estimator that is unbiased for a given unknown quantity and that depends on the data only through a complete, sufficient statistic is the unique best unbiased estimator of that quantity. The Lehmann–Scheffé theorem is named after Erich Leo Lehmann and Henry Scheffé, given their two early papers.
If T {\displaystyle T} is a complete sufficient statistic for θ {\displaystyle \theta } and E [ g ( T ) ] = τ ( θ ) {\displaystyle \operatorname {E} [g(T)]=\tau (\theta )} then g ( T ) {\displaystyle g(T)} is the uniformly minimum-variance unbiased estimator (UMVUE) of τ ( θ ) {\displaystyle \tau (\theta )}.
Statement
Let X → = X 1 , X 2 , … , X n {\displaystyle {\vec {X}}=X_{1},X_{2},\dots ,X_{n}} be a random sample from a distribution that has p.d.f (or p.m.f in the discrete case) f ( x : θ ) {\displaystyle f(x:\theta )} where θ ∈ Ω {\displaystyle \theta \in \Omega } is a parameter in the parameter space. Suppose Y = u ( X → ) {\displaystyle Y=u({\vec {X}})} is a sufficient statistic for θ, and let { f Y ( y : θ ) : θ ∈ Ω } {\displaystyle \{f_{Y}(y:\theta ):\theta \in \Omega \}} be a complete family. If φ : E [ φ ( Y ) ] = θ {\displaystyle \varphi :\operatorname {E} [\varphi (Y)]=\theta } then φ ( Y ) {\displaystyle \varphi (Y)} is the unique MVUE of θ.
Proof
By the Rao–Blackwell theorem, if Z {\displaystyle Z} is an unbiased estimator of θ then φ ( Y ) := E [ Z ∣ Y ] {\displaystyle \varphi (Y):=\operatorname {E} [Z\mid Y]} defines an unbiased estimator of θ with the property that its variance is not greater than that of Z {\displaystyle Z}.
Now we show that this function is unique. Suppose W {\displaystyle W} is another candidate MVUE estimator of θ. Then again ψ ( Y ) := E [ W ∣ Y ] {\displaystyle \psi (Y):=\operatorname {E} [W\mid Y]} defines an unbiased estimator of θ with the property that its variance is not greater than that of W {\displaystyle W}. Then
E [ φ ( Y ) − ψ ( Y ) ] = 0 , θ ∈ Ω . {\displaystyle \operatorname {E} [\varphi (Y)-\psi (Y)]=0,\theta \in \Omega .}
Since { f Y ( y : θ ) : θ ∈ Ω } {\displaystyle \{f_{Y}(y:\theta ):\theta \in \Omega \}} is a complete family
E [ φ ( Y ) − ψ ( Y ) ] = 0 ⟹ φ ( y ) − ψ ( y ) = 0 , θ ∈ Ω {\displaystyle \operatorname {E} [\varphi (Y)-\psi (Y)]=0\implies \varphi (y)-\psi (y)=0,\theta \in \Omega }
and therefore the function φ {\displaystyle \varphi } is the unique function of Y with variance not greater than that of any other unbiased estimator. We conclude that φ ( Y ) {\displaystyle \varphi (Y)} is the MVUE.
Example for when using a non-complete minimal sufficient statistic
An example of an improvable Rao–Blackwell improvement, when using a minimal sufficient statistic that is not complete, was provided by Galili and Meilijson in 2016. Let X 1 , … , X n {\displaystyle X_{1},\ldots ,X_{n}} be a random sample from a scale-uniform distribution X ∼ U ( ( 1 − k ) θ , ( 1 + k ) θ ) , {\displaystyle X\sim U((1-k)\theta ,(1+k)\theta ),} with unknown mean E [ X ] = θ {\displaystyle \operatorname {E} [X]=\theta } and known design parameter k ∈ ( 0 , 1 ) {\displaystyle k\in (0,1)}. In the search for "best" possible unbiased estimators for θ {\displaystyle \theta }, it is natural to consider X 1 {\displaystyle X_{1}} as an initial (crude) unbiased estimator for θ {\displaystyle \theta } and then try to improve it. Since X 1 {\displaystyle X_{1}} is not a function of T = ( X ( 1 ) , X ( n ) ) {\displaystyle T=\left(X_{(1)},X_{(n)}\right)}, the minimal sufficient statistic for θ {\displaystyle \theta } (where X ( 1 ) = min i X i {\displaystyle X_{(1)}=\min _{i}X_{i}} and X ( n ) = max i X i {\displaystyle X_{(n)}=\max _{i}X_{i}}), it may be improved using the Rao–Blackwell theorem as follows:
θ ^ R B = E θ [ X 1 ∣ X ( 1 ) , X ( n ) ] = X ( 1 ) + X ( n ) 2 . {\displaystyle {\hat {\theta }}_{RB}=\operatorname {E} _{\theta }[X_{1}\mid X_{(1)},X_{(n)}]={\frac {X_{(1)}+X_{(n)}}{2}}.}
However, the following unbiased estimator can be shown to have lower variance:
θ ^ L V = 1 k 2 n − 1 n + 1 + 1 ⋅ ( 1 − k ) X ( 1 ) + ( 1 + k ) X ( n ) 2 . {\displaystyle {\hat {\theta }}_{LV}={\frac {1}{k^{2}{\frac {n-1}{n+1}}+1}}\cdot {\frac {(1-k)X_{(1)}+(1+k)X_{(n)}}{2}}.}
And in fact, it could be even further improved when using the following estimator:
θ ^ BAYES = n + 1 n [ 1 − X ( 1 ) ( 1 + k ) X ( n ) ( 1 − k ) − 1 ( X ( 1 ) ( 1 + k ) X ( n ) ( 1 − k ) ) n + 1 − 1 ] X ( n ) 1 + k {\displaystyle {\hat {\theta }}_{\text{BAYES}}={\frac {n+1}{n}}\left[1-{\frac {{\frac {X_{(1)}(1+k)}{X_{(n)}(1-k)}}-1}{\left({\frac {X_{(1)}(1+k)}{X_{(n)}(1-k)}}\right)^{n+1}-1}}\right]{\frac {X_{(n)}}{1+k}}}
The model is a scale model. Optimal equivariant estimators can then be derived for loss functions that are invariant.