ST5215: Advanced Statistical Theory

ST5215: Advanced Statistical Theory
Chen Zehua
Department of Statistics & Applied Probability
Tuesday, October 7, 2014
Chen Zehua
ST5215: Advanced Statistical Theory
Lecture 15: Minimal sufficiency (cont.) and completeness
Theorem 2.3 (usefull tools for checking minimal sufficiency)
Let P be a family of distributions on Rk .
(i) Suppose that P0 ⊂ P and a.s. P0 implies a.s. P. If T is
sufficient for P ∈ P and minimal sufficient for P ∈ P0 , then T
is minimal sufficient for P ∈ P.
(ii) Suppose that P contains p.d.f.’s f0 , f1 , f2 , ..., w.r.t. a σ-finite
measure. Let f∞ (x) = ∞
i=0 ci fi (x), where ci > 0 for all i and
∞
i=0 ci = 1, and let Ti (x) = fi (x)/f∞ (x) when f∞ (x) > 0,
i = 0, 1, 2, .... Then T (X ) = (T0 , T1 , T2 , ...) is minimal
sufficient for P ∈ P. Furthermore, if
{x : fi (x) > 0} ⊂ {x : f0 (x) > 0} for all i, then we may
replace f∞ (x) by f0 (x), in which case T (X ) = (T1 , T2 , ...) is
minimal sufficient for P ∈ P.
Chen Zehua
ST5215: Advanced Statistical Theory
(iii) Suppose that P contains p.d.f.’s fp w.r.t. a σ-finite measure
and that there exists a sufficient statistic T (X ) such that, for
any possible values x and y of X , fp (x) = fp (y )φ(x, y ) for all
P implies T (x) = T (y ), where φ is a measurable function.
Then T (X ) is minimal sufficient for P ∈ P.
Proof
(i) If S is sufficient for P ∈ P, then it is also sufficient for P ∈ P0
and, therefore, T = ψ(S) a.s. P0 . The result follows from
that a.s. P0 implies a.s. P.
(ii) Note that f∞ > 0 a.s. P. Let gi (T ) = Ti , i = 0, 1, 2, . . . .
Then fi (x) = gi (T (x))f∞ (x) a.s. P. By Theorem 2.2, T is
sufficient for P ∈ P. Suppose S(X ) is another sufficient
statistic, and fi (x) = g˜i (S(x))h(x), i = 0, 1, 2, . . . . Hence
∞
Ti (x) = g˜i (S(x))
cj g˜j (S(x))
j=1
for x’s satisfying f∞ (x) > 0. By Definition 2.5, T is minimal
sufficient for P ∈ P. The proof is the same when f∞ is
replaced by f0 .
Chen Zehua
ST5215: Advanced Statistical Theory
(iii) From Bahadur (1957), there is a minimal sufficient statistic
S(X ). The result follows if we can show that
T (X ) = ψ(S(X )) a.s. P for a measurable function ψ.
By Theorem 2.2, there are Borel functions h and gP such that
fP (x) = gP (S(x))h(x) for all P. Let A = {x : h(x) = 0}.
Then P(A) = 0 for all P. For x and y such that
S(x) = S(y ), x ∈
/ A and y ∈
/ A,
fP (x) = gP (S(x))h(x) = gP (S(y ))h(x) = fP (y )h(x)/h(y )
for all P. Hence T (x) = T (y ). This shows that there is a
function ψ such that T (x) = ψ(S(x)) except for x ∈ A.
It remains to show that ψ is measurable. Since S is minimal
sufficient, g (T (X )) = S(X ) a.s. P for a measurable function
g . Hence g is one-to-one and ψ = g −1 . By Theorem 3.9 in
Parthasarathy (1967), ψ is measurable.
Chen Zehua
ST5215: Advanced Statistical Theory
Example 2.14
Let P = {fθ : θ ∈ Θ} be an exponential family with p.d.f.’s
fθ (x) = exp{[η(θ)]τ T (x) − ξ(θ)}h(x).
By Factorization Theorem, T (X ) is sufficient for θ ∈ Θ. Suppose
that there exists Θ0 = {θ0 , θ1 , . . . , θp } ⊂ Θ such that the vectors
ηi = η(θi ) − η(θ0 ), i = 1, . . . , p, are linearly independent in Rp .
(This is true if the exponential family is of full rank). Then T is
also minimal sufficient.
Solution A: Let P0 = {fθ : θ ∈ Θ0 }. Note that the set
{x : fθ (x) > 0} does not depend on θ. It follows from Theorem
2.3(ii) with f∞ = fθ0 that
S(X ) =
exp{η1τ T (x) − ξ1 }, . . . , exp{ηpτ T (x) − ξp }
is minimal sufficient for θ ∈ Θ0 .
Chen Zehua
ST5215: Advanced Statistical Theory
Example 2.14 (cont.)
Since ηi ’s are linearly independent, there is a one-to-one
measurable function ψ such that T (X ) = ψ(S(X )) a.s. P0 . Hence,
T is minimal sufficient for θ ∈ Θ0 . It is easy to see that a.s. P0
implies a.s. P. Thus, by Theorem 2.3(i), T is minimal sufficient
for θ ∈ Θ.
Solution B: Let φ(x, y ) = h(x)/h(y ). Then
fθ (x) = fθ (y )φ(x, y )
⇒ exp{[η(θ)]τ [T (x) − T (y )} = 1
⇒ T (x) = T (y ).
Since T is sufficient, by Thorem 2.3 (iii), T is also minimal
sufficient.
Chen Zehua
ST5215: Advanced Statistical Theory
Example 2.13 (revisited)
Let X1 , . . . , Xn be i.i.d. random variables form Pθ , the uniform
distribution U(θ, θ + 1), θ ∈ R, n > 1.
The joint Lebesgue p.d.f. of (X1 , . . . , Xn ) is
n
fθ (x) =
I(θ,θ+1) (xi ) = I(x(n) −1,x(1) ) (θ),
x = (x1 , . . . , xn ) ∈ Rn ,
i=1
where x(i) denotes the ith smallest value of x1 , . . . , xn .
Here is another way to show that T = (X(1) , X(n) ) is minimal
sufficient.
Let φ(x, y ) = 1. Then
fθ (x) = fθ (y ), for all θ
⇒ I(x(n) −1,x(1) ) (θ) = I(y(n) −1,y(1) ) (θ) for all θ
⇒ (x(1) , x(n) ) = (y(1) , y(n) ).
By Theorem 2.3 (iii), T = (X(1) , X(n) ) is minimal sufficient.
Chen Zehua
ST5215: Advanced Statistical Theory
Example
Let (X1 , . . . , Xn ), n ≥ 2, be a random sample from a distribution
with discrete probability density fθ,j , where 0 < θ < 1, j = 1, 2, fθ,1
is the Poisson distribution with mean θ, and fθ,2 is the binomial
distribution with size 1 and probability θ. Find a two-dimensional
minimal sufficient statistic for (θ, j).
Let gθ,j be the joint probability density of X1 , . . . , Xn . Let
P0 = {g1/4,1 , g1/2,1 , g1/2,2 }. Note that
n
gθ,1 = e −nθ θT /(
n
xi !), gθ,2 = θT (1 − θ)n−T
i=1
where T =
n
i=1 xi .
I{0,1} (xi ),
i=1
Then, a.s. P0 implies a.s. P.
By Theorem 2.3(ii), the two-dimensional statistic
n
g1/2,1 g1/2,2
−n/4 T
n/4 −n T
S=
,
= (e
2 ,e 2 4
I{0,1} (xi ))
g1/4,1 g1/4,1
i=1
is minimal sufficient for the family P0 .
Chen Zehua
ST5215: Advanced Statistical Theory
Let W = ni=1 I{0,1} (xi ). Since there is a one-to-one
transformation between S and (T , W ), we conclude that
(T , W ) is minimal sufficient for the family P0 . The joint
density of X1 , . . . , Xn can be expressed as
n
T
θ exp{−nθI{1} (j)}(1 − θ)
(n−T )I{2} (j)
W
I{2} (j)
i=1
1
.
xi !
Hence, by the factorization theorem, (T , W ) is sufficient for
(θ, j). By Theorem 2.3(i) , (T , W ) is minimal sufficient for
(θ, j).
Chen Zehua
ST5215: Advanced Statistical Theory
Completeness
Discussioin
A minimal sufficient statistic is not always the “simplest sufficient
statistic”. It might still contain something which is ancillary. For
example, if X¯ is minimal sufficient, then so is (X¯ , exp{X¯ }).
Ancillary statistic. A statistic V (X ) is ancillary if its distribution
does not depend on the population P. V (X ) is first-order ancillary
if E [V (X )] is independent of P.
A trivial ancillary statistic is the constant statistic V (X ) ≡ c. If
V (X ) is a nontrivial ancillary statistic, then σ(V (X )) is a nontrivial
σ-field but it does not contain any information about P.
If T (X ) is a statistic and V (T (X )) is a nontrivial ancillary statistic,
then σ(T (X )) contains σ(V (T (X ))), a nontrivial σ-field that does
not contain any information about P. Hence, the “data” T (X ) may
be further reduced.
If no nonconstant function of T (X ) is ancillary or even first-order
ancillary, we can say that T (X ) contains “completely” useful
information about P.
Chen Zehua
ST5215: Advanced Statistical Theory
Definition 2.6 (Completeness)
A statistic T (X ) is said to be complete for P ∈ P iff, for any Borel
f , E [f (T )] = 0 for all P ∈ P implies f = 0 a.s. P.
T is said to be boundedly complete iff the previous statement
holds for any bounded Borel f .
Remarks
A complete statistic is boundedly complete.
If T is complete (or boundedly complete) and S = ψ(T ) for a
measurable ψ, then S is complete (or boundedly complete).
Intuitively, a complete and sufficient statistic should be
minimal sufficient (Exercise 48).
A minimal sufficient statistic is not necessarily complete; for
example, the minimal sufficient statistic (X(1) , X(n) ) in
Example 2.13 is not complete (Exercise 47).
Chen Zehua
ST5215: Advanced Statistical Theory
Proposition 2.1
If P is in an exponential family of full rank with p.d.f.’s given by
fη (x) = exp η τ T (x) − ζ(η) h(x),
then T (X ) is complete and sufficient for η ∈ Ξ.
Proof
We have shown that T is sufficient.
We now show that T is complete.
Suppose that there is a function f such that E [f (T )] = 0 for all
η ∈ Ξ.
By Theorem 2.1(i),
f (t) exp{η τ t − ζ(η)}dλ = 0
for all η ∈ Ξ,
where λ is a measure on (Rp , B p ).
Chen Zehua
ST5215: Advanced Statistical Theory
Proof (continued)
Let η0 be an interior point of Ξ. Then
τ
f+ (t)e η t dλ =
τ
f− (t)e η t dλ
for all η ∈ N(η0 ),
where N(η0 ) = {η ∈ Rp : η − η0 < } for some
In particular,
τ
f+ (t)e η0 t dλ =
(1)
> 0.
τ
f− (t)e η0 t dλ = c.
If c = 0, then f = 0 a.e. λ.
τ
τ
If c > 0, then c −1 f+ (t)e η0 t and c −1 f− (t)e η0 t are p.d.f.’s w.r.t. λ
and result (1) implies that their m.g.f.’s are the same in a
neighborhood of 0.
τ
τ
By Theorem 1.6(ii), c −1 f+ (t)e η0 t = c −1 f− (t)e η0 t , i.e.,
f = f+ − f− = 0 a.e. λ.
Hence T is complete.
Chen Zehua
ST5215: Advanced Statistical Theory
Example 2.15
Suppose that X1 , ..., Xn are i.i.d. random variables having the
N(µ, σ 2 ) distribution, µ ∈ R, σ > 0.
From Example 2.6, the joint p.d.f. of X1 , ..., Xn is
(2π)−n/2 exp {η1 T1 + η2 T2 − nζ(η)} ,
where T1 = ni=1 Xi , T2 = − ni=1 Xi2 , and
η = (η1 , η2 ) = σµ2 , 2σ1 2 .
Hence, the family of distributions for X = (X1 , ..., Xn ) is a natural
exponential family of full rank (Ξ = R × (0, ∞)). By Proposition
2.1, T (X ) = (T1 , T2 ) is complete and sufficient for η.
Since there is a one-to-one correspondence between η and
θ = (µ, σ 2 ), T is also complete and sufficient for θ. It can be
shown that any one-to-one measurable function of a complete and
sufficient statistic is also complete and sufficient (exercise).
Thus, (X¯ , S 2 ) is complete and sufficient for θ, where X¯ and S 2 are
the sample mean and sample variance, respectively.
Chen Zehua
ST5215: Advanced Statistical Theory
Example 2.16
Let X1 , ..., Xn be i.i.d. random variables from Pθ , the uniform
distribution U(0, θ), θ > 0. The largest order statistic, X(n) , is
complete and sufficient for θ ∈ (0, ∞).
The sufficiency of X(n) follows from the fact that the joint
Lebesgue p.d.f. of X1 , ..., Xn is θ−n I(0,θ) (x(n) ).
From Example 2.9, X(n) has the Lebesgue p.d.f.
(nx n−1 /θn )I(0,θ) (x). Let f be a Borel function on [0, ∞) such that
E [f (X(n) )] = 0 for all θ > 0. Then
θ
f (x)x n−1 dx = 0
for all θ > 0.
0
Let G (θ) be the left-hand side of the previous equation.
Applying the result of differentiation of an integral (see, e.g.,
Royden (1968, §5.3)), we obtain that G (θ) = f (θ)θn−1 a.e. m+ ,
where m+ is the Lebesgue measure on ([0, ∞), B[0,∞) ). Since
G (θ) = 0 for all θ > 0, f (θ)θn−1 = 0 a.e. m+ and, hence, f (x) = 0
a.e. m+ . Therefore, X(n) is complete and sufficient for θ ∈ (0, ∞).
Chen Zehua
ST5215: Advanced Statistical Theory
Example 2.17
Let P be the family of distributions on R having Lebesgue p.d.f.’s.
The order statistics T (X ) = (X(1) , ..., X(n) ) of i.i.d. random
variables X1 , ..., Xn is complete for P ∈ P.
Let P0 be the family of Lebesgue p.d.f.’s of the form
f (x) = C (θ1 , ..., θn ) exp{−x 2n + θ1 x + θ2 x 2 + · · · + θn x n },
where θj ∈ R and C (θ1 , ..., θn ) is a normalizing constant such
that f (x)dx = 1. Then P0 ⊂ P and P0 is an exponential
family of full rank.
Note that the joint distribution of X = (X1 , ..., Xn ) is also in
an exponential family of full rank. Thus, by Proposition 2.1,
U = (U1 , ..., Un ) is a complete statistic for P ∈ P0 , where
Uj = ni=1 Xij .
Since a.s. P0 implies a.s. P, U(X ) is also complete for P ∈ P.
Chen Zehua
ST5215: Advanced Statistical Theory
Example 2.17 (continued)
The results follows from a one-to-one correspondence between
T (X ) and U(X ).
Let V1 = ni=1 Xi , V2 = i<j Xi Xj , V3 = i<j<k Xi Xj Xk ,...,
Vn = X1 · · · Xn . From the relationship (Newton’s identities)
Uk −V1 Uk−1 +V2 Uk−2 −· · ·+(−1)k−1 Vk−1 U1 +(−1)k kVk = 0,
k = 1, ..., n, there is a one-to-one correspondence between
U(X ) and V (X ) = (V1 , ..., Vn ).
From the expansion
(t − X1 ) · · · (t − Xn ) = t n − V1 t n−1 + V2 t n−2 − · · · + (−1)n Vn ,
there is a one-to-one correspondence between V (X ) and
T (X ), since the roots of a polynomial and its coefficients are
uniquely determined each other.
Thus T (X ) and U(X ) are one-to-one, and hence T (X ) is
complete.
Chen Zehua
ST5215: Advanced Statistical Theory
The relationship between an ancillary statistic and a complete and
sufficient statistic is characterized in the following result.
Theorem 2.4 (Basu’s theorem)
Let V and T be two statistics of X from a population P ∈ P.
If V is ancillary and T is boundedly complete and sufficient for
P ∈ P, then V and T are independent w.r.t. any P ∈ P.
Proof
Let B be an event on the range of V .
Since V is ancillary, P(V −1 (B)) is a constant.
As T is sufficient, E [IB (V )|T ] is a function of T (not dependent
on P).
Because
E {E [IB (V )|T ] − P(V −1 (B))} = 0
for all P ∈ P,
by the bounded completeness of T ,
P(V −1 (B)|T ) = E [IB (V )|T ] = P(V −1 (B))
Chen Zehua
a.s. P
ST5215: Advanced Statistical Theory
Proof (continued)
Let A be an event on the range of T .
Then
P(T −1 (A)∩V −1 (B)) = E {E [IA (T )IB (V )|T ]} = E {IA (T )E [IB (V )|T ]}
= E {IA (T )P(V −1 (B))} = P(T −1 (A))P(V −1 (B)).
Hence T and V are independent w.r.t. any P ∈ P.
Remark
Basu’s theorem is useful in proving the independence of two
statistics.
Example 2.18
Suppose that X1 , ..., Xn are i.i.d. random variables having the
N(µ, σ 2 ) distribution, with µ ∈ R and a known σ > 0.
It can be easily shown that the family {N(µ, σ 2 ) : µ ∈ R} is an
exponential family of full rank with natural parameter η = µ/σ 2 .
By Proposition 2.1, the sample mean X¯ is complete and sufficient
for η (and µ).
Chen Zehua
ST5215: Advanced Statistical Theory
Example 2.18 (continued)
Let S 2 be the sample variance.
Since S 2 = (n − 1)−1 ni=1 (Zi − Z¯ )2 , where Zi = Xi − µ is
N(0, σ 2 ) and Z¯ = n−1 ni=1 Zi , S 2 is an ancillary statistic (σ 2 is
known).
By Basu’s theorem, X¯ and S 2 are independent w.r.t. N(µ, σ 2 ) with
µ ∈ R.
Since σ 2 is arbitrary, X¯ and S 2 are independent w.r.t. N(µ, σ 2 ) for
any µ ∈ R and σ 2 > 0.
Using the independence of X¯ and S 2 , we now show that
(n − 1)S 2 /σ 2 has the chi-square distribution χ2n−1 .
Note that
n
X¯ − µ
σ
2
+
(n − 1)S 2
=
σ2
Chen Zehua
n
i=1
Xi − µ
σ
2
.
ST5215: Advanced Statistical Theory
Example 2.18 (continued)
From the properties of the normal distributions, n(X¯ − µ)2 /σ 2 has
the chi-square distribution χ21 with the m.g.f. (1 − 2t)−1/2 and
n
2
2
2
i=1 (Xi − µ) /σ has the chi-square distribution χn with the
−n/2
m.g.f. (1 − 2t)
, t < 1/2.
By the independence of X¯ and S 2 , the m.g.f. of (n − 1)S 2 /σ 2 is
(1 − 2t)−n/2 /(1 − 2t)−1/2 = (1 − 2t)−(n−1)/2
for t < 1/2.
This is the m.g.f. of the chi-square distribution χ2n−1 and,
therefore, the result follows.
Chen Zehua
ST5215: Advanced Statistical Theory
Cochran’s Theorem
Suppose that
❳ ∼ N(0, In ) and
❳ τ ❳ = ❳ τ A1 ❳ + · · · + ❳ τ Ak ❳ ,
where In is the n × n identity matrix and Ai is an n × n symmetric
matrix, i = 1, . . . , k. A necessary and sufficient condition that
❳ τ Ai ❳ has the χ2 distribution χ2ni , i = 1, . . . , k, and the ❳ τ Ai ❳ ’s
are independent is n = n1 + · · · + nk .
Chen Zehua
ST5215: Advanced Statistical Theory