Solution - The Department of Statistics and Applied Probability, NUS

ST5215: Advanced Statistical Theory
2014/2015: Semester I
Tutorial 7
1. Show that {Pθ : θ ∈ Θ} is an exponential family and find its canonical form and natural
parameter space, when
(i) Pθ is the Poisson distribution P (θ) : θ ∈ Θ = (0, ∞);
(ii) Pθ is the negative binomial distribution N B(θ, r) with a fixed r, θ ∈ Θ = (0, 1);
(iii) Pθ is the exponential distribution E(a, θ, ) with a fixed a, θ ∈ Θ = (0, ∞);
(iv) Pθ is the gamma distribution Γ(α, γ), θ = (α, γ) ∈ Θ = (0, ∞) ⊗ (0, ∞);
(v) Pθ is the beta distribution B(α, β), θ = (α, β) ∈ Θ = (0, 1) ⊗ (0, 1);
(vi) Pθ is the Weibull distribution W (α, θ) with a fixed α > 0, θ ∈ Θ = (0, ∞).
Solution:
(i) The p.d.f. of the Poisson distribution can be expressed as
θx e−θ /x! = exp{x ln θ − θ}(x!)−1 I{0,1,2,...,} (x),
which has the form of an exponential family. η = ln θ is the natural parameter and
the natural parameter space is (−∞, ∞).
(iii) The exponential distributions with fixed a has the p.d.f.
θ−1 e−(x−a)/θ I(a,∞) (x) = exp{−θ−1 (x−a)−ln θ}I(a,∞) (x), = exp{η(x−a)+ln(−η)}I(a,∞) (x)
which is in the form of an exponential family. The natural parameter is η = −θ−1 .
The natural parameter space is Ξ = (−∞, 0).
(v) The beta distribution has the p.d.f.
Γ(α + β)
Γ(α + β) α−1
x (1−x)β−1 I(0,1) (x) = exp{α ln x+β ln(1−x)+ln
}[x(1−x)]−1 I(0,1) (x),
Γ(α)Γ(β)
Γ(α)Γ(β)
which is in the canonical form of an exponential family. The natural parameter is
θ = (α, β). The natural parameter space is Θ = (0, 1) ⊗ (0, 1).
1
2. Show that {Pθ : θ ∈ Θ} is not an exponential family, when
(i) Pθ is the exponential distribution E(a, θ) with two unknown parameters a, and θ;
(ii) Pθ is the negative binomial distribution N B(θ, r) with two unknown parameters r and θ.
Solution:
(i) If E(a, θ) is an exponential family, then E(a, θ) has a positive density
(1)
exp{η(a, θ)τ T (x) − ξ(a, θ)}
with respect to a non-zero measure ν.
Consider the interval (−∞, t) for any t ∈ R. There is an a ∈ R such that a > t,
hence P(a,θ) [(−∞, t)] = 0. This together with (1) implies that ν[(−∞, t)] = 0. Since t
is arbitrary, ν must be a zero measure, which is a contradiction.
(ii) The proof is the same.
The method above is a general method for proving that any family of distributions with domain (the range of nonzero density function) depending on unknown
parameters is not an exponential family.
2
3. Show that the family of Cauchy distributions C(µ, σ) is not an exponential family. The
Cauchy distribution C(µ, σ) has p.d.f. given by
"
2 #−1
x−µ
1
1+
f (x; µ, σ) =
, x ∈ R, µ ∈ R, σ > 0.
πσ
σ
Solution:
Without loss of generality, assume σ = 1. By a counter-proof, assume that
C(µ, 1) is an exponential family, we are going to arrive at a contradiction. If C(µ, 1)
is an exponential family, then there exist p-dimensional Borel functions T (X) and
η(µ) (p ≥ 1) and one-dimensional Borel functions h(X) and ξ(µ) such that
−1
1
τ
1 + (x − µ)2
= eη(µ) T (x)−ξ(µ) h(x)
π
for any x andPµ. Let X = (X1 , . . . , Xn )Q
be a random sample from C(µ, 1), where n > p.
Let Tn (X) = ni=1 T (Xi ) and hn (X) = ni=1 h(Xi ). Then the joint Lebesgue density of
X is
n
1 Y
τ
2 −1
1
+
(x
−
µ)
= eη(µ) Tn (x)−nξ(µ) hn (x)
i
n
π i=1
for any x = (x1 , . . . , xn ) and µ, which implies that
!
n
2
Y
1
1 + xi
˜
ln
= η˜(µ)τ Tn (x) − nξ(µ)
n
2
π i=1 1 + (xi − µ)
˜
for any x and µ, where η˜(µ) = η(µ) − η(0) and ξ(µ)
= ξ(µ) − ξ(0). Define
ψµ (x) =
n
Y
1 + (xi − µ)2
i=1
1 + x2i
.
√
As a function of µ, ψµ (x) is a polynomial which has 2n complex roots x(i) ± −1, i =
1, . . . , n. If x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ) such that Tn (x) = Tn (y), then ψµ (x) =
ψµ (y) for all µ, which implies (x(1) , . . . , x(n) ) = (y(1) , . . . , y(n) ).
On the other hand, we may choose real numbers µ1 , . . . , µp such that η˜(µi ), i =
1, ..., p, are linearly independent vectors. Since
˜ i)
ψµi (x) = η˜(µi )τ Tn (x) − nξ(µ
3
for any x, Tn (x) is then a function of the p functions ψµi (x), i = 1, ..., p. Since n > p,
it can be shown that there exist x and y in Rn such that ψµi (x) = ψµi (y), i = 1, ..., p,
(which implies Tn (x) = Tn (y)), but the vector of ordered xi ’s is not the same as the
vector of ordered yi ’s. This contradicts the previous conclusion. Hence, P is not
an exponential family.
4
4. Consider the multinomial distribution with p.d.f. given by
n!
xk−1 xk
θk ,
θ1x1 · · · θk−1
x 1 ! . . . xk !
P
P
where xj ’s are integers satisfying kj=1 xj = n and θj > 0, kj=1 = 1.
f (x1 , x2 , . . . , xk ) =
(i) Show that the multinomial distribution is an exponential family with θ = (θ1 , . . . , θk ), but
it does not have a full rank.
(ii) Provide a re-parameterization of the family such that, with the re-parameterized parameter
space, the multinomial distribution is an full rank exponential family.
Solution:
(i) The p.d.f. can be expressed as
f (x1 , x2 , . . . , xk ) = exp{
k
X
ln(θj )xj }
j=1
n!
,
x 1 ! . . . xk !
which is of the form
P of an exponential family with parameter space {(θ1 , . . . , θk ) :
θj > 0, j = 1, . . . , k, kj=1 θj = 1}. However, the parameter space does not contain a
open set in the space Rk . Hence it does not have full rank.
(ii) Re-parameterize the family by ϑ = (θ1 , . . . , θk−1 ) where θj > 0, j = 1, . . . , k −
Pk−1
1, j=1 θj < 1. With the new parameterization, the p.d.f. becomes
f (x1 , x2 , . . . , xk ) = exp{
k−1
X
ln(θj )xj + ln(1 −
j=1
= exp{
k−1
X
j=1
k−1
X
θj )(n −
j=1
xj ln
1−
θj
Pk−1
j=1
θj
+ n ln(1 −
k−1
X
xj )}
j=1
k−1
X
j=1
θj )}
n!
,
x1 ! . . . xk !
n!
x 1 ! . . . xk !
The new parameter space itself is an open set in Rk−1 . Hence the family is of full
rank k − 1.
5
5. Let X1 , . . . , Xn be i.i.d. samples from a population P ∈ P where P is any family of
distributions. Show that T (X) = (X(1) , . . . , X(n) ) is sufficient for P, where X(k) is the k-th
smallest order statistic.
Solution:
Given (X(1) , . . . , X(n) ), (X1 , . . . , Xn ) has an equal chance to assume any permutations of (X(1) , . . . , X(n) ). Hence, the conditional probability of (X1 , . . . , Xn ) is
P (X1 = x1 , . . . , Xn = xn ) =
1
,
n!
where (x1 , . . . , xn ) is any permutation of (X(1) , . . . , X(n) ). No matter what family is
assumed, the conditional distribution does not depend on any unknowns of the
family. Hence T (X) is sufficient by the definition.
6