2WS30 - Mathematical Statistics

2WS30 - Mathematical Statistics
Homework 2 - Due 8th of October
Exercise 1. Consider a setting were the observed data (x1 , . . . , xn ) is modeled as a realization
of the random vector (X1 , . . . , Xn ), which is a random sample from a normal distribution with
known mean µ and unknown variance σ 2 .
(a) Show that the maximum likelihood estimator of σ 2 is given by
n
σ
ˆn2
1X
=
(Xi − µ)2 .
n
i=1
(b) Check that σ
ˆn2 is an unbiased estimator of σ 2
(c) Compute the variance of σ
ˆn2 .
(d) Compute the Cram´er-Rao lower bound for the variance of any unbiased estimator of σ 2 .
Is σ
ˆ 2 the UMVUE?
(e) Consider now the class of estimators of σ 2 given by
an
n
X
(Xi − µ)2 ,
i=1
where an is an arbitrary sequence that can be a function of n and µ, but not a function of
σ. Find the choice of an that yields the estimator with the minimal MSE.
(f ) Show that the MSE of the estimator in the previous question is given by 2σ 4 /(n + 2). This
value is lower than the Cram´er-Rao lower bound, and shows that biased estimators can
outperform the best possible unbiased estimator of a parameter. So bias is not necessarily
a bad thing...
Hint: For (c) P
it is useful to note that, if Z1 , . . . , Zn are i.i.d. standard normal random
variables then Y = ni=1 Zi2 is a χ2 random variable with n degrees of freedom. In particular
E(Y ) = n and V(Y ) = 2n.
1
Exercise 2. (Based on exercise 2.17 of the (AR) textbook) Let Y1 , . . . , Yn ∼ Geom(p).
(a) Find the MLE of p.
(b) Use the Fisher-Neyman factorization theorem to check that W =
statistic for p.
Pn
i=1 Yi
is a sufficient
(c) Find an unbiased estimator of p based on W , using Rao-Blackwellization.
P
Hint: Recall that ni=1 Yi follows a negative binomial distribution.
Exercise 3. Suppose you observe data x which is modeled as a realization of a Poisson random
variable X with expected value λ > 0. For this question we would like to estimate
θ ≡ (P(X = 0))2 = e−2λ .
(a) Use the invariance principle to compute the Maximum Likelihood Estimator (MLE) of θ.
(b) Show that the MLE is biased, and the bias is given by
−2 )
e−λ(1−e
− e−2λ .
(c) Show that the MSE of the MLE is given by
e−λ(1−e
−4 )
+ e−4λ − 2e−λ(3−e
−2 )
.
Now suppose we want to restrict our attention to unbiased estimators of θ. Consider the
following proposal
θˆ = (−1)X .
(d) Show that θˆ is an unbiased estimator of θ.
(e) Show that the MSE of θˆ is given by 1 − e−4λ .
(f ) Plot the MSEs of the MLE and of θˆ as a function of λ, and see that the MLE is always,
by large, the best estimator (in terms of MSE).
(g) Suppose now your data is modeled by a realization of X1 , . . . , Xn , which are i.i.d. Poisson
random variables with mean λ. Use the Rao-Blackwell theorem to show that
Pni=1 Xi
2
θˆn = 1 −
,
n
is the UMVUE of θ (in the above use the convention 00 = 1).
Remarks: Note that θˆ is actually the UMVUE, despite the fact it is a very silly estimator:
it only takes the values 1 and -1. However, the parameter θ is always positive, and therefore
the estimator is completely absurd when X is odd (as it takes the value -1). This shows that
pursuing unbiased estimation might lead into absurd estimation procedures.
Hint: To answer the above questions it is helpful to know (or derive) the moment generating
function of a Poisson random variable.
2
Exercise 4. Let X1 , . . . , Xn be an i.i.d. random sample from a continuous distribution with
density given by
(θ − x) θ22 if 0 ≤ x ≤ θ
f (x; θ) =
,
0
otherwise
where θ > 0 is an unknown parameter.
(a) Show that these random variables belong to a scale family of distributions.
(b) From the course you know that for scale families there is a simple pivotal quantity involving
the maximum likelihood estimator. In this case, however, the maximum likelihood estimator
is quite difficult to compute explicitly. Nevertheless one can propose a different pivotal
quantity:
P Show that Q = √1n ni=1 Xθi − 31 is a pivotal quantity for θ (Hint: begin by showing that
Xi /θ is a pivotal quantity).
(c) The exact distribution of Q is a bit difficult to derive, but it converges to a known distribution as n increases. Use the central limit theorem to characterize the asymptotic
distribution of Q.
(d) Suppose you have a realization of the sample with n = 100, for which the sample mean is
x
¯ = 13.7. Use the answer from (c) to construct an approximate 95% confidence interval
for θ. You can make use of the following facts about the quantiles zα = Φ−1 (α) of the
standard normal distribution: z0.95 ≈ 1.645 and z0.975 ≈ 1.96.
Note about question (d): If you have not answered question c) you can assume Q follows
approximately a normal distribution with mean 0 and variance 1/10.
Exercise 5. Let X1 , . . . , Xn be an i.i.d. random sample from a continuous uniform distribution
over [0, θ], where θ > 0 is an unknown parameter.
(a) Does these random variables belong to a family of distributions which is either a location,
scale, or location-scale family?
(b) Based on your answer to the previous question derive a pivotal quantity for θ.
(c) Denote the pivotal quantity you derived in question (b) by Q. Prove that Q is a continuous
random variable with density g(x) = nxn−1 1{0 ≤ x ≤ 1}.
(d) Suppose you have a realization of the sample with n = 10, namely you observed 0.78, 3.78,
2.42, 5.00, 2.10, 3.57, 3.96, 4.79, 2.56, 2.87. Use your answer of (c) to construct an exact
95% confidence interval for θ (exact means that the coverage probability of the interval is
exactly 0.95).
(e) Can you use Wald’s approach to construct an approximate confidence interval in this case?
Justify your answer.
3