2WS30 - Mathematical Statistics Homework 2 - Due 8th of October Exercise 1. Consider a setting were the observed data (x1 , . . . , xn ) is modeled as a realization of the random vector (X1 , . . . , Xn ), which is a random sample from a normal distribution with known mean µ and unknown variance σ 2 . (a) Show that the maximum likelihood estimator of σ 2 is given by n σ ˆn2 1X = (Xi − µ)2 . n i=1 (b) Check that σ ˆn2 is an unbiased estimator of σ 2 (c) Compute the variance of σ ˆn2 . (d) Compute the Cram´er-Rao lower bound for the variance of any unbiased estimator of σ 2 . Is σ ˆ 2 the UMVUE? (e) Consider now the class of estimators of σ 2 given by an n X (Xi − µ)2 , i=1 where an is an arbitrary sequence that can be a function of n and µ, but not a function of σ. Find the choice of an that yields the estimator with the minimal MSE. (f ) Show that the MSE of the estimator in the previous question is given by 2σ 4 /(n + 2). This value is lower than the Cram´er-Rao lower bound, and shows that biased estimators can outperform the best possible unbiased estimator of a parameter. So bias is not necessarily a bad thing... Hint: For (c) P it is useful to note that, if Z1 , . . . , Zn are i.i.d. standard normal random variables then Y = ni=1 Zi2 is a χ2 random variable with n degrees of freedom. In particular E(Y ) = n and V(Y ) = 2n. 1 Exercise 2. (Based on exercise 2.17 of the (AR) textbook) Let Y1 , . . . , Yn ∼ Geom(p). (a) Find the MLE of p. (b) Use the Fisher-Neyman factorization theorem to check that W = statistic for p. Pn i=1 Yi is a sufficient (c) Find an unbiased estimator of p based on W , using Rao-Blackwellization. P Hint: Recall that ni=1 Yi follows a negative binomial distribution. Exercise 3. Suppose you observe data x which is modeled as a realization of a Poisson random variable X with expected value λ > 0. For this question we would like to estimate θ ≡ (P(X = 0))2 = e−2λ . (a) Use the invariance principle to compute the Maximum Likelihood Estimator (MLE) of θ. (b) Show that the MLE is biased, and the bias is given by −2 ) e−λ(1−e − e−2λ . (c) Show that the MSE of the MLE is given by e−λ(1−e −4 ) + e−4λ − 2e−λ(3−e −2 ) . Now suppose we want to restrict our attention to unbiased estimators of θ. Consider the following proposal θˆ = (−1)X . (d) Show that θˆ is an unbiased estimator of θ. (e) Show that the MSE of θˆ is given by 1 − e−4λ . (f ) Plot the MSEs of the MLE and of θˆ as a function of λ, and see that the MLE is always, by large, the best estimator (in terms of MSE). (g) Suppose now your data is modeled by a realization of X1 , . . . , Xn , which are i.i.d. Poisson random variables with mean λ. Use the Rao-Blackwell theorem to show that Pni=1 Xi 2 θˆn = 1 − , n is the UMVUE of θ (in the above use the convention 00 = 1). Remarks: Note that θˆ is actually the UMVUE, despite the fact it is a very silly estimator: it only takes the values 1 and -1. However, the parameter θ is always positive, and therefore the estimator is completely absurd when X is odd (as it takes the value -1). This shows that pursuing unbiased estimation might lead into absurd estimation procedures. Hint: To answer the above questions it is helpful to know (or derive) the moment generating function of a Poisson random variable. 2 Exercise 4. Let X1 , . . . , Xn be an i.i.d. random sample from a continuous distribution with density given by (θ − x) θ22 if 0 ≤ x ≤ θ f (x; θ) = , 0 otherwise where θ > 0 is an unknown parameter. (a) Show that these random variables belong to a scale family of distributions. (b) From the course you know that for scale families there is a simple pivotal quantity involving the maximum likelihood estimator. In this case, however, the maximum likelihood estimator is quite difficult to compute explicitly. Nevertheless one can propose a different pivotal quantity: P Show that Q = √1n ni=1 Xθi − 31 is a pivotal quantity for θ (Hint: begin by showing that Xi /θ is a pivotal quantity). (c) The exact distribution of Q is a bit difficult to derive, but it converges to a known distribution as n increases. Use the central limit theorem to characterize the asymptotic distribution of Q. (d) Suppose you have a realization of the sample with n = 100, for which the sample mean is x ¯ = 13.7. Use the answer from (c) to construct an approximate 95% confidence interval for θ. You can make use of the following facts about the quantiles zα = Φ−1 (α) of the standard normal distribution: z0.95 ≈ 1.645 and z0.975 ≈ 1.96. Note about question (d): If you have not answered question c) you can assume Q follows approximately a normal distribution with mean 0 and variance 1/10. Exercise 5. Let X1 , . . . , Xn be an i.i.d. random sample from a continuous uniform distribution over [0, θ], where θ > 0 is an unknown parameter. (a) Does these random variables belong to a family of distributions which is either a location, scale, or location-scale family? (b) Based on your answer to the previous question derive a pivotal quantity for θ. (c) Denote the pivotal quantity you derived in question (b) by Q. Prove that Q is a continuous random variable with density g(x) = nxn−1 1{0 ≤ x ≤ 1}. (d) Suppose you have a realization of the sample with n = 10, namely you observed 0.78, 3.78, 2.42, 5.00, 2.10, 3.57, 3.96, 4.79, 2.56, 2.87. Use your answer of (c) to construct an exact 95% confidence interval for θ (exact means that the coverage probability of the interval is exactly 0.95). (e) Can you use Wald’s approach to construct an approximate confidence interval in this case? Justify your answer. 3