NONPARAMETRIC STATISTICAL THEORY Part III. Example Sheet 1 (of 3) RJS & AKHK/Lent 2014 Comments and corrections to [email protected] [Notation: For a square-integrable function g : R → R, define R(g) = R∞ for a kernel K, define µ2 (K) = −∞ x2 K(x) dx.] iid iid R∞ −∞ g(x)2 dx; 1. Let U1 , . . . , Un ∼ U (0, 1), and let Y1 , . . . , Yn+1 ∼ Exp(1). Writing Sj = for j = 1, . . . , n + 1, show that d U(j) = Pj i=1 Yi Sj ∼ Beta(j, n − j + 1), Sn+1 for j = 1, . . . , n. 2. (Hoeffding’s inequality) (a) Let Y be a random variable with mean zero and a ≤ Y ≤ b. Use convexity to show that for every t ∈ R, we have log E(etY ) ≤ −αu + log(β + αeu ), where u = t(b−a) and α = 1−β = −a/(b−a). Using a second-order Taylor expansion about the origin, deduce that log E(etY ) ≤ t2 (b − a)2 /8. (b) Now let Y1 , . . . , Yn be independent with E(Yi ) = 0 and ai ≤ Yi ≤ bi for i = 1, . . . , n. Use Markov’s inequality to show that, for every > 0, we have X n 22 P Yi > ≤ 2 exp − Pn . 2 (b − a ) i i i=1 i=1 3. Let X1 , . . . , Xn be independent with distribution P on a measurable space Pn (X , A), −1 ˆ ˆ and let Pn be the empirical measure of X1 , . . . , Xn ; thus Pn (A) = n i=1 1{Xi ∈A} for A ∈ A. Show that, for all > 0 and A ∈ A, we have 2 P(|Pˆn (A) − P (A)| > ) ≤ 2e−2n . iid 4. (a) Let X1 , . . . , Xn ∼ F , and let Fˆn denote their empirical distribution function. For t1 < . . . < tk , write down the distribution of n Fˆn (t1 ), Fˆn (t2 ) − Fˆn (t1 ), . . . , Fˆn (tk ) − Fˆn (tk−1 ), 1 − Fˆn (tk ) . 1 (b) Find the asymptotic distribution of n1/2 Fˆn (t1 ) − F (t1 ), . . . , Fˆn (tk ) − F (tk ) . 5. (Continuation) We say a continuous process (Bt )t∈[0,1] is a standard Brownian motion on [0, 1] if B0 = 0, and if, for 0 ≤ s1 ≤ t1 ≤ . . . ≤ sk ≤ tk ≤ 1, we have (Bt1 − Bs1 , . . . , Btk − Bsk ) ∼ Nk (0, Σ), where Σ := diag(t1 − s1 , . . . , tk − sk ). The process (Wt )t∈[0,1] defined by Wt = Bt − tB1 is called a Brownian bridge, or tied-down Brownian motion, because W0 = W1 = 0. Compute the distribution of (Wt1 , . . . , Wtk ). d [These last two questions suggest that “n1/2 Fˆn (t) − F (t) → WF (t) as n → ∞”. Care is required to make this statement and its proof precise.] 6. (a) Verify the algebraic identity φσ (x − µ)φσ0 (x − µ0 ) = φσσ0 /(σ2 +σ02 )1/2 (x − µ∗ )φ(σ2 +σ02 )1/2 (µ − µ0 ), where µ∗ = (σ 02 µ + σ 2 µ0 )/(σ 2 + σ 02 ), and φσ (x) is the N (0, σ 2 ) density. (b) Let X1 , . . . , Xn be independent N (0, σ 2 ) random variables. Taking K to be the N (0, 1) density, show that the mean integrated squared error of the kernel density estimate fˆh with kernel K and bandwidth h can be expressed exactly as 1 1 23/2 1 1 1 ˆ + 1− − + . M ISE(fh ) = 1/2 2π nh n (h2 + σ 2 )1/2 (h2 + 2σ 2 )1/2 σ 7. (Continuation) Now suppose that h = hn satisfies h → 0 as n → ∞ and nh → ∞ as n → ∞. Derive an appropriate asymptotic expansion of the M ISE computed above, and deduce that the asymptotically optimal bandwidth with respect to the M ISE criterion is given by 1/5 4 σ. hAM ISE = 3n Check that the same expression is obtained from the general formula for the asymptotically optimal bandwidth for a second-order kernel. iid 8. Let X1 , . . . , Xn ∼ f , where f 00 is bounded. Write f˜b for the histogram estimator of f with binwidth b. Assume b = bn → 0 andnb → ∞ as n → ∞. For x ∈ R, let Ib (x) denote the bin containing x and pb (x) = P X1 ∈ Ib (x) denote the bin probability. Show that 1 pb (x) = bf (x) + f 0 (x)[b2 − 2b{x − tb (x)}] + O(b3 ) 2 as n → ∞, where tb (x) is the left-hand endpoint of Ib (x). Deduce that 1 f (x) 1 2 0 2 0 2 2 0 2 3 ˜ M SE{fb (x)} = + b f (x) +f (x) {x−tb (x)} −bf (x) {x−tb (x)}+O +b . nb 4 n 2 9. (Continuation) Assuming in addition that R(f 0 ) < ∞, argue informally that 1 1 1 M ISE(f˜b ) = + b2 R(f 0 ) + o + b2 . nb 12 nb Hence derive the AM ISE optimal binwidth bAM ISE and find AM ISE(f˜bAM ISE ). 10. (Scheff´ e’s theorem) Let (fn ) be a sequence of densities and f be another density such that fn → f almost everywhere. By integrating gn = f − fn separately over {x : gn (x) > 0} and {x : gn (x) ≤ 0} and using dominated convergence, show that Z ∞ |fn (x) − f (x)| dx → 0. −∞ 11. Assume the standard conditions on f , h and K from lectures, and Ralso that ∞ f 00 is continuous with R(f 00 ) < ∞. Use Fubini’s theorem to show that h −∞ (Kh2 ∗ f )(x) dx = R(K). Use the dominated convergence theorem to show that (Kh ∗ f )(x) → f (x) for each x ∈ R, and show (Kh ∗ f )(x) < ∞. Apply Scheff´e’s theorem to n∈N supx∈R R ∞ that sup R∞ 2 deduce that −∞ (Kh ∗ f ) (x) dx → −∞ f (x)2 dx. Finally, deduce that Z ∞ Var{fˆh (x)} dx = −∞ 1 R(K) + O(n−1 ). nh 2 R∞ R∞ 12. (Continuation) Show that −∞ E{fˆh (x)}−f (x) dx = h4 −∞ A2n (x) dx, where Z ∞Z 1 An (x) = (1 − t)f 00 (x − thz)z 2 K(z) dt dz. −∞ 0 Apply Cauchy-Schwarz twice, firstly to the innermost integral with (1−t)1/2 |z|K 1/2 (z) as one term of the product, and secondly to the middle integral, and then use Fubini’s theorem to evaluate the x-integral first, to show that Z ∞ 1 A2n (x) dx ≤ R(f 00 )µ22 (K) 4 −∞ for all n. Use dominated convergence to show that An (x) → 12 f 00 (x)µ2 (K) for each x ∈ R. Apply Fatou’s lemma and combine the previous results to conclude that 1 1 1 M ISE(fˆh ) = R(K) + h4 R(f 00 )µ22 (K) + o + h4 . nh 4 nh 3
