Level crossing problem and hypothesis testing: techniques and applications Student Conference | April 26th, 2014 Xiaoou Li Department of Statistics, Columbia University , Level crossing problem and hypothesis testing, April 26th, 2014 1 Motivation I Level crossing problem: P( (f ) > b), I I I where f is a random field/process, and is a functional. R For example, P(supt2T f (t) > b), P( T e f (t) dt > b). Error rate of hypothesis testing is also a level crossing problem: P0 (likelihoodratio > C ), P0 (generalized likelihood ratio > C ). Using method for random field level crossing problem to analyze error rate of hypothesis testing. , Level crossing problem and hypothesis testing, April 26th, 2014 2 Level crossing problem I I Gaussian random field (and functionals of Gaussian random field). Random walk (and function of random walk) I I This is related to hypothesis testing for exponential family of distributions. Non-Gaussian random field , Level crossing problem and hypothesis testing, April 26th, 2014 3 Gaussian random field level crossing I I I I I R f (t) P(sup f (t) > b), P( e dt > b), t2T T R f (t) P T (e 1)dt > const. ,etc, as b ! 1 and ! 0. Change of measure: put more weight on the most likely event; analyze the event of interest conditional on the most likely event. For P(sup f (t) > b), first sample ⌧ around t ⇤ = arg supt P(f (t) > b), then sample other points conditional on f (⌧ ). The behavior of f (t) (t is around ⌧ ) is depending on the smoothness of the random field. Roughly speaking, P(supt2T f (t) > b) ⇡ supt2T P(f (t) > b) , Level crossing problem and hypothesis testing, April 26th, 2014 4 Likelihood ratio of a simple versus composite problem Suppose that Xi , i = 1, .., n are i.i.d. samples. We are interested in testing H0 : f = g and HA : f 2 {h : 2 }, where inf KL(g kh ) = inf Eg (log(g (X )/h(X )) > 0. The likelihood ratio statistic is Qn sup h (Xi ) Qn i=1 Rn = i=1 g (Xi ) and the log-likelihood is ln = sup ln ( ) = sup n ⇣X log h (Xi ) i=1 n X i=1 log g (Xi ) ⌘ , Level crossing problem and hypothesis testing, April 26th, 2014 5 Simple versus composite problem Test statistics: sup ln ( ) = sup n ⇣X log h (Xi ) i=1 I I I I I n X i=1 log g (Xi ) ⌘ Under null hypothesis, Eg (ln ( )) = n ⇥ KL(g kh ) ! 1 Under the alternative, Eh (ln ( )) = n ⇥ KL(h kg ) ! 1 If sup ln ( ) > 0, then reject the null. What’s the decay rate of type I and type II error? We view it as a random field indexed by . , Level crossing problem and hypothesis testing, April 26th, 2014 6 Type I error rate How to compute the type I error? Pg (sup ln ( ) > 0) This is a level crossing problem! With some assumptions on r ln ( ), we have Pg (sup ln ( ) > 0) ⇡ sup Pg (ln ( ) > 0). I I I Change of measure according to Pg (ln ( ) > 0). The property of ln ( ) for around ⇤ = arg sup Pg (ln ( ) > 0) is determined by r ln ( ). The large deviation probability for the random walk Pg (ln ( ) > 0). , Level crossing problem and hypothesis testing, April 26th, 2014 7 Type II error Type II error of this test sup Ph (sup ln ( 0 ) < 0) 0 I I An upper bound can be obtained by sup Ph (ln ( ) < 0). A lower bound can be obtained with Neyman-Pearson lemma. , Level crossing problem and hypothesis testing, April 26th, 2014 8 Composite null against composite alternative H0 : f 2 {g✓ : ✓ 2 ⇥} against H1 : f 2 {h : 2 }, where inf ✓, KL(g✓ kh ) > 0 and inf ✓, KL(h kg✓ ) > 0. Generalized log-likelihood ratio statistics is ˆln = sup log h (Xi ) sup log g✓ (Xi ). ✓ The rejection region is C1 = {ˆln > 0} Under some regularity condition, we have sup log prg✓ (C1 ) ⇠ sup log prh (C1c ) ⇠ ✓2⇥ 2 n ⇥ min ⇢✓ . ✓2⇥, 2 (1) , Level crossing problem and hypothesis testing, April 26th, 2014 9 Sequential Test Still the problem of testing H0 : f = g and HA : f 2 {h : 2 }, and ln (ˆ ) = sup ln ( ) = sup n ⇣X log h (Xi ) i=1 n X ⌘ log g (Xi ) . i=1 The Generalized Sequential Probability Ratio Test (GSPRT) stops at ⌧ = inf{n : ln (ˆ ) > A or ln (ˆ ) < B}. If l⌧ (ˆ ) > A, then reject H0 ; if l⌧ (ˆ ) < B, then accept H0 . , Level crossing problem and hypothesis testing, April 26th, 2014 10 Error rate of GSPRT The type I error is Pg (l⌧ (ˆ ) > A) ⇡ Pg (sup sup ln ( ) > A). n I I I I I It is still a level crossing problem but it is more complicated. n is also an index for the random field. Same idea, sample (n, ) together. Under some regularity condition, we could compute the error rate as A ! 1. Pg (l⌧ (ˆ ) > A) = e (1+o(1))A . , Level crossing problem and hypothesis testing, April 26th, 2014 11 Asymptotically optimality Under H0 , GSPRT is approximating Wald SPRT for the simple versus simple testing problem H0 : f = g and HA : f = h ⇤ , where ⇤ = arg inf KL(g kh ). I Wald’s SPRT has smallest sample size under the density g . I GSPRT has approximately the same sample size as Wald’s SPRT B ! 1. (1+o(1))B I GSPRT has approximately smallest sample size KL(g kh ⇤ ) among all the sequential test that have the same error rate as B ! 1. I It is also true for the composite versus composite testing problem. , Level crossing problem and hypothesis testing, April 26th, 2014 12 Summary I I I View hypothesis testing as a level crossing problem. Chernoff index for hypothesis testing for separate family of distributions. Generalized sequential probability test has approximately same decay rate as Wald test, and has asymptotically smallest sample size. , Level crossing problem and hypothesis testing, April 26th, 2014 13 Thank you! , Level crossing problem and hypothesis testing, April 26th, 2014 14
© Copyright 2024 ExpyDoc