Level crossing problem and hypothesis testing

Level crossing problem and hypothesis testing: techniques and applications
Student Conference | April 26th, 2014
Xiaoou Li
Department of Statistics, Columbia University
,
Level crossing problem and hypothesis testing, April 26th, 2014
1
Motivation
I
Level crossing problem:
P( (f ) > b),
I
I
I
where f is a random field/process, and is a functional.
R
For example, P(supt2T f (t) > b), P( T e f (t) dt > b).
Error rate of hypothesis testing is also a level crossing problem:
P0 (likelihoodratio > C ), P0 (generalized likelihood ratio > C ).
Using method for random field level crossing problem to
analyze error rate of hypothesis testing.
,
Level crossing problem and hypothesis testing, April 26th, 2014
2
Level crossing problem
I
I
Gaussian random field (and functionals of Gaussian random
field).
Random walk (and function of random walk)
I
I
This is related to hypothesis testing for exponential family of
distributions.
Non-Gaussian random field
,
Level crossing problem and hypothesis testing, April 26th, 2014
3
Gaussian random field level crossing
I
I
I
I
I
R f (t)
P(sup
f
(t)
>
b),
P(
e dt > b),
t2T
T
R
f (t)
P T (e
1)dt > const. ,etc, as b ! 1 and ! 0.
Change of measure: put more weight on the most likely event;
analyze the event of interest conditional on the most likely
event.
For P(sup f (t) > b), first sample ⌧ around
t ⇤ = arg supt P(f (t) > b), then sample other points
conditional on f (⌧ ).
The behavior of f (t) (t is around ⌧ ) is depending on the
smoothness of the random field.
Roughly speaking, P(supt2T f (t) > b) ⇡ supt2T P(f (t) > b)
,
Level crossing problem and hypothesis testing, April 26th, 2014
4
Likelihood ratio of a simple versus
composite problem
Suppose that Xi , i = 1, .., n are i.i.d. samples. We are interested in
testing
H0 : f = g and HA : f 2 {h : 2 },
where inf KL(g kh ) = inf Eg (log(g (X )/h(X )) > 0.
The likelihood ratio statistic is
Qn
sup
h (Xi )
Qn i=1
Rn =
i=1 g (Xi )
and the log-likelihood is
ln = sup ln ( ) = sup
n
⇣X
log h (Xi )
i=1
n
X
i=1
log g (Xi )
⌘
,
Level crossing problem and hypothesis testing, April 26th, 2014
5
Simple versus composite problem
Test statistics:
sup ln ( ) = sup
n
⇣X
log h (Xi )
i=1
I
I
I
I
I
n
X
i=1
log g (Xi )
⌘
Under null hypothesis, Eg (ln ( )) = n ⇥ KL(g kh ) ! 1
Under the alternative, Eh (ln ( )) = n ⇥ KL(h kg ) ! 1
If sup ln ( ) > 0, then reject the null.
What’s the decay rate of type I and type II error?
We view it as a random field indexed by .
,
Level crossing problem and hypothesis testing, April 26th, 2014
6
Type I error rate
How to compute the type I error?
Pg (sup ln ( ) > 0)
This is a level crossing problem!
With some assumptions on r ln ( ), we have
Pg (sup ln ( ) > 0) ⇡ sup Pg (ln ( ) > 0).
I
I
I
Change of measure according to Pg (ln ( ) > 0).
The property of ln ( ) for around
⇤
= arg sup Pg (ln ( ) > 0) is determined by r ln ( ).
The large deviation probability for the random walk
Pg (ln ( ) > 0).
,
Level crossing problem and hypothesis testing, April 26th, 2014
7
Type II error
Type II error of this test
sup Ph (sup ln ( 0 ) < 0)
0
I
I
An upper bound can be obtained by sup Ph (ln ( ) < 0).
A lower bound can be obtained with Neyman-Pearson lemma.
,
Level crossing problem and hypothesis testing, April 26th, 2014
8
Composite null against composite
alternative
H0 : f 2 {g✓ : ✓ 2 ⇥} against H1 : f 2 {h :
2 },
where inf ✓, KL(g✓ kh ) > 0 and inf ✓, KL(h kg✓ ) > 0. Generalized
log-likelihood ratio statistics is
ˆln = sup log h (Xi )
sup log g✓ (Xi ).
✓
The rejection region is
C1 = {ˆln > 0}
Under some regularity condition, we have
sup log prg✓ (C1 ) ⇠ sup log prh (C1c ) ⇠
✓2⇥
2
n ⇥ min ⇢✓ .
✓2⇥, 2
(1)
,
Level crossing problem and hypothesis testing, April 26th, 2014
9
Sequential Test
Still the problem of testing
H0 : f = g and HA : f 2 {h :
2 },
and
ln (ˆ ) = sup ln ( ) = sup
n
⇣X
log h (Xi )
i=1
n
X
⌘
log g (Xi ) .
i=1
The Generalized Sequential Probability Ratio Test (GSPRT) stops
at
⌧ = inf{n : ln (ˆ ) > A or ln (ˆ ) < B}.
If l⌧ (ˆ ) > A, then reject H0 ; if l⌧ (ˆ ) <
B, then accept H0 .
,
Level crossing problem and hypothesis testing, April 26th, 2014
10
Error rate of GSPRT
The type I error is
Pg (l⌧ (ˆ ) > A) ⇡ Pg (sup sup ln ( ) > A).
n
I
I
I
I
I
It is still a level crossing problem but it is more complicated.
n is also an index for the random field.
Same idea, sample (n, ) together.
Under some regularity condition, we could compute the error
rate as A ! 1.
Pg (l⌧ (ˆ ) > A) = e (1+o(1))A .
,
Level crossing problem and hypothesis testing, April 26th, 2014
11
Asymptotically optimality
Under H0 , GSPRT is approximating Wald SPRT for the simple
versus simple testing problem
H0 : f = g and HA : f = h ⇤ ,
where ⇤ = arg inf KL(g kh ).
I Wald’s SPRT has smallest sample size under the density g .
I GSPRT has approximately the same sample size as Wald’s
SPRT B ! 1.
(1+o(1))B
I GSPRT has approximately smallest sample size
KL(g kh ⇤ )
among all the sequential test that have the same error rate as
B ! 1.
I It is also true for the composite versus composite testing
problem.
,
Level crossing problem and hypothesis testing, April 26th, 2014
12
Summary
I
I
I
View hypothesis testing as a level crossing problem.
Chernoff index for hypothesis testing for separate family of
distributions.
Generalized sequential probability test has approximately same
decay rate as Wald test, and has asymptotically smallest
sample size.
,
Level crossing problem and hypothesis testing, April 26th, 2014
13
Thank you!
,
Level crossing problem and hypothesis testing, April 26th, 2014
14