Chapter 5: Stratified Sampling

Chapter 5: Stratified Sampling
Jae-Kwang Kim
Fall, 2014
Stratified sampling
1
Stratified sampling
2
Sample Size allocation
3
The construction of Strata
4
Mathematical Programming
Kim
Ch. 5: Stratified Sampling
Fall, 2014
2 / 27
Stratified sampling
Stratified sampling:
1
The finite population is stratified into H subpopulations.
U = U1 ∪ · · · ∪ UH
2
Within each population (or stratum), samples are drawn independently
across the strata.
Pr (i ∈ Ah , j ∈ Ag ) = Pr (i ∈ Ah ) Pr (j ∈ Ag ) ,
for h 6= g
where Ah is the index set of the sample in stratum h, h = 1, 2, · · · , H.
Example: Stratified SRS
1
2
3
Stratify the population. Let Nh be the population size of Uh .
Sample size allocation: Determine nh .
Perform SRS independently (select nh sample elements from Nh ) in
each stratum.
Kim
Ch. 5: Stratified Sampling
Fall, 2014
3 / 27
Stratified sampling
Why stratification ?
1
2
3
4
Control for domains of study
Flexibility in design and estimation
Convenience
Efficiency
Kim
Ch. 5: Stratified Sampling
Fall, 2014
4 / 27
Stratified sampling
Estimation
HT estimation for Y =
1
PH
h=1 Yh ,
where Yh =
P
i∈Uh yi .
HT estimator:
YˆHT =
H
X
Yˆh,HT
h=1
2
where Yˆh,HT is unbiased for Yh .
Variance
H
X
Var YˆHT =
Var Yˆh,HT
h=1
3
by independence
Variance estimation
H
X
ˆh Yˆh,HT
ˆ YˆHT =
V
V
h=1
ˆh (Yˆh,HT ) is unbiased for Var (Yˆh,HT ).
where V
Kim
Ch. 5: Stratified Sampling
Fall, 2014
5 / 27
Stratified sampling
Example: Stratified SRS
1
HT estimator:
YˆHT =
H
X
Nh y¯h
h=1
2
where y¯h = nh−1
Variance
P
i∈Ah
yi .
H
X
Nh2
nh
Var YˆHT =
1−
Sh2
nh
Nh
h=1
3
−1 P
¯ 2
where Sh2 = (Nh − 1)
i∈Uh yi − Yh .
Variance estimation
H
X
Nh2
ˆ ˆtHT =
V
nh
h=1
where sh2 = (nh − 1)
Kim
−1
P
i∈Ah
nh
1−
s2
Nh h
2
(yi − y¯h ) .
Ch. 5: Stratified Sampling
Fall, 2014
6 / 27
Sample Size allocation
1
Stratified sampling
2
Sample Size allocation
3
The construction of Strata
4
Mathematical Programming
Kim
Ch. 5: Stratified Sampling
Fall, 2014
7 / 27
Sample Size allocation
Sample allocation: Given n =
1
2
PH
h=1 nh ,
how to choose nh ?
Proportional allocation: choose nh ∝ Nh .
Optimal allocation: choose nh such that
H
X
minimize Var YˆHT
subject to c0 +
c h nh = C ,
h=1
where ch is the cost of observing an element in stratum h and C is a
given total cost. The solution (Neyman, 1934) is
√
nh ∝ Nh Sh / ch .
3
For ch = c, the lower bound of the variance is
1
V (YˆHT ) ≥
n
Kim
(
X
)2
Nh Sh
h
Ch. 5: Stratified Sampling
−
X
Nh Sh2 .
h
Fall, 2014
8 / 27
Sample Size allocation
Properties
Under proportional allocation, the weights are all equal.
In general,
Vopt ˆtHT ≤ Vprop ˆtHT ≤ VSRS ˆtHT
where Vopt ˆtHT is the variance of the
stratified sampling estimator
under optimal allocation, Vprop ˆtHT is the variance of the stratified
sampling estimator under proportional allocation, and VSRS ˆtHT is
the variance of SRS estimator.
Kim
Ch. 5: Stratified Sampling
Fall, 2014
9 / 27
Sample Size allocation
Remark
Neyman allocation is optimal for estimating the population total.
However, if the parameter of interest is comparing the stratum
means, nh = n/H is a better allocation rule.
Power allocation is also popular:
nh ∝ Nhα
where α > 0 is a constant. Often α = 1/2 is used.
For multivariate y , optimal allocation for one variable is not
necessarily optimal for the other item. Mathematical programming
can be used (Section 4).
Kim
Ch. 5: Stratified Sampling
Fall, 2014
10 / 27
The construction of Strata
1
Stratified sampling
2
Sample Size allocation
3
The construction of Strata
4
Mathematical Programming
Kim
Ch. 5: Stratified Sampling
Fall, 2014
11 / 27
The construction of Strata
Construction of stratum boundaries
Let y0 and yH be the smallest and largest values of y in the finite
population. The problem is to find intermediate stratum boundaries
y1 , · · · , yH−1 such that
V (Yˆ¯HT ) =
H
X
Wh2
h=1
1
1
−
nh Nh
Sh2
is a minimum, where Wh = Nh /N.
Under Neyman allocation, the above variance reduces to
1
V (Yˆ¯HT ) =
n
H
X
!2
Wh Sh
h=1
H
1 X
−
Wh Sh2 .
N
h=1
Thus, if nh /Nh are ignored, it is sufficient to minimize
Kim
Ch. 5: Stratified Sampling
P
h
Wh Sh .
Fall, 2014
12 / 27
The construction of Strata
Idea of Dalenius and Hodges (1959)
Let f (y ) is the frequency function of y . If the strata are numerous and
narrow, f (y ) should be approximately constant (rectangular) within a given
stratum. Hence,
Z yh
.
Wh =
f (t)dt = fh (yh − yh−1 )
yh−1
Sh
√
(yh − yh−1 )/ 12
.
=
where fh is the constant value of f (y ) in stratum h.
Ry p
Thus, writing Z (y ) = y0 f (t)dt, we have
H
X
h=1
Wh Sh ∝
H
X
H
. X
fh (yh − yh−1 )2 =
(Zh − Zh−1 )2 ,
h=1
where
h=1
Z
y
Z (y ) =
p
f (t)dt.
y0
Kim
Ch. 5: Stratified Sampling
Fall, 2014
13 / 27
The construction of Strata
Dalenius and Hodges (1959) method
Since (ZH − Z0 ) is fixed,
(Zh − Zh−1 ) constant.
PH
h=1 (Zh
− Zh−1 )2 is minimized by making
p
To achieve this goal, the rule is to form the cumulative of f (y ) and
choose
the yh so that they create equal intervals on the cumulative
p
f (y ) scale.
1
2
3
Partition the population into √
L(> 2)H intervals with equal length.
For each interval l, compute fl , the squared root of the frequency,
and its cumulative sum.
√
Choose the stratum boundaries such that the sum of the fl are about
the same in each stratum.
Kim
Ch. 5: Stratified Sampling
Fall, 2014
14 / 27
The construction of Strata
Further thoughts
Note that we can write
H
X
h=1
Wh Sh = N −1

X X X
h
(yi − yj )2

i∈Uh j∈Uh
1/2


= N −1
H
X
Qh
h=1
Thus, we have only to choose the stratum boundaries such that Qh
are about the same, which means that Qh2 are about the same.
Idea
1
2
3
4
Apply the Hierarchical clustering method (or other clustering method)
PH
to minimize Qt = h=1 Qh .
Identify the stratum h∗ with highest value of Qh .
P
In stratum h∗ , identify i ∗ with highest value of di = j∈Uh (yi − yj )2 .
Move i ∗ in stratum h∗ to another stratum (neighbor stratum) and
compute Qt again. If such move reduces Qt , accept the change.
Otherwise, go to the next stratum with the second largest value of Qh .
Continue the process until no further move is accepted.
Kim
Ch. 5: Stratified Sampling
Fall, 2014
15 / 27
The construction of Strata
Number of Strata
First, consider an extreme case when y is generated from
Uniform(a, a + d). In this case,
VSRS (¯
y) =
d2
12n
From the same population, we can create H strata with equal stratum
sizes so that
( H
)2
( H
)2
1 X
1 X 1 d
d2
√
VST (¯
yst ) =
Wh Sh
=
=
n
n
H 12H
12nH 2
h=1
=
h=1
VSRS (¯
y)
.
H2
Thus, increasing H will decrease inversely as the square of H when
the optimal boundaries are chosen directly from the population.
Kim
Ch. 5: Stratified Sampling
Fall, 2014
16 / 27
The construction of Strata
Number of Strata
Now, suppose that the finite population is a realization of a
superpopulation model
ζ : yi = α + βxi + ei ,
ei ∼ (0, σe2 ).
Suppose that the optimum choice of stratum boundaries are
determined by means of x, with the samples of equal size n/H in each
stratum.
The model expectation of V (¯
yst ) is equal to
)
( H
X
H
2
Eζ {V (¯
yst )} =
Eζ
Wh2 Syh
n
h=1
2 2
Sy2 ρ2
1 β σx
2
2
≥
+ σe =
+ (1 − ρ )
n
H2
n H2
Kim
Ch. 5: Stratified Sampling
Fall, 2014
17 / 27
The construction of Strata
V (¯
yst )/V (¯
y ) as a function of H for the linear regression
model
Number of
Strata
2
3
4
5
6
∞
0.95
0.323
0.198
0.154
0.134
0.123
0.098
ρ
0.90
0.392
0.280
0.241
0.222
0.212
0.190
0.85
0.458
0.358
0.323
0.306
0.298
0.277
The table is taken from Cochran (1977).
Kim
Ch. 5: Stratified Sampling
Fall, 2014
18 / 27
Mathematical Programming
1
Stratified sampling
2
Sample Size allocation
3
The construction of Strata
4
Mathematical Programming
Kim
Ch. 5: Stratified Sampling
Fall, 2014
19 / 27
Mathematical Programming
Component of problem
1
Objective function: a function of one or several variables to be
optimized;
2
Decision variables: the quantities that are adjusted in order to find a
solution e.g., sample sizes;
3
Parameters: fixed inputs that are treated as constants, e.g., stratum
population counts and variances; and
4
Constraints: restrictions on the decision variables or combinations of
the decision variables, e.g., domain sizes and cost.
Kim
Ch. 5: Stratified Sampling
Fall, 2014
20 / 27
Mathematical Programming
Formal statement of an optimization problem
Problem: Find the set of sample sizes {nh ; h = 1, · · · , H} to minimize
the weighted sum of relative variances
Ψ=
J
X
ωj relvar(yˆ
¯j )
j=1
where ωj is the weight for the importance of item j and


H
H
X
X
X
1
yˆ¯j =
Wh y¯jh =
Wh 
yi,j  .
nh
h=1
Kim
h=1
Ch. 5: Stratified Sampling
i∈Ah
Fall, 2014
21 / 27
Mathematical Programming
Formal statement of an optimization problem
Subject to the constraints:
1
2
3
4
nh ≤ Nh for all h;
nh ≥ nmin , a minimum sample size in every stratum;
2
2
{CV (¯
yj,sh )} ≤ (CV0jh ) for certain strata and variables;
PH
Budget: C = C0 + h=1 ch nh
Kim
Ch. 5: Stratified Sampling
Fall, 2014
22 / 27
Mathematical Programming
Software
Solver
SAS: Proc NLP, Proc Optmodel
R package: alabama
Kim
Ch. 5: Stratified Sampling
Fall, 2014
23 / 27
Mathematical Programming
Proc NLP
minx f (x), x = (x1 , · · · , cp )
subject to
ci (x) = 0,
i = 1, · · · , m1 ,
ci (x) ≥ 0,
i = m1 , · · · , m1 + m2 ,
lj ≤ xj
Kim
≤ uj ,
j = 1, · · · , p.
Ch. 5: Stratified Sampling
Fall, 2014
24 / 27
Mathematical Programming
Example: Artificial population of business establishments
Stratum population means, standard deviations, and proportions for
an artificial population of business establishments
h
1
2
3
4
5
Kim
Business
Sector
Manufacturing
Retail
Wholesale
Service
Finance
Total
Population
Size (Nh )
Cost
(ch )
600
1,200
400
2,300
500
5,000
120
80
50
90
150
Ch. 5: Stratified Sampling
Pop’n Proportion
Claimed
Had
Research Offshore
Credit affiliates
0.8
0.06
0.2
0.03
0.5
0.03
0.3
0.21
0.9
0.77
2,060
952
Fall, 2014
25 / 27
Mathematical Programming
Example (Cont’d): Excel spreadsheet set-up
Kim
Ch. 5: Stratified Sampling
Fall, 2014
26 / 27
Mathematical Programming
Example (Cont’d): Excel spreadsheet set-up
Kim
Ch. 5: Stratified Sampling
Fall, 2014
27 / 27