Outline
Quality Management
质量管理
• Point estimators, Confidence interval, Hypothesis test (PValue)
• Type I error and Type II error (and sample size)
• Statistical inference of a single sample
Lecture
L
t
IV
Statistical Methods in QM (2)
– The mean of a normal distribution, variance known
– The mean of a normal distribution,
distribution variance unknown
– The population proportion
Undergraduate Course
• Statistical inference of two samples
p
– Difference of mean of two normal distributions, variance known
– Difference of mean of two normal distributions, variance
unknown
– Difference of two population proportions
• Statistical inference of more than two samples (ANOVA)
Instructors: Wei Jiang, Wenhui Zhao
Antai College of Economics and Management
Shanghai Jiao Tong University
1
•
Read Chapter 3 of reference 1
2
Point Estimator
Point Estimator
(点估计)
(点估计)
• Probably you want to calculate the mean of the five flight
ti
times
off your first
fi t paper helicopter.
h li t
• What will happen
pp if yyou are asked to run the paper
p p
helicopter for another five times?
• What is the flight
g time of the first
paper helicopter of your group?
• Will you get the same mean?
• So, the mean you report depends on the sample you use –
a sample
l statistic.
i i
3
4
Point Estimator
Point Estimator
• A point estimator of an unknown parameter is a statistic
that corresponds to the parameter
parameter. (一个未知参数的估计量
是相应于该参数的一个统计量(随机变量))
• Si
Since th
the point
i t estimator
ti t changes
h
from
f
sample
l to
t sample,
l
what we want our point estimators?
μ
• We
W wantt them
th to
t be
b concentrated
t t d around
d th
the ttrue value.
l
• Properties of good point estimators:
‒ Unbiased (无偏Æ期望值等于真实值)
‒ Small variance (方差小)
• The sample mean x and sample variance s2 are point
estimators of the population mean μ and population
variance σ2. (样本均值和方差是总体均值和方差的点估计)
• Since E ( xi ) = μ , E ( xi − μ ) 2 =σ 2 ⇒ E ( xi2 ) =σ 2 + μ 2, we have
1
⎛1 n ⎞ 1 n
E ( x ) = E ⎜ ∑ xi ⎟ = ∑ E ( xi ) = μ Var ( x ) = σ 2
n
⎝ n i=1 ⎠ n i=1
1
⎛ n
⎞ 1
⎛ n
⎞
E (s 2 ) =
E ⎜ ∑ ( xi − x ) 2 ⎟ =
E ⎜ ∑ xi2 − nx 2 ⎟
n −1 ⎝ i=1
⎠ n −1 ⎝ i=1
⎠
2
⎤
⎛σ
1 ⎡ 2
2
2⎞
2
=
⎢ nσ + nμ − n ⎜ + μ ⎟ ⎥ =σ
n −1 ⎣
⎝ n
⎠⎦
• So,
So they are unbiased point estimators
estimators.
5
Point Estimator
6
Point Estimator
• However, the sample standard deviation s is not an
unbiased estimator of the population standard deviation
σ (样本标准差却不是总体标准差的无偏估计):
12
⎛ 2 ⎞
E (s ) = ⎜
⎟
⎝ n −1 ⎠
• Is it easy to calculate the sample standard
deviation s?
Γ(n 2 )
σ = c4σ
Γ[(n − 1) 2]
• Apparently,
A
tl it is
i nott an issue
i
nowadays.
d
• Then, an unbiased estimate of the standard deviation
s
from
σˆ =
• It does be a problem in the old days when
computers are not so popular.
c4
•A
As n gets
t llarge the
th bi
bias goes tto zero , i.e.,
i c4 Æ1.
Æ1 (we
(
have table for values of c4 for sample sizes 2 ≤ n ≤ 25)
7
8
Point Estimator
Estimate Standard Deviation
• In manyy applications,
pp
, it is q
quick and easy
y to estimate the
standard deviation by range (用极差估计总体的标准差).
• Let x1, x2, ···,, xn be a random sample from a normal
distribution with mean μ and variance σ2. The range is
R = max(xi) – min(xi) = xmax – xmin
• The mean of W= R/σ is a constant d2: E(W) = d2.
• Then,
h an unbiased
bi d estimator
i
off the
h standard
d d deviation
d i i σ is
i
R
σˆ =
d2
• d2 for sample sizes 2 ≤ n ≤ 25 can be found from tables.
9
EXAMPLE:
• The flight times of the first paper helicopter of group 1 are:
G1 D1
G1-D1
R1
R2 R3
1 9 22.03
1.9
03 22.44
44
R4 R5
2 25 11.72
2.25
72
• Using quadratic estimator s, we have s = 0.283.
• Using range, we have R = 2.44 – 1.72 = 0.72.
• For sample sizes n = 5,
5 1/d2 = 0.4299
0 4299
R
σˆ1 = = 0.72×0.4299 = 0.310
d2
• For sample sizes n = 5, 1/c4= 1.0638
s
σˆ 2 = = 0.283
0 283×1.0638
1 0638 = 0.302
0 302
c4
Estimate Standard Deviation
10
Estimate Standard Deviation
G1-D1-H1
G1 D1 H2
G1-D1-H2
G1-D2-H1
G1-D2-H2
G2 D1 H1
G2-D1-H1
G2-D1-H2
G2-D2-H1
G2 D2 H2
G2-D2-H2
G3-D1-H1
G3-D1-H2
G3 D2 H1
G3-D2-H1
G3-D2-H2
G4-D1-H1
G4 D1 H2
G4-D1-H2
G4-D2-H1
G4-D2-H2
11
R1
1.9
2 15
2.15
2.19
2.07
17
1.7
1.88
1.53
1 98
1.98
1.75
1.84
2 45
2.45
1.62
1.78
2 09
2.09
2
2.09
R2
2.03
19
1.9
1.66
2
1 67
1.67
1.91
1.75
1 96
1.96
1.91
1.79
1 75
1.75
1.31
1.75
2 62
2.62
2.28
1.78
R3
2.44
2 25
2.25
2.09
2
16
1.6
1.94
1.69
2
1.75
1.66
1 57
1.57
1.75
1.85
2 32
2.32
2.28
1.82
R4
2.25
2 09
2.09
1.91
2.31
1 88
1.88
2
1.69
2 08
2.08
1.82
1.63
1 44
1.44
2
1.94
25
2.5
2.34
1.72
R5
1.72
1 84
1.84
1.82
2.72
1 56
1.56
1.75
1.65
21
2.1
1.75
1.66
1 66
1.66
1.62
1.94
2 85
2.85
2.19
1.84
R/d2
0.310
0 176
0.176
0.228
0.310
0 138
0.138
0.107
0.095
0 060
0.060
0.069
0.090
0 434
0.434
0.297
0.082
0 327
0.327
0.146
0.159
s/c4
0.302
0 183
0.183
0.225
0.327
0 132
0.132
0.099
0.087
0 066
0.066
0.075
0.099
0 420
0.420
0.266
0.094
0 308
0.308
0.142
0.151
R/d2
s/c4
R/d2
s/c4
0.234
0.
3 0.
0.228
8
0.284 0.255
0.344 0.299
0.143 0.157
0.153 0.179
0.185 0.208
0.091 0.091
0.305 0.231
0.370 0.326
0.357 0.397
0.303 0.322
0.201 0.240
12
Point Estimator
Point Estimator
• Generally,
Generally the “quadratic
quadratic estimator”
estimator s is preferable.
preferable ((一般
般
来说,用s估计样本标准差更好一些)
•H
However, if the
th sample
l size
i n is
i relatively
l ti l small,
ll the
th range
method actually works very well. (样本容量较小时,极差法
效果也很好)
• For small sample sizes (say, n < 6), R works very well
and
d is
i entirely
ti l satisfactory.
ti f t
• For moderate n ((say,
y n > 10),
) the R loses efficiency
y
rapidly, as it ignores all information in the sample
gg
to use s.
between the extremes. It is suggested
• If we estimate the standard deviation of a paper helicopter
of a group (five data sample), should we use s or R?
g p
• If we want to estimate the standard deviation of a group
(twenty data sample), should we use s or R?
13
14
Confidence Intervals
Confidence Intervals (置信区间)
(
)
• An interval estimate of a parameter is the interval that
includes the true value of the parameter with some
probability. (参数的区间估计是该参数真实值落入该区间的可
• Again,
Again what is the flight time of the
first paper helicopter of your group?
能性为给定值))
• For example, to construct an interval estimator of the mean
μ, we must find two statistics L and U such that
P{L ≤ μ ≤ U } = 1 − α .
• Since the point estimators will change from sample to
sample we might want to know if we repeat for 100
sample,
times, what interval will contain the majority of those
point estimators
estimators.
• The majority is pre-determined, called confidence level.
• The resulting interval [L, U] is called a two-sided 100(1 –
α)% confidence interval for the unknown mean μ.
• 1 - α is called the confidence level (置信水平),L and U are
called the lower and upper confidence limits (置信界限).
(置信界限)
• Generally, we use 95%.
15
16
Confidence Intervals
Hypothesis Test (假设检验)
• Sometimes a one-sided confidence interval (单侧置信区间) might
be more appropriate.
• A one-sided
one sided lower 100(1100(1 α)% confidence interval on μ is given by
L≤μ
where L,, the lower confidence bound (置信下界), is chosen so that
P{L ≤ μ} = 1 − α
• Suppose, in general, people make paper
helicopters in the same conditions (as in our tests)
will have the flight time 2.30 seconds with
standard deviation 0.30
0 30 seconds.
seconds
• A one
one-sided
sided upper 100(1-α)% confidence interval on μ is given by
μ ≤U
where U, the upper confidence bound (置信上界),
置信上界 is chosen so that
P{μ ≤ U } = 1 − α
• Does the first paper helicopter of group 1 meet
the standard?
• How confident we make the above judgment?
17
Hypothesis Test (假设检验)
Hypothesis Test (假设检验)
• Statistical inference (统计推断) can be classified into two
g
pparameter estimation (point estimator and
broad categories:
confidence interval) and hypothesis testing (假设检验).
• A stat
statistical
st ca hypothesis
ypot es s iss a statement
state e t about the
t e values
va ues of
o
the parameters of a probability distribution.
Alternative
Hypothesis (替代假设)
H 0 : μ = μ0
H1 : μ ≠ μ 0
18
Null Hypothesis (原
假设或空假设)
• We reject H0 only if we have strong evidence from the
t t statistic.
test
t ti ti Otherwise,
Oth
i we have
h
to
t acceptt it!
• The set of values of the test statistic leading
g to rejection
j
of H0 is called the critical region or rejection region (拒绝
域) for the test.
– One-sided / two-sided test (计算拒绝域要注意是单侧还是双侧检验)
• If th
the ttestt statistic
t ti ti is
i within
ithi rejection
j ti region,
i then
th reject
j t
H0. Otherwise accept H0.
• Above is two-sided alternative hypothesis (双侧检验)
• One-sided hypothesis test (单侧检验 ):
H 0 : μ = μ0
H 0 : μ = μ0
H1 : μ < μ0
H1 : μ > μ 0
• The choice of μ in H0 heavily depends on the
knowledge and past experience.
19
20
Inference on the Mean of a Normal Distribution,
Variance Known
Stat Inference for a Single Sample
(单样本统计推断)
Inference on the Mean of a Normal Distribution,
Variance
V
ce Known
ow
EXAMPLE:
• Suppose, people make paper helicopters in the same conditions (as
our tests) will have flight time 2.3 sec with standard deviation 0.3 sec.
• Does the first p
paper
p helicopter
p of ggroup
p 1 meet the standard?
H 0 : μ = μ0
x − μ0
~ N (0,1)
(0 1)
T Statistic
Test
S i i (检验统计量):
(检验统计量) Z 0 =
σ n
Null Hypothesis (原假定)
Alternative Hypothesis
yp
(对立假定)
(
)
H1 : μ ≠ μ 0 (Two sided, 双向检验)
H1 : μ > μ 0 (One sided, 单向检验)
H1 : μ < μ 0 (One sided, 单向检验)
Rejection
j
Region:
g
(拒绝域)
(
)
Z 0 > Zα 2
• The appropriate hypotheses are
Z 0 > Zα
Z 0 < − Zα
Confidence Interval on the Mean (均值的置信区间)
σ
σ
σ
σ
x − Zα 2
≤ μ ≤ x + Zα 2
μ ≤ x + Zα
x − Zα
≤μ
n
n
n
G1-D1
R1
1.9
n
21
Inference on the Mean of a Normal Distribution,
Variance Known
R2
2.03
R3
2.44
R4
2.25
R5
1.72
H 0 : μ0 = 2.3 H1 : μ0 ≠ 2.3
• Solution: The sample size is n = 5.
5 The sample average flight time
is x = 2.07 seconds. Test statistic is
x − μ0 2.07 − 2.3
Z0 =
=
= −1.71
1 71
σ n 0.3 5
• Level of significance
g
is α = 0.05. It is a two-sided hypothesis
yp
test.
Then, Zα/2 = Z0.025 = 1.96. Z 0 < Zα /2
22
Inference on the Mean of a Normal Distribution,
Variance Known
EXAMPLE - CONT:
• Other p
people
p believe ppaper
p helicopters
p
in the same conditions will
have flight time 1.9 seconds with standard deviation 0.3 seconds
• Does the first paper helicopter of group 1 meet the standard?
• We can see that our test result cannot reject H0.
• Th
That means we ddo not hhave strong evidence
id
that
h the
h
flight time of the first paper helicopter of group 1 is
significantly
i ifi
l different
diff
from
f
2.3
2 3 seconds.
d
G1-D1
R1
1.9
R2
2.03
R3
2.44
• The appropriate hypotheses are H 0 : μ0 =1.9
• However, note that it does NOT mean that our test
confirm H0.
23
R4
2.25
R5
1.72
H1 : μ0 ≠1.9
• Solution: The sample size is n = 5. The sample average flight time
is x = 2.07 seconds. Test statistic is
x − μ0 2.07 −1.9
Z0 =
=
=1.27
1 27
σ n 0.3 5
• Level of significance is α = 0.05. It is a two
two-sided
sided hypothesis test.
Then, Zα/2 = Z0.025 = 1.96. Z 0 < Zα /2
24
Inference on the Mean of a Normal Distribution,
Variance Known
Inference on the Mean of a Normal Distribution,
Variance Known
• In fact, there will be a range of the null hypothesis H0
that the sample
p cannot reject.
j
x − μ0
,
n
• More interestingly, from Z 0 =
σ
d
depends
d on n?? n -->
> ∞??
H1 : μ ≠ μ 0
H1 : μ > μ 0
H1 : μ < μ 0
• We also need to test the normality
y of the data, which we
have done in Lecture 03.
will the range
(Two sided, 双向检验)
(One sided, 单向检验)
单向检验
(One sided, 单向检验)
Z 0 > Zα 2
Z 0 > Zα
Z 0 < − Zα
• What are the real world applications?
25
Inference on the Mean of a Normal Distribution,
Variance Known
P-Values
Definition
The P-value is the smallest level of significance (either
two-sided or one-sided) that would lead to rejection of
the null hypothesis H0. (P-value是我们拒绝原假设的最小的
显著性水平(Why we need it?))
P
Test statistic
x
• With P-value, a decision maker can determine how significant the
data are without imposing a preselected level of significance.
• For the normal distribution tests, the P-value is
Φ (| Z 0 |)] ffor a two-tailed
il d test:
H 0 : μ = μ0
⎧2[1−Φ
⎪
P = ⎨1−Φ ( Z 0 )
for an upper-tailed test: H 0 : μ = μ0
⎪Φ ( Z )
for a lower-tailed test: H 0 : μ = μ0
0
⎩
H1 : μ ≠ μ 0
H1 : μ > μ 0
H1 : μ < μ 0
• Consider the ppaper
p helicopter
p example
p with μ = 2.3 seconds. The
computed value of the test statistic is and since the alternative
hypothesis is one-tailed, the P-value is
P
μ0
26
μ0
x
P = 2[1−Φ (1.71)] = 0.087
• Thus, H0 would be rejected at any level of significance α ≥ P =
0.087. E.g., H0 would be rejected if α = 0.1 but not if α = 0.05
P
x
μ0
27
28
Inference on the Mean of a Normal Distribution,
Variance Unknown
Stat Inference for a Single Sample
Inference on the Mean of a Normal Distribution
Distribution,
Variance Unknown
EXAMPLE:
• People believe that paper helicopters in the same conditions will
have the flight time at least 2.4
2 4 seconds.
seconds
• Does the first paper helicopter of group 1 meet the standard?
• As σ2 is unknown,
unknown it may be estimated by s2. We use t-test.
t test
H 0 : μ = μ0
x − μ0
t0 =
~ tn−1
Test Statistic (检验统计量):
检验统
s n
Null Hypothesis (原假定)
Alternative Hypothesis (对立假定)
H1 : μ ≠ μ 0 (Two sided, 双向检验)
H1 : μ > μ 0 ((One sided,, 单向检验)
H1 : μ < μ 0 (One sided, 单向检验)
G1-D1
Rejection Region: (拒绝域)
t0 > tα 2,n −1
n
t0 < −tα ,n −1
n
• One could also compute the P-value for a t-test
R2
2.03
R3
2.44
• The appropriate hypotheses are H 0 : μ0 ≥ 2.4
24
R4
2.25
R5
1.72
H1 : μ0 < 2.4
24
• Solution: Since no enough information about the standard
deviation of the flight times,
times we use t-test.
t-test The sample size is n = 5.
5
The sample mean is x = 2.07 sec, and standard deviation is s = 0.28
sec. Test statistic is
x − μ0 2.07 − 2.4
t0 =
=
= −2.6
26
s n 0.28 5
t0 > tα ,n −1
Confidence
C
fid
Interval
I t
l on the
th Mean
M
(均值的置信区间)
s
s
s
s
≤ μ ≤ x + tα 2,n −1
x − tα 2,n −1
μ ≤ x + tα ,n −1
≤μ
x − tα ,n −1
n
R1
1.9
n
29
• Level of significance is α = 0.05.
0 05 It is a one
one-sided
sided test.
test Then,
Then tα,n−11
t
<
−
t
tα,4 = 2.13. 0
α ,4
30
Two Types of Errors
i Hypothesis
in
H
th i Test
T t (两类错误)
Two Types of Errors
i Hypothesis
in
H
th i Test
T t
•N
Note that
h even if our sample
l is
i in
i fact
f from
f
a distribution
di ib i
following null hypothesis H0, the probability that the test
statistic
i i falls
f ll in
i the
h rejection
j i region
i is
i not zero.
• Type I error: If the null hypothesis is rejected when it is
actually true (原假定是真,却被拒绝了).
• Type II error: If the null hypothesis is not rejected when it
is actually false (原假定是假,却未被拒绝)
• The p
probabilities of the two types
yp of errors are denoted as
• So,, we might
g reject
j H0 byy mistake.
• Also, even if our sample is in fact from a distribution not
following the null hypothesis H0, the probability that the
test statistic falls in the accepting region is not zero.
• So, we might accept H0 by mistake.
α = P{type I error} = P{reject H 0 | H 0 is true}
β = P{type II error} = P{fail
f il to reject
j H 0 | H 0 is
i false
f l }
• α is just the level of significance (α即为检验的显著性水平)
• Power of the test (the probability of correctly rejecting H0):
Power = P{reject H 0 | H 0 is false} = 1 − β
31
32
Two Types of Errors
i Hypothesis
in
H
th i Test
T t
rejection region
Two Types of Errors
i Hypothesis
in
H
th i Test
T t
Normal distribution,
can be other
distributions
β
α/2
μ0
L
U μ1
风险,也就是把次品当好产品而错误地接受).
By increasing the
sample size n
α/2
α/2
L
μ0
U
险 也就是把好的产品错误地当成次品).
险,也就是把好的产品错误地当成次品)
• Type II error β is sometimes called the consumer’s risk:
the
h probability
b bili that
h a bad
b d lot/process
l /
is
i acceptedd (消费者的
消费者的
α/2
β
• Type I error α is sometimes called the producer
producer’ss risk: the
probability that a good lot/process is rejected (制造者的风
• The β risk is generally a function of sample size n: the
larger is n used in the test, the smaller is the β risk (β风险
一般是样本容量的函数,增加样本容量可以减小β风险).
• Type
ype I and
d II eerrors
o s in real
e wo
world
d (e.g., hospital)?
osp )?
μ1
33
34
Stat Inference for a Single Sample
Inference on a Population Proportion (推断比例)
H 0 : p = p0
x − np0
Test Statistic (检验统计量): Z 0 =
=
np0 (1 − p0 )
Standard deviation, use p0
Null Hypothesis (原假定)
Alternative
Al
i Hypothesis
H
h i (对立假定)
对立假定
H1 : p ≠ p0 (Two sided, 双向检验)
H1 : p > p0 (One
(O sided,
id d 单向检验)
H1 : p < p0 (One sided, 单向检验)
p − p0
p0 (1 − p0 ) / n
R j i Region:
Rejection
R i
(拒绝域)
拒绝域
Z 0 > Zα 2
pˆ (1 − pˆ )
≤ p ≤ pˆ + Zα /2
n
pˆ (1 − pˆ )
n
pˆ − Zα
Use pˆ
EXAMPLE
pp
we define the non-defective p
paper
p helicopter
p has
Suppose
the average flight time at least 1.75 seconds. We test the
hypothesis
yp
that the fraction non-conforming
g is 10% in the 8
paper helicopter of groups 1 and 3.
Z 0 > Zα
Z 0 < − Zα
Group 1
Confidence Interval on the Proportion (次品比例的置信区间)
pˆ − Zα /2
Inference on a Population Proportion
pˆ (1 − pˆ )
≤p
n
p ≤ pˆ + Zα
pˆ (1 − pˆ )
n
• If n is large and 0.9 ≥ p ≥ 0.1, normal approximation can be used.
• But if n is small,
small binomial distribution should be used.
used
• If n is large but p is small (large), Poisson approximation can be used.
35
Group 3
R1
1.9
2.15
2.19
2.07
1.75
.75
1.84
2.45
1 62
1.62
R2
2.03
1.9
1.66
2
1.91
.9
1.79
1.75
1 31
1.31
R3
2.44
2.25
2.09
2
1.75
.75
1.66
1.57
1 75
1.75
R4
2.25
2.09
1.91
2.31
1.82
.8
1.63
1.44
2
R5 Average Defective?
1.72
2.068
0
1.84
2.046
0
1.82
1.934
0
2.72
2.22
0
1.75
.75
1.796
.796
0
1.66
1.716
1
1.66
1.774
0
1 62
1.62
1 66
1.66
1
36
Inference on a Population Proportion
Inference on a Population Proportion
We still need to check the normality:
Solution: The hypothesis test: H 0 : p0 = 0.1 H1 : p0 ≠ 0.1
Z0 =
p − p0
=
p0 (1 − p0 ) / n
2/8 − 0.1
01
=1.41
0.1*0.9/8
Using α = 0.05 we find Z0.025 = 1.96. Therefore, H0 is not
rejected
j
d (P-value
(P l is
i P = 0.16).
0 16)
Discussion: whyy we fail to reject
j H0, although
g the sample
p
mean 2/8 = 0.25 has significantly be different from 0.1?
37
38
Difference in Mean, Variance Known
Statistical Inference for Two Samples
((方差已知))
(双样本统计推断)
Testing Hypotheses on μ1 – μ2, Variance Known
Null hypothesis: H 0 : μ1 − μ 2 = Δ 0
x − x2 − Δ 0
Test statistic:
Z0 = 1
σ 12 σ 22
Var ( x1 − x2 ) = Var ( x1 ) + Var ( x2 )
+
n1
n2
Alternative Hypotheses
Rejection Criterion
| Z 0 |> Z α / 2
H 1 : μ1 − μ 2 ≠ Δ 0
Z 0 > Zα
H 1 : μ1 − μ 2 > Δ 0
Z 0 < − Zα
H 1 : μ1 − μ 2 < Δ 0
Assumptions:
•
•
•
•
Seems to be OK.
x11, x12, ..., x1n1 is
i a random
d sample
l from
f
population
l ti 1.
1
x21, x22, ..., x2n2 is a random sample from population 2.
The two populations represented by x1 and x2 are independent.
Both populations are normal, or if they are not normal, the
conditions of the central limit theorem apply.
(3 − 48)
Confidence Interval on a Difference in Means
σ 12
Two sided: x1 − x 2 − Z α / 2
Two-sided:
One-sided:
One
sided: x1 − x2 − Z α
39
σ
n1
2
1
n1
+
σ
+
2
2
n2
σ 22
n2
≤ μ1 − μ 2 ≤ x1 − x 2 + Z α / 2
≤ μ1 − μ 2
μ1 − μ 2 ≤ x1 − x2 + Z α
σ 12
n1
σ 12
n1
+
+
σ 22
n2
σ 22
n2
40
Difference in Mean, Variance Unknown
EXAMPLE
• We would like to see whether the performance of group 1
i significantly
is
i ifi
tl better
b tt than
th group 3.
3
• We take out the clear outlier 2.72 from group 1.
• We
W use sample
l standard
t d d deviations
d i ti
off the
th two
t groups.
• Use α = 0.01.
The Two-Sample Pooled t-Test
Null hypothesis: H 0 : μ1 − μ 2 = Δ 0
• Only for σ 12 = σ 22 = σ 2
(σ 12 ≠ σ 22 study yourself)
Test statistic:
n1 − 1
s =
s12 +
( n1 − 1) + ( n2 − 1)
2
p
n2 − 1
s22
( n1 − 1) + ( n2 − 1)
=
Difference in Mean, Variance Unknown
((方差未知))
t0 =
Alternative Hypotheses
H 1 : μ1 − μ 2 ≠ Δ 0
H 1 : μ1 − μ 2 > Δ 0
H 1 : μ1 − μ 2 < Δ 0
( n1 − 1) s12 + ( n2 − 1) s22
n1 + n2 − 2
x1 − x2 − Δ 0
1 1
sp
+
n1 n2
Rejection Criterion
| t 0 |> tα / 2 , n1 + n2 − 2
t 0 > tα , n1 + n2 − 2
t 0 < − tα , n1 + n2 − 2
G
Group
1
Confidence Interval on a Difference in Means
Two-sided: x1 − x 2 − tα / 2 , n + n
1
2 −2
One-sided: x1 − x2 − tα , n + n − 2 s p
1
2
sp
1
1
1
1
+
≤ μ1 − μ 2 ≤ x1 − x 2 + tα / 2 , n1 + n2 − 2 s p
+
n1 n2
n1 n2
1 1
+
≤ μ1 − μ 2
n1 n2
μ1 − μ 2 ≤ x1 − x2 + tα , n + n
1
2 −2
sp
Group 3
1 1
+
n1 n2
R1
1.9
2.15
2.19
2.07
1 75
1.75
1.84
2.45
1 62
1.62
R2
2.03
1.9
1.66
2
1 91
1.91
1.79
1.75
1 31
1.31
R3
2.44
2.25
2.09
2
1 75
1.75
1.66
1.57
1 75
1.75
R4
2.25
2.09
1.91
2.31
1 82
1.82
1.63
1.44
2
R5
1.72
1.84
1.82
2.72
1 75
1.75
1.66
1.66
1 62
1.62
Average
Stdev
2 03
2.03
0 20
0.20
1.74
0.23
41
Difference in Mean, Variance Unknown
G1
G3
42
Avg Stdev
2.03 0.20
1.74
0.23
• The hypotheses of interest are H 0 : μ1 − μ 2 = 0 H1 : μ1 − μ 2 > 0
s 2p =
( n1 −1) s12 + ( n2 −1) s22 (19 −1)0.20
1)0 20 2 + (20 −1)0
1)0.23
232
=
= 0.0466
n1 + n2 − 2
19 + 20 − 2
t0 =
x1 −x2 −(μ1 −μ2 )
2.03−1.74
=
= 4.19
0.0466 0.0466
1 1
+
sp +
19
20
n1 n2
Difference in Mean, Variance Unknown
• Box plots: indicate difference in the median of the two samples.
• Probability plots: both approximately along straight lines, with
similar slopes. The slope of the line is proportional to the standard
deviation. (Group 3 does have outliers.)
• So, the normality and equal variances assumptions are reasonable.
• Beca
Because
se the test statistic t0 = 4.19
4 19 > t0.01,(19+20-1) = 2.429,
2 429
we reject H0 at the 0.01 level.
• The P-value
P al e for this test is 1 - T(4.19,38)
T(4 19 38) = 8.00×10
8 00×10-5
• Therefore, H0 would be rejected at any significance level
α > 8.00×10
8 00×10-5.
43
44
Inference on Two Population Proportions
((两样本比例差异))
EXAMPLE
pp
we define the non-defective p
paper
p helicopter
p has
Suppose
the average flight time at least 1.70 seconds. We test the
hypothesis
yp
that the fraction non-conforming
g of ggroup
p 1 is
better than group 3.
• x11, x12, ..., x1n1 is a random sample from population 1.
• x21, x22, ..., x2n2 is a random sample from population 2.
• The
Th two populations
l i
representedd by
b x1 and
d x2 are independent.
i d
d
Null hypothesis: H 0 : p1 = p2
pˆ − pˆ 2 − ( p1 − p2 )
Test statistic: Z 0 = 1
=
⎛
⎞
1
1
x + x2
pˆ (1 − pˆ ) ⎜ + ⎟
pˆ = 1
⎝ n1 n2 ⎠
n1 + n2
Alternative Hypotheses
pˆ 1 − pˆ 2 − Zα / 2
pˆ 1 − pˆ 2
⎛1 1 ⎞
pˆ (1 − pˆ ) ⎜ + ⎟
⎝ n1 n2 ⎠
Rejection Criterion
H 1 : p1 ≠ p 2
| Z 0 |> Z α / 2
H 1 : p1 > p 2
Z 0 > Zα
H 1 : p1 < p 2
Inference on Two Population Proportions
Group 1
Z 0 < − Zα
pˆ 1 (1 − pˆ 1 ) pˆ 2 (1 − pˆ 2 )
+
≤ p1 − p2 ≤ pˆ 1 − pˆ 2 + Zα / 2
n1
n2
pˆ 1 (1 − pˆ 1 ) pˆ 2 (1 − pˆ 2 )
+
n1
n2
Group 3
R1
1.9
2.15
2.19
2.07
1.75
.75
1.84
2.45
1 62
1.62
R2
2.03
1.9
1.66
2
1.91
.9
1.79
1.75
1 31
1.31
R3
2.44
2.25
2.09
2
1.75
.75
1.66
1.57
1 75
1.75
R4
2.25
2.09
1.91
2.31
1.82
.8
1.63
1.44
2
R5 Average Defective?
1.72
2.068
0
1.84
2.046
0
1.82
1.934
0
2.72
2.22
0
1.75
.75
1.796
.796
0
1.66
1.716
0
1.66
1.774
0
1 62
1.62
1 66
1.66
1
45
Inference on Two Population Proportions
Solution: The hypothesis test: H 0 : p1 − p2 = 0 H1 : p1 − p2 < 0
x + x2 0 + 1
pˆ = 1
=
= 0.125
n1 + n2 4 + 4
Z0 =
=
pˆ 1 − pˆ 2 − ( p1 − p2 )
⎛1 1 ⎞
pˆ (1 − pˆ ) ⎜ + ⎟
⎝ n1 n2 ⎠
0 −1 / 4
=
pˆ 1 − pˆ 2
⎛1 1 ⎞
pˆ (1 − pˆ ) ⎜ + ⎟
⎝ n1 n2 ⎠
⎛1 1⎞
0.125(1 − 0.125) ⎜ + ⎟
⎝4 4⎠
= − 0.53
0 53
46
Statistical inference of More Than Two
Samples (ANOVA)
• Let the number of samples
p be a > 2. If we conduct t-Test
for each of the combinations with two samples:
Ca2
– The number of t-Tests required:
q
• Let a = 4. Each t-Test has α = 0.05 chance of being
significant
significant.
• Overall Type I error rate, is 1 – (0.95)6 ~ 0.26. Overall
Type I error rate increases
i
dramatically.
i
S we ddo
So,
not want to have too many t-Tests.
Using α = 0.05 we find Z0.05 = 1.645. H0 is not rejected .
Still think about: why we fail to reject H0, although it
seems that the two samples have different p.
• ANOVA only has one test for the multiple samples.
We will discuss it in the design of experiment part.
47
48
Lecture 04 Summarization
• Point estimators, Confidence interval, Hypothesis test (PValue)
• Type I error and Type II error (and sample size)
• Statistical
i i l inference
i f
off a single sample
l
– The mean of a normal distribution, variance known
– The mean of a normal distribution,
distribution variance unknown
– The population proportion
• Statistical inference of two samples
– Difference of mean of two normal distributions, variance known
– Difference of mean of two normal distributions, variance
unknown
– Inference of two population proportions
• Statistical
S i i l inference
i f
off more than
th ttwo samples
l (ANOVA)
49