Step 2: Specify a criterion for selection a model

Step 1: Specify the maximum
model to be considered.
„
From Kleinbaum, Kupper, Muller, Nizam: applied Regression
Analysis and Multivariable Methods. Duxbury, CA, USA
Dummy variables, quadratic terms,
cubic terms, interaction tems, etc.
1
2
Step 2: Specify a criterion
for selection a model
„
何謂重要?以p-value或R2決定
1
R-square
„
„
2
r1 < r2
β1 > β2
P-value
3
4
Step 3: Specify a strategy
for selecting variables.
„
„
„
„
„
Use p-value to select:
forward
backward
Stepwise
Modified methods: Chunkwise, Hierarchical
Selection of
Significant
Factors
„
„
„
Use R2 to select:
R-square, Mallow’s Cp (KKM, pages 391-392,
411)
5
Step 3: Specify a strategy
for selecting variables.
Use p-value to select
significant independent
variables:
„ forward
„ backward
„ Stepwise
Forward selection
Compute p-value of simple
linear regression for each X’s
„ One by one include variable
with p-value<.05
„ Into model
„ Entry prob=0.05 or less
„
„
7
8
Backward selection
Stepwise selection
Same as forward selection
„ Only double check p-value
once again, when X’s enter
into model
„ Entry prob=0.05 or more
„ Removal prob=0.05 or less
Put all of X’s into model
„ One by one delete any X’s
with p-value>.05
„
„
„
Removal prob=0.05 or less
9
10
Related factors for TMD
Model 1
Variable
Use this strategy
hierarchically
age group
≦25 yrs
26-34
≧35 yrs
level
junior
senior
stress
No
Yes
CHQ
≦2
≧3
smoking
No
Yes
drinking
No
Yes
betel quid chewing
No
Yes
* p<0.05
Odds
Ratio
95% CI
Model 2
pvalue
Odds
Ratio
95% CI
Model 3
pvalue
Odds
Ratio
95% CI
Model 4
pvalue
Odds
Ratio
95% CI
pvalue
1.00
1.17
2.83
(0.64, 2.12)
(1.36, 6.04) *
1.00
1.16
3.02
(0.62, 2.17)
(1.42, 6.67) *
1.00
1.18
2.85
(0.64, 2.16)
(1.36, 6.15) *
1.00
1.18
3.09
(0.63, 2.23)
(1.43, 6.90) *
1.00
1.15
(0.65, 2.03)
1.00
1.19
(0.66, 2.16)
1.00
1.27
(0.71, 2.29)
1.00
1.31
(0.71, 2.42)
1.00
4.48 (1.75, 13.25) *
1.00
4.02 (1.52, 12.24) *
1.00
3.24
1.00
4.56 (1.77, 13.56) *
1.00
1.01
1.00
1.54
**<.0001
1.00
3.31
(1.93, 5.78) **
(0.60, 2.01)
1.00
1.22
(0.65, 2.30)
(0.57, 1.80)
1.00
0.89
(0.49, 1.62)
(0.77, 3.14)
1.00
1.53
(0.74, 3.21)
(1.90, 5.61) **
1.00
1.10
1.00
3.97 (1.49, 12.23) *
Limitation: ORs did not dramatically change!
Step 4
Conduct the specified
analysis.
14
Step 5
Evaluate the reliability
of the model chosen
Lee, et al., Journal of Oral Rehabilitation (2007) 34, 79-87
15
Regression Diagnostics (KKM 212-253)
Criteria for goodness of fit
(Rosner pages 487-491, 519-530)
e = Y − Yˆ
i
i
i
residuals, i=1,2,……,n
Standardized Residuals
Studentized Residuals (STUDENT)
Jackknife Residuals (RSTUDENT)
From Kleinbaum, Kupper, Muller, Nizam: applied Regression
Analysis and Multivariable Methods. Duxbury, CA, USA
17
From Kleinbaum, Kupper, Muller, Nizam: applied Regression
Analysis and Multivariable Methods. Duxbury, CA, USA
18
KKM Page 225
First, graphical presentation
(Rosner 520-529 or KKM 225)
From Kleinbaum, Kupper, Muller, Nizam: applied Regression
Analysis and Multivariable Methods. Duxbury, CA, USA
20
Second, check normal distribution
of residuals (KKM 227)
check the normality of
Studentized Residuals or
Jackknife Residuals
From Kleinbaum,
Kupper, Muller,
Nizam: applied
Regression
Analysis and
Multivariable
Methods. Duxbury,
CA, USA
n<50 use Shapiro-Wilks’s test
n>=50 use Kolmogorov-Smirnov test
p-value>=0.05 indicate normally
distributed
21
Third, check independence (KKM 227)
22
Fifth, check outliers (KKM 229-232)
compute Durbin-Watson autocorrelation
when DW close to 0 and p-value>=0.05
indicate independence
Forth, check homogeneity (KKM 227)
compute the absolute value of Studentized
Residuals or Jackknife Residuals, and check
the Spearman rank correlation with all X’s.
Correlations close to zeros and nonsignificant indicate homogeneity.
From Kleinbaum, Kupper, Muller, Nizam: applied Regression
Analysis and Multivariable Methods. Duxbury, CA, USA
From Kleinbaum, Kupper, Muller, Nizam: applied Regression
Analysis and Multivariable Methods. Duxbury, CA, USA
23
use Jackknife residuals or
studentized residuals > t(n-k-2),α/(2n)
use Leverage > 2×(k+1)/n
use Cook’s D > 1
From Kleinbaum, Kupper, Muller, Nizam: applied Regression
Analysis and Multivariable Methods. Duxbury, CA, USA
24
Violation of assumptions
collinearity
independence
linearity
homogeneity
normality
outlier
β's, CIs, p-values
CIs, p-values
β's, CIs, p-values
CIs, p-values
CIs, p-values
β's, CIs, p-values
„
„
„
Transformation
KKM, pages 251-252
Rosner, pages 489-490
25
26
Any questions?
引用圖文出處:
Rosner: Fundamentals of Biostatistics, 6th. Wadsworth Publishing
Company.
KKM: Kleinbaum, Kupper, Muller, Nizam: applied Regression Analysis
and Multivariable Methods. Duxbury, CA, USA
27