Solutions to mid-term examination

Solutions to mid-term examination
Introductory econometrics, spring 2014
Name:
Uni ID:
You have 90 minutes (8:35 Am - 10:05Am) to complete the examination paper.
Everyone may not write anything in the paper after 10:05 Am. Otherwise, 30
points will be subtracted from your mid-term exam score.
1. Multiple choice problems. Only one answer is right in the four choices in
each problem. Please write the answer in the parentheses. (21 points, 3 for
each)
(1) The CLM includes six assumptions: MLR.1. Linearity; MLR.2. Random sampling; MLR.3. Full rank; MLR.4. Zero conditional mean;
MLR.5. Homoskedasticity; MLR.6. Normality. In order to make sure
unbiasedness of the OLS estimator, what assumptions have to be satisfied? (C)
A. MLR.1-MLR.6
B. MLR.1-MLR.5
C. MLR.1-MLR.4
D. MLR.1-MLR.3
(2) H0 stands for the null hypothesis of a statistical test. The significance
level of a hypothesis test is best defined as: (B)
A. The probability if retaining H0 when H0 is true.
B. The probability if rejecting H0 when H0 is true.
1
C. The probability if retaining H0 when H0 is false.
D. The probability if rejecting H0 when H0 is false.
(3) Among the following statements, which is wrong? (B)
A. The sample average of the residuals is zero and so y¯ = y¯ˆ.
B. The sample covariance between the OLS fitted values and the OLS
residuals is zero, while the sample covariance between each independent variable and the OLS residuals can be non-zero.
C. The point (¯
x1 , x¯2 , ..., x¯k , y¯) is always on the OLS regression line: y¯ =
βˆ0 + βˆ1 x¯1 + ... + βˆk x¯k .
D. R2 always increases when a new independent variable is added to a
regression.
(4) βˆ1 = 0.7 and β1 = 0.9, βˆ2 = −0.5 and β2 = −0.6, where βˆj is the OLS
estimate, and βj is the true parameter. Among the following statements
on estimations bias, which is wrong? (B)
A. βˆ1 has a downward bias.
B. βˆ2 has a downward bias.
C. βˆ2 has a upward bias.
D. βˆ1 and βˆ2 both have a bias towards zero.
(5) Which of the following factors makes V ar(βˆ1 ) smaller (with certainty)?
(A)
A. A larger value of N, sample size.
¯ 2 , i = 1, ..., N .
B. Smaller values of x2i = (Xi − X)
C. A larger value of σ 2 , the error variance.
D. A larger value of k, the number of regressors.
(6) Suppose the dependent variable is timing by 10, and run the regression
again. Then what will happen? (A)
A. The intercept increases to 10 times.
2
B. The coefficients of regressors does not change.
C. The t statistics increases.
D. The R2 increases.
d
(7) Given the estimation results, log(price)
= 9.23 − .718 log(nox) + .306rooms,
what is right among the flowing interpretations? (B)
A. When there is additional room, the housing price increase by .306%.
B. When there is additional room, the housing price increase by 30.6%.
C. When nox increases by 1%, the housing price reduces by 7.18%.
D. When nox increases by 1%, the housing price reduces by 71.8%.
2. Consider the saving function
sav = β0 + β1 inc + u, u =
√
inc · e,
where e is a random variable with E(e) = 0 and V ar(e) = σe2 . Assume that e
is independent of inc. (21 points totally and 7 for each)
(1) Show that E(u|inc) = 0, so that the key zero conditional mean assumption is satisfied. (7 points)
√
Solution: When we condition on inc in computing an expectation, inc becomes
√
√
√
a constant. So E(u|inc) = E( inc · e|inc) = inc · E(e|inc) = inc · 0 because
E(e|inc) = E(e) = 0.
(2) Show that V ar(u|inc) = σe2 inc, so that the homoskedasticity Assumption
is violated. In particular, the variance of sav increases with inc. (7
points)
√
Solution: Again, when we condition on inc in computing a variance, inc becomes
√
√
a constant. So V ar(u|inc) = V ar( inc · e|inc) = ( inc)2 V ar(e|inc) = σe2 inc
because V ar(e|inc) = σe2 .
(3) Provide a discussion that supports the assumption that the variance of
savings with increases with family income. (7 points)
Solution: Families with low incomes do not have much discretion about spending;
3
typically, a low-income family must spend on food, clothing, housing, and other
necessities. Higher income people have more discretion, and some might choose
more consumption while others more saving. This discretion suggests wider variability in saving among higher income families.
3. A researcher is using data from a sample of 274 male employees to instigate the relationship between hourly wage rates Yi (measured in dollars per
hour) and firm tenure Xi (measured in years). Preliminary analysis of the
sample data produces the following sample information:
N = 274
N
∑
N
∑
Yi = 1945.26
i=1
N
∑
Xi2 = 30608.00
i=1
N
∑
i=1
Xi = 1774.00
i=1
N
∑
Yi2 = 18536.73
i=1
Xi Yi = 16040.72
i=1
yi2 = 4726.377
n
∑
N
∑
xi yi = 3446.226
i=1
N
∑
x2i = 19122.32
i=1
N
∑
uˆ2i = 4105.297
i=1
¯ yi ≡ Yi − Y¯ , uˆi = Yi − Yˆi = Yi − βˆ0 − βˆ1 Xi for i = 1, ..., N . Use
where xi ≡ Xi − X,
the above sample information to answer all the following questions. Show
explicitly all formula and calculations. (58 points totally)
(1) Use the above information to compute OLS estimates of the intercept
coefficient β0 and the slope coefficient β1 . (10 points)
Solution:
∑n
βˆ2 =
i=1 (Xi − X)(Yi −
∑n
¯ 2
i=1 (Xi − X)
¯
Y¯ )
.
Thus, given the above sample information, we have
3446.226
= 0.1802201,
βˆ2 =
19122.32
and
¯ = 1 (1945.26 − 0.1802201 × 1774.00) = 5.9326626.
βˆ1 = Y¯ − βˆ2 X
274
(2) Interpret the slope coefficient estimate you calculated in part (1)–i.e.,explain
what the numeric value you calculated for βˆ1 means. (7 points)
4
Solution: The estimate 0.1802201 of β2 means that an increase (decrease) in firm
tenure Xi of 1 year associated on average with an increase (decrease) in male
employee’s hourly wage rate equal to 0.18 dollars per hour, or 18 cents per hour.
(3) Compute the value of R2 , the coefficient of determination for the estimated OLS sample regression equation.
Briefly explain what the
calculated value of R2 means. (7 points)
Solution: Given the formula
R2 = 1 − SSR/SST,
where SSR =
∑n
i=1
uˆ2i , and SST =
∑n
i=1 (Yi
− Y¯ )2 , therefore, we have R2 =
1 − 4105.297/4726.377 = .13140721. This means that about 13% of the total
sample variation in Yi (employees’ hourly wage rates) can be explained by the
sample regression function or the regressor Xi (firm tenure).
(4) Calculate the estimate for σ 2 , the error variance. (7 points)
Solution:
∑n
2
σ
ˆ =
uˆ2i
4105.297
=
= 15.0930.
N −2
272
i=1
(5) Calculate the estimated variance of βˆ1 . (7 points)
Solution:
V ar(βˆ2 ) = σ 2 /SSTx .
Given that σ
ˆ 2 = 15.0930, and SSTx =
∑n
i=1
x2i = 19122.32. Thus, V ar(βˆ2 ) =
15.0930/19122.32 = .00078929.
(6) Perform a test of the null hypothesis H0 : β1 = 0 against the alternative
hypothesis H1 : β1 ̸= 0 at the 5% significance level (i.e., for significance
level α = 0.05). State the decision rule you use, and the inference you
would draw from the test. Would you draw the same inference if you
performed the test at the 1% significance level (i.e., for significance
level α= 0.01)? (10 points)
Solution: t statistic follows a student distribution with degrees of freedom equal
√
to 272. Here, t statistic is equal to 0.1802201/ .00078929 = 6.4148267, which is
5
much large than the critical value for 5% significant level (0.975th quantile, 1.96),
thus, we can reject the null hypothesis that tenure has no relation with employees’
hourly wage rate. The t statistic is also much larger than the critical value for 1%
significant level (2.576), thus we still can reject the null hypothesis at this level.
(7) Compute the two-sided 95% confidence interval for the slope coefficient
β1 . (10 points)
Solution: The 0.975th quantile for t272 distribution is 1.96. Therefore the upper and lower bounds of the 95% confidence interval are 0.1802201 + 1.96 ×
√
√
.00078929 = .23528494, and 0.1802201 − 1.96 × .00078929 = .12515526, respectively. Namely, the 95% CI is [.12515526,.23528494].
6
Appendix G
Statistical Tables
TABLE G.2
Critical Values of the t Distribution
Significance Level
1-Tailed:
2-Tailed:
D
e
g
r
e
e
s
o
f
F
r
e
e
d
o
m
.10
.20
.05
.10
.025
.050
.01
.02
.005
.010
1
2
3
4
5
3.078
1.886
1.638
1.533
1.476
6.314
2.920
2.353
2.132
2.015
12.706
4.303
3.182
2.776
2.571
31.821
6.965
4.541
3.747
3.365
63.657
9.925
5.841
4.604
4.032
6
7
8
9
10
1.440
1.415
1.397
1.383
1.372
1.943
1.895
1.860
1.833
1.812
2.447
2.365
2.306
2.262
2.228
3.143
2.998
2.896
2.821
2.764
3.707
3.499
3.355
3.250
3.169
11
12
13
14
15
1.363
1.356
1.350
1.345
1.341
1.796
1.782
1.771
1.761
1.753
2.201
2.179
2.160
2.145
2.131
2.718
2.681
2.650
2.624
2.602
3.106
3.055
3.012
2.977
2.947
16
17
18
19
20
1.337
1.333
1.330
1.328
1.325
1.746
1.740
1.734
1.729
1.725
2.120
2.110
2.101
2.093
2.086
2.583
2.567
2.552
2.539
2.528
2.921
2.898
2.878
2.861
2.845
21
22
23
24
25
1.323
1.321
1.319
1.318
1.316
1.721
1.717
1.714
1.711
1.708
2.080
2.074
2.069
2.064
2.060
2.518
2.508
2.500
2.492
2.485
2.831
2.819
2.807
2.797
2.787
26
27
28
29
30
1.315
1.314
1.313
1.311
1.310
1.706
1.703
1.701
1.699
1.697
2.056
2.052
2.048
2.045
2.042
2.479
2.473
2.467
2.462
2.457
2.779
2.771
2.763
2.756
2.750
40
60
90
120
!
1.303
1.296
1.291
1.289
1.282
1.684
1.671
1.662
1.658
1.645
2.021
2.000
1.987
1.980
1.960
2.423
2.390
2.368
2.358
2.326
2.704
2.660
2.632
2.617
2.576
Examples: The 1% critical value for a one-tailed test with 25 df is 2.485. The 5% critical for a two-tailed test
with large (" 120) df is 1.96.
Source: This table was generated using the Stata® function invt.
780
7