Econ. 771: Problem Set I

Econ. 771: Problem Set I
David Guilkey
Spring 2014
Question 1
Given the following model:
Y = Xβ + ,
Where the Gauss-Markov assumptions are satisfied and Y is N x 1, X is N x K, β is K x 1,
and is N x 1 and Xi,1 = 1 for all i.
a. Derive the OLS estimator for β in this model and derive its covariance matrix.
b. Define: ˆ = Y − X βˆ and show that E(ˆ) = 0.
c. Derive the N x N covariance matrix of ˆ .
d. We run an OLS regression on the following model: ˆ = Xδ + µ. Show that the OLS
estimator for δ is equal to zero and the R2 for this model is also zero.
Question 2
Given the following model:
yi = βxi + i ,
Where all terms are scalars and the Gauss-Markov assumptions are satisfied except that
i = zi µi where the z’s are known and non-stochastic and the µi ’s are iid N (0, σµ2 ).
a. Show that the OLS estimator of β is unbiased.
b. Derive the N x N covariance matrix of the i ’s.
c. Derive the variance of the OLS estimator for β.
d. Does the OLS estimator for β follow a normal distribution? Explain.
1
Question 3
Given the following model:
Yi = β1 + Xi β2 + i ,
define:
Sxx =
N
X
¯ 2 , SST = Syy =
(Xi − X)
i=1
N
X
(Yi − Y¯ )2 and Sxy =
i=1
N
X
¯ i − Y¯ )
(Xi − X)(Y
i=1
a. Show that the OLS estimators can be written as:
Sxy
βˆ1 = Y¯ − βˆ2 X and βˆ2 =
Sxx
b. Show that
SSE = SST − βˆ2 Sxy
STATA Question 1
Consider the simple linear regression model:
Y = β1 + β2 X + The data file “HW1.dta” contains four different dependent variables, Y1 , Y2 , Y3 , Y4 , and four
different independent variables (or covariates), X1 , X2 , X3 , X4 . Please answer the following
questions about the simple linear model using the data.
1) Which linear model fits best for Y1 ? Which linear model fits best for Y2 , Y3 and Y4 ?
Try multiple models for each dependent variable. Do not report the regression output.
Please just report the final models and the criteria you are using to determine the model
that fits the data best. Note that each variable X1 , X2 , X3 , X4 has a different total
number of observations but STATA automatically “deals” with this issue when running
the necessary regressions. For instance, X2 has 500 observations but the command “reg
Y1 X1 X2 ” automatically drops the last 300 observations from X2 .
2) Run and report the following regressions:
Y1 on X1 X2 X3 X4 ,
Y1 on X1 X2
Y1 on X1
Comment on which model fits the data best.
3) Run the following regressions (do not report the regression results):
2
Y1 on X1
Y2 on X2
Y3 on X3
Y4 on X4
What is happening to the estimated coefficients for β2 ? What asymptotic property
might this be illustrating?
4) Given the various regression models you have analyzed, is there a common data generating process; i.e., what is the correct linear model generating the data.
5) Run the same regressions as in question three using the nocons option. Comment on
the results.
STATA Question 2
This question uses the “National Data Set”(available on the website as national old.dta).
a. Generate new variables that represent the annual percentage change in employment
(‘eea’), CPI, and the S&P 500. Generate a variable that is the difference between the
10 year bond yield and the 90 day rate.
b. Regress the difference in the 10 year bond and the 90 day rate on the other variables
created above and a constant. Discuss the result of the F-test.
c. Regress the 10 year bond yield on the same variables. Discuss the result of the F-test.
d. Test the following null hypotheses and interpret the results:
1) The coefficient on percentage change in employment is equal to
on percentage change in CPI.
1
4
of the coefficient
2) The coefficient on the percentage change in CPI is equal to 75.
3) The coefficients on the percentage change in S&P 500 and employment are jointly
zero.
3