Math 70 Homework 5 Michael Downs 1. Let yn×1 be the column

Math 70
Homework 5
Michael Downs
1. Let yn×1 be the
column vector containing each yi ∼ N (µ, σ 2 ) and suppose that µ = 0.
P
n
y¯ = n1 10 y and i=1 (yi − y¯)2 = ky − 1¯
y k2 = ky − n1 110 yk2 = k(In×n − n1 110 )yk2 . We have
1
0
n×n
proven in class that (I
− n 11 ) is symmetric, idempotent, and has trace n − 1. Thus
1
1
0
2
0
k(I − n 11 )yk = y (I − n 110 )0 (I − n1 110 )y = y0 (I − n1 110 )y. So:
√ 1 0
√
n 1y
n¯
y
q Pn
q n1
=
2
0
0
y)
i=1 (yi −¯
y (I− n 11 )y
n−1
1 √1 0
1y
σ n
n−1
=
1
σ
q
=q
=r
1
110 )y
y0 (I− n
n−1
√1 10 ( y )
σ
n
1
0
0
1 y (I− n 11 )y
2
σ
n−1
√1 10 ( y )
σ
n
0
y
1
110 )( σ
)
( yσ )(I− n
n−1
Now, yσ ∼ N (0, I), √1n k1k = 1, (I − n1 110 ) is a symmetric idempotent matrix with trace
n − 1, and √1n (I − n1 110 )1 = √1n (1 − n1 110 1) = √1n (1 − n1 1n) = 0 so by the big theorem
the t-statistic is ∼ t(n − 1).
2. Let yn×1 be the usual column vector of observations of the dependent variable in a linear
model, Xn×p be the usual matrix with columns containing observations of independent
variables, and βˆp×1 the OLS estimator of β. Then y = Xβ + n×1 with ∼ N (0, σ 2 I)
1
ˆ 2 . Proposition:
ky − Xβk
and βˆ = (X0 X)−1 X0 y = β + (X0 X)−1 X0 . Let σ
ˆ 2 = n−p
(n−p)ˆ
σ2
σ2
∼ χ2 (n − p)
(n − p)ˆ
σ2
1
ˆ 2
=
ky − Xβk
2
2
σ
σ
1
= 2 kXβ + − X(β + (X0 X)−1 X0 )k2
σ
1
= 2 kXβ + − Xβ − X(X0 X)−1 X0 k2
σ
1
= 2 k − X(X0 X)−1 X0 k2
σ
1
= 2 k(In×n − X(X0 X)−1 X0 )k2
σ
1
= 2 0 (I − X(X0 X)−1 X0 )0 (I − X(X0 X)−1 X0 )
σ
1
Math 70
Homework 5
Michael Downs
Here it is necessary to point out that (I − X(X0 X)−1 X0 ) is symmetric:
(I − X(X0 X)−1 X0 )0 = I0 − (X(X0 X)−1 X0 )0
= I − (X0 )0 (X(X0 X)−1 )0
= I − X((X0 X)−1 )0 X0
= I − X((X0 X)0 )−1 X0
= I − X(X0 X)−1 X0
and idempotent:
(I − X(X0 X)−1 X0 )(I − X(X0 X)−1 X0 ) = (I − X(X0 X)−1 X0 ) − (X(X0 X)−1 X0 )(I − X(X0 X)−1 X0 )
= I − X(X0 X)−1 X0 − X(X0 X)−1 X0 + X(X0 X)−1 X0 X(X0 X)−1 X0
= I − X(X0 X)−1 X0 − X(X0 X)−1 X0 + X(X0 X)−1 X0
= I − X(X0 X)−1 X0
and has trace n − p:
tr(I − X(X0 X)−1 X0 ) = tr(I) − tr(X(X0 X)−1 X0 )
= tr(I) − tr(X0 X(X0 X)−1 )
= tr(In×n ) − tr(Ip×p )
=n−p
Returning to the original chain of equalities:
1 0
1 0
0
−1 0 0
0
−1 0
(I
−
X(X
X)
X
)
(I
−
X(X
X)
X
)
=
(I − X(X0 X)−1 X0 )
2
σ2
σ
0
0
−1 0
=
(I − X(X X) X )
σ
σ
Now, σ ∼ N (0, I) and (I − X(X0 X)−1 X0 ) is a symmetric idempotent matrix with trace
0
σ2
= σ (I − X(X0 X)−1 X0 ) σ ∼ χ2 (n − p)
n − p. By the big theorem, (n−p)ˆ
σ2
3. Consider a linear model yn×1 = Xβ p×1 + . Under the P
constraint P
that β1 = β2 = · · · =
n
2
β
= 0 the model collapses to y = y¯1+ with RSS0 = i=1 = ni=1 (yi − y¯)2 . Denote
Pp−1
n
2
2
¯)2 with α. The unconstrained model has RSSA = α(1 − RA
) where RA
is the
i=1 (yi − y
2
Math 70
Homework 5
Michael Downs
coefficient of determination. Using the fact that there are p − 1 constraint equations (C
is a p − 1 × p matrix). The F test can be rewritten:
2
RA
/(p − 1)
∼ F (p − 1, n − p)
2
(1 − RA
)/(n − p)
2
RA
=
(1)
(p − 1)F.95 (p − 1, n − p)
(p − 1)F.95 (p − 1, n − p) + (n − p)
(2)
Plotting minimum R2 :
1
3
5
n = 20
p = c (2 ,3 ,4 ,5)
Rsquared = ( ( p−1)∗ q f ( . 9 5 , p−1, n−p ) ) / ( ( p−1)∗ q f ( . 9 5 , p−1, n−p ) + ( n−p ) )
p l o t ( p , Rsquared )
t i t l e ( ”min Rˆ2 t o have s t a t . s i g . r e g r e s s i o n f o r n = 2 0 , a l p h a = . 0 5 , vs p
”)
p r i n t ( Rsquared )
0.1969260 0.2970286 0.3778341 0.4489806
The above four numbers are the minimum R squared required to have a statistically
significant regression for n = 20, p = 2, 3, 4, 5 at a 95 percent significance level.
0.20 0.25 0.30 0.35 0.40 0.45
Rsquared
min R^2 to have stat. sig. regression for n = 20, alpha = .05, vs p
2.0
2.5
3.0
3.5
4.0
p
4. I used the regression in boston(4) with V2,V3,V7 omitted.
3
4.5
5.0
Math 70
2
4
6
8
10
12
Homework 5
bh$V14=l o g ( bh$V14 )
o=lm ( V14˜V1+V4+V5+V6+V8+V9+V10+V11+V12+V13 , data=bh )
p r i n t ( summary ( o ) )
yp=bh$V14−o $ r e s i d u a l s
n=nrow ( bh )
p l o t ( 1 : n , 1 0 0 ∗ o $ r e s i d u a l s , type=”h” , y l a b=”% o v e r p a i d / u n d e r p a i d ” , x l a b=” Census
d i s t r i c t code ” )
t i t l e ( ”Where t o buy a house i n Boston ? ” )
t e x t (0 , −60 , ” U n d e r p r i c e d h o u s e s ” , a d j =0, c o l =3, cex =2)
t e x t ( 0 , 6 0 , ” O v e r p r i c e d h o u s e s ” , a d j =0, c o l =2, cex =2)
pp=100∗ o $ r e s i d u a l s
imove =(1: n ) [ pp==min ( pp ) ]
p o i n t s ( imove , min ( pp ) , pch =16 , c o l =3)
t e x t ( imove , min ( pp ) , ”Move h e r e ” , a d j =0, c o l =3)
14
16
18
20
22
24
26
28
30
Michael Downs
# obtain the c o n f i de n c e i n t e r v a l
X = a s . matrix ( bh [ , − c ( 2 , 3 , 7 , 1 4 ) ] )
X = c b i n d ( r e p ( 1 , n ) ,X)
# f i r s t get the d e s i r e d x value
x = a s . matrix (X[ imove , ] )
b e t a = a s . matrix ( o $ c o e f f i c i e n t s )
p = length ( beta )
t 9 5 = qt ( 1 − . 0 5 / 2 , n−p )
s i g h a t = s q r t ( 1 / ( n−p ) ∗sum ( o $ r e s i d u a l s ˆ 2 ) )
s = s q r t ( 1 + t ( x )%∗%s o l v e ( t (X)%∗%X)%∗%x )
u = t ( b e t a )%∗%x + t 9 5 ∗ s i g h a t ∗ s
l = t ( b e t a )%∗%x − t 9 5 ∗ s i g h a t ∗ s
print ( ” predicted value : ” )
p r i n t ( t ( b e t a )%∗%x )
p r i n t ( ” 95 p e r c e n t c o n f i d e n c e i n t e r v a l f o r b e s t house : ” )
print ( cat ( ” ( ” , l , ” , ” ,u , ” ) ” ) )
[1] "predicted value:"
[,1]
[1,] 2.704989
[1] "95 percent confidence interval for best house:"
( 2.328414 , 3.081564 )NULL
4