STAT 362 Assignment 2, 2014

STAT 362
Assignment 2, 2014
Due Date: 6pm, Wed 6 August
Answer all questions on a separate document. Please use complete sentences
and type up your answers for all questions. Hand written work will NOT
be accepted.
Q1. In an experiment to investigate the speed of learning of rats, rats were
placed in a maze and the time each took to complete the maze was
recorded. If a rat took longer than 5 seconds to complete the maze,
they were subjected to an electric shock for the duration of their next
attempt.
The data are the average completion times for rats after a given number
of shocks1 :
Number of shocks
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Time (seconds)
11.4
11.9
7.1
14.2
5.9
6.1
5.4
3.1
5.7
4.4
4.0
2.8
2.6
2.5
5.2
2.0
(a) Plot the time (y) against the number of shocks (x) using R.
(b) Fit the simple linear regression model yi = β0 + β1 xi + εi and find
ˆ
β.
(c) Find a 95% confidence interval for the expected maze completion
time for rats that are shocked 6 times.
1
Data from Bond, N.W. (1979) Impairment of shuttlebox avoidance-learning following
repeated alcohol withdrawal episodes, Pharmacology, Biochemistry and Behaviour
1
(d) Suppose a new rat is shocked 8 times. Construct a 95% prediction
interval for the rat’s time through the maze.
(e) Fit the quadratic model: yi = β0 + β1 xi + β2 x2i + εi and then test
whether this model provides a significant improvement over the
simple linear regression model.
Q2. For the daily output of an industrial operation, let Y1 and Y2 be the
amount of sales and costs respectively (in thousands of dollars). The
density functions for Y1 and Y2 are given by:
f1 (y1 ) =
1 3 −y1
y e ,
6 1
0,
and
f2 (y2 ) =
1 −y2 /2
e
,
2
0,
y1 > 0
otherwise.
y2 > 0
otherwise.
The daily profit is given by U = Y1 − Y2 .
(a) Find E(Y1 ) and E(Y2 ).
(b) Find E(U ).
(c) Assuming that Y1 and Y2 are independent, find Var(U ).
(d) Will you expect the average daily profit over a month to drop
below zero very often? Please explain.
Q3. Show that Total SS = SST + SSE by:
(a) First adding and subtracting y¯.j =
sion for Total SS to show that:
1
nj
Pnj
i=1
yij within the expres-
nj
k X
X
Total SS =
(yij − y¯.. )2
j=1 i=1
nj
k X
X
=
(yij − y¯.j )2 + 2(yij − y¯.j )(¯
y.j − y¯.. ) + (¯
y.j − y¯.. )2
j=1 i=1
(b) Then showing that
Pnj
i=1 (yij
2
− y¯.j ) = 0.
(c) And finally, using A and B, showing that
Total SS =
nj
k X
X
2
(yij − y¯.j ) +
j=1 i=1
k
X
nj (¯
y.j − y¯.. )2
j=1
= SSE + SST.
Q4. If X and Y are correlated random variables and a,b are constants, show
that
(a)
Var(X + Y ) = Var(X) + Var(Y ) + 2Cov(X, Y )
(b)
Cov(aX, bY ) = abCov(X, Y )
Q5. To evaluate the effect of aerobic training on aerobic performance, ten
male athletes from four sports were tested on a treadmill and were
subjected to a graded workload. The athletes were tested until the
point of exhaustion was reached, and their maximum oxygen outputs
during the tests were recorded. The data (in litres/minute) are given
in the following table:
Squash
5.1
5.7
4.6
4.3
5.9
5.3
4.8
5.4
4.4
5.1
Marathon
4.3
4.5
5.0
3.8
5.9
4.7
4.3
5.2
4.8
4.1
Soccer
4.5
4.4
3.9
5.2
4.1
5.0
5.4
4.1
4.2
4.9
Rowing
4.6
5.0
5.3
5.8
5.5
6.0
5.1
4.8
5.6
5.3
(a) Write the X-matrix for fitting the model yij = µj + εij where yij
is the maximum oxygen output for athlete i from sport j (j =
1, 2, 3, 4).
3
(b) Write the X-matrix for fitting the model
µ + εij
j=1
yij =
µ + τj + εij j = 2, 3, 4
where yij is again the maximum oxygen output for athlete i from
sport j. What is the correct interpretation of µ and τj in this
model?
(c) (By hand) perform a one-way ANOVA to determine whether there
is evidence of differences in the maximum oxygen output for athletes from the different sports. What is a drawback of using this
approach?
4