Stats 201 Homework 8: Due Friday, Dec. 12 by 5pm For problems that require the use of R, print all relevant R code, output, and plots. Your R code and output should be clear and concise – our grader will not wade through pages of R output looking for an answer. If there are handwritten portions as well as printed R code and output, be sure the location of your answers is clear. 1. Generations of athletes have been cautioned that cigarette smoking hinders performance. One measure of the truth of that warning is the effect of smoking on heart rate. In one study examining that impact, six each of non-smokers, light smokers, moderate smokers, and heavy smokers undertook sustained physical exercise. Their heart rates were measured after resting for three minutes. The results appear in the following table: Non-Smokers 69 52 71 58 59 65 Light Smokers 55 60 78 58 62 66 Moderate Smokers 66 81 70 77 57 79 Heavy Smokers 91 72 81 67 95 84 Note that this data set is not posted on the course webpage; in order to get the data into R, you will need to manually enter the data into a data.frame. (a) Create the one-way ANOVA table “by hand.” You may use R to calculate sample means and sample variances (using tapply with mean and var functions), but do not use the aov or lm functions (though you may check your answers using those functions). Be sure to show how you calculated each component of the ANOVA table. (b) Carry out the ANOVA F-test for this scenario. Report the null and alternative hypotheses, defining any symbols used in context of the problem; the F test statistic; the p-value; and your conclusion in terms of the problem. Use the pf function in R to calculate the p-value rather than the aov or lm functions. (c) Use R to compute the Tukey multiple comparison confidence intervals for all pairwise mean differences in the smoker study using a familywise error rate of α = .05 (now you can use the R function aov, and then TukeyHSD). Which pairs of groups have significantly different mean heart rates? (d) Write a few sentences summarizing your conclusions from this study. 2. Consider the two-way ANOVA model Yijk = µ + αi + βj + αβij + ijk iid for k = 1, . . . , nij , i = 1, . . . , a, and j = 1, . . . , b, where ijk ∼ N (0, σ 2 ). Suppose a = 2 and b = 3, and the sample sizes within each combination of the two factors are n11 = n13 = n21 = n22 = 1 and n12 = n23 = 2. This model can be expressed as a linear model in matrix form: Y = Xβ + . Write out the entries of the vector Y, the matrix X and the vector β for this model. 3. Read in the 1985 Current Population Survey data we have used previously: cps = read.csv("http://www.ics.uci.edu/~staceyah/201/data/cps.csv") The response is hourly wage in dollars (wage), Factor A is sex (M or F) and Factor B is married (Married or Single). (a) Use R to calculate the sample sizes for each of the four factor combinations. (b) Use R to calcualte the sample standard deviations for each of the four factor combinations. (c) Use R to calculate the four cell sample mean wages (ˆ µij for i = 1, 2 and j = 1, 2), as well as the four marginal sample mean wages (ˆ µi. for i = 1, 2 and µ ˆ.j for j = 1, 2), and the overall sample mean wage. (d) Find estimates of the following parameters: (i) α2 , (ii) β1 , (iii) αβ21 . (e) Use R to produce an interaction plot for these data. From the interaction plot, does it seem like an interaction is present between sex and marital status? Explain. (f) Produce the two-way ANOVA table (including interaction). Report the p-value and the conclusion of the F-test for interaction. (g) Explain why it would not make sense to interpret the main effects for these data. (h) Assess the assumptions of the two-way ANOVA model for these data. If the assumptions are not met, suggest a transformation of the reponse that might reduce the assumption violations. The remaining problems are exercises from the Utts and Heckard supplemental chapter on Two-Way Analysis of Variance (Chapter S4), posted in our EEE Dropbox: 4. Exercise S4.4. 5. Exercise S4.22.
© Copyright 2024 ExpyDoc