# Assignment 2

This is the second quiz for the cogmaster-stats course. There are 10 multiple choice questions in total, with only one correct response. It is recommended not to answer at random, but rather to check the "Don't know" option. When you are done with the quizz, you can validate your responses by clicking on the Validate button at the bottom of the page.
Responses are not saved as you typed them, so be sure to check all questions before submitting your answers.
If you have any question, you can ask on Twitter: @cogmasterstats, or by email.
Due date: December, 1.

 Nom Email Date (E.g., 21/08/2008; this is also checked internally.)

 1. The next output shows the structure of a data frame `d` and the results of applying a one-way ANOVA model to the data. ```> summary(d) resp grp Min. : 6.847 A:10 1st Qu.: 8.905 B:10 Median :10.063 C:10 Mean :10.095 3rd Qu.:11.000 Max. :13.033 > summary(aov(resp ~ grp, data=d)) Df Sum Sq Mean Sq F value Pr(>F) grp ? 61.42 30.708 ???? 1.33e-07 Residuals 27 27.52 1.019``` What are the values for the missing degrees of freedom (DF=) and F-statistic (F=)? DF=3 and F=30.13 DF=2 and F=30.13 DF=2 and F=2.23 Don't know. 2. Below are results from a two-way ANOVA with factors `x1` and `x2`, and responses collected on 100 subjects. `````` Df Sum Sq Mean Sq F value Pr(>F) x1 1 1077 1077 4.893 0.029385 * x2 1 3255 3255 14.788 0.000219 *** x1:x2 1 1338 1338 6.081 0.015480 * Residuals 94 20688 220 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1`````` How many missing observations are present in the data frame? None. 2. 1. Don't know. 3. From the preceding ANOVA table, how would you compute the partial effect-size measure for `x2`? 3255/20688. 3255/(3255+20688). 3255/(3255+1338+20688). Don't know. 4. Using the data set described in Question 1, but restricted to levels A and B for factor `grp`, we fit a regression line to the data (20 observations). Some results are provided below: ``````> with(d, tapply(resp, grp, mean)) A B C 10.247899 11.765453 8.270759 > lm(resp ~ grp, data=d, subset= grp != "C") Call: lm(formula = resp ~ grp, data = d, subset = grp != "C") Coefficients: (Intercept) grpB 10.248 ?????``````What is the value of the estimate for the slope parameter? 11.765. 1.518. 0.759. Don't know. 5. Would you expect to observe the same results (value of the test statistic, and its corresponding p-value) when using a two-tailed Student t-test vs. a simple linear regression to assess differences between the two groups in the preceding case? Yes. No. Don't know. 6. Here are some data from an experiment in plant physiology, which record the length in coded units of pea sections grown in tissue culture with auxin present. [RR Sokal et FJ Rohlf. Biometry. 3e ed. WH Freeman et Company, 1995] The purpose of the experiment was to test the effects of various sugars on growth as measured by length (pea diameter measured in ocular units, x 0.114 = mm). Four experimental group, representing three different sugars (`X2G`, 2% glucose; `X2F`, 2% fructose; `X2S`, 2% sucrose) and one mixture of sugars (`X1G1F`, 1% glucose + 1% fructose), were used, plus one control (`C`) without sugar. The null hypothesis is that there is no added component due to treatment effects among the five groups. Data altered for the purpose of the exercise. `````` C X2G X2F X1G1F X2S 1 75 57 58 58 62 2 67 58 61 59 66 3 70 60 NA 58 65 4 75 59 58 61 63 5 65 62 57 57 64 6 71 60 56 NA 62 7 67 60 61 58 NA 8 67 57 60 57 NA 9 76 NA 57 57 62 10 68 61 58 59 67`````` Assuming the data frame `peas` has been converted to the long format where the explanatory variable is now `tx` and the response variable is `value`, the ANOVA table is shown below: `````` Df Sum Sq Mean Sq F value Pr(>F) tx 4 989.6 247.41 42.37 7.13e-14 *** Residuals 40 233.6 5.84 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1`````` What command can be used to find the degrees of freedom for the residual sum of squares? `nrow(peas)-nlevels(peas\$tx) ` `length(peas\$value)-nlevels(peas\$tx) ` `sum(!is.na(peas\$value))-nlevels(peas\$tx) ` Don't know. 7. What command could we use to compute a 95% confidence interval for Pearson correlation coefficient estimated from the following series of obervations? ``````x1 11 12 14 11 13 15 14 15 10 13 14 11 13 8 9 x2 12 13 14 11 13 16 15 16 11 14 15 12 14 8 10`````` `confint(cor(x1, x2))` `cor(x1, x2, conf.level=0.95)` `cor.test(x1, x2, conf.level=0.95)` Don't know. 8. In a study on cognitive performance of twenty children from four different age groups (5, 6, 7 and 8 years), we observed the following results with a linear regression model, considering age group as a numerical variable: ``````Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.7521 0.7398 1.017 0.323572 age 0.5053 0.1117 4.525 0.000299 Residual standard error: 0.5554 on 17 degrees of freedom (1 observation deleted due to missingness) Multiple R-squared: 0.5464, Adjusted R-squared: 0.5197 F-statistic: 20.48 on 1 and 17 DF, p-value: 0.0002991`````` Without taking into account the missing observation, the estimated variances are Var(x)=1.374 and Var(y)=0.642. What is the value of Pearson coefficient of correlation between `x` and `y`? 0.739 0.345 0.592 Don't know. 9. If we were to use an ANOVA model, treating age group as a factor, would we get the same p-value for the F-test assessing the whole model? Yes. No. Don't know. 10. Is the hypothesis of normality required to compute the slope of a regression line by ordianry least squares? Yes, but only that of the x-variable. Yes, but only that of the y-variable. Yes, both the x- and y-variable should be normally distributed. No. Don't know.