Here are brief solutions for Assignment 1. OpenIntroStats refers to: Diez, DM, Barr, CD, and Ã‡etinkaya-Rundel, M (2012). OpenIntro Statistics (2nd edition). 1. There's a header line (name of the variables on the first line of the file), missing data are coded as '.', and values are separated by single space. The last observation allows to discard all solutions that consider read.csv() without updating the default delimiter option, and NA values should be passed through the na.strings= option. So only option D is valid. 2. Option C is the correct one. Option A does not address the variable "id" corerctly (it should be d\$id), while option B does not specify that single imputation should be done on variable "time3". 3. Option A is incorrect because there's no such option (levels=). Option B is correct even if it uses repeated labels in the first assignement. 4. The quantity sd(x)/sqrt(n) represent the standard error of the mean, so the correct answer is B, see OpenIntroStats, 2nd ed., p. 170. 5. The correct answer is A, because the density function for a t-distribution with 29 degrees of freedom will have slightly larger tails than that of a standard normal distribution, hence a higher value for the corresponding quantiles, e.g. > qnorm(0.025) [1] -1.959964 > qt(0.025, 29) [1] -2.04523 6. Options A and B are incorrect interpretation of Type I (rejecting the null when we shouldn't) and II (not rejecting the null when we should) errors. The correct interpretation is C: there is a 100-80 = 20% risk that a true difference exists but we fail to demonstrate it using this sample. 7. If there are 15 individuals, the degrees of freedom of a t-test for paired sample will be 15-1 = 14, while for independant samples it would be 15*2-2 = 28. Only option C considers that there is observations are paired. 8. In a frequentist framework, the parameter of interest is fixed, hence it has no probability, so option A is incorrect. Instead, option B correctly reflects the sampling process. 9. The 95% CI doesn't cover the value 0, so the observed effect can be regarded as significant at a 5% level. However this CI is rather large which suggests that the estimate is rather imprecise (low sample size?). As the observed difference is far from 0 (and twice the expected decrease in mortality), the effect might be considered as large. So the correct answer is B. 10. Working with difference of scores (d) is equivalent to testing H0 : d = 0, which is strictly comparable to H0 : mu1 = mu2, when the two samples are paired, see OpenIntroStats, 2nd ed., p. 214.