< a quantity that can be divided into another a whole number of time />

Design of experiment in R

May 28, 2011

When I started writing my companion textbook for Montgomery’s Design and Analysis of Experiments, there was not so much dedicated package available on CRAN. Now, I realize that there are a lot of very handy packages on CRAN. Most of them were released in 2010 and are listed in the corresponding Task View, ExperimentalDesign.

In the so-called White Book (Statistical Models in S, Chambers & Hastie, 1992), section 5.2.3 pp. 169-175 is dedicated to full- and fractional factorial designs, with and However, those two functions are not available in R, and we only have expand.grid (see Venables and Ripley, MASS 4th ed., pp. 167-169) which is not very useful for the purpose of generating fractional designs.

Let’s consider a 25-2 design, with the following generator: D=±AB and E=±AC. The corresponding design matrix can be easily found using the BHH2 package, which provides R functions and datasets from Box, Hunter and Hunter’s book, Statistics for Experimenters II (Wiley, 2005):

d52 <- ffDesMatrix(5, gen=list(c(4,1,2), c(5,1,3)))

Or we could use:


Note that the FrF2 package has an Rcmdr plugin that facilitates its use. In both cases, we get:

   A  B  C  D  E
1 -1  1 -1 -1  1
2 -1 -1 -1  1  1
3 -1 -1  1  1 -1
4  1  1 -1  1 -1
5 -1  1  1 -1 -1
6  1 -1  1 -1  1
7  1 -1 -1 -1 -1
8  1  1  1  1  1

Now, we want to find the aliases that this structure defines. We already know that for this kind of 25-2 design, every main effect is aliased with at least one first order interaction. Let’s check it:

[1] "A=A" "B=B" "C=C" "D=D" "E=E"

[1] "A=BD=CE" "B=AD"    "C=AE"    "D=AB"    "E=AC"   

[1] "BC=DE" "BE=CD"

There’s lot more to see in this package, including plot of main effects in 2k designs, Daniel’s plot, “cube plot”, alias structure for standard lm object, or construction of split-plot designs.

rstats statistics

See Also

» Using bootstrap in cluster analysis » Recursive feature elimination coupled to SVM in R » Visualizing What Random Forests Really Do » How to efficiently manage a statistical analysis project » Multiple comparisons and p-value adjustment