* I always create small data.frame for illustrations with GLMs. Basically, I do something like X <- replicate(2, rnorm(100)) y <- X[,1]+X[,2]+rnorm(100) df <- data.frame(y=y, X=X) Here is a one-line solution: df <- transform(X <- as.data.frame(replicate(2, rnorm(100))), y = V1+V2+rnorm(100)) This is also a nice way to generate two uncorrelated predictors, while allowing a strong association between the outcome and each of them. * A quick and dirty way to simulate two-way ANOVA data n <- 100 A <- gl(2, n/2, n, labels=paste("a", 1:2, sep="")) B <- gl(2, n/4, n, labels=paste("b", 1:2, sep="")) df <- data.frame(y=rnorm(n), A, B) * A good replacement to sink() for capturing output of R commands is capture.output(). From the on-line help, it can even be combined to enscript like so: ps <- pipe("enscript -o tempout.ps","w") capture.output(example(glm), file=ps) close(ps) * conflicts(detail=TRUE) gives details about masked functions. * Export plot with shaded border (like OS X screencapture utility): R> png("grv.png"); plot(replicate(2, rnorm(100))); dev.off() \$ convert grv.png \( +clone -background black -shadow \ 55x15+0+5 \) +swap -background none -layers merge +repage grv2.png * Instead of replicate, we can use r_ply to do something when no return values is expected, e.g. for an animation r_ply(10, plot(runif(50))) * To get the column number of a column given its name, better than which (colnames(d)=="a") or grep("^a\$", colnames(df)) use match("a", names(d)) With huge data.frame, it's even better to use the fastmatch package (use fmatch instead of match). h/t Matthew Dowle, http://bit.ly/yFeDTT * To update to a new version of R while keeping older installed programs, we can use pkgutil to look for available versions on a Mac system: \$ pkgutil --packages / | grep org.r-project Mac versions usually include Leopard.fw.* in the installaed receipts. Then, we just have to tell the system to forget about a previous version, using e.g., \$ sudo pkgutil --forget org.r-project.R.Leopard.fw.pkga * Use page(d, method="print") to view a long data frame d through a pager.