< a quantity that can be divided into another a whole number of time />

Some random geeky notes

September 3, 2013

Here are some random geeky notes that have accumulated over the past few months on my desk.

There are too many articles I have read and have to read to provide a semblance of summary here. Regarding books, my reading list is growing as well. Nevertheless, I’ve been happy with Serious Stats, by Thom Baguley, and Statictics Applied to Clinical Studies, by Cleophas and Zwinderman (bought on as a complement to Statictics Applied to Clinical Trials by the same authors. I also enjoyed Introduction to Psychometric Theory, by Raykov and Marcoulides, since I was looking for a book relying on the Mplus software.

R 3.0 has been released early this year. See also David Smith’s post on Revolutions blog. I’m still using the 2.15.2 version, partly because I have a lot of work in progress and also because I haven’t found any decent way to manage my old packages directory with both versions. I guess at what time I will have to update everything, but I have to wait for a moment.

I just updated to Tex 2013, I bought Stata 13 and I’m playing more and more with Julia. For literate programming, I will probably be happy with dexy, and also try the Stata filter.

I authored (80%) a new course on the use of statistical software (R and Stata, as far as I was concerned) in medical research. It took me more than 150 hours to produce about 450 pages of slides, exercises and solutions, errata and handouts. What I’ve learned is that it is not writing code or designing a $\LaTeX$ template, or even learning some Stata, that take most time: it is all about finding some good data set!

I was supposed to attend the JSM meeting this year. Unfortunately, I couldn’t make it, so I followed some of the #jsm2013 tweets. I hear about Nat Silver’s talk, and of course I followed the “Data Science” trend in recent months.

The long awaited Applied Predictive Modeling by Max Kuhn and Kjell Johnson is now out. There’s also an R package. I haven’t time to buy and read the book at the moment, but this is just a matter of time.

There was a nice article about Bioconductor in PLoS Comp Bio: Software for Computing and Annotating Genomic Ranges. There was also a great tutorial at UseR! 2013 on the Analysis and Comprehension of High-Throughput Genomic Data.1

I started several sessions on Coursera, but I had to stop early: too busy at some point during this Summer. I like those courses, though, especially Introduction to Data Science (IDS), by Bill Howe, and High Performance Scientific Computing, with Randall J. LeVeque. In the former, I learnt a lot about big data, databases and Hadoop, while the latter is not so much about HPC than scientific computing with open-source software: Python and Fortran 90|95, and cloud facilities (Amazon Web Services). Like for IDS, the instructor offers the possibility to download a virtual machine (XUbuntu 12.04 with all software pre-installed) which is quite a good idea. Download time and size on HD are reasonable, so that I may be using a similar approach for future courses. Generally speaking, I think it is a great idea to offer so much high-quality material for free–I really don’t care about grading policy or getting a certificate at the end, and other takers probably shouldn’t. I just wish we could get some PDF handouts because 6 hours of lecture per week is really exhausting, especially when you try to follow several courses at once, and I generally prefer to be able to refer to paper-based material. I’m involved myself in a French MOOC on biostatistical analysis with the R software. Let’s wait and see how it will be received next year…

Now, there are really great tools to build interactive HTML slides, namely Slidify and RStudio presentations, available in the development preview of RStudio. I know Chris Fonnesbeck used landslide for his great Bios301 course (Introduction to Statistical Computing) in the Department of Biostatistics at VU.2 I wish it doesn’t foul my web browser history, but that’s probably something I can manage in the future.

Other miscellanies: I noticed that Apple just updated Java to version 7 (from Oracle)–what they called Java for OS X 2013. It probably occurred during an update that I allowed, although I haven’t updated anything since one year or so. I believe I haven’t noticed this change before because my Clojure install just works fine and it’s been a long since I haven’t needed Java web start or the Java compiler. It is still possible to revert those changes. Another funny thing is that I can still use my registered QuickTime Pro 7 software, despite QuickTime X being the default on OS X 10.7 and higher. Well, enough for my uphill complaints.

  1. Their annual report (PDF) looks great too! ↩︎

  2. And now there’s Bios366 (Advanced Statistical Computing) using Python. ↩︎


See Also

» Dose finding studies and cross-over trials » Exploratory data mining and data cleaning » Hierarchical Omega in factor analysis » Winter desk cleaning » Cognitive diagnosis models