# Analysing discrete correlated data using R

Geert Molenberghs & Geert Verbeke, **Models for discrete longitudinal data**. (Springer, 2005) [www]

This book provides a comprehensive treatment on modeling approaches for non-Gaussian repeated measures, possibly subject to incompleteness. The authors begin with models for the full marginal distribution of the outcome vector. This allows model fitting to be based on maximum likelihood principles, immediately implying inferential tools for all parameters in the models. At the same time, they formulate computationally less complex alternatives, including generalized estimating equations and pseudo-likelihood methods. They then briefly introduce conditional models and move on to the random-effects family, encompassing the beta-binomial model, the probit model and, in particular the genralized linear mixed model. Several frequently used procedures for model fitting are discussed and differences between marginal models and random-effects models are given attention.

The authors consider a variety of extensions, such as models for multivariate longitudinal measurements, random-effects models with serial correlation, and mixed models with non-Gaussian random effects. They sketch the general principles for how to deal with the commonly encountered issue of incomplete longitudinal data. The authors critique frequently used methods and propose flexible and broadly valid methods instead, and they conclude with key concepts of sensitivity analysis.

Without putting too much emphasis on software, the book shows how the different approaches can be implemented within the SAS software package. The text is organized so the reader can skip the software-oriented chapters and sections without breaking the logical flow.

**Note to the interested reader.**

Partly due to my current position, and the other projects I'm involved in, I cannot work on a regular basis on this handbook. I plan to release a beta version in january 2008. I hope this job will be finished by the end of 2008.

- Handbook : MDLD.pdf (source : MDLD.Rnw) a very incomplete draft compiled by the end of 2007
- Dataset : dataset.tar.gz (README)
- R code : MDLD.R
- Original SAS code : SAS_code.tar.gz (version html)

The following R packages were necessary to run all of the examples used in the textbook: `gee`

, `geepack`

, `yags`

, `vgam`

, `aod`

, `lme4`

.

Below are some additional material that I have collected during the period I devote to the writting of this handbook. These are mainly tutorial or articles about GEE and GLMM which I found very useful to get additional insights thorough both the theoretical and practical aspects of this statistical field.

- Régression logistique conditionnelle pour données corrélées (taken from archimede.mat.ulaval.ca/pages/duchesne/) [pdf]
- The Generalized Estimating Equations in the Past Ten Years: An Overview and A Biomedical Application (taken from citeseer.ist.psu.edu) [pdf]

I'm also working on another handbook, designed for IRT analysis using Mixed Models with R. It can be found on the following page. Note that the same warrant applies to this work: it's just homework that I do when I have some time.