Clinical significant changes

[1] RK Henson. Effect-size measures and meta-analytic thinking in counseling psychology research. The Counseling Psychologist, 34(5):601-629, 2006. [ bib ]
Effect sizes are critical to result interpretation and synthesis across studies. Although statistical significance testing has historically dominated the determination of result importance, modern views emphasize the role of effect sizes and confidence intervals. This article accessibly discusses how to calculate and interpret the effect sizes that counseling psychologists use most frequently. To provide context, the author presents a brief history of statistical significance tests. Second, the author discusses the difference between statistical, practical, and clinical significance. Third, the author reviews and graphically demonstrates two common types of effect sizes, commenting on multivariate and corrected effect sizes. Fourth, the author emphasizes meta-analytic thinking and the potential role of confidence intervals around effect sizes. Finally, the author gives a hypothetical example of how to report and potentially interpret some effect sizes.

[2] NS Jacobson, WC Follette, and D Revenstorf. Psychotherapy outcome research: Methods for reporting variability and evaluating clinical significance. Behavior Therapy, 15:336-352, 1984. [ bib ]
[3] GH Maassen. The unreliable change of reliable change indices. Behaviour Research and Therapy, 39(4):495-498, 2001. [ bib ]
The classic method for assessment of reliable change, in 1991 re-introduced as Jacobson's RC, can be characterized as a confidence interval method. In recent years, several RC indices have been proposed using Kelley's (1947) (Kelley, T. L. (1947). Fundamentals of statistics. Cambridge: Harvard University Press) formula for estimating true change. In these proposals, interval estimation and confidence intervals are mixed up, which leads to unjustified probability statements. When Kelley's estimate is correctly expanded into a normal distributed statistic, the classic approach reveals itself as a large sample approximation of a properly constructed RCI based on Kelley's formula. Researchers should continue using the classic approach for the determination of reliable change.

[4] WJ Hageman and Arrindell WA. A further refinement of the reliable change (rc) index by improving the pre-post difference score: introducing rcid. Behaviour research and therapy, 31(7), 693-700 1993. [ bib ]
[5] SV Eisen, G Ranganathan, P Seal, and A 3rd. Spiro. Measuring clinically meaningful change following mental health treatment. Journal of Behavioral Health Serv Research, 34(3):272-289, 2007. [ bib ]
Assessment of clinically meaningful change is useful for treatment planning, monitoring progress, and evaluating treatment response. Outcome studies often assess statistically significant change, which may not be clinically meaningful. Study objectives are to: (1) evaluate responsiveness of the BASIS-24 using three methods for determining clinically meaningful change: reliable change index (RCI), effect size (ES), and standard error of measurement (SEM); and (2) determine which method provides an estimate of clinically meaningful change most concordant with other change measures. BASIS-24 assessments were obtained at two time points for 1,397 inpatients and 850 outpatients. The proportion showing clinically meaningful change using each method was compared to the proportion showing change in global mental health, retrospectively reported change, and clinician-assessed change. BASIS-24 demonstrated responsiveness at both aggregate and individual levels. Regarding clinically meaningful improvement and decline, SEM was most concordant with all three outcome measures; regarding no change, RCI was most concordant with all three measures.

[6] RJ Ferguson, AB Robinson, and M Splaine. Use of the reliable change index to evaluate clinical significance in sf-36 outcomes. Quality of Life Research, 11(6):509-516, 2002. [ bib ]
The SF-36 Health Survey is the most widely used self-report measure of functional health. It is commonly used in both randomized controlled trials (RCT) and non-controlled evaluation of medical or other health services. However, determining a clinically significant change in SF-36 outcomes from pre-to-post-intervention, in contrast to statistically significant differences, is often not a focus of medical outcomes research. We propose use of the Reliable Change Index (RCI) in combination with SF-36 norms as one method for researchers, provider groups, and health care policy makers to determine clinically significant healthcare outcomes when the SF-36 is used as a primary measure. The RCI is a statistic that determines the magnitude of change score necessary of a given self-report measure to be considered statistically reliable. The RCI has been used to determine clinically significant change in mental health and behavioral medicine outcomes research, but is not widely applied to medical outcomes research. A usable table of RCIs for the SF-36 has been calculated and is presented. Instruction and a case illustration of how to use the RCI table is also provided. Finally, limitations and cautionary guidelines on using SF-36 norms and the RCI to determine clinically significant outcome are discussed.

[7] BM Ogles, MJ Lambert, and KS Masters. Assessing outcome in clinical practice. Boston: Allyn and Bacon, 1996. [ bib ]
[8] PC Kendall and WM Grove. Normative comparisons in therapy outcome. Behavioral Assessment, 10:147-158, 1988. [ bib ]
Normative comparisons are a procedure for evaluating the clinical significance of therapeutic interventions. Although a step-by-step statistical methodology for conducting normative comparisons has been reported elsewhere (P. C. Kendall, A. Marrs-Garcia, S. R. Nath, & R. C. Sheldrick, 1999), questions regarding the collecting of normative data remain. For this study, all treatment outcome studies published in the Journal of Consulting and Clinical Psychology from 1988 to 1997 were examined and reviewed, and the 5 most commonly used outcome measures were identified. For these outcome measures, multiple sources of normative data were located. Although we identified a dearth of normative data on measures used for treatment outcome, results discussed here nevertheless provide information that may be of use to therapy outcome evaluators when conducting normative comparisons. In addition, equations to determine the minimum sample size needed in a normative sample for a given treatment outcome study are provided.

[9] Z Martinovich, S Saunders, and KI Howard. Some comments on "assessing clinical significance". Psychotherapy Research, 6:124-132, 1996. [ bib ]
The strategies for extending clinical significance (CS) methodology, suggested by Tingey et al., have considerable merit. They also serve to highlight the difficulties encountered with CS methodology in general. Problems encountered with the original methodology may be compounded, not solved, by such extensions For example, problems around lack of agreement about the appropriateness of certain measures, and the questionable psychometric properties of measures, are likely to be exacerbated, not lessened, when attempting to measure social impact. Similarly, the proposal that multiple normative groups be identified to provide the impact factor does not resolve the original difficulty of identifying and discriminating more obviously diverse groups, such as functional and dysfunctional. Other problems with the proposed extensions, such as using criterion c (Jacobon and Truax, 1991) with non-normal distributions, are discussed. Some recommendations regarding these problems are made.

[10] PD Raymond, AD Hinton-Bayre, M Radel, MJ Ray, and NA Marsh. Assessment of statistical change criteria used to define significant change in neuropsychological test performance following cardiac surgery. Eur J European Journal of Cardiothoracic Surgery, 29:82-88, 2006. [ bib ]
Objective: This paper compares four techniques used to assess change in neuropsychological test scores before and after coronary artery bypass graft surgery (CABG), and includes a rationale for the classification of a patient as overall impaired. Methods: A total of 55 patients were tested before and after surgery on the MicroCog neuropsychological test battery. A matched control group underwent the same testing regime to generate test-retest reliabilities and practice effects. Two techniques designed to assess statistical change were used: the Reliable Change Index (RCI), modified for practice, and the Standardised Regression-based (SRB) technique. These were compared against two fixed cutoff techniques (standard deviation and 20% change methods). Results: The incidence of decline across test scores varied markedly depending on which technique was used to describe change. The SRB method identified more patients as declined on most measures. In comparison, the two fixed cutoff techniques displayed relatively reduced sensitivity in the detection of change. Conclusions: Overall change in an individual can be described provided the investigators choose a rational cutoff based on likely spread of scores due to chance. A cutoff value of 20% of test scores used provided acceptable probability based on the number of tests commonly encountered. Investigators must also choose a test battery that minimises shared variance among test scores.

[11] WJ Hageman and Arrindell WA. Establishing clinically significant change: increment of precision and the distinction between individual and group level of analysis. Behavior Research and Therapy, 37(12):1169-1193, 1999. [ bib ]
Some essential adaptations to the method for determining clinically significant change originally introduced by Jacobson, Follette and Revenstorf [Jacobson, N. S., Follette, W. C. & Revenstorf, D. (1984a). Psychotherapy outcome research: methods for reporting variability and evaluating clinical significance. Behavior Therapy, 15, 336-352.] are presented. One adaptation deals with the failure in the original method to distinguish between analysis at the individual versus analysis at the group level. A second adaptation entails the provision of a closer approximation of the underlying true scores. This refinement represents an enhancement in precision. Specific aspects of this refinement may be understood in terms of a correction for error-based regression to the mean. Taking into account these adaptations, new procedures are described for determining (clinically significant) change. Some guidelines for the publication of outcome findings are also presented.

[12] C Evans, F Margison, and M Barkham. The contribution of reliable and clinically significant change methods to evidence-based mental healt. Evidence Based Mental Health, 1:70-72, 1998. [ bib ]
Where outcomes are unequivocal (life or death; being able to walk v being paralysed) clinicians, researchers, and patients find it easy to speak the same language in evaluating results. However, in much of mental health work initial states and outcomes of treatments are measured on continuous scales and the distribution of the "normal" often overlaps with the range of the "abnormal." In this situation, clinicians and researchers often talk different languages about change data, and both are probably poor at conveying their thoughts to patients.

Researchers traditionally compare means between groups. Their statistical methods, using distributions of the scores before and after treatment to suggest whether change is a sampling artefact or a chance finding, have been known for many years.1 By contrast, clinicians are more often concerned with changes in particular individuals they are treating and often dichotomise outcome as "success" or "failure." The number needed to treat (NNT)...

[13] NS Jacobson and P Truax. Clinical significance: a statistical approach to defining meaningful change in psychotherapy research. Journal of Consulting and Clinical Psychology, 59(1):12-19, 1991. [ bib ]
In 1984, Jacobson, Follette, and Revenstorf defined clinically significant change as the extent to which therapy moves someone outside the range of the dysfunctional population or within the range of the functional population. In the present article, ways of operationalizing this definition are described, and examples are used to show how clients can be categorized on the basis of this definition. A reliable change index (RC) is also proposed to determine whether the magnitude of change for a given client is statistically reliable. The inclusion of the RC leads to a twofold criterion for clinically significant change.


This file was generated by bibtex2html 1.91.