Correcting a Significance Test for Clustering (WP-06-11)
Larry V. Hedges
A common mistake in analysis of cluster-randomized trials is to ignore the effect of clustering and analyze the data as if each treatment group were a simple random sample. This typically leads to an overstatement of the precision of results and anticonservative conclusions about the precision and statistical significance of treatment effects. This working paper gives a simple correction to the t-statistic that would be computed if clustering were (incorrectly) ignored. The correction is a multiplicative factor depending on the total sample size, the cluster size, and the intraclass correlation p. The corrected t-statistic has a student’s t-distribution with reduced degrees of freedom. The corrected statistic reduces to the t-statistic computed by ignoring clustering when p = 0. It reduces to the t-statistic computed using cluster means when p = 1. If 0 < p <1, it lies between these two, and the degrees of freedom are between those corresponding to these two extremes.
Larry V. Hedges, Board of Trustees Professor of Statistics and Social Policy, Northwestern University