Correcting a Significance Test for Clustering (WP-06-11)
Larry V. Hedges
A common mistake in analysis of cluster-randomized trials is to ignore the effect of clustering and analyze the data as if each treatment group were a simple random sample. This typically leads to an overstatement of the precision of results and anticonservative conclusions about the precision and statistical significance of treatment effects. This working paper gives a simple correction to the t-statistic that would be computed if clustering were (incorrectly) ignored. The correction is a multiplicative factor depending on the total sample size, the cluster size, and the intraclass correlation p. The corrected t-statistic has a student’s t-distribution with reduced degrees of freedom. The corrected statistic reduces to the t-statistic computed by ignoring clustering when p = 0. It reduces to the t-statistic computed using cluster means when p = 1. If 0 < p <1, it lies between these two, and the degrees of freedom are between those corresponding to these two extremes.