Correcting a Significance Test for Clustering in Designs With Two Levels of Nesting (WP-07-14)
Larry V. Hedges
A common mistake in analysis of cluster randomized experiments is to ignore the effect of clustering and analyze the data as if each treatment group were a simple random sample. This typically leads to an overstatement of the precision of results and anti-conservative conclusions about precision and statistical significance of treatment effects. This paper gives a simple correction to the t-statistic that would be computed if clustering were (incorrectly) ignored in an experiment with two levels of nesting (e.g., classrooms and schools). The correction is a multiplicative factor depending on the number of clusters and subclusters, the subcluster sample size, the subcluster size, and the cluster and subcluster intraclass correlations ρS and ρC. The corrected t-statistic has the student’s t-distribution with reduced degrees of freedom. The corrected statistic reduces to the t-statistic computed by ignoring clustering when ρS = ρC = 0. It reduces to the t-statistic computed using cluster means when ρS = 1. If ρS and ρC are between 0 and 1, the adjusted t-statistic lies between these two and the degrees of freedom are in between those corresponding to these two extremes.