Design, Implementation, and Analysis of Within-Study Comparisons

The 2014 Summer Research Training Institute is sponsored by the Institute of Education Sciences, U.S. Department of Education, and the Institute for Policy Research at Northwestern University

Workshop Dates
August 11–22, 2014

Workshop Location
Northwestern University, Evanston, IL 

Why Within-Study Comparison Designs?

The randomized experiment is the gold standard for answering causal questions, but randomized experiments are not always feasible. Can observational methods generate the same treatment effects that would have been obtained from a randomized experiment on the same target population? The within-study comparison (WSC) design is a methodology for assessing the observational study methods in “real world” contexts. 

Early WSCs

Introduced by Robert LaLonde (1986), the earliest WSCs used data from job-training evaluations and compared results from an observational study with those from an experimental benchmark. The goal was to determine whether observational methods such as propensity score matching, regression, or instrumental variable approaches could replicate results from an experiment. The early conclusion from these studies were that observational methods failed to produce results that were comparable to their experimental benchmark.

However, early WSCs were flawed in several ways. The randomized and non-randomized groups differed in more ways than in mode of assignment. Comparison units were drawn from remote locations, measured at different times, and in some cases, did not share the same outcomes as treatment cases. These study differences may have confounded initial conclusions that observational methods could not replicate experimental benchmarks. Later WSCs attempted to address these concerns by using internal comparisons, where randomized and non-randomized respondents were drawn from similar populations in the same geographic location, and were measured by the same units of measurement at the same time.

Successful Observational Approaches

More within-study comparisons were conducted—many in the fields of education, medicine, and early childhood development. Cook, Shadish, and Wong (2008) reviewed these studies and argued that observational approaches could replicate experimental results in at least three instances:

  • When assignment was based on a covariate score and cutoff, as in the case of regression-discontinuity 
  • When treatment and comparison units were from the same geographic location and rich, theoretically relevant covariates were available 
  • When multiple pretests were available 

Over the last decade, the work on WSCs has begun to provide researchers with important clues for designing empirically validated quasi-experiments when random assignment is not feasible. Still more WSCs are needed to answer the many remaining questions about the contexts and conditions in which observational approaches can produce unbiased results.