2014 IES/NCER Summer Research Training Institute: Design, Implementation, and Analysis of Within-Study Comparisons

Sponsored by the Institute of Education Sciences, U.S. Department of Education, and the Institute for Policy Research at Northwestern University

August 11 – 22, 2014
Northwestern University
Evanston, IL

Why Within-Study Comparison Designs?

The randomized experiment has long been established as the gold standard for answering causal questions. However, randomized experiments are not always feasible, so observational methods are also needed for answering causal questions. The issue then is whether observational methods can generate the same treatment effects that would have been obtained from a randomized experiment on the same target population.

Fortunately, we have a methodology for assessing the performance of observational study methods in “real world” contexts. It is what we call the within-study comparison (WSC) design. Introduced by LaLonde (1986), the earliest WSCs used data from job training evaluations and compared results from an observational study with those from an experimental benchmark. The goal was to determine whether observational methods such as propensity score matching, regression, or instrumental variable approaches could replicate results from an experiment. The early conclusion from these studies were that observational methods failed to produce results that were comparable to their experimental benchmark

However, early WSCs were flawed in several ways. The randomized and non-randomized groups differed in more ways than in mode of assignment. Comparison units were drawn from remote locations, measured at different times, and in some cases, did not share the same outcomes as treatment cases. These study differences may have confounded initial conclusions that observational methods could not replicate experimental benchmarks. Later WSCs attempted to address these concerns by using internal comparisons, where randomized and non-randomized respondents were drawn from similar populations in the same geographic location, and were measured by the same units of measurement at the same time.

More within-study comparisons were conducted – many in the fields of education, medicine, and early childhood development. Cook, Shadish, and Wong (2008) reviewed these studies and argued that observational approaches could replicate experimental results in at least three instances – when assignment was based on a covariate score and cutoff, as in the case of regression-discontinuity; when treatment and comparison units were from the same geographic location and rich, theoretically relevant covariates were available; and when multiple pretests were available.

Over the last decade, the work on WSCs has begun to provide researchers with important clues for designing empirically validated quasi-experiments when random assignment is not feasible. Still, many questions remain about the contexts and conditions in which observational approaches can produce unbiased results, so more WSCs are needed.

Workshop Goals and Overview:

The goals of the WSC workshop are to provide researchers with state-of-the-art tools for designing, implementing, and analyzing their own within-study comparisons, to help researchers plan higher quality (and more internally valid) WSCs, and to provide general education and background knowledge on WSCs for researchers interested in methodology.

Workshop materials will be presented through a series of lectures and discussions of past within-study comparisons (see attached reference list for some of the WSCs that will be discussed). In addition, attendees will work in small groups (of 2 – 5 people) to design their own WSCs on topics that are of mutual substantive and methodological interest. At the end of the week, groups will present their WSC protocols to the workshop for feedback and comments. The hope is that some participants will go home and implement the WSCs that they designed at the workshop.

Target Audience:

The workshop is for researchers who 1) have interests in learning more about WSCs; 2) have access to randomized control trial data (though we will have limited capacity to provide such data); and/or 3) intend to design, implement, and analyze a WSC study prospectively.

Minimum Prior Knowledge:

We will assume that participants have upper-level knowledge in statistics or econometrics (e.g. multiple regression, including OLS, logit, probit), and experience with causal research designs such as regression-discontinuity, interrupted time series, and non-equivalent comparison group matching designs.

This course is suitable for most empirical researchers with PhD level training or advanced doctoral students.

Workshop Leaders:

Thomas D. Cook (IPR/Northwestern University) Tom Cook is a professor of sociology, psychology, education and social policy. He is the Joan and Sarepta Harrison Chair in Ethics and Justice and an Institute for Policy Research faculty fellow at Northwestern University. He is best known for his work on the theory and practice of the design and analysis of various forms of quasi-experiment. He has published heavily on threats to validity, and enumerating threats to internal validity and external validity in particular, on regression discontinuity studies, on interrupted time series work and on various forms of individual and group-level matching. He has authored or co-authored ten books and about one hundred articles on these topics, including Cook & Campbell, Quasi-Experimentation: Design and Analysis Issues for Field Settings (1979) and Shadish, Cook & Campbell, Experimental and Quasi-Experimental Designs for Generalized Causal Inference (2002).

William Shadish (University of California, Merced) is is Professor and Founding Faculty, University of California, Merced, where he is also Chair of Psychological Sciences. He received his bachelor’s degree in sociology from Santa Clara University in 1972, and his M.S. (1975) and Ph.D. (1978) degrees from Purdue University in clinical psychology, with minor areas in statistics and measurement. He completed a postdoctoral fellowship in methodology and program evaluation at Northwestern University from 1978-1981. His current research interests include experimental and quasi-experimental design, the empirical study of methodological issues, and the methodology and practice of meta-analysis. He is author (with T.D. Cook & D.T. Campbell, 2002) of Experimental and Quasi-Experimental Designs for Generalized Causal Inference, (with T.D. Cook & L.C. Leviton, 1991) of Foundations of Program Evaluation, (with L. Robinson & C. Lu, 1997) of ES: A Computer Program and Manual for Effect Size Calculation, co-editor of five other volumes, and the author of over 140 articles and chapters. He was the founding Secretary-Treasurer of the Society for Research Synthesis Methodology (2005-2010). He was 1997 President of the American Evaluation Association, winner of the 1994 Paul F. Lazarsfeld Award for Evaluation Theory from the American Evaluation Association, the 2000 Robert Ingle Award for service to the American Evaluation Association, the 1994 and 1996 Outstanding Research Publication Awards from the American Association for Marriage and Family Therapy, the 2002 Donald T. Campbell Award for Innovations in Methodology from the Policy Studies Organization, the 2009 Frederick Mosteller Award for Lifetime Contributions to Systematic Reviews from the Campbell Collaboration, and the 2011 Ingram Olkin Award for Lifetime Contributions to Systematic Reviews from the Society for Search Synthesis Methodology. He is a Fellow of the American Psychological Association, Associate Editor of American Psychologist, past Associate Editor of Multivariate Behavioral Research, and past editor of New Directions for Evaluation.

Peter Steiner (University of Wisconsin-Madison) is an assistant professor in the Department of Educational Psychology at the University of Wisconsin-Madison. His research program focuses on methodological questions about the causal inference with experimental and quasi-experimental designs, including PS-matching designs, regression-discontinuity designs, and interrupted-time-series designs.

Vivian Wong (University of Virginia) is an assistant professor of Research, Statistics, and Evaluation in the Curry School of Education at the University of Virginia. Her interests are in research designs for causal inference, regression-discontinuity evaluations, and the design of within-study comparisons.

Application Process:

Applications are no longer being accepted.

Admission to the workshop occurs through a competitive application procedure. Registration is limited to 25 participants, so apply early!

Click here to apply.

All applications must be received no later than ?, 2014, at 5:00 p.m. CST.

Applicants will be notified no later than ?, 2014, via e-mail. Selected applicants will receive a full workshop schedule and information about travel and lodging.

For questions about the content of the workshop, please email Thomas D. Cook at t-cook@northwestern.edu or any of the other workshop leaders. 

Workshop Logistics:

The workshop will be held from August 11th to 22th, 2014. It will take place at Northwestern University in beautiful Evanston, IL. Successful applicants will receive financial support for costs related to instruction and meals (while in Evanston, IL). Depending on need, there will be scholarships for some to recover hotel and travel costs, and for others hotel costs only.

Hotel rooms have been reserved at the Hilton Garden Inn in Evanston. The hotel is a 15-minute walk to the Norris University Center, where the workshops will be held. There is also a free shuttle service for guests to travel in and around Evanston. You will find more information about the hotel on their website. We need you to confirm your reservation at the hotel 1 month in advance of the workshop. The negotiated room rate is $129/night, plus tax. You will be responsible for any charges incurred due to late cancellation.

For logistical questions regarding the workshop, please email Valerie Lyne at: v-lyne@northwestern.edu.

Workshop Schedule with Presentations:

Day 1: Introduction: Theory of Within-Study Comparison Design pdf (pdf)

  • An empirical research program on quasi-experimentation
  • History of WSCs
  • Examples of different WSC designs
  • Lessons learned from past WSCs
  • Criteria for a causally interpretable WSC design

Day 2: Non-equivalent Control Group Designs (PS Matching) pdf (pdf)

  • Review of matching designs (propensity score matching, stratification, and weighting)
  • Design and analytic assumptions for WSCs of matching designs
  • Design options for WSCs of matching designs
  • Demonstration of a matching WSC: Design and analysis

Day 3: Non-equivalent Control Group Designs (Selected Topics & Discussion) pdf (pdf)

  • Role of covariates in matching
  • Matching with multilevel data

Day 4: Regression Discontinuity Designs (RDD) & Interrupted Time Series Designs (ITS) pdf (pdf)

  • Review of RDD
    • Design and analytic assumptions for WSCs of RDDs
    • Demonstration of an RDD WSC: Design and analysis
    • Examples
  • Review of ITS designs
    • Design and analytic assumptions for WSCs of ITS designs
    • Demonstration of an ITS WSC: Design and analysis
    • Examples

Day 5: Conclusion and Future Work

  • Presentation of group projects
  • Brainstorming session: Discussion of what future WSCs should look like, what substantive areas should the focus on, what methods should be tested