Recent Research: Quantitative Methods for Policy Research
- Improving the Design and Quality of Experiments
- Developing New Methods for Research in Education
- Data Use, Quality, and Cost in Policy Research
- Framing Methods and Pretreatment Effects
- Interdisciplinary Methodological Innovation
IES-Sponsored Research Training
The Institute of Education Sciences (IES), the research wing of the U.S. Department of Education and its National Center for Education Research, supported two workshops at IPR in summer 2011 to improve methodological practices in education research, each co-organized by an IPR fellow.
Complementing the current interest in randomized experiments in education, the Workshop on Quasi-Experimental Design and Analysis in Education seeks to improve the quality of quasi-experiments in education, which recent analyses have shown to be generally below state of the art. Two weeklong sessions were held between August 8 and 19, led by IPR social psychologist Thomas D. Cook and William Shadish of the University of California, Merced. More than 120 participating researchers learned three distinct quasi-experimental techniques—regression discontinuity, interrupted time series, and nonequivalent group design using propensity scores and matching—and the advantages and disadvantages of each.
The fifth summer institute on cluster-randomized trials (CRTs) in education was led by IPR statistician and education researcher Larry Hedges and professors Mark Lipsey and David Cordray of Vanderbilt University, June 19 to 30, at Northwestern. Thirty researchers from around the country participated in the two-week training, which focuses on the use of cluster-randomization—a methodological tool that helps account for the group effects of teachers and classrooms when measuring an intervention’s effects on individual student achievement. Sessions cover a range of specific topics in the design, implementation, and analysis of education CRTs, from conceptual and operational models to sampling size and statistical power. Participants also learn to use software such as STATA and HLM to conduct hierarchical data modeling and work in groups to create mock funding applications for an education experiment.
The effect of unreliability of measurement on propensity score adjusted treatment effects is a topic that has largely been unexamined. A study published in the Journal of Educational and Behavioral Statistics by IPR social psychologist Thomas D. Cook and his colleagues Peter Steiner of the University of Wisconsin–Madison and William Shadish of the University of California, Merced presents results from their work simulating different degrees of unreliability in the multiple covariates that were used to estimate a propensity score. The simulation uses the same data as two prior studies where the researchers showed that a propensity score formed from many covariates demonstrably reduced selection bias. They also identified the subsets of covariates from the larger set that were most effective for bias reduction. Adding different degrees of random error to these covariates in a simulation, the researchers demonstrate that unreliability of measurement can degrade the ability of propensity scores to reduce bias. Specifically, increases in reliability only promote bias reduction if the covariates are effective in reducing bias to begin with. They found that increasing or decreasing the reliability of covariates that do not effectively reduce selection bias makes no difference at all.
Random and Cutoff-Based Assignment
Cook and former IPR postdoctoral fellow Vivian Wong co-authored a study in Psychological Methods reviewing past studies comparing randomized experiments to regression discontinuity designs, which mostly found similar results—but with some significant exceptions. The authors argue that these exceptions might be due to potential confounds of study characteristics with assignment method or with failure to estimate the same parameter over methods. In their study, they correct the problems by randomly assigning 588 participants to be in a randomized experiment or a regression discontinuity design in which they are otherwise treated identically, comparing results estimating both the same and different parameters. Analysis includes parametric, semiparametric, and nonparametric methods of modeling nonlinearities. Results suggest that estimates from regression discontinuity designs approximate the results of randomized experiments reasonably well but also raise the issue of what constitutes agreement between the two estimates.
Accounting for Missing Survey Data
Missing data are prevalent in social science and health studies, both in the form of attrition—in which responses “drop out” of the data set after a certain point—and in nonmonotone patterns of intermittently missing values. Yet even within these patterns, not all missing data can be treated equally; certain trends in missing data might indicate wider trends that should be taken into account when forming conclusions about the data set as a whole. In an article published in Biometrics, marketing professor and IPR associate Yi Qian, with Hua Yun Chen and Hui Xie of the University of Illinois at Chicago investigate the use of a generalized additive missing data model that, contrary to the existing literature, does not assume a restricted linear relationship between missing data and the potentially missing outcome. Using a bone fracture data set, they conduct an extensive simulation study. Their simulation shows that the proposed method helps reduce bias that might arise from the misspecification of the functional forms of predictors in the missing data model.
Improving Generalizability in Research
If an education intervention proves to be successful in the study sample, will it actually work in schools outside of the study too? The results of a well-designed experiment can also apply to the relevant population. With the support of the National Science Foundation and the Institute of Education Sciences, Hedges is investigating new methods to improve the generalizability of findings from education research so that results from one study can be used to make statistical claims in another population or place. His work builds on propensity score methods and a database of national covariates to create a statistical approach that uses study samples to estimate parameters of the distribution of treatment effects in an inference population. Hedges will conduct training workshops on using new methods at four national education conferences for other researchers.
In a project supported by IES, Hedges continues his development of improved statistical methods for analyzing and reporting the results of multilevel experiments in education. Many education evaluations employ complex, multilevel designs to account for the effects of clustering—or the fact that students are situated within certain classrooms in certain schools. The project also looks at how to represent and combine the results of several experiments, which can sometimes yield multiple measures of the same outcome construct. The results will improve the precision of estimates and suggest new ways to use the results of randomized experiments in education.
IES is also sponsoring a project co-led by Hedges that seeks to design new parameters for educational experiments at the state, local, school, and classroom levels. Many current educational experiments use designs that involve the random assignment of entire pre-existing groups (e.g., classrooms and schools) to treatments, but these groups are not themselves composed at random. As a result, individuals in the same group tend to be more alike than individuals in different groups, so results obtained from single district data might be too imprecise to provide useful guidance. This project will decompose the total variation of state achievement test scores to estimate experiment design parameters for students in particular grades in each state. The new parameters will take into account achievement status and year-to-year improvement for a particular grade, as well as demographic covariates. Designs will also differ across different school contexts with a focus on low- performing schools, schools serving low-income populations, or schools with large minority populations.
Human Capital Forecasting
Spencer is working on estimates and forecasts for selected areas of human capital, such as those that categorize U.S. workers employed in science and technology jobs according to skill. Past studies of U.S. educational attainment have tended to focus on differences in averages across groups. This is consistent with most demographic research, which has focused on rates rather than totals. Total numbers of people with certain types of human capital are important for U.S. competitiveness, however. Thus, Spencer is developing a new model that allows for aging and retirement, international movement, and potential policy effects of improved incentives for attracting and training students. Having a framework for systematically organizing information about human capital could help U.S. policymakers both in tracking progress and in developing strategies to increase particular kinds of human capital. Spencer also hopes the statistics will be useful in discussions about the future of U.S. higher education, and, by extension, K–12 and even preschool education.
Learning Interventions Institute
Hedges co-led the 2011 American Society for Microbiology (ASM)/National Institute of General Medical Sciences Learning Interventions Institute on “Understanding Research Techniques to Study Student Interventions,” held January 10–13 in Washington, D.C. The institute aims to introduce new behavioral and social science research methods that can be used to understand factors affecting student interest, motivation, and preparedness for research careers in science and medicine. The program used an intensive “learn, apply, and share” process of lectures and discussions, followed by small group work. All elements of the process are focused on how research can be used to learn what efforts drive and support academic success and commitment by students who are studying in science, technology, engineering, and mathematics (STEM) fields.
Fostering a Methodological Network
The Society for Research on Educational Effectiveness (SREE) gathered hundreds of researchers and educators from all over the nation in 2011 to participate in its first September conference, in addition to the organization’s annual conference in the spring. Several IPR members presented their research at the three-day meetings, both held in Washington, D.C., including education economist David Figlio, Cook, and Hedges, who serves as SREE’s president. Founded in 2005, SREE is a professional society that brings together scholars from a diversity of disciplines in the social sciences, behavioral sciences, and statistics who seek to advance and disseminate research on the causal effects of education interventions, practices, programs, and policies. It also publishes the Journal of Research on Educational Effectiveness, a peer-reviewed publication of research articles focused on cause-and-effect relations important for education, which Hedges co-edits. SREE is supported by a grant from the Institute of Education Sciences.
Methodologists Stimulate Q-Center Series
Designed to showcase and promote discussion of methodological innovation across disciplines, the Q-Center continued its colloquia series in 2011 with several renowned experts, including MDRC Chief Social Scientist Howard Bloom, who spoke about the design and analysis of a recent, largescale MDCR study of New York City's small schools initiative. Stephen Raudenbush, who is Lewis-Sebring Distinguished Service Professor at the University of Chicago, presented his research on the impact of a math curricular reform program launched in 2004 by Chicago Public Schools on course-taking, classroom composition, and achievement. Andrew Gelman of Columbia University and Kosuke Imai of Princeton University were among other invited speakers.
Statistical Theories for Census Data
A new project led by IPR statistician Bruce Spencer with economist Charles F. Manski will address fundamental problems for government statistical agencies: how to understand the value of the statistics they produce, how to compare value to cost to guide rational setting of statistical priorities, and how to increase value for a given cost. Because data use is so complicated and difficult to study, Spencer argues that new theory is needed so that case studies for use of data in policymaking and research in the social, behavioral, and economic sciences can be developed, analyzed, and interpreted. The practical implications of research findings are important for statistical agencies, both in the long term and the short term, to understand and communicate the value of data programs the agencies might seek to carry out. Supported by a grant from the U.S. Census Bureau, the researchers propose to extend and apply statistical decision theory, including cost-benefit analysis, to attack such basic questions of the statistical agencies. The research will focus on data use, data quality, data cost, and optimization, and the findings will be applied to problems of the U.S. Census Bureau with the goal of carrying out a cost-benefit analysis of the 2020 census, which is facing severe cost constraints.
Sampling Theory and Methodology for Networks
Government data collections are tempting targets for budget cutters—not because the budgets are so large, but because ignorance about data use makes the effects of data reductions hard to see. There is a reason that so little is known about data use, however. Inferring the impacts of data use is a problem of assessing the causal effect of an intervention—people either observe what happened when the data program was conducted, or what happened when it was not conducted, but not both. Spencer is currently reviewing the state of knowledge about whether and how data are used, including both theoretical and empirical research. His work pays particular attention to the effects of data quality.
Uncertainty in Policy Analysis
Douglas Elmendorf, director of the Congressional Budget Office (CBO), reported to Congress in March 2010 that enactment of the proposed healthcare legislation would lower the deficit by $138 billion between 2010 and 2019. In his 25-page letter, Elmendorf expressed no uncertainty about the figure, which the media subsequently reported without question. Yet Manski underscores that given the complicated nature of the legislation, the CBO figure of $138 billion is at best a very rough estimate. In contrast, he cited a former CBO director who predicted that the same bill would instead increase the deficit by $562 billion. In an article for The Economic Journal, Manski used this $700 billion difference to underscore why policy analysts should be upfront about the amount of uncertainty in their predictions. Transparency is key in creating more credible policy analysis, he said. Manski points to the United Kingdom, where agencies must state upper and lower bounds in assessing the budgetary impact of a legislative proposal. He also suggests that policy analysts and researchers use a layered analysis, to move from weak, highly credible assumptions to stronger, less credible ones, determining the conclusions that follow in each case. This would help resolve the tension between the credibility and power of assumptions and also improve the transparency of policy discussions. The research project received funding from the National Science Foundation.
Pretreatment Effects in Political Communication Experiments
Research on political communication effects has seen great progress over the past 25 years. A key ingredient underlying these advances is the increased usage of experiments that demonstrate how communications influence opinions and behaviors. But virtually none of these studies pay attention to events that occur before the experiment, or “pretreatment events.” Given that many, if not most, researchers design experiments aimed at capturing “real world” political communications, the likelihood of pretreatment contamination is substantial. In an article in the American Journal of Political Science, IPR political scientist James Druckman and IPR graduate research assistant Thomas Leeper explore how and when the pretreatment environment affects experimental outcomes. They present two studies—one where they controlled the pretreatment environment and one where it naturally occurred—to show how pretreatment effects influence experimental outcomes, presenting the first conclusive evidence of a pretreatment dynamic. More importantly, they identify the conditions under which these effects occur. When accounting for the pretreatment context, they found that average experimental treatment effects might miss important variations among subgroups. Furthermore, the non-existence of experimental effects might stem from a large number of individuals forming strong attitudes in response to earlier communications prior to the experiment, making them more likely to reject subsequent contrary arguments. They argue that, under certain conditions, attending to pretreatment dynamics leads to a more accurate portrait of the mass public and its political flexibility.
Framing and Obesity-Related Behaviors
A variety of persuasive communications and interventions have been explored as possible means to prevent or reduce obesity. One persuasive message variation that has been of interest to researchers in this domain is the contrast between gain-framed and loss-framed appeals. A gain-framed appeal emphasizes the advantages of compliance with the advocated action, whereas a loss-framed appeal emphasizes the disadvantages of noncompliance. Work by communication studies researcher and IPR associate Daniel O’Keefe and his colleague provide a meta-analytic review of the accumulated experimental research concerning the relative persuasiveness of gain-framed and loss-framed appeals for influencing various obesity-related behaviors. The results showed that gain-framed appeals were significantly more persuasive than their loss-framed counterparts for messages encouraging physical activity. But the researchers found no evidence that either gain- or loss-framed appeals held any persuasive advantage in influencing healthy eating behaviors. They advised designers of messages aimed at specifically obesity-relevant eating practices not to spend much time worrying about whether those messages are gain- or loss-framed.
Party Heterogeneity in Candidates
IPR political scientist Georgia Kernell is examining the conditions under which parties benefit from fielding more or less heterogeneous candidate teams. While most spatial voting models assume or imply that homogeneous candidate teams offer parties the best prospect for winning elections—in reality, candidates from the same political party often adopt divergent policy positions. She reconciles theory and reality by identifying a strategic rationale for political parties to recruit a diverse pool of candidates. Kernell develops a spatial model in which two parties each select a distribution of candidates to compete in an upcoming election. The model demonstrates that parties positioned close to the median voter should field a more homogeneous set of candidates than parties with platforms that are more distant. Kernell tests this prediction using data on the policy positions of Democratic and Republican candidates for congressional and state legislative elections since 1990. In line with the model’s predictions, she finds that minority parties—presumably more distant from the median voter—are more heterogeneous than majority parties.
Experiments in Political Science
Druckman is co-editor of the first comprehensive overview of how experimental research is transforming political science. Published by Cambridge University Press, the Cambridge Handbook of Experimental Political Science (2011) offers methodological insights and groundbreaking research from 30 of the discipline’s leading experimentalists, including Druckman, Shanto Iyengar and Paul Sniderman of Stanford University, Alan Gerber and Donald Green of Columbia University, and Diana Mutz of the University of Pennsylvania. The handbook aims to ensure that political science experiments are conducted with the highest level of intellectual rigor, thereby enabling political scientists to provide policymakers with significant data and conclusions. The volume came together after a May 2009 conference at Northwestern University and also features contributions from IPR faculty Daniel Diermeier and Dennis Chong.
IPR sociologist Jeremy Freese and Penny Visser of the University of Chicago continue to expand the research capacity of Time-Sharing Experiments for the Social Sciences (TESS), a website that facilitates original experiments on nationally representative samples at no cost to investigators. Recently, TESS joined forces with the Human Factors and Behavioral Sciences Division of the Department of Homeland Security’s Science and Technology Directorate to encourage survey research related to terrorism and government countermeasures. Specifically, the partnership will allow social and behavioral scientists to investigate the factors contributing to terrorism-related attitudes, beliefs, judgments, and behaviors with a field study larger than normally permitted in a standard TESS proposal. TESS was launched in 2008 as an infrastructure project of the National Science Foundation. Faculty, graduate students, and postdoctoral researchers can simply submit their proposals for peer review, and if successful, TESS then fields the Internet-based survey or experiment on a random sample of the U.S. population.
Evaluating Fellowship Programs
Evaluating the quality of researchers is a key component of any strategy to improve the overall quality of research output. Hedges and IPR research associate Evelyn Asch were part of the Spencer Foundation’s full-scale review of its two highly prestigious fellowship programs, designed to determine the programs’ effectiveness in helping fellows become stronger researchers than they would be otherwise. Hedges and Asch, along with graduate student Jennifer Hanis, completed evaluations of the Spencer Postdoctoral Fellowship program and the Spencer Dissertation Fellowship program. Their principal question in the evaluations was not whether those who received the fellowships had more successful careers than those who did not, but rather whether they had more successful careers because they received a fellowship. They also examined whether the Dissertation Fellowship program has helped to build a community of scholars related to the Spencer Foundation and whether there is any evidence that it has attracted scholars into education research. Using a regression discontinuity analysis, they examined the fellowships’ impact on the total number of publications, citations, editorial positions, and grants received by fellows versus finalists who were not selected as fellows. Their findings indicated that both programs have a significant causal impact on several outcomes, especially in the fellows’ success garnering research support through both federal and nonfederal grants.
Methodologists Stimulate Q-Center Series
Designed to showcase and promote discussion of methodological innovation across disciplines, the Q-Center continued its colloquia series in 2011 with several renowned experts, including MDRC Chief Social Scientist Howard Bloom[N1] , who spoke about the design and analysis of a recent, largescale MDRC study of New York City’s small schools initiative. Stephen Raudenbush[N2] , who is Lewis-Sebring Distinguished Service Professor at the University of Chicago, presented his research on the impact of a math curricular reform program launched in 2004 by Chicago Public Schools on course-taking, classroom composition, and achievement. Andrew Gelman[N3] of Columbia University and Kosuke Imai [N4] of Princeton University were among other invited speakers.