Data Use, Quality & Cost in Policy Research
How Schools Are Chosen for Research
As education policymakers increasingly rely on evidence, researchers have conducted more cluster-randomized trials. But much work remains be done in reviewing those trials’ methodology, specifically how researchers choose which schools to include as what works for one school or district may not work for another. In a recent working paper, IPR statistician Beth Tipton and her co-authors examine 34 such trials to determine whether their samples are truly representative of particular populations of schools and students. They compare the sample data from those studies, funded by the Institute of Education Sciences and selected and evaluated by Tipton and her co-authors, to general population data from the U.S. Department of Education. They find that recruitment for studies is heavily dependent on pre-existing local relationships between researchers and the schools that are ultimately selected. Therefore, those schools skew, like the universities, larger in size and more urban than the population being studied. They find this poses major challenges to any generalizations drawn from such studies. The researchers recommend three major changes to recruitment: including the sample-collection methodology in the grant proposal, increasing training for sample collection, and establishing best practices for school recruitment.
Predicting Homeowners' Gender
Homeownership is an essential tool to building wealth, but who owns a home? In Population Research and Policy Review, IPR sociologist Julia Behrman and Doron Shiffer-Sebba note that shared ownership between married couples is an assumption in most surveys, creating higher joint ownership estimates and misleading patterns in wealth. To get a clearer picture of who actually holds wealth in the form of homeownership, the researchers apply gender R’s algorithms to distinguish gender through names based on data from the Social Security Administration. They examined 257,764 properties from tax assessor data in Detroit and Philadelphia and combined the data with Census Zip code-level data from 2013–2017 American Community Survey 5-year estimates to examine race, marriage status, education, and income. The researchers categorized homeownership by female-sole owners, male-sole owners, mixed ownership, or other, which meant more than two owners or owners of the same gender. Female-sole owners were the largest category in Detroit (49%) and Philadelphia (36%), and mixed male-female ownership was the lowest in Detroit (12%) and somewhat lower in Philadelphia (33%). Even though the researchers found more female owners, women were more likely to own smaller and lower-value homes than males. In Philadelphia, women also tend to be in Zip codes with lower education and fewer high earners. The results provide new insights into understanding wealth and homeownership by highlighting gender differences, specifically that more women, not couples, own homes.
What Is a Realistic Estimate of COVID-19 Infection Rates?
Epidemiologists and policymakers accept that due to imperfections in testing and data collection, the actual rate of COVID-19 infection is likely higher than reported. Given those uncertainties, how can researchers more accurately estimate the true infection rate? In a working paper, IPR economist Charles F. Manski and Cornell University’s Francesca Molinari (PhD 2003) explore new upper and lower limits for those rates by combining existing data with assumptions about the infection rate in the untested population, as well as those about the accuracy of current tests. To explain the difficulty of setting those bounds, they examine three key hotspots, using data on the numbers of individuals tested and numbers of positive test results in Italy, Illinois, and New York. They find that due to the lack of information about the infection rate in the large untested population, as well as issues with the accuracy of current testing methods, the infection rate might be substantially higher than reported. They also find the infection fatality rate in Italy is substantially lower than reported. While the bounds can be narrowed by imposing stronger assumptions about the infection rate, random testing would significantly narrow the bounds, along with the development of a better understanding regarding how testing predicts infection rate. Manski is Board of Trustees Professor in Economics.
Education Mobility Across Generations
Several researchers worldwide find a “grandparent effect” when studying socioeconomic mobility, which means older generations transmit their wealth and education to the next generation. In the Journal of Labor Economics, economist and IPR associate Joseph Ferrie and his colleagues test whether this effect applies to the United States in the 20th century by examining whether grandparents’ and parents’ education leads to more education for younger generations. Using data from the 1940 U.S. Census of Population, the Annual Social and Economic Supplement (ASEC) of the Current Population Survey (CPS), and the American Community Survey (ACS), the researchers create two two-generation samples and one three-generation sample. First, Ferrie and his colleagues investigate the relationship, or correlation, between parents’ education levels and their children’s to determine if children have an advantage and gain more education due to a more educated parent. They find that education persists across generations by about 20%, meaning younger generations benefit from the older generation. However, the researchers suspect that measurement error—in this case, whether people accurately report their education in surveys and if researchers can connect relatives across generations—incorrectly shows this large effect. Ferrie and his colleagues correct the measurement error and find the correlation between generations declines by 18.2%, suggesting older generations do not play a significant role in later generations’ educational outcomes. As inequality increases in the United States, this research has important implications for scholars and policymakers trying to identify the origins of inequality because it shows educational advantages do not transfer across generations.
Applying Partial Identification to Public Health
Most preventive medicine and public health researchers report exact values when assessing risks of illness or predicting treatment response. However, researchers must deal with missing data and may make unjustifiable assumptions about these gaps, leading to misleading conclusions. Manski and his colleagues recommend in the American Journal of Preventative Medicine that researchers use partial discovery, or partial identification, to determine more informative results, and they apply it to three real-world scenarios. Partial identification strategies provide a range of values and make more realistic assumptions about missing data. The first example looks at child vaccination data, which can be missing for several reasons, such as parents refusing to share information. Here, a partial identification approach helps identify the lowest possible vaccination rate and the highest. The second examines a trial investigating treatment for hypertension. Here only a small amount of outcome data are missing, which means researchers do not need to make assumptions about the missing data. A partial identification strategy determines a narrow range of options to help clinicians propose treatments to different patients. In the last example, they assess COVID-19 infection rates, which can be inaccurate due to missing data from untested people who are asymptomatic carriers of the virus, as well as tests that are not entirely accurate. A partial identification strategy gives more practical information about COVID-19 infection rates and underscores the need for better data and testing strategies. The researchers conclude that partial identification can address public health and preventative medicine issues, including the ongoing COVID-19 pandemic, estimating how common and severe infections are, and making decisions on potential vaccines and treatments based on preliminary clinical trials.