Research News

Faculty Spotlight: Bruce Spencer

IPR statistician shows how accurate figures can be a force for good


spencer

IPR statistician Bruce Spencer is one of the world’s leading experts in statistical accuracy.

A discarded pamphlet on a New York subway train is what pushed IPR fellow Bruce Spencer to become a leading statistician.

That brochure, which described how a foundation was using statistics to improve rice yields and nutrition for developing countries, found its way into the hands of the then-sophomore math major at Swarthmore College. It was the late 1960s, and he had been feeling a disconnect between his studies and the social turmoil of the times.

“I couldn’t see how [math] could be applied to make the world a better place,” Spencer said.

But the brochure set off a light bulb. “I thought, ‘Wow, I can use math and statistics and do something good for the world,’ ” Spencer recalled.

That cast-off pamphlet is what led him to become one of the world’s leading experts in statistical accuracy, empowering policymakers and governmental agencies around the world to make better policy decisions.

Improving Census Accuracy

While working on his doctorate in statistics at Yale University, Spencer became interested in how governments collect statistics and their accuracy..

“Does it matter if [statistics] are accurate?” Spencer asked. “And if it does matter, how much is it worth spending to make them more accurate?”

These questions have focused his work on the decennial U.S. Census, but also those of South Africa. In a recent study with IPR graduate research assistant Zachary Seeskin, Spencer built models of census accuracy profiles, using a variety of error distributions. A seemingly small 4 percent average error in state population estimates, they found, would be expected to result in 9–14 of the 435 seats in the U.S. House of Representatives going to the wrong states—and $60–80 billion of federal grants potentially being misallocated.

Spencer’s work on costs and benefits of censuses attracted the attention of South African statisticians, who were considering the benefits of conducting a constitutionally optional mid-decade census.

By modeling accuracy of two alternative sets of midyear population estimates, Spencer and Seeskin helped researchers at Statistics South Africa, the country’s official statistics agency, advise the prime minister whether to forgo the mid-decade 2016 census, saving 3 billion South African Rand (approximately $208 million at the time) and use less costly methods. The decision of whether improved accuracy of allocations arising from the census outweighed the additional cost was ultimately a political one, involving the importance of equitable allocations.  The government decided to forgo the census, expanding a population survey to a sample size of 1 million persons, collecting more accurate official birth and death statistics, and doubling the number of researchers working in noncensal years. 

Measuring The Accuracy of Criminal Trial Verdicts

Extending his inquiry on the accuracy of information, Spencer considered how to assess the accuracy of verdicts in criminal cases when the truth is unknown.

In statistics, one can estimate sampling error even when the value for the whole population is unknown. The key is to have replication. For trials by jury, you can replicate verdicts by asking the presiding judges what they would rule if they issued a verdict from the bench.

“What happens when a judge and a jury disagree in a criminal trial? “They can’t both be right,” Spencer said.

In the Journal of Empirical Legal Studies, Spencer described how it is possible to study average accuracy of jury verdicts—even when the correct verdict cannot be known—by obtaining a second rating of the verdict, such as the judge’s. His widely reported study concluded that jury verdicts in a nonrepresentative sample of courtrooms were incorrect 15 percent of the time, with an estimated standard error of 4 percentage points. He noted that his simple estimator, based on the rate of agreement between judge and jury, might have a tendency to overestimate, but not underestimate, verdicts’ accuracy. Using a complex form of statistical models known as latent class models, he estimated the type I and type II error rates for judges and juries. A type I error occurs when an innocent person is convicted, and type II errors occur when guilty persons are acquitted. The estimates he obtained were not based on a large or random sample of cases, however, and should not be taken out of context.

A methodological issue for the study of verdict accuracy is that the methods tend to underestimate type I and type II error rates, but not erroneous conviction rates or the ratio of type II errors to type I errors, which the Constitution requires to exceed 1. Additionally, latent class models will not be perfectly correct, since they are complex and subject to multiple sources of error, not all of which are directly quantifiable.

Spencer is continuing work in the area by backtracking to a very fundamental question on the matter: Are we better off not trying to know the answer, or should we seek answers whose accuracy is in doubt due to uncertainty about the models?  

Improving Earthquake Prediction Models

Another area where accurate forecasting can have an earth-shattering impact, literally, is that of earthquake hazard maps. For example, Japan’s 9.0-magnitude Tohoku earthquake in 2011 and its resulting tsunami killed more than 15,000 people and caused nearly $300 billion in damages. The shaking from the earthquake was significantly larger than Japan’s national hazard map had predicted, devastating areas forecasted to be relatively safe.

Such hazard-mapping failures prompted Spencer, geophysicist and IPR associate Seth Stein, and IPR graduate research assistant Edward Brooks to search for better ways to construct, evaluate, and communicate earthquake hazard forecasts In their two related IPR working papers, both of which have since been published in seismological journals, the three researchers point out several critical problems with current hazard maps and offer improvements for statistical models, including through the use of Bayesian modeling to update hazard maps as additional events (and nonevents) are observed.

Better seismological mapping can help a government to better prepare its earthquake-prone zones for such disasters, better protect their citizens in those zones, and help decide whether costly earthquake-resistant construction projects are justified.

Spencer’s research on earthquake prediction models is just one example of Spencer and other IPR colleagues applying a methodological lens to areas previously understudied by statisticians.

“I like to work in areas that are relatively unexplored,” Spencer concluded. “I’m trying to take statistical methods and statistical theory and expand their boundaries to problems that were previously out of reach.”

Bruce Spencer is professor of statistics and an IPR fellow.