# Bruce D. Spencer

## Professor of Statistics

## Biography

Bruce Spencer is a statistician whose interests span the disciplines of statistics and public policy with a special focus on the design and evaluation of large-scale statistical data programs. He is currently conducting a cost-benefit analysis to see how much accuracy is needed for the next census (in 2020), and at what cost. In 2013 he designed and conducted a cost-benefit analysis for South Africa’s statistical agency, Statistics South Africa, to help it decide whether to conduct a population census in 2016 following the previous census in 2011. Spencer has developed statistical methods for assessing the accuracy of verdicts in criminal cases when the truth is unknown. The methods are informative, but not perfect, and he is currently investigating the question of whether society should invest in large-scale studies to assess verdict accuracy, given that the measures of accuracy will be imperfect. He also works on questions related to statistical sampling, including how to estimate the structure of a network when its data come from a conventional sample of nodes on the network—for example, if you have a large survey of people and you know which survey members are linked to other survey members, how much can you learn about the network as a whole? Another area of Spencer's current research involves assessing the accuracy of forecasts of earthquake hazards, as these forecasts affect building codes throughout the U.S.

A member of the Northwestern faculty since 1980, Spencer chaired its statistics department from 1988 to 1999, 2000 to 2001, and 2007 to 2010. He designed and teaches an advanced undergraduate seminar on Human Rights Statistics. He directed the Methodology Research Center of the National Opinion Research Center (NORC) at the University of Chicago from 1985 to 1992 and was Senior Research Statistician there from 1992 to 1994. Spencer has served on a variety of panels for the National Academy of Sciences. He received the Palmer O. Johnson Memorial Award from the American Educational Research Association in 1983 and is an elected Fellow of the American Statistical Association and an elected member of the International Statistical Institute.

Spencer has participated in evaluations of major statistical programs, including population estimates by the Census Bureau, population forecasts by the Social Security Administration, test score statistics by the Department of Education, and drug abuse estimates by state and local agencies. He has also conducted research into the effects of data error on the allocations of public funding and representation. He has published numerous articles and three books.

## Current Projects

**Statistical Decision Theory for Statistical Agencies**. One of several strands of Spencer’s research is cost-benefit analysis for statistical activities. Spencer is currently carrying out a cost-benefit analysis for the 2020 census. Spencer and his former student Zachary Seeskin assembled a list of federal programs that allocate funds or representation based in whole or in part on census population data. For a stratified random sample of 20 such programs, they measured the benefits that arise (or are lost) as the census improves (or deteriorates) in accuracy. The results show the impact of alternative levels of accuracy on allocative uses of the census. Current work carried out jointly with the Census Bureau explores what accuracy can be attained at what cost. The work is being used by the Census Bureau in planning the 2020 census. The funding for this project comes from a National Science Foundation grant entitled, “NCRN-SN: Census Bureau Data Programs as Statistical Decision Problems.” The “NCRN:SN” in the title refers to “NSF-Census Research Network, Small Node,” because Northwestern is one of eight such research nodes.

**Statistical Sampling in Networks. **Part of Spencer’s work addresses the following problem: When one takes a sample of individuals using the usual kind of sampling methods, and the individuals are related to each other in a way that can be described in terms of a graph (or statistical network), how can we make inferences about the graph (network)? In a 2015 *Annals of Applied Statistics *article, “Estimating network degree distributions under sampling: An inverse problem, with applications to monitoring social media networks,” Yaonan Zhang, Eric Kolaczyk, and Spencer address the problem of estimating the degree distribution (fraction of nodes with 0 links, 1 link, 2 links, etc.) based on simple random samples and Bernoulli samples. Work by Spencer’s doctoral student Maxim Vasylkiv extends the work to complex samples, and Vasylkiv is developing new methods to handle the more complex case of inference for 2-degree distributions (e.g., how many people in a social network have *j* friends of friends).

**Human Rights Statistics. **Statistics, as numbers and as methodology, are increasingly used in the context of human rights. Statistical research in this area is in its infancy, both in terms of individual research projects and as a study of how statistics are used and misused. Spencer is collaborating with Galya Ruffer (Department of Political Science and Buffet Center for the Humanities at Northwestern University) to look at forced migration. Spencer and Ruffer are developing a research project to statistically assess the performance of agencies in granting asylum status. The first step in the research is to understand what agencies are engaged in such processes, how they operate, and what their various goals are. To this end, a National Science Foundation grant supported an international workshop on the topic that was held at Northwestern in 2014.

**Statistical Assessment of Earthquake Hazard Predictions.** Earthquake hazard predictions are important for their impact on building codes in general and on decisions where to site nuclear power plants in particular. Spencer, Seth Stein (Department of Earth and Planetary Sciences at Northwestern University), and graduate student Edward Brooks are developing methods for assessing the accuracy of these predictions. Their recent papers include:

Stein, S., M. Liu, **B.D. Spencer**, and E. Brooks. 2016. Promise and paradox: Why improved knowledge of plate tectonics hasn’t yielded correspondingly better earthquake hazard maps. In *Plate Boundaries and Natural Hazards*, eds. J.C. Duarte and W.P. Schellart, 123-148. Hoboken, NJ: Wiley/AGU.

Brooks, E. M., S. Stein, and **B.D. Spencer**. 2016. Comparing the performance of Japan’s earthquake hazard maps to uniform and randomized maps. *Seismological Research Letters* 87: 90-102.

Stein, S., **B.D. Spencer**, and E. Brooks. 2015. Bayes and BOGSAT: Issues in when and how to revise earthquake hazard maps. *Seismological Research Letters *86: 6–10.

Stein, S., **B.D. Spencer**, and E. Brooks. 2015. Metrics for assessing earthquake hazard map performance. *Bulletin of Seismological Research of America *105(4): 2160-2173.

**Should Society Try to Measure the Accuracy of Verdicts in Criminal Trials?** In past years Spencer published papers in the law and statistics literature discussing how to estimate the accuracy of verdicts when the truth is unknown. The key idea is replication—for example, getting multiple “raters” (e.g., a judge and jury) to issue verdicts, and then using the pattern of agreement supplemented with other information (e.g., ratings of quality of evidence) to fit statistical models known as latent class models. The models are complex and subject to multiple sources of error, not all of which are directly quantifiable. Spencer is currently exploring the question, “Should society try to measure the accuracy of verdicts in criminal trials?” That question ties into another important social question: Are we better off not trying hard to know the answer or are we better off with answers whose accuracy is in doubt? In July 2014, Spencer presented a paper on this topic, “Should society try to measure the accuracy of verdicts in criminal trials? A total survey error (TSE) perspective,” at the VI European Congress of Methodology in Utrecht, the Netherlands.

## Selected Publications

**Books**

Alho, J., and **B. D. Spencer**. 2005. *Statistical Demography and Forecasting.* New York: Springer.

**Spencer, B. D.** (ed.). 1997. *Statistics and Public Policy.* Oxford: Oxford University Press.

**Spencer, B. D.** 1980. *Benefit-Cost Analysis of Data Used to Allocate Funds.* New York: Springer.

**Accuracy of Population Estimates**

Mulry, M., and **B. D. Spencer**. 2001. Accuracy and coverage evaluation: Overview of total error modeling and loss function analysis. *DSSD Census 2000 Procedures and Operations Memorandum, Series B-19*. Washington, D.C.: U.S. Census Bureau.

Anderson, M., B. Daponte, S. Fienberg, J. Kadane, **B. D. Spencer**, and D. Steffey. 2000. Sampling-based adjustment of the 2000 Census—A balanced perspective. *Jurimetrics* 40(3): 341–56.

Mulry, M., and **B. D. Spencer**. 1993. Accuracy of the 1990 Census and undercount adjustments. *Journal of the American Statistical Association* 88(423): 1080–91.

Mulry, M., and **B. D. Spencer**. 1991. Total error in PES estimates of population: The dress rehearsal Census of 1988. *Journal of the American Statistical Association* 86: 839–54, with discussion 855–63.

**Accuracy of Population Forecasts**

Alho, J., and **B. D. Spencer**. 1997. The practical specification of the expected error of population forecasts. *Journal of Official Statistics* 13: 203–26.

Alho, J., and **B. D. Spencer**. 1991. Population forecasts as a database. *Journal of Official Statistics* 7: 295–310.

Alho, J., and **B. D. Spencer**. 1990. Effects of targets and aggregation on the propagation of error in mortality forecasts. *Journal of Mathematical Population Studies* 2: 209–27.

Alho, J., and **B. D. Spencer**. 1990. Error models for official mortality forecasts. *Journal of the American Statistical Association* 85: 609–16.

Alho, J., and **B. D. Spencer**. 1985. Uncertain population forecasting. *Journal of the American Statistical Association* 80: 306–14.

**Education Statistics**

**Spencer, B. D.** 1993. Education statistics—A study of eligibility exclusions and sampling: 1992 trial state assessment. In *The Trial State Assessment: Prospects and Realities.* The Third Report of the National Academy of Education Panel on the Evaluation of the NAEP Trial State Assessment: 1992 Trial State Assessment, ed. R. Glaser, R. Linn, and G. Bohrnstedt, 1–68. Stanford: National Academy of Education.

**Spencer, B. D.** 1992. A critique of sampling in the 1990 trial state assessment. In *Assessing Student Achievement in the States: Background Studies.* Studies for the Evaluation of the NAEP Trial State Assessment Commissioned for the National Academy of Education Panel Report on the 1990 Trial, ed. R. Glaser, R. Linn, and G. Bohrnstedt, 1–18. Stanford: National Academy of Education.

**Spencer, B. D.** 1992. Eligibility/exclusion issues in the 1990 trial state assessment. In *Assessing Student Achievement in the States: Background Studies.* Studies for the Evaluation of the NAEP Trial State Assessment Commissioned for the National Academy of Education Panel Report on the 1990 Trial, ed. R. Glaser, R. Linn, and G. Bohrnstedt, 19–49. Stanford: National Academy of Education.

**Spencer, B. D.**, and W. Foran. 1991. Sampling probabilities for aggregations, with applications to NELS:88 and other educational longitudinal surveys. *Journal of Educational Statistics* 16: 21–34.

**Spencer, B. D.** 1983. On interpreting test scores as social indicators: Statistical considerations. *Journal of Educational Measurement* 20: 317–34.

**Spencer, B. D.** 1983. Test scores as social statistics: Comparing distributions. *Journal of Educational Statistics* 8: 249–70.

**Cost-Benefit Analysis of Statistical Data Programs**

**Spencer, B. D.**, J. May, S. Kenyon, and Z. Seeskin. 2015. Cost-benefit analysis for a quinquennial census: The 2016 population census of South Africa. Institute for Policy Research Working Paper WP-15-05. Evanston, IL: Northwestern University.

**Spencer. B. D.** 1994. Sensitivity of benefit-cost analysis of data programs to monotone misspecification. *Journal of Statistical Planning and Inference* 39(1): 19–31.

**Spencer, B. D.**, and L. Moses. 1990. Needed data expenditure for an ambiguous decision problem. *Journal of the American Statistical Association* 85:1099–104.

**Spencer, B. D.** 1985. Optimal data quality. *Journal of the American Statistical Association* 80: 564–73.

**Spencer, B. D.** 1982. Feasibility of benefit-cost analysis of data programs. *Evaluation Review* 6: 649–72.

**Data Error and the Allocation of Public Funds and Representation**

Seeskin, Z., and **B. D. Spencer**. 2015. Effects of census accuracy on apportionment of Congress and allocations of federal funds. Institute for Policy Research Working Paper WP-15-05. Evanston, IL: Northwestern University.

Seeskin, Z., and **B. D. Spencer**. 2015. Effects of census accuracy on apportionment of Congress and allocations of federal funds. In *JSM Proceedings, Government Statistics Section*, 3061-3075. Alexandria, VA: American Statistical Association.

**Spencer, B. D.**, J. May, S. Kenyon, and Z. Seeskin. 2015. Cost-benefit analysis for a quinquennial census: The 2016 population census of South Africa. Institute for Policy Research Working Paper WP-15-06. Evanston, IL: Northwestern University.

**Spencer, B. D.** 1985. Statistical aspects of equitable apportionment. *Journal of the American Statistical Association* 80: 815–22.

**Spencer, B. D.** 1985. Avoiding bias in estimates of the effect of data error on allocations of public funds. *Evaluation Review* 9: 511–18.

**Spencer, B. D.** 1982. Technical issues in allocation formula design. *Public Administration Review* 42: 524–29.

**Spencer, B. D.** 1982. Concerning dubious estimates of the effects of census undercount adjustment of federal aid to cities. *Urban Affairs Quarterly* 18: 145–48.

**Legal Statistics**

**Spencer, B. D.** 2012. When do latent class models overstate accuracy for diagnostic and other classifiers in the absence of a gold standard? *Biometrics* 68(2): 559–66.

**Spencer, B. D.** 2007. Estimating the accuracy of jury verdicts. *Journal of Empirical Legal Studies* 4(2): 305–29.

Anderson, M., B. Daponte, S. Fienberg, J. Kadane, **B. D. Spencer**, and D. Steffey. 2000. Sampling-based adjustment of the 2000 Census—A balanced perspective. *Jurimetrics* 40(3): 341–56.

**Network Sampling**

Zhang, Y., E. D. Kolaczyk, and **B. D. Spencer.** 2015. Estimating network degree distributions under sampling: An inverse problem, with applications to monitoring social media networks. *The Annals of Applied Statistics* 9: 166-199.

Scholtens, D. and **B. D. Spencer**. 2015. Node sampling for protein complex estimation in bait-prey graphs. *Statistical Applications in Genetics and Molecular Biology* 14: 391-411.

**Statistical Assessment of Earthquake Hazard Predictions**

Stein, S., M. Liu, **B. D. Spencer**, and E. Brooks, 2016. Promise and paradox: Why improved knowledge of plate tectonics hasn’t yielded correspondingly better earthquake hazard maps. In *Plate Boundaries and Natural Hazards*, eds. J.C. Duarte and W.P. Schellart, 123-148. Hoboken NJ: Wiley/AGU.

Brooks, E. M., S. Stein, and **B. D. Spencer**. 2016. Comparing the performance of Japan’s earthquake hazard maps to uniform and randomized maps. *Seismological Research Letters* 87: 90-102.

Stein, S., **B. D. Spencer**, and E. M. Brooks. 2015. Bayes and BOGSAT: Issues in when and how to revise earthquake hazard maps. *Seismological Research Letters* 86: 6-10.

Stein, S., **B. D. Spencer**, and E. M. Brooks. 2015. Metrics for assessing earthquake hazard map performance. *Bulletin of Seismological Research of America*. 105(4): 2160-2173.