Skip to main content

Spring 2021 Seminars

The 2020-2021 Seminar Series will be offered virtually using Zoom instead of in person. Registration will be required to receive the zoom link for the event. Please see the registration link associated with each talk to sign up. Links are specific to individual talks, so you will need to register for every talk you are interested in attending. Please email Kisa Kowal at if you have questions. 

Seminar Series talks are free and open to faculty, graduate students, and advanced undergraduate students.

Detection and Estimation Problems in Combinatorial Statistics

Wednesday, March 31, 2021

Time: 11:00 a.m. - registration required

Speaker: Miklós Z. Rácz, Assistant Professor ORFE department, Affiliated Faculty Member at the Center for Statistics and Machine Learning (CSML), Princeton University

Abstract: From networks to genomics, large amounts of data are abundant and play critical roles in helping us understand complex systems. In many such settings, these data take the form of large discrete structures with important combinatorial properties. The interplay between structure and randomness in these systems presents unique statistical challenges. In this talk I will highlight these via fundamental detection and estimation problems in two settings.

First, I will discuss various problems which probe the underlying structure of networks. These include uncovering geometric structure, as well as understanding correlation in growing networks, where I will highlight connections to the influence of the seed graph in the underlying growth model.

I will also discuss trace reconstruction, a fundamental problem where the goal is to reconstruct a sequence from noisy copies. I will explain our applied work on this problem in the context of DNA data storage, an exciting emerging technology, as well as our theoretical work including approximate versions and generalizations to trees.

Zoom Registration link

Assumption-Lean Analysis of Cluster Randomized Trials in Infectious Diseases

Wednesday, April 14, 2021

Time: 11:00 a.m. - registration required

Speaker: Hyunseung Kang, Assistant Professor, Department of Statistics, University of Wisconsin-Madison

Abstract: Cluster randomized trials (CRTs) are a popular design to study the effect of interventions in infectious disease settings. However, standard analysis of CRTs primarily relies on strong parametric methods, usually a Normal mixed effect models to account for the clustering structure, and focus on the overall intent-to-treat (ITT) effect to evaluate effectiveness. The paper presents two methods to analyze two types of effects in CRTs, the overall and heterogeneous ITT effects and the spillover effect among never-takers who cannot or refuse to take the intervention. For the ITT effects, we make a modest extension of an existing method where we do not impose parametric models or asymptotic restrictions on cluster size. For the spillover effect among never-takers, we propose a new bound-based method that uses pre-treatment covariates, classification algorithms, and a linear program to obtain sharp bounds. A key feature of our method is that the bounds can become dramatically narrower as the classification algorithm improves and the method may also be useful for studies of partial identification with instrumental variables. We conclude by reanalyzing a CRT studying the effect of face masks and hand sanitizers on transmission of 2008 interpandemic influenza in Hong Kong. This is joint work with Chan Park (UW-Madison).

Zoom Registration link

Testing an Elaborate Theory of a Causal Hypothesis

Wednesday, April 21, 2021

Time: 11:00 a.m. - registration required

Speaker: Dylan Small, Class of 1965 Wharton Professor of Statistics, Department of Statistics, The Wharton School, University of Pennsylvania

Abstract: When R.A. Fisher was asked what can be done in observational studies to clarify the step from association to causation, he replied, “Make your theories elaborate” -- when constructing a causal hypothesis, envisage as many different consequences of its truth as possible and plan observational studies to discover whether each of these consequences is found to hold.  William Cochran called “this multi-phasic attack…one of the most potent weapons in observational studies.”  Statistical tests for the various pieces of the elaborate theory help to clarify how much the causal hypothesis is corroborated. In practice, the degree of corroboration of the causal hypothesis has been assessed by a verbal description of which of the several tests provides evidence for which of the several predictions. This verbal approach can miss quantitative patterns.  We develop a quantitative approach to making statistical inference about the amount of the elaborate theory that is supported by evidence.  This is joint work with Bikram Karmakar. 

Zoom Registration link

A Flexible Regression Model for Dispersed Count Data

Wednesday, April 28, 2021

Time: 11:00 a.m. - registration required

Speaker: Kimberly F. Sellers, Professor, Department of Mathematics and Statistics, Georgetown University

Abstract: While Poisson regression serves as a standard tool for modeling the association between a count response variable and explanatory variables, its underlying equi-dispersion assumption and its implications are well documented. The Conway-Maxwell-Poisson (COM-Poisson) distribution is a flexible count data alternative that allows for data over- or under-dispersion, thus the COM-Poisson regression can flexibly model associations involving a discrete count response variable and covariates. This talk introduces the resulting regression along with its zero-inflated analog, and the associated COMPoissonReg package in R which has become a popular resource for statistical computing.

Zoom Registration link


Wednesday, May 12, 2021

Time: 11:00 a.m.

Speaker: Jiangtao Gou, Assistant Professor, Mathematics and Statistics, Villanova University

Zoom Registration link: registration will be required to attend, link coming

Abstract: TBA


A statistical method for estimating (specific) causation in the law

Wednesday, May 19, 2021

Time: 11:00 a.m. - registration required

Speaker: Maria Cuellar, Assistant Professor, Criminology Department, University of Pennsylvania

Abstract: Researchers often need to determine whether a specific exposure, or something else, caused an individual's outcome. To answer questions of causality in which the exposure and outcome have already been observed, researchers have suggested estimating the probability of causation (PC). PC is especially important in court, for example in class action lawsuits, and in public and health policy, for example in determining who has benefited most from a program. However, the current estimation methods for PC make strong parametric assumptions, or are inefficient and do not easily yield inferential tools. In this talk, I will describe an influence-function-based nonparametric estimator for a projection of PC, which allows for simple interpretation and valid inference by making only weak structural assumptions. I compare my proposed estimator to the current plug-in methods, both parametric and nonparametric, by simulation. Finally, I present an application of the proposed estimator by using data from a randomized controlled trial in Western Kenya.

Zoom Registration link

Perfect is the enemy of good: new shrinkage estimators for genomics

Wednesday, May 26, 2021

Time: 11:00 a.m. - registration required

Speaker: Sihai Dave Zhao, Assistant Professor, Department of Statistics, University of Illinois at Urbana-Champaign

Abstract: Simultaneous estimation problems have a long history in statistics and have become especially common and important in genomics research: modern technologies can simultaneously assay tens of thousands to even millions of genomic features that can each introduce an unknown parameter of interest. These applications reveal some conceptual and methodological gaps in the standard empirical Bayes approach to simultaneous estimation. This talk reviews some standard approaches, illustrates some difficulties, introduces an alternative approach based on regression modeling, and illustrates some new estimators that can be applied to gene expression denoising, coexpression network reconstruction, and large-scale gene expression imputation.

 Zoom Registration link




Back to top