Skip to main content

Fall 2022 Seminar Series

Department of Statistics and Data Science 2022-2023 Seminar Series - Fall 2022

The 2022-2023 Seminar Series will primarily be in person, but some talks will be offered virtually using Zoom. Talks that are virtual will be clearly designated and registration for the Zoom talks will be required to receive the zoom link for the event. Please see the registration link associated with each talk to sign up. Links are specific to individual talks, so you will need to register for every talk you are interested in attending. Please email Kisa Kowal at k-kowal@northwestern.edu if you have questions. 

Seminar Series talks are free and open to faculty, graduate students, and advanced undergraduate students

New 2-parameter families of advanced forecasting functions: seasonal/nonseasonal models, comparison to the exponential smoothing and ARIMA models, and application to stock market data

Friday, September 30, 2022

Time: 11:00 a.m. to 12:00 p.m. central time

Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)

Speaker: Nabil Kahouadji, Associate Professor of Mathematics, Northeastern Illinois University

Abstract: We introduce twenty-four new two-parameter families of advanced time series forecasting functions, using three forecast estimate methods along with eight optimization criteria. We also introduce the concept of powering and derive non-seasonal and seasonal time series models with examples in education, sales, economics, industry and finance. We compare the performance of our twenty-four functions/models to both exponential smoothing and ARIMA models using non-seasonal and seasonal time series. We show in particular that our models not only do not require a decomposition of a seasonal time series into trend, seasonal and random components, but also leads to substantially lower sum of absolute error and a higher number of closer forecasts than both Holt--Winters and ARIMA models. Finally, we apply and compare the performance of our twenty-four models using five-year stock market data of 467 companies of the S&P500.

This talk will be given in person on Northwestern's Evanston campus at the location listed above.

From Experimentation to Causal Learning: Applied Research in Tech Industry

Friday, October 7, 2022

Time: 11:00 a.m. to 12:00 p.m. central time

Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)

Speaker: Zhenyu Zhao, Data Science Director at Tencent, Ph.D. Department of Statistics, Northwestern University

Abstract: Nowadays, experimentation is an essential method for making decisions and improving user experience based on causal effects in the tech industry. In the meanwhile, causal learning has been gaining momentum in the recent few years.

This talk will discuss two applied research topics: 1) sequential testing in experimentation and 2) feature selection methods for uplift modeling (a causal learning model); to illustrate how applied research is carried out to address real business problems in practice.

This talk will be given in person on Northwestern's Evanston campus.

It’s Not What We Said, It’s Not What They Heard, It’s What They Say They Heard

Friday, October 14, 2022

Time: 11:00 a.m. to 12:00 p.m. central time

Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)

Speaker: Barry D. Nussbaum, Adjunct Professor, University of Maryland Baltimore County, Chief Statistician, US Environmental Protection Agency (Retired), 2017 President, American Statistical Association

Abstract: Statisticians have long known that success in our profession frequently depends on our ability to succinctly explain our results so decision makers may correctly integrate our efforts into their actions.  However, this is no longer enough.  While we still must make sure that we carefully present results and conclusions, the real difficulty is what the recipient thinks we just said.   The situation becomes more challenging in the age of “big data”.  This presentation will discuss what to do, and what not to do.  Examples, including those used in court cases, executive documents, and material presented for the President of the United States, will illustrate the principles.

This talk will be given in person on Northwestern's Evanston campus at the location listed above.

Ultrahigh Dimensional Variable Selection for Bayesian Mixed‐type Multivariate Generalized Linear Models

Friday, October 28, 2022

Time: 11:00 a.m. to 12:00 p.m. central time

Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)

Speaker: Hsin-Hsiung Bill Huang, Associate Professor, Department of Statistics and Data Science, University of Central Florida

Abstract: Inspired by our recent works on the NSF ATD challenge and medical imaging research, we investigate whether the Bayesian methods can consistently estimate the model parameters. To this end, shrinkage priors are useful for identifying relevant signals in high-dimensional data. We develop a multivariate Bayesian model with shrinkage priors (MBSP) model to mixed-type response generalized linear models (MRGLMs), and we consider a latent multivariate linear regression model associated with the observable mixed-type response vector through its link function. Under our proposed model (MBSP-GLM), multiple responses belonging to the exponential family are simultaneously modeled and mixed-type responses are allowed. We show that the MBSP-GLM model achieves strong posterior consistency when $p$ grows at a subexponential rate with $n$. Furthermore, we quantify the posterior contraction rate at which the posterior shrinks around the true regression coefficients and allow the dimension of the responses $q$ to grow as $n$ grows. This greatly expands the scope of the MBSP model to include response variables of many data types, including binary and count data.  To address the non-conjugacy concern, we propose an adaptive sampling algorithm via a P\'{o}lya-gamma data augmentation scheme for the MRGLM estimation. We provide simulation studies and real data examples.

This talk will be given in person on Northwestern's Evanston campus.

Conformal prediction beyond exchangeability

Friday, November 4, 2022

Time: 11:00 a.m. to 12:00 p.m. central time

Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)

Speaker: Rina Foygel Barber, Louis Block Professor, Department of Statistics, University of Chicago

Abstract: Conformal prediction is a popular, modern technique for providing valid predictive inference for arbitrary machine learning models. Its validity relies on the assumptions of exchangeability of the data, and symmetry of the given model fitting algorithm as a function of the data. However, exchangeability is often violated when predictive models are deployed in practice. For example, if the data distribution drifts over time, then the data points are no longer exchangeable; moreover, in such settings, we might want to use an algorithm that treats recent observations as more relevant, which would violate the assumption that data points are treated symmetrically. This paper proposes new methodology to deal with both aspects: we use weighted quantiles to introduce robustness against distribution drift, and design a new technique to allow for algorithms that do not treat data points symmetrically, with theoretical results verifying coverage guarantees that are robust to violations of exchangeability.

This work is joint with Emmanuel Candes, Aaditya Ramdas, and Ryan Tibshirani.

This talk will be given in person on Northwestern's Evanston campus.

Leveraging "partial" smoothness for faster convergence in nonsmooth optimization

Friday, November 11, 2022

Time: 11:00 a.m. to 12:00 p.m. central time

Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)

Speaker: Damek Shea Davis, Associate Professor, Operations Research and Information Engineering, Cornell University

Abstract: First-order methods in nonsmooth optimization are often described as "slow." I will present two (locally) accelerated first-order methods that violate this perception: a superlinearly convergent method for solving nonsmooth equations, and a linearly convergent method for solving "generic" nonsmooth optimization problems. The key insight in both cases is that nonsmooth functions are often "partially" smooth in useful ways.

This talk will be given in person on Northwestern's Evanston campus.

A modern take on Huber regression

Friday, November 18, 2022

Time: 11:00 a.m. central time - Virtual talk, registration required

Speaker: Po-Ling Loh, Lecturer, Department of Pure Mathematics and Mathematical Statistics, University of Cambridge

Abstract: In the first part of the talk, we discuss the use of a penalized Huber M-estimator for high-dimensional linear regression. We explain how a fairly straightforward analysis yields high-probability error bounds that hold even when the additive errors are heavy-tailed. However, the parameter governing the shape of the Huber loss must be chosen in relation to the scale of the error distribution. We discuss how to use an adaptive technique, based on Lepski's method, to overcome the difficulties traditionally faced by applying Huber M-estimation in a context where both location and scale are unknown.

In the second part of the talk, we turn to a more complicated setting where both the covariates and responses may be heavy-tailed and/or adversarially contaminated. We show how to modify the Huber regression estimator by first applying an appropriate "filtering" procedure to the data based on the covariates. We prove that in low-dimensional settings, this filtered Huber regression estimator achieves near-optimal error rates. We further show that the commonly used least trimmed squares and least absolute deviation estimators may similarly be made robust to contaminated covariates via the same covariate filtering step. This is based on joint work with Ankit Pensia (UW-Madison) and Varun Jog (Cambridge).

Register here

This talk will be given virtually by Zoom. Registration is required to receive the Zoom link for the talk.

 

Back to top