Winter 2026 Seminar Series
Department of Statistics and Data Science 2025-2026 Seminar Series - Winter 2026
The 2025-2026 Seminar Series will primarily be in person, but some talks will be offered virtually using Zoom. Talks that are virtual will be clearly designated and registration for the Zoom talks will be required to receive the zoom link for the event. Please email Kisa Kowal at k-kowal@northwestern.edu if you have questions.
Seminar Series talks are free and open to faculty, graduate students, and advanced undergraduate students
TBA
Friday, January 23, 2026
Time: 11:00 a.m. to 12:00 p.m. central time
Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)
Speaker: Dawei Zhou, Assistant Professor, Department of Computer Science, Virginia Tech
Abstract: TBA
This talk will be given in person on Northwestern's Evanston campus.
TBA
Friday, January 30, 2026
Time: 11:00 a.m. to 12:00 p.m. central time
Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)
Speaker: TBA
Abstract: TBA
This talk will be given in person on Northwestern's Evanston campus.
TBA
Friday, February 6, 2026
Time: 11:00 a.m. to 12:00 p.m. central time
Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)
Speaker: TBA
Abstract: TBA
This talk will be given in person on Northwestern's Evanston campus.
TBA
Friday, February 13, 2026
Time: 11:00 a.m. to 12:00 p.m. central time
Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)
Speaker: TBA
Abstract: TBA
This talk will be given in person on Northwestern's Evanston campus.
Deep Survival Learning for Kidney Transplantation: Knowledge Distillation and Data Integration
Friday, February 20, 2026
Time: 11:00 a.m. to 12:00 p.m. central time
Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)
Speaker: Kevin He, Associate Professor of Biostatistics and Associate Director of the Kidney Epidemiology and Cost Center (KECC), University of Michigan
Abstract: Prognostic prediction using survival analysis faces challenges due to complex relationships between risk factors and time-to-event outcomes. Deep learning methods have shown promise in addressing these challenges, but their effectiveness often relies on large datasets. However, when applied to moderate- or small-sized datasets, deep models frequently encounter limitations such as insufficient training data, overfitting, and difficulty in hyperparameter optimization. To mitigate these issues and enhance prognostic performance, this talk presents a flexible deep learning framework that integrates external risk scores with internal time-to-event data through a generalized Kullback–Leibler divergence regularization term. Applied to the national kidney transplant data, the proposed method demonstrates improved prediction of short-term mortality and graft failure following kidney transplantation by distilling and transferring prior knowledge from pre-policy-change teacher models to newly arrived post-policy-change cohorts.
This talk will be given in person on Northwestern's Evanston campus.
planitpurple.northwestern.edu/event/636129
What functions does XGBoost learn?
Friday, February 27, 2026
Time: 11:00 a.m. to 12:00 p.m. central time
Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)
Speaker: Aditya Guntuboyina, Associate Professor, Department of Statistics, University of California, Berkeley
Abstract: We develop a theoretical framework that explains what kinds of functions XGBoost is able to learn. We introduce an infinite-dimensional function class that extends ensembles of shallow decision trees, along with a natural measure of complexity that generalizes the regularization penalty built into XGBoost. We show that this complexity measure aligns with classical notions of variation—in one dimension it corresponds to total variation, and in higher dimensions it is closely tied to a well-known concept called Hardy–Krause variation. We prove that the best least-squares estimator within this class can always be represented using a finite number of trees, and that it achieves a nearly optimal statistical rate of convergence, avoiding the usual curse of dimensionality. Our work provides the first rigorous description of the function space that underlies XGBoost, clarifies its relationship to classical ideas in nonparametric estimation, and highlights an open question: does the actual XGBoost algorithm itself achieve these optimal guarantees? This is joint work with Dohyeong Ki at UC Berkeley.
This talk will be given in person on Northwestern's Evanston campus.