2007: Department of Statistics and Data Science

2007

Spring 2007

Tuesday, March 27 at 11 am

Speaker: Wei Biao Wu, Assistant Professor, Department of Statistics, The University of Chicago

Title: New Perspectives in the Theory of Time Series

Abstract: I will present a unified framework for a large-sample theory of time series. Topics in classical time series analysis will be revisited and they include the estimation of covariances, spectral densities and long-run variances and linear prediction. I will also talk about high dimensional covariance matrices estimation and inference of mean and quantiles of non-stationary processes. In the second part I will discuss dependence, a fundamental concept in statistics. Our viewpoint provides new insights in the study of complicated random systems. I will also discuss relations with nonlinear system theory, experimental design, information theory, risk-metrics theory and high dimensional covariance matrices estimation.

Tuesday, April 9 at 1 pm

Speaker: Vadim Linetsky, Professor, Department of Industrial Engineering and Management Sciences, Northwestern University

Title: Time-Changed Markov Processes in Asset Pricing

Abstract: The procedure of a time changing stochastic process, going back to S. Bochner, allows one to construct new processes from a given process by running it on a new clock that can itself be a non-decreasing stochastic process (random time). When the process to be time changed is a Markov process and the Laplace transform of the time change is known, there is an explicit representation of the expectation operator of the time changed process in terms of the resolvent of the original Markov process and the Laplace transform of the time change. We use this result to build a rich tool box of analytically tractable asset pricing models in finance that incorporate stochastic volatility, state-dependent jumps, and state-dependent killing rates (or default intensities). Among the resulting models is a new credit-equity model that is an extension of the constant elasticity of variance (CEV) model with stochastic volatility, jumps, and default, as well as extensions of the Cox-Ingersoll-Ross and the Ornstein-Uhlenbeck models with mean-reverting jumps.

Tuesday, May 8 at 11 am

Speaker: Hui Xie, Assistant Professor, School of Public Health, University of Illinois at Chicago

Title: A Local Sensitivity Analysis Approach to Longitudinal Non-Gaussian Data with Nonignorable Dropout

Abstract: Longitudinal non-Gaussian data subject to potentially nonignorable dropout is a challenging problem. Very often data contain little information about the dropout mechanism. As a result, frequently an analysis has to rely on some strong but unverifiable assumptions, among which ignorability is a key one. Sensitivity analysis has been advocated to assess the likely effect of alternative assumptions about dropout mechanism on such an analysis. Previously Ma et al. (2005) applied a general index of local sensitivity to nonignorability (ISNI) (Troxel et al. 2004) to measure the sensitivity of MAR estimates to small departures from ignorability for multivariate normal outcomes. In this paper, we extend the ISNI methodology to handle longitudinal non-Gaussian data subject to nonignorable dropout. Specifically we propose to quantify the sensitivity of inferences in the neighborhood of an MAR generalized linear mixed model (GLMM) for longitudinal data. Through a simulation study, we evaluate the performance of the proposed methodology. We then illustrate the methodology in one real example: Smoking Cessation Data.

Tuesday, May 22 at 11 am

Speaker: Hira L. Koul, Professor, Department of Statistics and Probability, Michigan State University

Title: Model Diagnostics via Martingale Transforms

Abstract: Classical problems in statistics are to fit a distribution up to unknown location-scale parameters and to fit a parametric model to the regression-autoregressive function. The first problem is generic to many other statistical models including the celebrated regression and autoregressive and generalize autoregressive conditionally heteroscedastic (ARCH-GARCH) models where one is testing that innovations are from a given distribution. It will be argued that the Khamaladze's martingale transformation of the residual empirical process that yields asymptotically distribution free tests for the one sample location-scale model does the same thing for a parametric heteroscedastic regression model, and ARCH-GARCH models. Analogous tests for the second problem will be also discussed.

Friday, May 25 at 3 pm

Speaker: Cliff Speigelman, Professor, Department of Statistics, Texas A&M University University

Title: Statistical considerations on the process of discovering and validating biomarker candidates using MS platforms
(Joint with Lorenzo J. Vega Montoto and Asokan Mulayath Variyath)

Abstract: Claims have been made that the application of supervised pattern recognition methodology can be used with MS proteomic data to achieve near perfect sensitivity and specificity for detecting early stage cancer. So far those claims have not been verified in part due to the use of less than optimal experimental design, but in the interim significant effort has been spent on proteomic biomarker discovery research (without significant positive results) largely using tandem MS platforms. Underpinning the proteomics studies are several key components including standardization of materials, bioinformatics, reagent development, MS improvements, and statistics. This presentation discusses the NCI CPTAC program generally and a related mouse studies project. Several areas where statistical design of experiment input is present will be discussed.

Thursday, June 7 at 11 am

Speaker: Archana Singh, PhD student, Department of Computer Science, University of Tsukuba, Japan and National Food Research Institute, Tsukuba, Japan

Title: Robustness of FDR Method in Brain Mapping Studies using Functional near Infra-Red Spectroscopy
(Joint with Ippeita Dan)

Abstract: Near infrared spectroscopy (NIRS) is an emerging non-invasive technique, which allows monitoring of brain activity in infants, patients, and healthy subjects with a relative ease of application than other techniques, because it is portable and is more permissive to subjects' movements and allows the subjects' brain monitoring in a more eco-friendly setting. It allows simultaneous measurements through many channels ranging from below ten to around two hundred, thus escalating the issue of multiple testing. Till date, only a few studies have considered this issue using Bonferroni correction, which tends to be conservative in spatially correlated fNIRS data. In addition, its power is inversely proportional to the number of channels, which varies among fNIRS experiments depending on selected region of interest (ROI), thereby leading to a subjective inference. This problem may be well circumvented by a more contemporary approach, called false discovery rate (FDR). In this session, I will illustrate how the application of FDR procedures can provide a more objective and also more powerful inference than Bonferroni method in analyzing neuroimaging analysis with real data. In addition, I will present the results from a simulation analysis that show that FDR provides greater sensitivity while maintaining the conventional specificity control.

Winter 2007

Tuesday, January 23 at 2 pm

Speaker: Joel L. Horowitz, Charles E. and Emma H. Morrison Professor of Market Economics, Department of Economics, Northwestern University

Title: Nonparametric Instrumental Variables Estimation of a Quantile Regression Model
(Joint with Sokbae Lee)

Abstract: We consider nonparametric estimation of a regression function that is identified by requiring a specified quantile of the regression "error" conditional on an instrumental variable to be zero. The resulting estimating equation is a nonlinear integral equation of the first kind, which generates an ill-posed-inverse problem. The integral operator and distribution of the instrumental variable are unknown and must be estimated nonparametrically. We show that the estimator is mean-square consistent, derive its rate of convergence in probability, and give conditions under which this rate is optimal in a minimax sense. The results of Monte Carlo experiments show that the estimator behaves well in finite samples.

Tuesday, February 6 at 2 pm

Speaker: Leah J. Welty, Assistant Professor, Department of Preventive Medicine, Northwestern University

Title: Bayesian Distributed Lag Models: Estimating Effects of Particulate Matter Air Pollution on Daily Mortality

Abstract: A distributed lag model (DLM) is a regression model that includes lagged exposure variables as covariates; its corresponding distributed lag (DL) function describes the relationship between the lag and the coefficient of the lagged exposure variable. DLMs have recently been used in environmental epidemiology for quantifying the cumulative effects of weather and air pollution on mortality and morbidity. Standard methods for formulating DLMs include unconstrained, polynomial, and penalized spline DLMs. These methods may fail to take full advantage of prior information about the shape of the DL function for environmental exposures, or for any other exposure with effects that are believed to smoothly approach zero as lag increases, and are therefore at risk of producing sub-optimal estimates.

We propose a Bayesian DLM (BDLM) that incorporates prior knowledge about the shape of the DL function and also allows the degree of smoothness of the DL function to be estimated from the data. In a simulation study, we compare our Bayesian approach with alternative methods that use unconstrained, polynomial and penalized spline DLMs. We also show that BDLMs encompass penalized spline DLMs: under certain assumptions, imposing a prior on the DL coefficients is analogous to smoothing the DL coefficients with a penalty specified by the prior. We apply our BDLM to data from the National Morbidity, Mortality, and Air Pollution Study (NMMAPS) to estimate the short term health effects of particulate matter air pollution on mortality from 1987-2000 for Chicago, Illinois.

Tuesday, February 20 at 2 pm

Speaker: Ruey S. Tsay, H.G.B. Alexander Professor of Econometrics and Statistics, Graduate School of Business, The University of Chicago

Title: The Dynamics of Threshold Interest Rate Models
(Joint with Shang C. Chiou)

Abstract: We propose a two-factor arbitrage-free term structure model for interest rates, where the short-term interest rate follows a threshold model with stochastic volatility. Under the proposed model, the number of thresholds is unknown and must be endogenously determined by a model selection procedure. To estimate the proposed model, we develop an efficient Bayesian method by transforming the threshold problem into a structural-break problem. Simulation study shows that the proposed Bayesian method provides an accurate estimation of the thresholds and the associated parameters of the model. In applications, the U.S. data strongly favor the newly proposed model over other models with constant volatility. We further compare the threshold model to its affine counterpart and the Markov-switching model, demonstrating the significant difference of using the thresholds. We find that the threshold model built implies a kinked yield function and can generate an inverted yield curve. In addition, for U.S. monthly bond yields with 11 maturities (1 to 6 months and 1 to 5 years), the threshold model has smaller out-of-sample pricing errors than other models, especially for the long-term yields.

Tuesday, March 6 at 2 pm

Speaker: Ying Wei, Assistant Professor
Department of Biostatistics, Columbia University

Title: A Dynamic Quantile Regression Transformation Model for Longitudinal Data

Abstract: This paper describes a flexible nonparametric quantile regression model for longitudinal data. The basic elements of the model are a time-dependent power transformation on the longitudinal dependent variable and a varying-coefficient model for conditional quantile functions. A two-step estimation procedure is proposed to fit the model, and its consistency property is established. Tuning parameters are chosen with generalized cross validation in conjunction with a Schwarz-type information criterion. The proposed method is illustrated by a data on the time evolution of CD4 cell counts in HIV-1 infected patients under three different treatments. The quantile regression approach for longitudinal data enables construction of pointwise prediction band of individual trajectories without requiring parametric distributional assumptions. This is joint work with Prof. Yunming Mu at University of Texas at A & M.