[Lecture] Donald B. Rubin: Models for Imputing Missing Data

Mar. 16, 2018

Topic: Models for Imputing Missing Data, Including Methods for Assessing Sensitivity of Conclusions to Them

Speaker: Prof. Donald B. Rubin (Harvard University)

Language: English

Time: March 16, 2018 (Friday)
14:00-15:30

Venue: Lecture Hall, Jiayibing Building, Jingchunyuan 82, BICMR (数学中心甲乙丙楼报告厅)

Introduction: There are two relatively standard approaches for dealing with missing data in statistics, one based on “selection models” and the other based on “pattern-mixture" models. The former is focused on formulating a model for complete data and then effectively imputing missing data so that the combined observed and missing data fit the assumed model for the complete data. In contrast, the latter effectively fits a different model for each pattern of observed and missing data, thereby directly revealing sensitivity of conclusions to assumptions about distributions for which there are no actual observed data available for estimation. A third class of models, which have remained mostly recondite, is based on “Gibbs” factorizations; although these may not imply a valid joint distribution, they have enjoyed success in applications because of their ease of use when implemented by MCMC computer software for multiple imputation, such as in SAS, STATA, and MICE. The consideration of sensitivity of conclusions to assumptions unassailable by observed data, whether implicit, as with selection models, or explicit, as with pattern-mixture models, is a critical ingredient of satisfactory analyses of data sets with missing values. Graphical displays, such as “enhanced tipping point analyses” implemented using modern computing, are critical ingredients for this enterprise.