Bayesian Nonparametric Statistical Methods – Theory and Applications

Prof. Peter Mueller – Department of Biostatistics – Division of Quantitative Sciences – University of Texas M D Anderson Cancer Center, Houston – USA
FCUL (DEIO) – Campo Grande – Bloco C6 – Piso 4 – Sala 6.4.31 – 11/12/13/15 Janeiro 2010 – 9h -10.30h / 11h -12.30h e 14 Janeiro 2010 – Sala 6.4.35 – 14h -15.30h -16h-17.30h
Segunda-feira, 11 de Janeiro de 2010 a Sexta-feira, 15 de Janeiro de 2010

Curso de Estatística Bayesiana Não-paramétrica

Relembra-se que o Curso Bayesian Nonparametric Statistical Methods – Theory and Applications, ministrado pelo Prof. Peter Mueller do Department of Biostatistics, Division of Quantitative Sciences, The University of Texas M D Anderson Cancer Center, Houston, USA, vai-se realizar ao longo da semana de 11 a 15 de Janeiro de 2010, com duas sessões de 90 minutos cada por dia, na FCUL, Edifício C6, Piso 4 na sala 6.4.31e 6.4.35.

Os interessados na sua frequência deverão proceder à sua inscrição informal através de envio de mensagem electrónica nesse sentido com a respectiva identificação e afiliação para ceaul@fc.ul.pt. Esta inscrição, para elementos externos à entidade promotora do curso (CEAUL), só se tornará efectiva mediante o pagamento de uma taxa de 250 euros.

Este minicurso funcionará no seguinte horário: 9h-10.30h e 11h-12.30h nos dias11,12,13 e 15 de Janeiro e 14h-15.30h e 16h-17.30h no dia 14 de Janeiro. Para mais detalhes é favor contactar o secretariado do CEAUL.

O plano geral deste curso encontra-se descrito a seguir:

January, 11th

CLASS 1: Overview of models for NP Bayes.

The course will focus on NP Bayes models for random probability measures. We will start with a brief comparative discussion of alternative models, including how different models constitute generalizations and special cases of others. This discussion will provide a summary of non-parametric Bayesian approaches, focusing on the structure of the underlying probability models. In the second part of this introductory class we will review Bayesian non-parametrics from another, complementary and equally important perspective by reviewing non-parametric Bayesian inference from a data analysis perspective. For several traditional statistical inference problems we will discuss how non-parametric Bayesian generalizations would be approached, and which models would be appropriate.

The goal of this initial class is to provide a red thread for the rest of this course. In the remaining lectures we will fill in the details for many of the models, discuss the implementation of posterior simulation and add a few more variations of the initially introduced basic models.

CLASS 2: Dirichlet Process (DP) { the most beautiful of all

The DP has featured prominently as the special case of several more general models in Class 1. We will review more properties of the DP and discuss extensions to DP mixture models. We will at length discuss posterior MCMC schemes for DP mixture models, including conjugate and non-conjugate models. We will discuss at length the use of DP priors to induce random partition models, and properties and limitations of these random partition models.

The goal of Class 2 is to understand the flexibility and most importantly the limitations of the DP and DPM models. In particular, we will highlight when the DP is used as a convenient computational tool to induce random clustering vs. when it is used for statistical inference about an unknown probability distribution. Usually it’s just the earlier (and I fully include my own work in this statement

January, 12th

CLASS 3: Polya trees (PT)

We will review definition and statistical inference with PT models. We will highlight important limitations of the PT model for data analysis and discuss variations of the model to mitigate some of these shortcomings. We will discuss applications of PT models for nonparametric Bayesian survival analysis and for inference for ROC curves.

No big goals for Class 3. We just want to introduce in detail a reasonably popular alternative to the sometimes over-used DP model and make a convincing argument that implementation is not all that difficult.

CLASS 4: Dependent DP (DDP)

Construction of DDP models with common weights versus common locations. Posterior inference and applications to inference for related studies. Linear DDP, spatial DDP and other variations. We will discuss the special case of the nested DP and the related hierarchical DP.

A goal of Class 4 is to understand the reduction of the DDP to a simple DP when the DDP is restricted to finitely many random distributions.

January,13th

CLASS 5. Species sampling models (SSM) and random partition models:

Definition, alternative constructions and various defining properties. Data analysis motivations for chosing SSMs different from the DP prior. Random partition implied by SSMs. Properties of predictive probability functions (PPF), limited nature of the PPF implied by the DP.

CLASS 6. Product partition models (PPM):

We will discuss definions and key concepts for posterior inference. We will consider extensions of the basic PPM to random clustering with covariates. We will review applications to inference for clinical trials with related subpopulations.

The goal of Classes 5 and 6 is a more in depth discussion of the random clustering that is implied by the DP, and an appreciation for the limitations of this random partition.

January,14th

LAB 1: MCMC for DP mixtures.

Implementations in R, using the Polya Urn and nite DP approximations.

LAB 2: DPpackage

We will use DPpackage to carry out NP Bayesian inference.

January,15th

CLASS 7a (part I): Catch up and review

We will complete unfinished discussions from earlier classes, and review models introduced in Classes 1-7.

CLASS 7b (part II): Comletely random measures and NRMIs

We will review the definition of the DP model as normalized gamma process and the generalizations that can be proposed based on this characterization.

CLASS 8: Model comparison and validation

We will discuss the use of NP Bayes models to robustify and validate parametric models and general strategies for model validation, including the (tricky) evaluation of Bayes factors for NP Bayes models.

Pel’ A organização

Carlos Daniel Paulino