Seminário de Estatística Bayesiana do GI3/CEAUL: "Approximate Bayesian Computation (ABC) as Flexible Inference Methods - Examples of Applications from Population Genetics" ; "Sequential Monte Carlo Methods"; "Slice Sampling"

Dr. Vitor Sousa – Instituto Gulbenkian de Ciência; Prof. Doutora Isabel Natário – DM/FCT Universidade Nova Lisboa; Prof. Doutor Paulo Soares – DM/FCT Universidade Técnica Lisboa
FCUL (DEIO) – Campo Grande – Bloco C6 – Piso 4 – Sala 6.4.31 das 14h 30m ás 16h 30m
Sexta-feira, 19 de Março de 2010

Abstracts:

In recent years ABC methods have become popular in population genetics as an alternative to full-likelihood methods to make inferences under complex models. The principle of ABC methods is to obtain an approximate sample from the posterior distribution. This is done using simulations across a wide range of parameter values within a model to find the parameter values that generate data sets that match those in the observed data most closely. Therefore, ABC algorithms do not require an explicit likelihood function, and can be applied to problems where it is possible to simulate from the model of interest (Beaumont et al., 2002; Marjoram et al. 2003). The first ABC approaches were based on a rejection scheme that involves four steps:

(i) simulation of datasets with different parameter values drawn from the prior distributions; (ii) computation of a set of sufficient summary statistics for each dataset; (iii) comparison of the observed and simulated summary statistics using a distance metric, e.g.

Euclidean distance; and (iv) rejection of the parameters that generated distant datasets. The posterior distribution reflects P(theta | d(S_s;S_o) < t), where theta refers to the parameters, d(S_s; S_o) stands for the distance between the observed S_o and simulated S_s summary statistics, and t is an arbitrary threshold. The choice of t (and of the number of simulations) reflects to some extent a balance between computability and accuracy. If the summary statistics are sufficient for the parameter q, the approximate sample converges to the correct posterior distribution as t goes to 0. Thus, the quality of the ABC inference is expected to depend on the chosen summary statistics, the distance metric and the tolerance level t.

Several approaches to improve the accuracy and precision of these methods have been proposed in recent years; including ABC within Markov chain Monte Carlo (Marjoram et al. 2003), sequential Monte Carlo approaches (Sisson et al. 2007; Beaumont et al. 2010) and post adjustments using regression (Beaumont et al. 2002, Blum and François 2008). In this talk I will present the principles of the methods and discuss some results from its applications to population genetics.

When we have data arriving sequentially in time we are often interested in doing inference each time an observation pops up. In a Bayesian setting that means that we need to sequentially update the model posterior distributions as data become available. For most real world situations it is not possible to get the necessary analytical expressions to compute the evolving sequence of posterior distributions. Sequential Monte Carlo methods are a set of very flexible and easy to implement simulation-based methods that sort this out. They can be applied in very general settings. In this talk we will introduce Sequential Monte Carlo Methods and describe some of the most commonly used filters – Importance Sampling and Sequential Importance Sampling. Some examples are given.

Slice sampling is a class of Markov chain sampling methods, proposed be Neal (2003), that require little knowledge about the distribution being sampled and can adapt to the local characteristics of that distribution. In this talk we will review the univariate slice sampler that is used in WinBUGS and discuss some of the implementation difficulties and potential benefits of a multivariate version.