Soraia Pereira (CEAUL-FCUL);
Tiago André Marques (CEAUL/DBA/University of St. Andrews/FCUL);
Tipo de bolsa
Bolsa de Doutoramento
Estado do projeto:
The increasing accessibility to data with complex spatial and temporal correlation structures brings the need to the development of methodologies that allow to deal with such features. In particular, when the data are point-referenced and we are interested in their spatial distribution, spatial point processes are the natural approach.
Traditional models for such processes are divided in three main areas, depending on the spatial pattern: Poisson models; Cox and cluster models; and Gibbs models (Baddeley et al., 2015; Moller and Waagepetersen, 2003). Whereas Poisson models are adequate to model spatial random patterns where the points are independent from each other, Cox and cluster models are more adequate for aggregation patterns, and Gibbs models for regularity patterns. Although the last one is more commonly used for regularity patterns, it is flexible enough to also model aggregation patterns.
However, this classification in three types of patterns might be a rough approximation under some real life scenarios. We can see many patterns in nature that are aggregated at some scale and regular at another. We can also see different levels of aggregation in the same pattern. Thus, recently, some models have been proposed in the spatial statistics literature that allow to deal with different types of interactions between the points at different scales. Andersen and Hahn (2016) proposed Matérn thinned Cox processes to model aggregation at medium scale together with regularity at small scale. The general idea is to do a thinning at small scales in a Cox process.
A very different approach but that also allows to model this kind of pattern is the model proposed by Raeisi et al. (2021). The authors propose hybrids of Gibbs models, in particular hybrids of Geyer saturation models in a spatio-temporal framework. The hybrid approach brings flexibility since it allows to combine different spatial point process models with different structures. Baddeley et al. (2013) explain the concept of these models and their implementation. Raeisi et al. (2021) illustrate their model with an application to forest fire occurrences. While it is expected some clustering in the locations of forest fires due to the characteristics of the ground, it is also expected some regularity since we do not expect that the same area burns again within a short period of time. This is natural because usually the recorded locations are the centroids of the burned areas.
Although these models represent a big step for the spatial point processes literature, we believe that there is still space for interesting extensions in this direction.
- The general objective is to develop spatial point processes models that allow to deal with different types of interactions at different scales. In particular, we are interested in those models that allow to identify what are the most significant drivers of a phenomenon of interest and that is able to make spatio-temporal predictions with a good precision.
- We intend to use a Bayesian approach since it has been shown several advantages regarding model flexibility with such complex structures
Síntese do Plano de Trabalho
We propose the following steps for this PhD program:
- A Bayesian framework for the methods proposed by Andersen and Hahn (2016) and Raeisi et al. (2021). The authors have proposed a frequentist approach, and the implementation of the methods was made using the spatstat R package (Baddeley and Turner, 2005). It is well known that the Bayesian methods bring flexibility and good computational performance for models with complex spatial structures. Here we propose a hierarchical Bayesian version of the cited methods.
- A spatio-temporal extension of the method proposed by Andersen and Hahn (2016), to understand the effect of potential covariates in the spatial distribution of the points, taking into account the spatio-temporal dependence, and allowing spatio-temporal predictions. For that purpose we suggest to model the intensity of the thinned process through a linear predictor including spatio-temporal structured effects. The idea has some points of contact with the work of Brix and Diggle (2001), which propose a spatio-temporal prediction method for log-Gaussian Cox processes. Inferences based on the proposed model might be done using the Integrated Nested Laplace Approximations (INLA) methodology, based on Simpson et al. (2016).
- Comparison between the two approaches in a simulation study and a real data problem. As far as we know, there are no studies comparing these two different approaches dealing with the same practical problem.
- Incorporation of regularization methods to improve the accuracy and interpretation of the parameter estimates. Regularization techniques have gained importance in the last few years due to the large number of predictors that many times are available to explain the variability of a variable of interest. These methods allow to reduce the number of predictors during the inference process increasing the precision and the interpretation. Here we propose to use a LASSO-type version of the cited models, based on Tibshirani (1996) and Park and Casella (2008).
- Development of methods to deal with regularity at large scales together with clustering at small scales. There are also some processes in nature characterized by this behaviour. A natural model for that might be a thinned version of a Gibbs model. Another possible model might be a Hybrid model that combines a Cox model with a Geyer saturation model. These options come naturally as a slight modification of the methods proposed by Andersen and Hahn (2016) and Raeisi et al. (2021).
- Comparison of the computational performance of these models for Bayesian inference using JAGS, STAN and NIMBLE. We also plan to use INLA for the implementation of some of these models. This is also an important step since the computational performance of complex models is usually a challenge. Moreover, the comparative performance between the software depends on the model. See for instance Taylor and Diggle (2014) for a comparison between MCMC and INLA for Log Gaussian Cox processes.
We expect to implement successfully the proposed methods, and to illustrate them with real problems such as forest fires and fisheries applications, but anticipate that other application areas might be explored. One of the outputs of these analyses will be the spatial and temporal prediction of the burned area for a given period of time, together with the respective variability indicators. We expect to collaborate with teams of the application areas to make the relevant questions and answers. The methodologies and the results of the applications will be published in national and international scientific journals.
Andersen, I. T. and Hahn, U. (2016), “Matérn thinned Cox processes,” Spatial Statistics, 15, 1-21.
Baddeley, A., Rubak, E., and Turner, R. (2015), Spatial Point Patterns: Methodology and Applications with R, London: Chapman and Hall/CRC Press.
Baddeley, A. and Turner, R. (2005), “spatstat: An R package for analyzing spatial point patterns,” Journal of Statistical Software, 12, 1-42.
Baddeley, A., Turner, R., Mateu, J., and Bevan, A. (2013), “Hybrids of Gibbs point process models and their implementation”, Journal of Statistical Software, 55, 1-43.
Brix, A. and Diggle, P. J. (2001), “Spatiotemporal prediction for log-Gaussian Cox processes”, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63, 823-841.
Moller, J. and Waagepetersen, R. (2003), Statistical inference and simulation for spatial point processes, Chapman & Hall.
Park, T. and Casella, G. (2008), “The Bayesian Lasso”, Journal of the American Statistical Association, 103, 681-686.
Raeisi, M., Bonneu, F., and Gabriel, E. (2021), “A spatio-temporal multi-scale model for Geyer saturation point process: Application to forest re occurrences”, Spatial Statistics, 41, 100492.
Simpson, D., Illian, J. B., Lindgren, F., Sørbye, S. H., and Rue, H. (2016), “Going off
grid: computationally efficient inference for log-Gaussian Cox processes”, Biometrika, 103, 49-70.
Taylor, B. M. and Diggle, P. J. (2014), “INLA or MCMC? A tutorial and comparative evaluation for spatial prediction in log-Gaussian Cox processes”, Journal of Statistical Computation and Simulation, 84, 2266-2284.
Tibshirani, R. (1996), “Regression shrinkage and selection via the Lasso”, Journal of the Royal Statistical Society: Series B, 58, 267-288.