Machine learning beyond the data range: extreme quantile regression

Sebastian Engelkel

Research Center for Statistics, University of Geneva

Local: ZOOM  – Link

12 de janeiro 2023 (5.ª feira), 17:00

Abstract:

Machine learning methods perform well in prediction tasks within the range of the training data. When interest is in quantiles of the response that go beyond the observed records, these methods typically break down. Extreme value theory provides the mathematical foundation for estimation of such extreme quantiles. A common approach is to approximate the exceedances over a high threshold by the generalized Pareto distribution. For conditional extreme quantiles, one may model the parameters of this distribution as functions of the predictors. Up to now, the existing methods are either not flexible enough or do not generalize well in higher dimensions. We develop new approaches for extreme quantile regression that estimate the parameters of the generalized Pareto distribution with tree-based methods and recurrent neural networks. Our estimators outperform classical machine learning methods and methods from extreme value theory in simulations studies. We illustrate how the recurrent n eural network model can be used for effective forecasting of flood risk.

Joint seminar CEMAT and CEAUL