Impact of OVL Variation on AUC Bias Estimated by Non-parametric Methods

Carina Silva
Escola Superior de Tecnologia da Saúde de Lisboa, Instituto Politécnico de
Lisboa & Centro de Estatística e Aplicações, Universidade de Lisboa
Local: ZOOM – 13:00 – Link Password: 860066
Quinta-feira, 22 de outubro de 2020
Seminário Conjunto CEAUL e CEMAT e Seminário no âmbito do Mestrado em Bioestatística
Referência Projeto: UIDB/00006/2020 and UIDB/04621/2020

The area under the ROC curve (AUC) is the most commonly used index in the ROC methodology to evaluate the performance of a classifier that discriminates between two mutually exclusive conditions. The AUC can admit values between 0.5 and 1, where values close to 1 indicate that the model of classification has a high discriminative power. The overlap coefficient (OVL) between two density functions is defined as the common area between both functions. This coefficient is used as a measure of agreement between two distributions presenting values between 0 and 1, where values close to 1 reveal total overlapping densities. These two measures were used to construct the arrow plot to select differential expressed genes. A simulation study using the bootstrap method is presented in order to estimate AUC bias and standard error using empirical and kernel methods. In order to assess the impact of the OVL variation on the AUC bias, samples from various distributions were simulated considering different values for its parameters and for fixed OVL values between 0 and 1. Samples of dimensions 15, 30, 50 and 100 and 1000 bootstrap replicates for each scenario were considered.