Addressing data scarcity in classification of vertebrate footprints using transfer learning and procedurally simulated footprints

Carolina Marques, CEAUL, FCUL

  Ciências ULisboa, C6, SASLab (Sala 6.4.29)

18 November 2024 (Monday) – 15:30

Abstract:

Studying vertebrate footprints provides helpful insights into the distribution and movements of both past (e.g. dinosaurs) and present (e.g. mammals) fauna. Classifying vertebrate footprints automatically through photographs can be very challenging due to the variability among footprint images and the lack of available labelled datasets. To address this issue, a novel Unity application is used to create procedurally simulated footprints. The obtained data is then used to train a Convolutional Neural Network (CNN) that can classify the different simulated footprints and then transfer learning is used to fine-tune a CNN that classifies real footprints. The two datasets used to evaluate this assumption corresponded to (1) 100,000 simulated footprints, corresponding to 10 different animal groups, and (2) 800 real footprints, corresponding to 5 different animal groups. The results using transfer learning presented an improvement of more than 30% in the accuracy results of the CNN trained with the real footprints. These results highlight the importance of innovative data augmentation techniques for enhancing accuracy and reliability, especially when dealing with data scarcity.

Seminário no âmbito do Mestrado em Bioestatística