Themes: Mathematics, Physics
Thesis location: Laboratoire de Statistique et des Méthodes Avancées (LSMA) - Cadarache
Start: October 2021
Master's Degree in data Science/Statistics/Applied Mathematics
Age limit: 26 years old unless otherwise stated.
This thesis is related to the reconstruction of data coming from the safety studies performed at IRSN. In practice, the amount of data, that are experimentally measured or simulated by complex computer codes, is not always substantial enough to precisely capture a phenomenon of interest on its whole variation range. To circumvent this problem, prediction methods are used to approximate the phenomenon where it has not been observed. There exists a large literature on this type of method, several of them exploit the appealing framework of machine learning. However, their adaption to multidimensional objects remains an active field of research under the name object oriented data analysis. This analysis relies on the interpretation of an object as a point in a feature space. Among feature spaces, the Wasserstein space offers an appealing framework for prediction since it allows exploiting optimal transport methods that are efficient for the treatment of complex objects. The objective of the thesis is to establish the connection between classical machine learning approach and optimal transport in order to propose a prediction tool for multidimensional data. A mathematical analysis will be then conducted to quantify the uncertainty associated to the new predictor. Finally, several applications coming from IRSN projects will be performed to validate the new developments and compare them with existing approach of the literature.