DQM: Data Quality Metrics for AI components in the industry - IRT SystemX
Communication Dans Un Congrès Année : 2024

DQM: Data Quality Metrics for AI components in the industry

Sabrina Chaouche
  • Fonction : Auteur
  • PersonId : 1349359
Yoann Randon
  • Fonction : Auteur
  • PersonId : 1420969
Faouzi Adjed
Nadira Boudjani
  • Fonction : Auteur
  • PersonId : 1420970
Mohamed Ibn Khedher
  • Fonction : Auteur
  • PersonId : 1224892

Résumé

In industrial settings, measuring the quality of data used to represent an intended domain of use and its operating conditions is crucial and challenging. Thus, this paper aims to present a set of metrics addressing this data quality issue in the form of a library, named DQM (Data Quality Metrics), for Machine Learning (ML) use. Additional metrics specific to industrial application are developed in the proposed library. This work aims also to assess various data and datasets types. Those metrics are used to characterize the training and evaluating datasets involved in the process of building ML models for industrial use cases. Two categories of metrics are implemented in DQM: inherent data metrics, are the ones evaluating the quality of a given dataset independently from the ML model such as statistical proprieties and attributes, and model dependent metrics which are those implemented to measure the quality of the dataset by considering the ML model outputs such the gap between two datasets in regards to a given ML model. DQM is used in the scope of the Confiance.ai program to evaluate datasets used for industrial purposes such as autonomous driving.
Fichier principal
Vignette du fichier
AAAI_DQM___Data_centric_author_version.pdf (1.91 Mo) Télécharger le fichier
Origine Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-04719346 , version 1 (03-10-2024)

Identifiants

  • HAL Id : hal-04719346 , version 1

Citer

Sabrina Chaouche, Yoann Randon, Faouzi Adjed, Nadira Boudjani, Mohamed Ibn Khedher. DQM: Data Quality Metrics for AI components in the industry. AI Trustworthiness and Risk Assessment for Challenged Contexts workshop (ATRACC). AAAI Fall symposium, Nov 2024, Arlington, United States. ⟨hal-04719346⟩
125 Consultations
25 Téléchargements

Partager

More