EU AI Act Browser

Article 10 — Data and Data Governance

1. High-risk AI systems which make use of techniques involving the training of AI models with data shall be developed on the basis of training, validation and testing data sets that meet the quality criteria referred to in paragraphs 2 to 5 whenever such data sets are used.

2. Training, validation and testing data sets shall be subject to data governance and management practices appropriate for the intended purpose of the high-risk AI system. Those practices shall concern in particular:
(a) the relevant design choices;
(b) data collection processes and the origin of data, and in the case of personal data, the original purpose of the data collection;
(c) relevant data-preparation processing operations, such as annotation, labelling, cleaning, updating, enrichment and aggregation;
(d) the formulation of relevant assumptions, in particular with respect to the information that the data are supposed to measure and represent;
(e) an assessment of the availability, quantity and suitability of the data sets that are needed;
(f) examination in view of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations;
(g) appropriate measures to detect, prevent and mitigate possible biases.
Data and Data Governance

Legal Text