Article 10Chapter III โ High-Risk AI Systems
Data and Data Governance
High-Risk
Summary
Sets data governance requirements for high-risk AI systems trained on data. Training, validation and testing datasets must meet quality criteria: relevance, representativeness, freedom from errors, and completeness.
Keywords
data governancetraining databiasdata qualityrepresentativeness
Legal Text
Article 10 โ Data and Data Governance 1. High-risk AI systems which make use of techniques involving the training of AI models with data shall be developed on the basis of training, validation and testing data sets that meet the quality criteria referred to in paragraphs 2 to 5 whenever such data sets are used. 2. Training, validation and testing data sets shall be subject to data governance and management practices appropriate for the intended purpose of the high-risk AI system. Those practices shall concern in particular: (a) the relevant design choices; (b) data collection processes and the origin of data, and in the case of personal data, the original purpose of the data collection; (c) relevant data-preparation processing operations, such as annotation, labelling, cleaning, updating, enrichment and aggregation; (d) the formulation of relevant assumptions, in particular with respect to the information that the data are supposed to measure and represent; (e) an assessment of the availability, quantity and suitability of the data sets that are needed; (f) examination in view of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations; (g) appropriate measures to detect, prevent and mitigate possible biases.
All Articles
Quick Actions