Explainable trustworthy ai
Explainable & Trustworthy AI#
Overview#
This course focuses on techniques for improving the trustworthiness of machine-learning (ML) based systems, including methods for explaining the decision andensuring the safety of ML models. By the end of this course, students will be able to: - generate explanations from complex models - evaluate the quality ofthese explanations with respect to expectations - formulate properties that aML model should enforce - verify whether these properties hold, and exhibit counter-examples if it is not the case.
Outline#
- Introduction to trustworthy AI - 1.5h
- Reminders on machine learning and deep learning architectures
- Why do we need trustworthy systems?
- General overview and definitions
- Introduction to formal methods for verification - 1.5h
- Explainable AI (XAI) - 9h
- Explaining models and evaluating explanations (3h)
- Taxonomy of explanations
- Post-hoc explanation methods for models and decisions
- Designing self-explainable models
- Properties of explanations and how to measure them
- Introduction to formal XAI (1.5h)
- Practical work (4.5h)
- Explaining models and evaluating explanations (3h)
- AI Safety - 6h
- Current trends in the evaluation of non-linear ML models (1.5h)
- Adversarial machine learning - 1.5h
- Practical work (3h)
In brief#
- Period: semester 9
- Credits: 3 ECTS
- Number of hours: 18h
- Apogée:
Recommended prerequisites#
Theory: Fundamentals of machine learning (gradient descent) and deep learning architectures (CNN, transformers), notions in computer vision. Practice: development in Python, notions in deep learning frameworks (Pytorch or Tensorflow).
Pedagogical team#
- Romain Xu-Darme: romain.xu-darme@cea.fr