Explainable trustworthy ai

Explainable & Trustworthy AI#

Overview#

This course focuses on techniques for improving the trustworthiness of machine-learning (ML) based systems, including methods for explaining the decision andensuring the safety of ML models. By the end of this course, students will be able to: - generate explanations from complex models - evaluate the quality ofthese explanations with respect to expectations - formulate properties that aML model should enforce - verify whether these properties hold, and exhibit counter-examples if it is not the case.

Outline#

Introduction to trustworthy AI - 1.5h
- Reminders on machine learning and deep learning architectures
- Why do we need trustworthy systems?
- General overview and definitions
Introduction to formal methods for verification - 1.5h
Explainable AI (XAI) - 9h
1. Explaining models and evaluating explanations (3h)
  - Taxonomy of explanations
  - Post-hoc explanation methods for models and decisions
  - Designing self-explainable models
  - Properties of explanations and how to measure them
2. Introduction to formal XAI (1.5h)
3. Practical work (4.5h)
AI Safety - 6h
1. Current trends in the evaluation of non-linear ML models (1.5h)
2. Adversarial machine learning - 1.5h
3. Practical work (3h)

In brief#

Period: semester 9
Credits: 3 ECTS
Number of hours: 18h
Apogée:

Recommended prerequisites#

Theory: Fundamentals of machine learning (gradient descent) and deep learning architectures (CNN, transformers), notions in computer vision. Practice: development in Python, notions in deep learning frameworks (Pytorch or Tensorflow).

Pedagogical team#

Romain Xu-Darme: romain.xu-darme@cea.fr