Skip to content

Optim management


Optimized Management & Processing for learning#


Overview#

Optimized management and processing for learning Artificial Intelligence has experienced unprecedented growth in recent years. This growth comes at the cost of more complex models, with more parameters, and the use of increasingly large datasets. This implies that data preparation and training of these AI models, to be efficient, must take into account the various constraints of the training platforms: the number of available CPU cores, RAM size, the number of GPUs/specific cores, available VRAM size, available disk space, and specific constraints of certain file systems, particularly in the case of cluster usage (e.g., limited number of inodes). This implies a thorough understanding of the nature of the data being processed, whether images/videos, sound, text, or lidar data from an autonomous vehicle, and their peculiarities both in terms of processing and storage, as well as access efficiency to prepare batches for the training process. The parallelization capabilities of the system performing the training will also be crucial in making decisions to optimize the training process. In this course, we will focus on different types of data, discuss how to efficiently prepare this data, and how to store it in a way that allows for efficient reading and processing. We will explore how this is compatible with the parallelization aspects of the host system (CPU/RAM/Disk) as well as the available GPUs for training machine learning models, including Deep Learning ones. The course syllabus can be summarized as follows:

  • Analysis of unimodal and multimodal datasets, study of their characteristics
  • Case studies of a multimodal perception algorithm: analyzing group level emotion recognition in videos in the wild
  • How to optimize data loading based on the limitations of the host machine(s) used for training?
  • Data augmentation: why augment data, types of augmentation, static vs. dynamic augmentation, etc.
  • Parallelizing training (single CPU/multi-CPUs, single GPU, model parallel, data parallel in case on multi-GPUs): when, why, how. Case studies of low-powered platforms (smartphones, real-time platforms like STM32, for example) Distributed training on dedicated multi-user clusters
  • "Reproducible science": how to share code and models

In brief#

  • Period: semester 9
  • Credits: 3 ECTS
  • Number of hours: 18h
  • Python programming
  • PyTorch framework (optional)

Grading#

  • Short quizzes
  • Practical work submissions
  • Mini-project

Pedagogical team#

Dominique Vaufreydaz: Dominique.Vaufreydaz@univ-grenoble-alpes.fr