
Séminaire Image : « PETRA: Parallel End-to-End Training with Reversible Architectures », Edouard Oyallon
20 mars / 14:00 - 15:00
Nous aurons le plaisir d’écouter Edouard Oyallon, Équipe MLIA, Sorbonne université.
Il donnera un séminaire IMAGE le jeudi 20 mars 2025 à 14h en salle de séminaire F-200.
Titre : « PETRA: Parallel End-to-End Training with Reversible Architectures »
Résumé :
Deep learning models are increasingly large, requiring new strategies for efficient training. In this talk, I will present PETRA, a novel approach that leverages reversible architectures and delayed gradient methods to enable efficient model parallelism. PETRA allows different stages of a model to compute independently across devices, drastically reducing memory usage while achieving significant computational speedups. By removing the need for weight stashing and decoupling forward and backward computations, PETRA provides a scalable alternative to standard backpropagation. I will discuss its implementation within a custom autograd framework and demonstrate its effectiveness on computer vision benchmarks such as ResNet-18, ResNet-34, and ResNet-50.
Ce travail a été réalisé en collaboration avec : Stephane Rivaud, Louis Fournier, Thomas Pumir, Eugene Belilovsky, Michael Eickenberg
L’article associé (ICLR 2025) est disponible à : https://openreview.net/pdf?id=0fhzSFsGUT