Coordinate descent on the Stiefel manifold for deep neural network training

Massart, Estelle;Abrol, Vinayak
(2023) 31st European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. — Location: Bruges, Belgium (4.October.2023)

Files

esann_paper12.pdf
  • Open Access
  • Adobe PDF
  • 270.74 KB

Details

Authors
  • Author
  • Abrol, VinayakCSE Department Infosys Centre for AI,IIIT Delhi - India
    Author
Abstract
To alleviate the cost incurred by orthogonality constraints in optimization and model training, we propose a stochastic coordinate descent algorithm on the Stiefel manifold. We compute expressions for geodesics on the Stiefel manifold with initial velocity aligned with coordinates of the tangent space and show that, analogously to the orthogonal group, iterate updates of coordinate descent methods can be efficiently implemented in terms of multiplications by Givens matrices. We illustrate our proposed algorithm on deep neural network training.
Affiliations

Citations

Massart, E., & Abrol, V. (2023). Coordinate descent on the Stiefel manifold for deep neural network training. ESANN 2023 proceedings. Published. 31st European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning., Bruges, Belgium.