Meta Learning Seminar
Recent Advancements on Natural Gradient: Meta-Learned Gradient Preconditioning
The natural gradient, first introduced by Amari (1998), allows to train neural networks by explicitly taking into account the non-Euclidean geometry of (conditional) statistical models associated to neural networks. The computation of the natural gradient requires the inversion of the Fisher matrix, which poses limitations to this adoption in training large networks, unless some approximations are introduced, in order to reduce the computational cost. Meta-learned gradient preconditioning is an approach in meta-leaning in which the gradient of a neural network is preconditioned based on the task. For instance preconditioning can be obtained through another network which processes the available information about the task to be learned, e.g., a small set of images associated to the labels of the classification task. In general, natural gradient as well as second-order methods in optimization can be considered as specific instances of gradient preconditioning, exploiting either information about the geometry of the space (e.g., the natural gradient) or about the function to be optimized (e.g., the Newton method). In this presentation we discuss about the use of natural gradient in meta-learning and we propose a combined framework for meta-learning natural gradient preconditioning.