Event
Andrew Mackenzie (缅北强奸)
Wednesday, September 27, 2023 13:00to14:00
Burnside Hall
Room 1214, 805 rue Sherbrooke Ouest, Montreal, QC, H3A 0B9, CA
Title: Tensor Programs and 碌P
Abstract: We will discuss the limiting behaviour of large neural networks as the layer width goes to infinity. One of the factors that most affects limiting behaviour is the specific parametrization used; apart from training stability, this will determine whether or not the neural network can learn features. We show a technique for mechanically deriving the "best" parametrization, known as 碌P. As an additional empirical benefit, we demonstrate that under 碌P, hyperparameters transfer directly across different sizes of models, allowing for running all experiments at a small scale.