# Weight expansion: A new perspective on dropout and generalization ###### tags: `papers`, `generalization` Paper main idea: new measurement of generalization; understand dropout’s effectiveness in improving generalization New measurement proposed by the paper: weight expansion. With larger weight volume, one can achieve increased generalization in a PAC-Bayesian setting. Application: Apply weight expansions to dropout. Theoretically and empirically examine that the application of dropout during training “expands” the weight volume. Definition: weight volume- normalized determinant of the weight covariance matrix intuitively, the more correlated they are, the smaller the weight volume and worse generalization ability More orthogonal -> larger weight volume ![](https://i.imgur.com/t28LohU.png) Key ideas: (1) weight expansion reduces the generalization error of neural networks: PAC-Bayesian upper bound of the generalization error becomes smaller when vol(Wl) increases (PAC-Bayes connects weights with generalization by establishing an upper bound on the generalization error with respect to the Kullback-Leibler divergence between posterior and prior) (2) dropout leads to weight expansion theoretically and empirically verified: Dropout reduces correlation of gradient updates, dropout reduces correlation of updated weights It’s an inexpensive way to do weight expansion