# Weight expansion: A new perspective on dropout and generalization
###### tags: `papers`, `generalization`
Paper main idea: new measurement of generalization; understand dropout’s effectiveness in improving generalization
New measurement proposed by the paper: weight expansion. With larger weight volume, one can achieve increased generalization in a PAC-Bayesian setting.
Application: Apply weight expansions to dropout. Theoretically and empirically examine that the application of dropout during training “expands” the weight volume.
Definition: weight volume- normalized determinant of the weight covariance matrix
intuitively, the more correlated they are, the smaller the weight volume and worse generalization ability
More orthogonal -> larger weight volume

Key ideas:
(1) weight expansion reduces the generalization error of neural networks:
PAC-Bayesian upper bound of the generalization error becomes smaller when vol(Wl) increases (PAC-Bayes connects weights with generalization by establishing an upper bound on the generalization error with respect to the Kullback-Leibler divergence between posterior and prior)
(2) dropout leads to weight expansion
theoretically and empirically verified: Dropout reduces correlation of gradient updates, dropout reduces correlation of updated weights
It’s an inexpensive way to do weight expansion