# Notes on "Replay in Deep Learning: Current Approaches and Missing Biological Elements" ###### tags: `continual learning` #### Author [Rishika Bhagwatkar](https://https://github.com/rishika2110) ## Introduction * Replay is the reactivation of a sequence of cells that occurred during an activity. It is observed in sleep, rapid eye movement and while awake takes place on a faster time scale. * It plays a crucial role in memory formation, retrieval, and consolidation. * Stability-Plasticity Dilemma: Learning requires the model to update its weights, but updating its weights according to new data results in overwriting the weights critical to performance on past data i.e it forgets previous knowledge. It occurs because ANNs assume data to be iid. It's the root cause of catastrophic forgetting. * In mammals, catastrophic forgetting is very rare. <!-- * To mitigate catastrophic forgetting, variants of replays are incorporated. In ANN, replay is implemented by storing a subset of previously seen data and making the model learn on the mixture of new data and few examples from data. --> <!-- ## Replay in Biological Networks * Spiliting learning into long-term and short-term memories allows the brain to efficiently solve the stability-plasticity dilemma. --> <!-- * Replay is reacitvation of sequence of cells that occured during an activity. It is observed in sleep, rapid eye movement and while awake, takes place on a faster time scale. * It plays an crucial role in memory formation, retrieval, and consolidation. * Stablitiy-Plasticity Dilemma: Learning requires model to update its weights, but updating its weights according to new data results in overwriting the weigths critical to performance on past data i.e it forgetts previous knowledge. It occurs because ANNs assume data to be iid. It's the root cause of catastrophic forgetting. * In mammals, catastrophic forgetting is very rare. <!-- * To mitigate catastrophic forgetting, variants of replays are incorporated. In ANN, replay is implemented by storing a subset of previous seen data and making the model to learn on mixture of new data and few examples from pata data. --> <!-- ## Replay in Biological Networks * Spiliting learning into long-term and short-term memories allows the brain to efficiently solve stability-plasticity dilemma. --> --> ## Replay in Aritificial Networks <!-- * An offline ML model learn on certain sets on assumptions: * Training and testing data comes from same underlying iid distribution * All of the data is available at once * There are distinct periods of training and testing * In continual learing, the data stream is evolving in non-iid manner over time. Hence, resulting in catastrophic forgetting. * Lifelong ML agents have been developed to overcome the challenges faced in continual learning of networks from non-iid data and should be able to utilize past knowledge to learn similar knowledge in future better and quickly. * Methods adapted for mitigating catastrophic forgetting are: * Regularisation schemes: Apply constriant weight updates with GD * Network expansion techniques: Adding new parameters for learning new parameters * Replay mechanisms: Storing representations of seen data to mix with the new data ### Replay in Supervised Learning * There are two major paradigms in which continual learning agents are trained: * Incremental batch learning * Streaming learning * Replay has been the most successful mechanism in mitigation of catastrophic forgetting in both of the paradigms. * There are two ways in which replay is used in ANNs: * Partial replay: Agent stores all or subset of previously seen data in replay buffer, mix it with the new data and finetune the model on this mixture. * Generative replay: Generative models (AEs or GANs) are used for generating samples from previously seen data. Leading to reduction in memory consumption but addition of similar number of parameters in classification network. Additionally, these are difficult to optimise being susceptible to mode collapse. * Generative replay seems to be more biologically plausible as it is impractical to assume that brain explicity stores previously seen data. * It is important for the agent should know *what* to replay. There are various strategies for storing subset of previously seen inputs: * Storing examples by uniform random sampling * Storing examples closest to class decision boundaries * Storing examples with the highest entropy * Storing a mean vector for each class in deep feature space. * Storing examples for which performance would be harmed the most by paramter updates * Selective replay have shown promising reults but uniform random sampling gives almost similar results with less compute. --> * An offline ML model learn on certain sets of assumptions: * Training and testing data comes from the same underlying iid distribution * All of the data is available at once * There are distinct periods of training and testing * In continual learning, the data stream is evolving in a non-iid manner over time. Hence, resulting in catastrophic forgetting. * Lifelong ML agents have been developed to overcome the challenges faced in continual learning of networks from non-iid data and should be able to utilize past knowledge to learn similar knowledge in future better and quickly. * Methods adapted for mitigating catastrophic forgetting are: * Regularisation schemes: Apply constraint weight updates with GD * Network expansion techniques: Adding new parameters for learning new parameters * Replay mechanisms: Storing representations of seen data to mix with the new data ### Replay in Supervised Learning * There are two major paradigms in which continual learning agents are trained: * Incremental batch learning * Streaming learning * Replay has been the most successful mechanism in the mitigation of catastrophic forgetting in both of the paradigms. * There are two ways in which replay is used in ANNs: * Partial replay: Agent stores all or subset of previously seen data in replay buffer, mix it with the new data and finetune the model on this mixture. * Generative replay: Generative models (AEs or GANs) are used for generating samples from previously seen data. Leading to the reduction in memory consumption but the addition of a similar number of parameters in the classification network. Additionally, these are difficult to optimise being susceptible to mode collapse. * Generative replay seems to be more biologically plausible as it is impractical to assume that the brain explicitly stores previously seen data. * It is important for the agent should know *what* to replay. There are various strategies for storing a subset of previously seen inputs: * Storing examples by uniform random sampling * Storing examples closest to class decision boundaries * Storing examples with the highest entropy * Storing a mean vector for each class in deep feature space. * Storing examples for which performance would be harmed the most by parameter updates * Selective replay has shown promising results but uniform random sampling gives almost similar results with less compute. ### Replay in Reinforcement Learning * In RL an agent is trained to take actions in an environment to maximize its reward, using experience replay. * Experience replay is a method used to create iid batches of data for the agent to learn and allows agent to store and replay rare experiences. * Experience selection strategies: * Temporal DIfference error * Absolute reward * Distribution matching based on reservoir sampling * State-space coverage maximisation based on the nearest neighbours to experience * Other kinds of replay: * Standard prioritized experience replay: Each experience is associated with a single reward. * Hindsight experience replay: Experiences can be replayed with various rewards which allow learning when reward signals are sparse or binary (a common challenge in RL) and it serves as curriculum learning by structuring the rewards such that they start off simple and grow increasingly more complex during training. ### Replay in Unsupervised Learning * Replays are also used for continual learning of GANs for image and scene generation. Various GANs are used in this aspect. * Unserpervised models such as AEs and GANs have been used to generate replay in supervised learning paradigms. ## Juxtaposing Biological and Artificial Replay <!-- * Representational replay or generative representational replay increase the performance in continual learning than verdical replay. * In biological networks, replay happens both independently and concurrently in several different brain regions, whereas artificial replay implementations only perform replay at a single layer within the neural network. * Efficient replay techniques can help in better and faster learning (analogous to forward knowledge ttransfer in humans). However, more research has been carried out on developing replay strategies that result only in better performance. * CLS-inspired models focus on using generative replay to generate new inputs during training, instead of storing raw inputs explicitly. * Till date, there are no CLS-inspired models that use information from neocrotex-inspired network to influence training of hippocampal-inspired network. However, neocortex influences learning in hippocampas. * Integration of replay and regularisation mechanisms with a larger extent of commuincation with each other, each mechanism could potentially strenthen the other, could yield improved performance of continual learning in ANNs. --> * Representational replay or generative representational replay increase the performance in continual learning than verdical replay. * In biological networks, replay happens both independently and concurrently in several different brain regions, whereas artificial replay implementations only perform replay at a single layer within the neural network. * Efficient replay techniques can help in better and faster learning (analogous to forward knowledge transfer in humans). However, more research has been carried out on developing replay strategies that result only in better performance. * CLS-inspired models focus on using generative replay to generate new inputs during training, instead of storing raw inputs explicitly. * Till date, there are no CLS-inspired models that use information from the neocortex-inspired network to influence the training of the hippocampal-inspired network. However, the neocortex influences learning in the hippocampus. * Integration of replay and regularisation mechanisms with a larger extent of communication with each other, each mechanism could potentially strengthen the other, could yield improved performance of continual learning in ANNs.