## Knowledge Transfer for Continual Learning
## Bi-Weekly Meeting Minutes
### 9th June 2021
- Meeting Gianma + Happy
- After running Gianma code, try the same on my code to get a good baseline
- Request Code for moon data sampling from Fariz
- Update the loss ablation table with statistical significancy results
- Gianma to share BLS code from Darmesh
- Update the plots code
### 7th April 2021
- Meeting Gianma + Emti + Happy
- Write up the Experimental results part
- Meeting with CL team
- Think of possible collaboration (scaling current experiments to bigger architectures)
### 24th March 2021
- Repeat all the MNIST experiments by;
- Train teacher longer to get the best performance possible
- Use similar parameters across all experiments
- Try DSL+KD
### 10th Feb 2021 (Fariz+Gianma)
Suggested next steps to check for large scale KD
(Tiny-ImageNet and CIFAR100) with DNN2GP
- Compute hessian ggn only on memorable points
- The time taken still large (5 days for 10000 memorable points, minimal number of class 200)
- Start form 1000 memorable points, 5 for each class
- Check this proposed bottleneck in MNIST
- Split the class on hessian ggn update
- Encounter the singular matrix error on moist experiments
### 14th Nov 2020 (Happy+Fariz+Gianma)
- How to overcome out of memory issue
- Re-do experiment with DSL
- Re-do experiment with DSL + Memorable loss (SL)
- Implement early stoping in the code
### 11th Nov 2020 (Happy+Fariz+Gianma)
- Check the FROMP code, adapt for TL task
- On 2D Toy experiment,
- Using Loss from FROMP, Train using only the second part of the loss
- Train with the full loss
- Meeting with Siddharth on (11/19) to discuss how to adapt FROMP code for Transfer learning.
Meeting with Bayesian duality members
**Kernel transfer in Knowledge Distillation**
Introduction write up draft:
Knowledge distillation has increasingly become a popular solution for model compression over the years with a wide range of applications.
### 14th October 2020 (Happy+ Fariz + GianMa)
Direction to take on Dropout & Kernel transfer in Knowledge Distillation:
### Feedback
- Land mark selection (with bayesian duality)
- How to use DNN2GP representation for kernel computation
### 6th October 2020 (Happy+GianMa)
- Toy experiment results for a simple FFN from scratch (Cats Vs Dogs)
- Discussion about "Efficient kernel Transfer in Knowledge Distillation"
- How to contribute to MCDropout ?
### Feedback
- Improve kernel transfer computation with DNN2GP
- Simple 2D for the current experiment (Cats Vs Dogs)
- Create RAIDEN Environment to train all Inception layers
## 8th September 2020 (Happy + GianMa+Emti)
#### Toy Experiment
- Image classification with Inception Model (Cats Vs Dogs)
- Some Visualisation
- Paper discussion "Parameter Efficient Transfer learning in NLP"
#### Feedback
- Train a simple FFN model from Scratch (Cats Vs Dogs)
- Freeze some layers and transfer using Inception
- Train all Inception layers from Scratch
- Compare two networks
- Compute function representation at each layer
- Get account for Raiden
## 25th August 2020
- Survey on transfer learning
- Identify key transfer learning papers
- Prepare slides and presentation at reading session
### Feedback
- Identify specific/simple but promissing transfer learning task, do some experiment, visualisation & create benchmarks
- Identify relationship between transfer learning and Continual lerning
- Think about analogy of knowledge transfer
- Look into similarity between NN layers
## 10th August 2020 (Happy+GianMa+Emti)
- Paper discussion "Neural Variational Inference for Text Processing"
### Feedback
- Paper is not related to transfer learning and not relevant to the team
- Have a broad view/picture of transfer learning
- Have a broader picture of losses used in transfer learning and Continual Learnig
## 29th July 2020
- Decided to hold a bi-weekly meeting to discuss progress and next steps
- Discussed about my current research on question answering with memory networks and KGs
- Think about solving related problem/s but in a simpler way.
- Specifically: Find some work that have tried to solve the problem in a somewhat easier way.
- Goal: Figure out how to apply transfer learning to broad NLP problems