# GN1 Dev Tasks
## Tasks
Please write your name if you are are going to investigate a task!
#### 1. Very quick (will turn on by default):
- [x] layer norm: added, appears to increase convergence
- [x] droupout: seems to degrade performance so have disabled for now, can re-test when we increase the model size
- [x] residual connections: added additional residual in first gatconv layer
- [x] ReLU -> elu/silu/selu/mish/gelu?
- note we are actually using ELU in the gat, for no particular reason.
- This needs to be changed everywhere (MLPs and GAT).
- Plus, in the init_params method of the gatconv class, you need to make sure the initialised parameters are using the correct gain.
#### 2. Test one-by-one:
- [x] dense node updates (already supported, not enabled)
- [x] add a separate value projection (already supported, not enabled)
- [ ] promote all projections to full MLPs, calculate attention weights with MLPs
- [x] go deeper/narrower (3->6 layers, config change)
- [x] increase number of heads 2 -> 8
- [x] concat jet pt, eta to any linear transformation/MLP [!83](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/GNNJetTagger/-/merge_requests/83) -- *Dmitrii*
- [x] add LR scheduler (e.g. [torch.optim.lr_scheduler.OneCycleLR](https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.OneCycleLR.html))
- You might have to disable SWA in the trainer (or config file) as it may interfere with a scheduler.
- Run with verbose=True to ensure it's doing what we want
- [ ] Add label smoothing (just an argument to the loss)
- [ ] Train for more epochs (up to 200)
#### 3. Slightly more involved:
- [ ] persisent edge features (this is not so bad)
- [ ] persisent global features (this requires using a hetrograph, which is supported but deprecated) -- *Dmitrii*, I will try to add heterograph support
## Merge requests:
Main MR into main will be collected in [!81]( https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/GNNJetTagger/-/merge_requests/81).
Please open MRs to this branch (svanstro/model-updates)
- [!82](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/GNNJetTagger/-/merge_requests/82): small improvements to dropout and layernorm - should speed up convergence
- [!83](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/GNNJetTagger/-/merge_requests/83): draft version of pt/eta concatentaion -- *Dmitrii*