---
title: engineering-meetings
hackmd_link: https://hackmd.io/@understanding-search/engineering-meetings
---
# engineering-meetings
# Week 1
## Agenda
- Codebase tour
- High level goals
- Status updates
- Week 2 plans
## Notes
### Configuration management resources
- Hydra
- Weights & Biases
- Big(?) open question: what is our config management strategy going forward?
### High-level goals
- We don't have strong confidence in our roadmap yet because a lot of it will depend on the results of experiments in the next 1-2 weeks. Some possibilities:
- Get a trained model: may be less trivial than we had initially thought. So we may need to train a lot of models.
- Training and logging infrastructure
- Training & evaluation pipeline
- Might be high priority to use weights & biases for training & evaluation
- We definitely want to improve how we are handling configs, this may be pending completion of ZANJ
- We will definitely be implementing a bunch of interpretability and visualization techniques, timeline and details TBD.
- There's always work to be done fixing bugs, adding tests etc.
### Status updates
- Everyone on the team has got the codebase set up locally and most have done a small PR! 🥳
- There are no remaining blockers to making the repo public
### Week 2 goals
- To discuss early next week - need to wait and see how experiments go
- For now we can focus on any TODO issues on the board
- **Update:** Our main focus this week is on enabling the research team to train a successful model. In our meeting with the research team we identified several issues that are currently making this hard, and we've started working on them.
### Next actions
- [x] - Arrange meeting to discuss training & evaluation infrastructure
- [x] - Create status update doc (or slack channel if we get pro plan)
- [x] - Set up github integration for slack
- [ ] - Create project board readme
- [x] - Discuss week 2 goals by Tuesday latest
# Week 2
## Agenda
- Status updates
- Week 2 plans
## Notes
- Test coverage map? Test by branches? Dashboard to indicate health of codebase?
- Question for Wandb - We have integration test now that looks at loss logs after training - can we use WB for this?
- We could have E2E tests leveraging WB but not in CI to evaluate training perf
- External contributors? Doing a bigger demo notebook would help onboard them.
# Week 3
## Agenda
- Status updates
- Michael
- Dan
- Rusheb
- Lucia
- Can
- Chris
- Alex
- Look at board - current priorities?
- In-flight
- Any pain points from research team?
- Non-urgent but high value work
- eg features we will need later or big refactorings
- Rusheb - cfg.device bug https://searchingforsearch.slack.com/archives/C04S0P93BNE/p1679579393252949
- Week 4 plans
## Notes
- Michael - ZANJ serialization
- Dan - batched model evals
- Rusheb - WandB PR is up
- Logging demo
- Questions about value of stuff in summary
- Lucia - truncation wrapped up, looking for work
- Can - Wrapping up as_img PR, moving on to more visualizations of attention
- Alex - merged big PR, might need a ticket for some tests
- Office hours
- Do we need more?
- Should Alex do some?
- Priority
- Eval stuff
- Resuming runs
- Model sharing
- Alex: Might be some complexity here due to W&B not using ConfigHolder
- Maybe ConfigHolder should use muutils to serialize/load (Micael can work on this)
- Alex: When loading a model, should be able to use current workflow, or pass a link to a W&B run or a link to a local run
- Configs?
- Handling multiple tokenization schemes?
## Action items
- [ ] Make ticket to resume a run from Wandb checkpoint
- [ ] Make ticket to improve ease of imports from maze_transformer
- inits?
- maybe ides/vscode can import things automatically?
- [ ] Create ticket for multiple tokenization schemes
- [ ] Start a conversation about config refactoring and tokenization schemes on related tickets
- [ ] Create ticket for adding x,y tokenization scheme (for Lucia to work on)
- [ ] Create label for big-picture discussion issues