owned this note
owned this note
Published
Linked with GitHub
---
tags: workshop
---
# BAS & Equadratures Workshop | 28 March 2022
Zoom: https://turing-uk.zoom.us/j/92760137048?pwd=dUlqOUp5L3NaMU1Sd1VPTzZtbkhIUT09&from=addon
[TOC]
### Useful links
* [Equadratures Website](https://equadratures.org/)
* [Equadratures Github](https://github.com/equadratures/equadratures)
* [BAS AI Lab](https://www.bas.ac.uk/ai)
* https://equadratures.org/research/apps.html
* [Equadratures Workshop on March 31](https://www.eventbrite.co.uk/e/equadratures-midlands-workshop-tickets-294564689917)
### Equadratures Team
* Pranay Seshadri (pseshadri@turing.ac.uk)
### Attendees
> *Please add your name below*
> Add a fun emoji that fits your mood today :rocket: :star2: :fries:
* Scott Hosking :sunglasses:
* James Byrne :fireworks:
* Jen Ding :cactus:
* Dan(i) Jones :ocean:
* Alden Conner :turtle:
## Agenda
| Time | Event |
|:------------- | ------------------ |
| 14:00 - 14:15 | Introductions and Ice Breaker |
| 14:15 - 14:45 | Equadratures Workshop Presentation |
| 14:45 - 14:50 | Break |
| 14:50 - 15:20 | Discussion & Interactive Coding |
| 15:20 - 15:30 | Workshop Closing |
### Pre-Workshop Question
What are your biggest gaps/weaknesses in your current model training pipelines?
> Please add your initials with your response if you are comfortable.
* JB: consistency of architectural approach to development
*
### Discussion
1. Do the following keywords resonate with you: uncertainty quantification, data-driven dimension reduction, sensitivity analysis, numerical integration, optimisation with response surfaces, etc.
* DJ: Yes, a technique called "4DVAR", which uses a lot of these concepts, is used often in ocean-atmosphere science for constraining numerical models using observational data.
2. How do you quantitatively ascertain which parameters in your models are important?
* DJ: Adjoint sensitivity analysis (code derived using algorithmic differentiation)
3. Do you work with PDE-based models that require some form of reduced order modelling?
* DJ: We use lots of PDEs, but we tend to just throw more compute power at them instead of reducing their complexity!
### Post-Workshop
What features of Equadratures stood out the most to you? How might they help support your work at BAS?
> Please add your initials with your response if you are comfortable.
* DJ: Equadratures stood out as an efficient way to design a sampling strategy. This may be especially useful for work that Simon Thomas is planning, which involves running a large numerical model to generate a training dataset.
*
*
Do you have any additional questions about Equadratures?
> Please add your initials with your response if you are comfortable.
### Post-Workshop Follow-up 25 April
- General feedback on workshop
- PS: Potential applications: sensitivity analysis + parameter sweeps
- How might equadratures integrate with existing tools that BAS uses?
- SH: Dani (mathematical background/applying similar ideas to the environmental space), James
- Identifying a stalled project that could benefit from equadratures' expertise for collaboration?
- Clean slate project: problem, data, and we have freedom to take it in different directions
- James identified the ozone measurement project
- JB: Investigating Dobson values in calculating ozone levels
- Challenge: parameterization for calculating dobson values challenging from readings; often manually performed
- Multi-dim variable optimisation to create a usable, improvable model
- Follow-up call with scientists?
- Self contained problem, with relatively clean data
- SH: starting with a small project can help us identify what bigger collaborations might be possible
- ASG delivery
- Impact: an area of science where measurements haven't changed
- Stakeholders: Ask Steve (WMO? WMO Assessment Report. Met Office?), Satellite and Sensor companies
- Challenge: calibrating ozone data on the satellites
- Very few calibration readings in direct sun
- SH: Charles (postdoc) started a pipeline before his contract ended
- Handing over of notebooks from Josh and Charles?
- What can we do in 6-8 months?
- Criticisms from mid-term review: not enough cross-collaboration between ASG PIs; this is a great opportunity for DT/DCE + E&S collaboration demo
- Position for future funding if it opens up
- Dobson processing is very manual and relying on expert knowledge (not being transferred by those who are retiring)
- Translating real-world data into models > Digital Twinning opp?
- PDC - Polar Data Centre
- Timelines/Capacity to keep in mind
- Equadratures: postdoc (contract ends in June), Pranay
- Maybe some part-time support from the postdoc?
- BAS: John (retired), Steve
- Onboarding equadratures to prior work
- Sharing jupyter notebooks via Github
- Initial meeting/workshop with James, Pranay, Steve, Brin (bubald@turing.ac.uk)
- Understanding the question
- Translating Charles' handover
- To do: James to organise meeting
### Shared Notes
* Extreme weather use cases
* Computational efficiency (targeting geography, trajectories to focus on) for antarctic ice sheet surrogate models
* Scaling down emulation with equadratures?
* Equadratures version 10
* Piecewise continuous polynomials (e.g. housing price modelling application)
* Some information is passed across models, but different boundaries might be set per region
* Deep learning impact? Size of the models for environmental data
* 1:1 comparison with CNN
* Using dimensionality reduction for data (as pre staging step) to then expand your computational capabilities
* Maximising knowledge at various points
* State estimates (constructed using PDEs + suite of data which is rare in oceanography), cost function squared weighted sum differences
* Currently need a lot of computational power to reach answer
* Could this be used for data assimilation?
* Parameter selection for ozone calculations
* Opportunities for non-manual calculations
* Can you lazy load model data on evaluation? Pass them as generators
* Latin hypercube design (sampling) - Dani's PhD student
# 5 May Dobson Ozone
- James Byrne (RSE), Scott Hosking, Pranay, Bryn Ubald (Research Associate)
- Steve Colwell: look over the meteorological instruments used in Antarctica
- Scott Hosking: Senior Research Fellow @ Turing, Lead AI Lab @ BAS
## Introduction to Dobson (Steve)
- Instruments split up incoming light into a spectrum
- 3 wavelength pairs (A, C, D) - values calculated using AD and CD pairs
- Equation for calculating ozone from direct sun measurements constrained based on things like sun's elevation
- Automated a Dobson to work during the winter (no people at Halley); only works in zenith mode so we need to check the values against the manual direct sun and zenith measurements made during the summer
- Direct sun equation for calculating ozone (can only be taken when there aren't clouds in the sky)
- Zenith equation: majority are these; observations take scattered light from the open sky, each photon will have a different path as they bounce off different molecules in the atmosphere
- Variety of paths means there's not "one size fits all" equation for deriving ozone from zenith equations for all Dobsons
- Need to derive relationship between column ozone (obtained through direct sun measurements close to time in zenith measurements) and factors in direct sun equation

## Discussion
- Pulled together some datasets already
- autodobson
- manual ozone mesaurements (w/ comments)
- Prior work (James):
- Previous work handed over includes:
- Handover notes on equations, data processing
- Dobson configurations
- Processed data into CSV and excel spreadsheets
- ozone_demo_6 notebook
- No automated data pipeline: take machine measurements in text former to local servers at Halley; data management infra pulls data back up North
- Manual data processing by experts
- How often is data updated?
- Multidimensional exploration of parameters seems precisely to be what equadratures is good at?
- Second challenge seems to be a sensitivity analysis problem, whereas the first challenge seems to be model development in order to identify the right equation
- PS: is there a structure you need for the equation (first challenge?)
- SC: Code exists to process the data from the auto-dobson, so we need an equation that can process into the graphs
- E.g. if sun as at this angle, use this equation
- Currently 2 equations, one when sun is >30 or <30 degrees
- These were derived through analysis on data so far through manual values (zenith if cloudy, direct/zenith if clear)
- Could have many more equations more granular to different degrees identified through software
- Direct sun measurements are the "gold standard" and the objective is to leverage the sparse direct sun measurements to better inform the zenith model
- SC: if there are 5 across the day, we can make a fairly linear model throughout the day
- Example parameters: Sun angle, temperature of the atmopshere, cloudiness,
- JB: opportunity to explore other environmental parameters to feed in
- AD calculation better than CD calculation (use different equations)
- About 5-10% difference in AD vs. direct sun measurements
- JB: "Producing a domain of equations" that can help us calculate ozone metric from zenith measurements
- SC: unclear how current parameters impact (e.g. cloudiness might not actually impact very much)
- SC: what parameters make a difference to the measurement when they vary, and which dont?
- JB: a whole suite of atmospheric measurements are being generated at Halley continuously - humidity, radiation
- PS: Is there a good estimate in uncertainty for each parameter?
- Because you know location and time you can calculate sun's elevation to a fraction of a degree
- BU: How do you define cloud cover?
- SC: A scale 1-8 for amount of coverage (oktas) depending on patchiness and heights
## Next Steps
- equadratures team to review previous work (notebooks) before requesting more data
- Charles Simpson (previous worker) at UCL - perhaps he can drop by
- SH: can we work openly (github repo) so we can work asynchronously
- SC: no restriction on the data (Antarctic Treaty - all data must be open)
- JB: not into storing the data on github; how about onedrive
- SH: perhaps a sample dataset?
- SC: this problem has been going on for 2-3 years; woudl be great to see it moving forward again
- Potential for high impact + calibration with satellites
- JD: set up github repo for the project
## 14 October
- SH to link PY with Charles Simpson (UCL), prior notebook author
- Meeting with BAS to narrow down literature
- Steve Colwell (leads group to take measurements) + Jonathan Shanklin (wrote original ozone hole paper)
- Start with Steve
- SH: I like the Gaussian process/physics-driven approach (nice stretch goal)
- PY has does done something like this before, not too different than setting up a GP
- Not a massive dataset, but plenty of coherence
- PY: main challenge is unclear data (some of the covariates, how do they fit into the equation?)
- Timeline/ASG impact goal
- GPs as the concrete goal - ask James how to deploy on the ground
- Testing with instruments in Cambridge
- Stick on a Raspberry Pi and deploy in Antarctica (might not get there until next year)
- Physics-informed + stallite to build on in the application
- After May 2023 - write up fellowship with SH
- SH: you can measure ozone with satellite - can be combine them to make a more complete surface dataset with uncertainty to fill in gaps and calibrate satellite data (far cheaper)
- SH's project environmental monitoring on fusing data - https://www.turing.ac.uk/research/research-projects/environmental-monitoring-blending-satellite-and-surface-data
- Neural processes
- Goal: scale up to the whole globe
- Next Steps
- SH intro to Charles
- Fill in gap with Steve if needed
- Meeting with James when he's back from leave (in early Nov) to discuss concrete goals for real world impact
- On the ground deployment in Cambridge and/or Antarctica
- PY to upload code/data on the open repo - https://github.com/alan-turing-institute/dobson-eq
- PY to share draft of fellowship application