# AZmet QA/QC - Matt "plugged in" rounding to correct precision--only applied to *new* data (since the last month or so) - Part of `azmetr` could be checking for correct precision - Matt has a place online where we can edit the measured variables and it will propogate to derived variables (we think). - Split tasks: modeling and workflow automation - Phase one is alert Jeremy of extreme values and imputed values (a daily report table published with Quarto, for example) - Could fit multivariate model as additional step (can't be raining all day and have high solar radiation) - Could detect things in derived variables that we don't see in measured variables beause of transformations? ## Report refinements Report currently uses **all data** for rule-based validations and **just one day** for forecast-based validations. This doesn't make sense. Need some flexibility in terms of what days are being viewed in the report and consistency between types of validations. - [ ] Get forecast-based validation to work for past dates and potentially store results in a dataset somewhere Then, there's two options: 1. Create reports with 1 week of data weekly and save them. Make a Quarto website to easily view weekly reports. 2. Make the report a Shiny app / flexdashboard with a date-range selector. There's also the issue of the `pointblank` style tables not being very customizable and not being very readable sometim es (e.g. with segmented validations). ## Kickoff meeting (Oct 7) Agenda: - look at proposal - set some dates When CCT system was down Matt Harmon put some static QA/QC checks into place Jeremey will send spreadsheet of active stations Check if stations haven't reported for > 6 hrs Dynamic QA/QC - Probability distributions for paramaters using historical data, flag if outside some threshold - Look how other systems do weather QA/QC - https://data.ess-dive.lbl.gov/view/doi:10.15485/1823516 - https://github.com/WSWUP/pyWeatherQAQC - Compare to climatology products (e.g. Daymet) or use those product for modeling - Compare to forecast data products (e.g. from NOAA). Use ensemble forecasts to create interval for QA/QC. ## Initial Meeting AZMet update - equipment maintainence / optimization - modernization of data workflow - need to update website QA/QC needs - sensor ranges - check for physically impossible values - Some (but not all) stations have data loggers that deal with small errors (e.g. RH = 100.1%) - Collection frequency = hourly - Maybe a Shiny dashboard (or other data visualization) to see the data plotted in addition to just alerts - Basic Shiny apps for this exist - Done some resarch on industry standards on QA/QC for environmental sensors. Can use this as source for QA/QC checks - Reach goal: inter-station comparisons Q: Is there software already in existance for meterology QA/QC? A: Haven't seen anything from mesonets. Need to look around. - FluxNet has shared tools for QA/QC David suggestion---for every data point, what's the probability that it's an error given history? What's the probability it's a "true" data point? Matt Harmon - was responsible for porting AZMet to more modern infrastructure - dataloggers -> JSON -> python scripts - some basic range checking and normalization is in python object - Future idea that ranges would get pulled from a database table - E.g. some of the weather stations are in golf courses that use sprinklers - ranges in database table could be dynamic - Simple validations built into code that pulls data (e.g. laws of physics) and have more sophisticated validations that rely on spatial and temporal variation in another layer. 3 layers: 1. laws of physics checks---hard coded 2. range checks from ranges in database (seasonally dynamic?) 3. spatio-temporal checks on data output including calculated values Layer 1 in existing python code (or can be added) Layer 2 maybe added to existing python code Layer 3 dashboard Notifications: - Some of this done currently using SQL database - Whatever tools do QA/QC just need to be able to interact with SQL database or REST API for notifications Q: Are all tables exposed as REST API? A: Depends on what we want to do with data. Currently API hides some things---can't address a particular table or see metadata without authorization. Data Integrity: - Previously some manual manipulation of data coming off data loggers without any tracking - Now manual modifications and updates are versioned Q: Do you want provisional data (ASAP) and validated data? What's the desired turn-around for validated data? A: Not a problem to have near-real-time data being provisional. Don't know what the turn-around should be for validated data. Need to figure it out. Q: Could you define what kind of dashboard or automated reporting would be most useful? A: Need to think through it still - Need a separate dashboard for QA/QC for use in the field for diagnosing problems (doesn't need to be constrained to AZmet website) - Separate dashboard for data consumers - For website, focus on tabular data presentation (`gt` package?) - For website, haven't gone beyond built-in capabilities of drupal - Haven't gone into visualizations for website API conversation - Whenever we (David and Eric) only have read access to database that's a good thing. Next steps: - Drafting proposal for data incubator - Jeremy will get it started and send around # meeting Loading time isn't a big issue since only used internally Likes the shiny app interface with date selector AZMET in good position to fund continued work First step: Consistency checks, date slider, daily data shiny app Report just pulls data from azmetr, doesn't need targets pipeline