TMAPP-EVO Meetings

###### tags: `TopoSCALE` # TMAPP-EVO Meetings ## Meet 25 Feb Todos: * J draws high-level schematic * J ports py2 -> py3 new repo * S Romania case to J * J setup example sim ### Agenda: 1. Overview structure tscale 1. Known issues 1. Simon points 1. How we can organise ourselves 2. Romania project *Lets keep it highlevel today not too much on the details* ### Overview structure tscale: *I will make this schematic!* - starts with config - download era5 for aoi - points or map? - Points: construct listpoints from sparse dem pixel of interest (poi) - Map: Toposub map domain (aoi), listpoints represent Toposub samples - New map!: > In the case of generating spatial models TopoSCALE is coupled to the spatial subgrid scheme, TopoSUB (Fiddes and Gruber, 2012) - this reduces the modelling problem of a given domain from N pixels to S samples via a k-means multivariate clustering algorithm. The variables in this case being topographic predictors. In the case of this study domain this reduces 0.5 million 50 m pixels to 50 samples. However, to account for spatial variability in the forcing each sample is driven by each of G, coarse-grids. Therefore, the total number of TopoSCALE runs is S X G, in this case 1750 runs. This generates S, TopoSCALE time-series per coarse grid centroid that represent the topographic variability which exists within that COSMO grid cell. Finally, a 2D interpolation is performed for each sample (from coarse grid centroid) to each pixel member of that sample. This produces an efficient multidimensional interpolation of the coarse grid forcing, to a subgrid, high resolution topography, at each model timestep (hourly). - tscale then always based on listpoints.txt - Run tscale > set of timeseries corresponding to listpoints - ### Known issues: - multi platform requirement(laptop /hpc)-> multiple use cases (points/maps) = lots of run files and wrappers - rcode - py2 - tests - dev/master branch - terminology - name! - fsm online - tclim ### Simons points: 1. make a pipy package combining all your toolbox (data download, toposub and toposcale) 2. organize the code in classes: with classes you can segment your code in one package with for instance a class for ERA5 download, one for CORDEX, and so on, One classes for TopoSUB, and another for TopoScale. Those are suggestion and other break out of the code might be better. the point is to avoid repeating code and have all in one toolbox > tlib.py / t3d 3. create a main branch separate from dev as you suggested yesterday. 4. add a License (https://choosealicense.com/) 5. one centralized docu with readthedoc https://tscalev2.readthedocs.io/en/latest/ ### How we can organise ourselves 1. py3! 2. overview high lev schematic 3. New repo dev branch 4. start attacking the code! ### Romania project 1. start with current code base to get to know how it works 2. this will help with any dev work ## TopoSCALE-COSMO desc. TSCALE-COSMO is based on the TopoSCALE downscaling scheme (Fiddes and Gruber, 2014). TopoSCALE can be described as an efficient physically based downscaling scheme which formulates physical principles to account for the effect of a high-resolution topography on boundary layer meteorology. Its main purpose is to generate high resolution forcings that account for the main dimensions of surface variability and enable efficient regional scale simulations. TopoSCALE performs a 3D interpolation of atmospheric fields available on pressure levels to account for time varying lapse rates and a topographic correction of radiative fluxes. These include a cosine correction of incident direct shortwave on a slope, adjustment of diffuse shortwave and longwave by the sky view factor and elevation correction of both longwave and direct shortwave. Additionally, a simple lapse rate is implemented to allow orographic enhancement of precipitation to be considered (Liston and Elder, 2006). In practice variability in snow accumulation and depletion is controlled mainly by the energy balance (wind redistribution is also not considered). The scheme is coupled to the snow model FSM (Essery 2015) to simulate snow parameters and surface temperature. It has been well tested in various geographical regions and applications e.g. Permafrost in the European Alps (Fiddes, Endrizzi and Gruber, 2015), Northern hemisphere permafrost (Westermann et al., 2015), Antarctic permafrost (Obu et al., 2020), Arctic snow-cover (Aalstad et al., 2018), Alpine snow cover (Fiddes, Aalstad and Westermann, 2019). Typically, TopoSCALE has been driven by reanalysis from ECMWF (ERA-Interim, ERA5). In this study the scheme is adapted for a COSMO-1 forcing. Fundamentally, TopoSCALE operates on 1D samples with attributes: elevation, slope, aspect, SVF, longitude, latitude. In the case of a single or set of few points this is a trivial array of TopoSCALE runs. In the case of generating spatial models TopoSCALE is coupled to the spatial subgrid scheme, TopoSUB (Fiddes and Gruber, 2012) - this reduces the modelling problem of a given domain from N pixels to S samples via a k-means multivariate clustering algorithm. The variables in this case being topographic predictors. In the case of this study domain this reduces 0.5 million 50 m pixels to 50 samples. However, to account for spatial variability in the forcing each sample is driven by each of G, coarse-grids. Therefore, the total number of TopoSCALE runs is S X G, in this case 1750 runs. This generates S, TopoSCALE time-series per coarse grid centroid that represent the topographic variability which exists within that COSMO grid cell. Finally, a 2D interpolation is performed for each sample (from coarse grid centroid) to each pixel member of that sample. This produces an efficient multidimensional interpolation of the coarse grid forcing, to a subgrid, high resolution topography, at each model timestep (hourly). COSMO forcings were aggregated to 10km to make the study more comparable to standard TopoSCALE setups where there is a large difference in resolution between the coarse grid forcing and the subgrid simulation domain. The method is implemented as an array of points that loops through simulation months (the smaller loop, being rarely greater than several hundred) so is highly scalable with runtimes of single or thousands of points being similar. With this approach it is possible to generate high resolution simulations at continental scales without the need of HPC clusters. However, the scheme is also implemented as a standard 100 node HPC job so that simulations measured in several hours can be run in minutes by parallelising the time dimension. ## Meet 20 Feb 2021 - [ ] schematic / pseudo code description - [ ] py3 port - [ ] new clean repo - "the lifeboat" ## A place to document dev work ## Issues to create (I think I create a slack channel would be a good way to documment discussion) ### Bugs Hi Joel, I tested the TOA incoming SW a bit more today just to be on the safe side based on your finding that we actually don't need to download it since it's just calculated from mu0*S0. As expected (since it says so in the IFS documentation) this is indeed the case, but we have to a bit careful with the implementation. If we just simply use mu0 (cos solar zenith) with the angle at the current time we get the wrong result (see test_toa_simple.jpg) with an error (mean absolute error) of as much as 30% (this is for a site in Finnmark). You can clearly see the error in the scatter plot (y-axis is the estimate using mu0*S0 while x-axis is the ERA5 TOA SW). I'm quite sure that there is no offset in the time vector, everything is in UTC, and the solar geometry should be accurate. Correcting this (you may already have done this in your implementation), we have to account for the fact that the SWtoa values are not instantaneous but averages over the timestep. As such it's better to use the average mu0 over the current hourly timestep, pseudocode approximation is mu0av=0.5*(mu0(i)+mu0(i-1). This gives better results (3% mean absolute error) as shown in test_toa.jpg. Still the SW from TOA is slightly overestimated with the current approximation, but maybe we can live with this. Maybe we could compare this for Weissflujoch as well, would be cool to see if you get a similar slight overestimation. I'll hopefully also have some results for there to share either tonight or tomorrow. Cheers, Kris P.S. I used 1360 for the solar constant (based on Wiki) - Yea I was comparing to ERA5. Thanks for giving me the correct value for S0, unfortunately I think this just makes the positive bias worse (since it would increase the values on the y-axis, i.e. our estimates with mu0*S0, more). Still, the bias is quite small (as long as we use the average mu0 in the time period, to account for the accumulated energy) and we are probably making much bigger errors other places. ![](https://i.imgur.com/wcS0Chx.png) - I found another important problem that we should be aware of! In the humidity calculations that I do (and you may also use now in the python version) there was a mistake (typo) in the original paper that I based it on (Lawrence, 2005 https://doi.org/10.1175/BAMS-86-2-225). The correct value for the parameter "A1" in both the Magnus formula (saturation vapor pressure) and dewpoint -> RH conversion should be 17.625 NOT 7.625 which is what I was using (based on Lawrence). I checked the original paper that he got it from (and a similar expression used by ECMWF) and the correct value is 17.625. This can make a significant impact on our humidity calculations. Sorry about that! But it's good to know and just a coincidence that I found it (was checking our formulation vs ERA5 formulation) since one would expect such a highly cited reference to not contain such typos. ### Features - For the interpolation, we could go closer to what you originally had and treat the horizontal component seperately to the vertical component, using a natural neighbor approach in the horizontal (only need to do this for one time step, since the horizontal structure is fixed) and a linear in the vertical (for each time step). This is also a note to self for tomorrow - Also I've made some potential changes to the TopoSCALE routine that I would want to briefly discuss with you mainly concerning the humidity since I think we should deal directly with specific humidity (q) on pressure levels (instead of converting to q from RH), associated with that is a new methodology to go from q to e (vapor pressure) needed for the longwave computation. In addition, I've become a bit more sceptical if "bilinear" interpolation is the best way for us to go after reading more into the MATLAB implementation of this routine ( https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.8.6623&rep=rep1&type=pdf ). We could also consider IDW or Natural Neighbor as well. See e.g. https://blogs.mathworks.com/loren/2015/07/01/natural-neighbor-a-superb-interpolation-method/ . I'm sure there are similar issues in Python since this is an algorithm issue not a language issue. I also realized that at high latitudes it may be important to convert from lat/lon to UTM, otherwise the convergence of longitude spacing with latitude will skew the the interpolation. This is not so important for Switzerland, but quite imporant for the (high) Arctic.