# SwissTwins AiiDA W&C Meeting 2024-06-13 ###### tags: `bi-weekly meetings` ###### time: 11:00 CET [TOC] ### Present * Julian * Alexander * Matthieu * Rico ### Progress Current WorkGraph: ![image](https://hackmd.io/_uploads/BJ6lXNuHR.png) Simple AiiDA provenance graph for one simplified cycle: ![image](https://hackmd.io/_uploads/Syp-amurA.png) --- ### Questions Alexander and Julian * Just to verify logic, this ```yaml= ICON: input: - restart: lag: '-P2M' ``` means that as soon as `start_date+period*i + lag >= start_date` valid, we depend only on that input file with timestamp `start_date+period*i + lag` otherwise it is optional. -> Correct. We might need to consider cycles with their own start date. Don't forget also the other side of the case `start_date+period*i + lag <= end_date`. Maybe an error should be raised? We think about it once it becomes relevant. * Not specifying the date/lag -> Take time of current cycle/step * `preproc` depend on the time, create different files * I am not sure how to put the arguments information into the yaml. In principle this is simple ```yaml= preproc: input: - grid_file - extpar_file: date: *root_start_date - ERA5 arguments: ERA5: ERA5 extpar-file: extpar_file grid-file: grid_file custom-argument: hello ``` -> Consider options that don't have arguments (e.g. -v / --verbose) * Actual command runs on the commandline: ```shell= ./extpar extpar_namelist (contains path to `obs_data`) ./preproc ./icon icon_namelist (contains everything) ./cosmo ``` * How to provide environment variables via `runtime` (in general) * Setting different working directories, e.g. for the different ICON runs (this is something we need to check in `aiida-core/shell/workgraph`) * Does `icon_input` need the time-stamp attached to it, or is it always the same file. Same about `Extpar` and `extpar_file`, `grid_file`, etc.? * Add executables to `scheduling: \n graph:` section. Other runtime metadata information can remain in the extra section at the end, but we need actual, callable commands / AiiDA codes to generate the WorkGraph -> No. * Also, which way should the command information be given: * Absolute path to the executable (most Linux users might be familiar with that), or * Name of AiiDA `Code` -> We need an AiiDA code to run the executable via AiiDA. Users either have to define them beforehand themselves, or we generate `AbstractCode` instances on the fly (similar to how `aiida-shell` does it for built-in Linux commands when they are being used) -> Don't **require** absolute path for executable. Either label or executable. When generating the AiiDA code on-the-fly and we have already an entry in the database of the code with a different path we have to make the user aware about it with an error. The advantage of supporting labels is that sharing yaml files is made easier as the path is configured while the label stays the same. * How is the namelist specified by the user? Or is it hardcoded filename for current path? * User **has** to set up AiiDA `Computer`, even if we generate, e.g. `Code` on the fly -> Then specify computer either once in the workflow, globally, or also on a per-task basis #### JG * Tasks/data ignored if outside of bounds * Differentiate between: * absolute files that are already present on the file system, don't originate from a process, don't get modified, and don't have a date attached to them (this is done already -> `obs_data`, `ERA5`, `grid_file`) * files that don't have a date attached to them (or it doesn't make a difference, as it's not being used), but originate from a process (`extpar_file`) -> handle this as the case above * files that get created various times during the cycling, for instances at different dates (all other ones: `icon_input`, `stream_*`, `restart`) * Why even attach `root_start_date` to `extpar_file`? Does that depend on the date (apart from the "cycle" just being run once)? * I find attaching dates to the files a bit weird, e.g. `- restart: \n lag: '-P2M'` * Hard-code predefine jobs on which other jobs depend in the functions that add the workgraph nodes, e.g. ```python= def add_preproc_job_node( wg, era5_file_abs, grid_file_abs,extpar_file, (prev_)extpar_job ): ``` as we require the `extpar_job.outputs["extpar_file"]` in the `nodes` dictionary -> Need to keep track of the tasks that have been converted to WorkGraph (e.g. create new empty, nested task dictionary and pop internal ones to the new one through ) --- ### Collab example Group: setup_icon_for_aiida.sh ```bash setup_icon_for_aiida "computer" "/path/to/my/icon" ``` User1 DB: Code "icon" -> /home/user1/bin/icon User2 DB: Code "icon" -> /home/user2/bin/my_icon Only need to reference the label, no need to change paths (optionally)