AiiDA WorkGraph coding days

# AiiDA WorkGraph coding days ###### tags: `event` ###### time: 07.07.2025 – 11.07.2025 [TOC] **Participants**: * Alex * Edan * Julian * Xing (coordinator) --- ## Goal Improve the documentation and clean up the codebase in preparation for a stable release. --- ## Strategy Each participant will review the documentation, identify issues, discuss them with the group, and work on fixes or improvements. These improvements may apply to both the documentation and the codebase, depending on what's needed. Some fixes may take longer to complete. In such cases, they can be added to the GitHub project board and scheduled for the next coding week, where everyone will work independently. Xing, as the coordinator, will focus on cleaning up the codebase. He will join group dicussion, as well as available for discussion at any time during the week. ## Place We will work on site for three days, with the option to work remotely for the other two. --- ## Topics ### Documentation Everyone (Alex, Edan, Julian) will pick topics from the documentation, try them out, and evaluate: * Is the content clear? Too much or too little? * Is the syntax user-friendly? Are the API names clear and simple? * Is there outdated content? Are the latest conventions used? * Are there any errors? Are the error messages informative? Focus areas: * [**HowTos** section](https://aiida-workgraph.readthedocs.io/en/latest/howto/index.html) ![HowTos](https://hackmd.io/_uploads/ByRwgWBHle.png) * [**Built-in tasks** section](https://aiida-workgraph.readthedocs.io/en/latest/built-in/index.html) <img width="290" src="https://hackmd.io/_uploads/rJO7MWSrge.png" /> From Thursday onward, review additional sections: * **Quick start** – Edan * **Tutorials** – Alex * **Concepts** – Julian --- ### Code Refer to this [GitHub project](https://github.com/orgs/aiidateam/projects/12/views/1) for open issues. More items will be added throughout the week based on documentation feedback. --- ## Schedule ### Monday * **10:00** Kickoff meeting (OVGA/999) * Discuss major WorkGraph issues * Define goals for this week * Identify items for independent follow-up week * **10:30–11:15** Read and test 1–2 documentation topics * **11:20–12:00** Group discussion on what can be improved (OVGA/102) * **13:30–14:15** Read and test another 1–2 topics * **14:20–15:30** Group discussion (OVGA/999) * **16:00 onward** Start improving docs/code, create PRs --- ### Tuesday to Friday (same structure) * **9:00–10:00** Work on improvements (docs and code), submit PRs * **10:00–10:30** Daily sync – present PRs and plan the day * **10:30–11:15** Read and test 1–2 documentation topics * **11:20–12:30** Group discussion * **13:30–14:15** Read and test another 1–2 topics * **14:20–15:30** Group discussion * **16:00 onward** Work on improvements (docs and code), submit PRs --- ## Thoughts * Uses cases of syntax: * Easy-to-use (does not offer all features but most intuitive) * Fully tunable (programmatic usage, data provenance fine tuning) * Type of users: * End user: Mainly using to run calculations with simple manipulation of workflows * Workflow developer: Needs to write complex workflows or even meta workflows * Remove `%load_ext aiida` jupyter magic throughout? * --- ## Open questions to be dicussed * [AG] Can we replace @task with only `@task.pythonjob` task? It can run on localhost when no remote is specified? I am not sure about the technical differences. **-->** Xing pointed out that a key difference is that all `@task` tasks are run in the same directory (in `run` mode in the directory where the python environment is, in `submit` in the daemon directory), so they have a shared working directory and this is a feature people requested. * [AG] Sections `Run tasks in parallel`, `Aggregate data from multiple tasks` and `Use map to generate tasks dynamically` are highly overlapping. My impression is that we should merge `Aggregate data from multiple tasks` and `Use map to generate tasks dynamically` to one section. And `Run tasks in parallel` should be a section in `Concepts` about `Dataflow programming` that workgraph tracks the usage of your inputs and outputs and by that creates determines which task can be run in parallel **-->** We agreed that put one section `Run tasks in parallel (using graph-builder, map zone)` that includes the python for-loop, graph-builder, map-zone approachs. In addition, we add a `Concept` section about the dataflow programming ascepts of workgraph. The aggregate part can be moved to the new `Namespace` section but we need to mention in the `Run tasks in parallel `section. * [JG]: Utility function to generate "MultiplyAddWorkgraph" ![{9AE9163D-0D5C-4AEB-8E53-FB3DC49E57EC}](https://hackmd.io/_uploads/HJTnZSYrgg.png) * [AG] Is the graph builder still useful? We can combine workflows with factories (see below). The runtime evaluation of graph builder was historically useful for creating for- and if-logic. But now we have syntactically easy-to-use versions for the for- and if-logic using contexts solution. The question is the graph builder still useful? While the graph builders still allows to use even simpler syntax, it contains some. Basically is there a use case for the graph builder and can we express it? There are still minor advantages of the graph builder: The syntax is more consistent with the rest of workgraph, you can ignore data provenance, .... Provenance part is still not clear to me [AG] ```python= # --- Workflow developer lives in different module from aiida_workgraph import task, WorkGraph @task def add(x, y): return x+y # also possible to do # def add_workflow_factory(x, y): def add_workflow_factory(): wg = WorkGraph() wg.add_input("workgraph.int", "x") wg.add_input("workgraph.int", "y") with If(x): ... wg.add_task(add, "add1", x=wg.inputs.x, y=wg.inputs.y) wg.add_task(add, "add2", x=5, y=wg.tasks.add1.outputs.result) wg.outputs.sum1 = wg.tasks.add1.outputs.result wg.outputs.sum2 = wg.tasks.add2.outputs.result return wg # --- Workflow user from aiida_workgraph import task, WorkGraph # from myworkflow import add_workflow_factory wg = WorkGraph() wg.add_task(add_workflow_factory(), x=task1.outputs.result, y=2) wg.run() ``` **-->** Still open * Graph builder vs context manager: Show people a EOS exmaple using both graph builder and context manager. * Syntax to continue a finished WorkGraph: In principle, restart a finished WorkGraph and continue a finished WorkGraph are the same. * [AG] Is there a public `waiting_on` for sockets? ```python= should_run = compare(x=wg.ctx.n, y=50) should_run._waiting_on.add(result1) ``` **-->** There is this syntax `result1 << should_run` * [AG] Automatic creation of `python3@localhost` code seems to look for a python executable outside of my current environment. Using `sys.executable` should work. - New table of content for How To with adaptations of https://github.com/aiidateam/aiida-workgraph/issues/525 ``` Overview (Xing - Edan reviews) * use cases (brief with links) * context manager * graph builder approach * programmatic approach ``` ``` Quick start (Edan) * 5 minute read max * Context manager syntax * What's next? ``` ``` How To: - Interoperate with aiida-core components (Alex) * Integrate workflows and calculations * Integrate WorkGraph into a WorkChain - Graph-level input/output (Edan) - Control task flow (Julian) * Introduction * While * Workflow description * Context manager * If * Workflow description * Context manager * Graph builder - Run tasks in parallel (Alex) * Conventional for-loop * Graph builder * Map - Wait on a task (Julian) * Group tasks (Julian) - Combine workgraphs (Edan) - Save and restart a workgraph (Edan) * Continue a finished workgraph (Edan) - Write error-resistant workflows (Alex) - Monitor a state (remove this from built-in tasks) (Edan) - Run shell commands as a task (Xing) - Run remotely (the name here needs work because this can also be useful locally when simply wanting to isolate the execution...) (Xing) * PythonJob * ShellJob * CalcJob - Generate workflows programmatically (Alex) * If * While * Map ``` ``` Concepts: (Xing, then others) - Task - Socket * nested and dynamic namespaces - WorkGraph * Creation * Engine - Context variable (remove the api references) - Graph Builder ``` ``` CLI (Julian) - Pause/play/kill a task ``` ``` GUI (Xing) - Pause/play/kill a task ``` ``` Tutorials (Julian) - Materials Science - Atomization energy - Difference of atomization energies - Relaxation - EOS ``` From Zero to Hero will be removed and maybe integrated into other tutorials ``` Migration from `aiida-core`: (Xing) - Migrate `PwBandsWorkChain` to WorkGraph - Error handler ``` ``` Blogs (move to AiiDA blogs) (Xing) ``` ``` API reference (Alex 🤖) ``` ## Group discussions #### [Graph builder](https://aiida-workgraph.readthedocs.io/en/latest/howto/autogen/graph_builder.html) * [JG]: Graph builder doc source code only in `docs/gallery/howto/autogen/graph_builder.py`? Should also be elsewhere, no? How to build docs? * [JG]: Why not combine these two lines: ``` wg.update_ctx({"task_out": task.outputs.result}) wg.outputs.result = wg.ctx.task_out ``` to ``` wg.outputs.result = task.outputs.result ``` Why do we need the context? - [xw] In this case, no need to use ctx. btw, the example is not intitutive, why only output the result of the last task? I suggest to change to option 1) remove the outputs, or option 2) gather the results of all tasks using ctx. - Graph builder dispersed -> concept - Concepts: - Dataflow programming (node-graph programming) - - decision: context manager way or node-graph programming way