Metatensor-models

# Metatensor-models Plan for unifying metatensor-based models > **note:** > > mlelec would provide another set of tools to define models, used as a dependency for some models in metatensor-models ## Existing stuff ### MLIPs - https://github.com/lab-cosmo/equisolve (public) - Linear models with numpy, energies and forces - Converters from ASE to metatensor for properties - RMSE on metatensor - blocks to build a model - https://github.com/abmazitov/torch_alchemical (public) - Metatensor wrappers for linear layers, activation functions, graph convolutions, capable of doing forward pass directly on metatensor.torch.TensorMap objects - Features calculators from either torch_spex or rascaline [only torch_spex at the moment] - Power Spectrum, Behler-Parinello Power Spectrum and Alchemical models based on torch_spex Spherical Expansion - Convenient training tools from Pytorch Lightning (allows for training callbacks, model checkpoints, multiple-gpu training, etc.) - Automatic logging of the experiments with Weights & Biases - TorchScript compatibility - https://github.com/Luthaf/alchemical-learning (public) - only alchemical models - train model with either linear algebra or gradient descent - very bad, don't use it - https://github.com/bananenpampe/H2O/tree/move-rascaline (public) - SOAP/LODE/RS BPNN with rascaline-torch (energies, forces) - roughly oriented on Schnet (aggregation, feature, response torch layers) - and torch - atomic properties (chemical shieldings) - i-pi driver - uncertainty quantification (UQ) - UQ biased PES - uses pytroch lightnig to simplify training ### Electronic structure - https://github.com/jwa7/rho_learn (public) - torch/metatensor equivariant global model, wrapping linear/ arbitrary nonlinear block models - torch dataset + dataloaders, L2 loss for scalar fields and arbitrary-rank tensors - Parsers + calculators for FHI-aims, specifically for RI fitting of scalar fields - End-to-end predictions (i.e. integrated w/ AIMS) - Examples for learning tensors and fields - Assumptions: input = ASE frames, reps w/ rascaline, targets in angular basis, train w/ torch by gradient descent - https://github.com/curiosity54/mlelec (private) - Tangential stuff to unify electronic structure o/p (multicenter) including Hamiltonian stuff/interface with electronic structure, i.e. self contained codebase to compute desired target, train, and provide the output back to the elec code (with focus on PySCF for now) - interface with other (non acdc) models - torch based - https://serfg.github.io/pet/ ## Goals & non-goals We want something for two classes of users: ML developers who want to create new models/new architectures; and ML users who want to train existing models/architectures on new datasets. We will have two libraries: - Building blocks for ML developers, living in the metatensor repository - End to end models for ML users, living in [metatensor-models](https://github.com/lab-cosmo/metatensor-models) Both will follow all the usual software engineering practices: - pip installable - conda installable - documentation - tests - examples - CI - *etc.* ### Building blocks library #### Goals - Make prototyping of new models easy - Provide most of `torch.nn` - Provide kernel & linear models - Support both linear algebra and gradient descent for training #### Non-Goals - Contains code used by a single application/model. This will live either in metatensor-model (for stable stuff) or in standalone repositories ### End to end architecture and model training library Nomenclature: the `architecture` is defined by the code in `forward()` and the training procedure, while the `model` is an architecture trained with a spefific dataset (i.e. a model is something that can be used as-is for MD, an architecture needs to be trained first). #### Goals - Use existing architectures with new system/datasets - Allow us to market our models/architectures in conferences, have a single place to point people to if they want to try our models - Invariant & equivariant predictions - Per-structure and per-atoms predictions - Allow external contributions of new architectures with clear guidelines - Provide pre-trained models (as github release artifacts); automatically download them on user request - Does not require users to write any code to train a model - Allow less technical users to train models and compare different models - Make it clear which models are stable/experimental - we will include stuff used in a paper - we will include stuff in development for a paper - we will **not** include tools to build new models - Some facilities to compose existing architectures together, not necessarly easy to use by end users - Minimal building blocks to allow someone else to build active learning on top of this repo - regressions tests with 0/1 epoch training + fixed seed random weights, only for non-experimental models - per-architecture dependencies management: - Each architecture comes with a `requirements.txt` or `requirements.py` defining the dependency for this model. User can install metatensor-models alone (and get import error when trying to use some models); or install metatensor-models + the dependency for some architectures: ``` pip install metatensor-models pip install metatensor-models[allegro] pip install metatensor-models[allegro,mesh_lode] pip install metatensor-models[allegro_mesh_lode] ``` - it must be possible to install all architectures at the same time, meaning two different architectures can not have incompatible dependencies ### Non-goals - Anything that is not PyTorch - Jupyter notebooks - Reproducing results from papers - Architectures that are not usable for research (i.e. small building blocks for larger models, such as atomic composition) - Full active learning loop ## Requirements/technical decisions ### Building blocks #### Make prototyping of new models easy - Simple custom dataloader & dataset (prototype in: Joe's rho learn) - support both loading from memory & from disk - support both pre-computed features & systems as input - loading arbitrary number of data in each minibatch - Loss functions - MSE/MAE/L2/... - **PROTOTYPE** loss on values and gradients - Export metatensor System data to torch-geometric format - Building blocks that can handle equivariant properties #### Provide most of `torch.nn` - Wrapper for something similar to ModuleDict, applying the module to blocks one by one - Some well used models should also be directly provided (mainly `LinearModel`) #### Support both linear algebra and gradient descent for training - Option to solve with numpy export to Torch? - Option to solve with Torch directly for GPU support - Explicit user choice for one or the other linalg backend ### End to end models #### Use existing models with new dataset/system/research topic - consistence API boundary: input & output are well defined #### Allow us to market our architectures in conferences, have a single place to point people to if they want to try our architectures - give each architecture a name - Have example on how to train model with a specific architecture, export it & use it #### Invariant & equivariant predictions #### Per-structure and per-atoms predictions - architectures needs to define what properties they can train on & what they can output #### Store architecture & way to train this architecture into a model together - two files per architecture, one for the architecture itself, the second one containing a trainer #### Allow external contributions of new architectures with clear guidelines - Document + stupidly easy example of how to add architecture & what's required (tests, comments, docs, …) #### Does not require users to write any code to train a model with existing architectures - Single `train.py config.yml` script than handle all architectures - Single `eval.py model.pt dataset.xyz --output excel --output chemiscope --output txt,csv` command - + tools to do this in - Extracting properties from XYZ/AIMS/VASP/... to `TensorMap` - Extracting systems data to `metatensor.torch.System` - Convert from cartesian `torch.Tensor` to spherical `TensorMap` - Integration with TensorBoard/Weights & Biases/... - Everything should still work wtihout any of these tools - Workflow for training: ``` pip install metatensor-models[allegro,mesh-lode] cd where-the-training-data-and-logs-and-anything-else-should-live/ vim config.yml train-metatensor-model config.yml # create checkpoints folder + log file + log to stdout export-metatensor-model config.yml --epoch=100 # export the best model from checkpoint cp exported/my-model.pt lammps-simulation-folder cp -r exported/my-model/torch-extensions lammps-simulation-folder/torch-extensions ``` #### Allow less technical users to train models and compare different models - How does user interact with the code? 1. single script for the repo w/ configuration 2. import from python 3. ~~one script per model~~ #### Make it clear which models are stable/experimental - we will include stuff used in a paper - we will include stuff in development for a paper - we will **not** include tools to build new models - have an `experimental` folder for new architectures being beta-tested - have a `deprecated` folder for old stuff that will be removed if it becomes a maintenance issue #### Some facilities to compose existing architectures together, not necessarly easy to use by end users - you can import from one architecture into another one, no additional structure requirements #### Minimal building blocks to allow someone else to build active learning on top of this repo - initially only making the trainer script importable, long term to be determined ## Milestones Team leads for the libraries: - `metatensor-learn`: Arslan, Alex (1/2), Joe (1/2) - `metatensor-models`: Philip, Filippo ### Minimal Viable Product #### Metatensor-learn Decide name Python package scaffolding ModuleMap (i.e. ModuleDict working on TensorMap blocks) ([prototype with tests](https://github.com/lab-cosmo/equisolve/blob/main/src/equisolve/nn/module_tensor.py)) - Dataloader, once metatensor #405 is merged - Neighborlist computation via `rascaline` ATM Folder structure ``` src/metatensor/learn data/ __init__.py dataset.py dataloader.py nn/ __init__.py module_map.py utils/ __init__.py collate_fn.py # A utility for a dataloader tests/ test_dataloader.py test_module_map.py ``` DEADLINE: Christmas 2023 - Alex - Arslan - Joe #### Metatensor-models END GOAL of MVP: train on QM7, get a %RMSE from CLI Python package scaffolding Single architecture: SOAP Power Spectrum + BPNN Corresponding trainer code Read energies from ASE to TensorMap Convert positions+cell from ASE to System (re-use rascaline for now, will be removed) Single train script + config + checkpoint - look into hydra Basic evaluation script Use `rascaline.torch.System` in-place of `metatensor.torch.atomistic.System` until [metatensor#405](https://github.com/lab-cosmo/metatensor/pull/405) is merged Folder structure ``` bin/ train-metatensor-model.py eval-metatensor-model.py export-metatensor-model.py src/metatensor/models/ utils/ __init__.py ... experimental/ bad-model/ worst-model next-sota??/ actually-its-wrong/ first-model/ __init__.py # empty model.py # MODEL_CLASS = Something trainer.py # MODEL_TRAINER = Something defaults.yml # Default values for hypers requirements.py tests/ regression_tests.py unit_tests.py second-model/ __init__.py model.py trainer.py defaults.yml requirements.py combined-model/ from ..first_model import ModelPart1, ModelPart2 from ..second_model import ModelPart1, ModelPart2 requirements.py # re-export from the two others ``` DEADLINE: Christmas 2023 - Filippo - Philip - Matthias ### Second milestone #### Metatensor-models Per architecture CI setup (dynamically generate jobs for each architecture in Github Actions) Look into lightning Write documentation on what can go in experimental & what's required to move out of experimental

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.