## Development cases Software development at the MSE Lab follows best practices. Practices are divided into cases because different projects might have to meet different demands. These cases define practices that must be followed; multiple cases can apply. [Figure cases, coordinate system users vs workers] ### Case "Always" Indicator: This case applies always. Guidelines: - Each project folder must have a **README** in Markdown format. It contains - a description, - installation or setup instructions, - and a responsible person's name and email address. - Each module, class and function must have a **minimal docstring** (see Knowledge-base > Writing docstrings (Python)). A similar approach for programming languages other than Python is recommended. ### Case "Single worker" Indicators: Only a single worker is expected to work on a task / project at a time or is responsible for integrating feedback from others. E.g. student thesis, or student programming exercise. Guidelines: - Provide a dependencies in a **environment.yml** (conda-forge if possible) - Source files must be **versioned with Git** and be synced with a remote repository on https://gitlab.hrz.tu-chemnitz.de. Don't commit data files (see Knowledge-base > Git). Optional guidelines (should be used for public code): - Code must be **linted with [Ruff](https://beta.ruff.rs/docs/)** in standard configuration. All files must pass `ruff check <file path>`. - **Code style** is determined by the code formatter [**Black**](https://github.com/psf/black). All Python files must pass `black --check <file path>`. ### Case "Group of workers" Indicators: Multiple workers (are expected to) drive a project. Guidelines: - **Extents case "Single worker".** All guidelines, including optional ones, of this case apply. - Parallel development uses a **collaborative Git workflow** (see Knowledge-base > Collaborative development (Git)). - Changes are only incorporated into a project if an additional worker approves. - Doctests mandatory objects which are used by users (public API). Public API are objects which can be reached without a preceding "_" prefix in their path, e.g. "msetools.io.load_hdf" vs "msetools.io._handle_path". ### Case "Group of users / remote users" Indicators: Many users (are expected to) to use the resulting code, tool or resource. E.g. a Jupyter notebook for a student exercise. Guidelines: - Each **public** module, class and function must have a **complete docstring** (see Knowledge-base > Writing docstrings (Python)). - README must contain user installation instructions (as opposed to developer installation instructions). - Optional: Add usage example (narrative description). ### Case "Storing data" Indicators: Non-trivial amounts of (binary) data is created by a tool. Guidelines: - Data must be placed inside a "data root folder". A "data root folder" must contain a `DATA.md` file with - and the name and email of a person that produced it / is responsible. - a description of the data and how it was produced. - Use the [HDF5 file format](https://docs.h5py.org) if possible to store array-like data (see KB > Data storage with H5py) - For array-like data, the unit and sampling rate (step) of each dimension must be described. Ideally, a storage format like HDF5 is used which supports this annotation, otherwise it must be contained in the `DATA.md`. Optional: - Use [TOML](https://docs.python.org/3/library/tomllib.html) to store hierarchical and inhomogeneous information, e.g. configuration. ## Knowledge-base and practices ### Git ... ### Collaborative development (Git) ... ### Writing docstrings (Python) A [docstring](https://peps.python.org/pep-0257/) is a text block that documents a code object, e.g. a class, function, method or namespace. It is often placed directly above or below the definition of an object or at the top of a file. Docstrings give context to users and are distinct from comments which give context to developers. For Python, multiple docstring styles exist, but the [NumPy Docstring style](https://numpydoc.readthedocs.io/en/latest/format.html) is the most common within the scientific ecosystem. For complete details refer to the linked style guide. **Minimal docstring:** A docstring starts with a one-line summary. For methods and function it should be worded in imperative mood, e.g. "Apply a low-pass filter forward and backward.". For files and classes, the summary should be descriptive, e.g. "Arduino temperature sensor." ```python def foo(x): """One-line summary. Short description. """ ``` Consider adding the often very useful "Examples" section to make simple doctests possible and demonstrate the intended usage! **Recommended docstring:** ```python def foo(x): """One-line summary. Short description. Parameters ---------- x : int Some parameter. Returns ------- out : numpy.ndarray Examples -------- >>> foo(3) 4 """ ``` ### Doctests (Python) ... ### Testing with PyTest (Python) ... (short overview, refer to msetools for examples) ### Data storage with h5py (Python) -> `msetools.io.save_hdf`, `msetools.io.load_hdf`, `msetools.io.check_hdf` Example structure: ``` group "/" attribute "created_on" (string) = 2023-06-29 attribute "author" (string) = "alex@mselab.de" attribute "tool_reference" (string) = "https://gitlab.hrz.tu-chemnitz.de/some/repo/-/commit/63f4dfc8acd48ae298d9a1beca87620b8a2bd6e0 attibute "command line args" (string) = "measure.py --out "data/2023-06-29_measurment.h5" attribute "desc" (string) = "Recording of the dataset "/tx_signal" with shape (100) attribute "value_desc" (string) = "Excitation pattern" attribute "axis_0_desc" (string) = "sample number / time" dataset "/raw_rx_signal" with shape (16, 30000) attribute "value_desc" (string) = "Array voltage" attribute "axis_0_desc" (string) = "FPGA number / channel group" attribute "axis_1_desc" (string) = "sample number, time axis, 33 ms" attribute "f_sample" = 30_000 ``` ### Narrative documentation - Sphinx vs GitLab Wiki ### Installing with pip directly from a GitLab URL **SSH:** Requires [registering a public SSH key with the GitLab](https://gitlab.hrz.tu-chemnitz.de/-/profile/keys) for the current machine. Replace `{org_project_path}` with the project path on the GitLab instance and optionally `{revision}` with a Git revision (e.g. a commit hash or tag). ```bash pip install git+ssh://git@gitlab.hrz.tu-chemnitz.de/{org_project_path}.git@{revision} ``` **HTTPS:** Requires a [personal access token](https://gitlab.hrz.tu-chemnitz.de/-/profile/personal_access_tokens) `{read_repo_token}` with the `read_repository` permission. Replace `{token_user}` with the appropriate user name, `{org_project_path}` with the project path on the GitLab instance and optionally `{revision}` with a Git revision (e.g. a commit hash or tag). ```bash pip install git+https://{token_user}:{read_repo_token}@gitlab.hrz.tu-chemnitz.de/{org_project_path}.git@{revision} ```