2026/02/03 (notes)

--- tags: TechMon --- # 2026/02/03 (notes) :::info - **Date:** February 3, 2026, 11 pm EST - **Participants:** OE, MG, EM, MG, TS, CP, JHL ::: ::: warning REMINDER: Today is the last day to nominate NiPreps to the OHBM Open Science award. Form: https://humanbrainmapping.org/i4a/forms/index.cfm?id=248&pageid=3954&showTitle=1 If you have trouble with the form below, please submit the required nomination information in a single pdf to OHBM Executive Office at info@humanbrainmapping.org by Tuesday, February 3, 2026 at 11:59 pm EST, USA. ::: :::success ## Agenda 1. *TemplateFlow* example in *nilearn* * Please help! https://github.com/nilearn/nilearn/pull/5968 1. fMRIPrep (Sentry) stats migration 1. RSMF Round 2 * GH Roadmap (incl. Gantt!): https://github.com/orgs/nipreps/projects/14 * Working [list of issues below](#RSMF-Round-2-Proposed-issues) * Deadline 25 February 2026, 16:00 GMT 1. Copilot agents: https://github.com/organizations/nipreps/settings/copilot/coding_agent * Auto-on for all repos. Can disable for all or switch to enabling on a per-repository basis. Any reason to keep it on? 1. Quick question about HCPh dataset for ME EPI / MRIQC ::: ## AI Summary ### Quick recap The meeting focused on several key updates and discussions. Oscar invited feedback to the *nilearn* team as they are integrating TemplateFlow examples and highlighted changes to fMRIPrep's stats analytics system, which now uses GitHub Actions to generate Parquet files for tracking usage. Eilidh presented plans for RSMF Round 2, focusing on community tools and fit/transform functionality for rodents, with a deadline of February 28th. The team discussed concerns about data leakage in Melanie's MRI QC study using multi-echo data, with Oscar suggesting that redefining the research question might be more effective than attempting a complete re-annotation of the data. The conversation ended with a brief discussion about Copilot agents in repositories, though this topic was deferred for further discussion when Chris joins the next meeting. **Next steps** - All team members: Submit nominations for NiPreps for the OHBM Open Science Award using the provided text or alternative text before the deadline (today). - All team members: Review and provide feedback/help on the Nilearn team's documentation and pull request regarding TemplateFlow integration in Nilearn, including providing examples if possible. - Mathias: Bring up the possibility of using OpenNeuro's private S3 storage (or a new S3 bucket) for NiPreps Sentry data in the next relevant meeting. - Oscar: Investigate and resolve the issue preventing automatic updates of fMRIPrep analytics on the NiPreps.org website. - Eilidh (with support from all relevant team members): Review, vet, and curate [the list of issues below](#RSMF-Round-2-Proposed-issues) (generated by ChatGPT) for the RSMF Round 2 grant, ensuring the list is specific and ready for inclusion in the grant application by the end of February. - Melanie: Meet with Martin after February 15 to work on the methods section of the multi-echo MRIQC paper. - Oscar: Coordinate with Melanie to review the multi-echo MRIQC paper and discuss how to address the data leakage issue when she is in Lausanne. ### OHBM Award Oscar requested team members to submit their texts by the end of the day (deadline close), noting that Martin had prepared a boilerplate. The links to the nomination form and Martin's boilerplate were both shared. ### fMRIPrep's Sentry analytics fetch Regarding fMRIPrep's analytics, Oscar explained that while Sentry provides valuable usage data, the free tier limits access to 3 months of data. To address this, Oscar has developed a new system using GitHub Actions to generate Parquet files daily for started, successful, and failed runs, with plans to store these files in Dropbox temporarily until a more secure solution is found. Oscar suggested moving from his private Dropbox suscription using Russ's Google Drive or Amazon S3. Mathias proposed using private S3 space for NiPreps, which he will discuss with Russ before the next meeting. Oscar showed the GHAs fetching (daily) and generating stats (weekly). He discovered we don't automatically update the plots on nipreps.org anymore, though he thought this was due to a glitch at the moment (after the meeting he realized it was actually to prevent Git's history to grow, and he still needs to figure out where to post the figures weekly without bloating git). ### MRIQC Web API Monitoring Additionally, Oscar described new GitHub actions he developed that monitors the MRIQC web API every 15 minutes, providing status updates (https://www.nipreps.org/mriqcwebapi/health/) and automatically creating or closing issues when the API is down or restored. Mathias and Oscar discussed the MRIQC tool's functionality and its tolerance for web API errors. ### MRIQC and RSMF Grant Review They then focused on the RSMF Round 2 grant application, which will maintain community and fit/transform components while removing battery prep due to funding constraints. Eilidh mentioned the need to refine the YAML file generated by Codex and proposed organizing a meeting to align on fit/transform issues with relevant project stakeholders. The YAML file contains a list of issues that will be automatically submitted to the designated repos and assigned to a [new GH Project](https://github.com/orgs/nipreps/projects/14) Oscar created such that a Gantt chart/roadmap is generated and available to support the new application. Oscar and Eilidh discussed the need to refine the list of future GitHub issues to create a clear roadmap for a project, ensuring it meets the requirements for rodent-specific implementations and generalizability. They emphasized the importance of collaboration to vet the issues and align the project with grant expectations. The deadline for this task was set for the end of February. ### Copilot and GitHub Additionally, Taylor mentioned that Copilot agents were enabled on NiPreps repositories, and Oscar suggested waiting for further clarification from Chris in the next meeting. ### Multi-echo MRIQC Data Transparency Issue Melanie discussed an issue with her MRIQC study on multi-echo data, where data leakage occurred due to the use of IQM features in both the annotations and the training set (most prominently, head motion metrics such as FD). She considered two options: being transparent about the limitations in the paper or re-annotating the scans from scratch using raw images. Oscar suggested reframing the research question to focus on the dimensionality reduction of ME-IQMs rather than predicting human annotations, as this would reduce the risk of data leakage. They also discussed the possibility of using IQMs from different echo times to address some of the concerns. Melanie agreed to continue working on the project and planned to meet with Martin after February 15 to discuss the methods section. ## Notes ### RSMF Round 2 Proposed issues ```YAML --- # Issue 0 title: "Create the RSMF Round 2 delivery board and operating cadence" brief: "Stand up a public GitHub Project that encodes the 1-year plan as milestone views, and define the cadence for triage and milestone reviews. This becomes the auditable execution plan we link in the Round 2 proposal." repo: nipreps/niworkflows effort: low impact: high type: Task x-refs: [1, 2, 5, 6, 42, 47, 51] body: |- ## Context Round 1 feedback asked for tightened scope and milestone-level success criteria. A public Project board with curated issues is the most legible way to show "we know exactly what we will do next" and to reduce perceived delivery risk. ## What we will do - Create a public GitHub Project (beta) owned by `nipreps`. - Import all issues in this YAML set into the Project. - Configure required fields: **Milestone**, **Status**, **Effort**, **Impact**, **Area**, **Owner**. - Define operating cadence: - Weekly asynchronous triage (labels, prioritisation, routing to maintainers). - Monthly milestone review (public notes). - Quarterly public progress note (paired with telemetry/impact report). ## Deliverables 1. Public Project link. 2. Views: - Roadmap (milestone × month) - Execution Kanban (Status) - Risk/Blocked (Status + blockers) - Impact-first (impact:high) 3. `docs/rsmf-r2-board.md` describing how the board maps to proposal milestones. ## Acceptance criteria - Project is public and stable. - ≥95% of issues in this YAML are added with Milestone + Status set. - Monthly milestone notes are published in-repo and linked from the Project. --- # Issue 1 title: "Define milestone-level success criteria and KPIs for ELT adoption" brief: "Translate the proposal narrative into measurable, milestone-scoped success criteria (technical, adoption, community). This directly addresses the 'milestone-level success criteria' gap." repo: nipreps/niworkflows effort: medium impact: high type: Task x-refs: [6, 13, 42, 45, 46, 51] body: |- ## Context Reviewers want confidence via explicit success criteria tied to milestones (not just 'we will track things'). This issue defines the KPI set and the minimum reporting cadence. ## What we will do Define a small KPI set with baseline + targets for a 12-month small grant: - **Technical**: equivalence gates passed (ETL vs ELT), schema validity, CI stability, performance budgets. - **Adoption**: % of runs using ELT features (fit-only/transform-only), migration uptake in pilot pipelines. - **Community**: # unique contributors, # first-time contributors, review throughput, issue closure rate. ## Deliverables - `docs/rsmf-r2-kpis.md` containing: - Baseline definition (how we measure, from which sources) - Target ranges (conservative vs stretch) - Reporting schedule (quarterly) ## Acceptance criteria - KPIs are measurable from telemetry/GitHub logs without manual curation. - Each KPI is mapped to an RSMF Round 2 milestone and an owner. - KPI doc is referenced from the Project board (Issue 0). --- # Issue 2 title: "Publish Round 2 scope statement and explicit FactoryPrep de-scope" brief: "Produce a crisp scope statement for the 1-year small grant that focuses on fit&transform/ELT generalisation and evaluation/telemetry, explicitly de-scoping FactoryPrep. This reduces reviewer ambiguity about what will and will not be delivered." repo: nipreps/niworkflows effort: low impact: high type: Task x-refs: [16, 31, 47, 48] body: |- ## Context The Round 2 call constraints require a smaller, tighter scope. We should prevent any confusion that we're still promising a pipeline factory (FactoryPrep) in a 1-year budget envelope. ## What we will do Write and publish a scope note that: - Defines the two technical outcomes for Round 2: 1) A shared ELT/fit&transform contract + artifact seeialisation. 2) Pilot ELT adoption in a small set of NiPreps components/pipelines (see milestones). - Explicitly lists out-of-scope items, including FactoryPrep/pipeline factory work. - States how this Round 2 work keeps the door open for future factory-style tooling without committing to it. ## Deliverables - `docs/rsmf-r2-scope.md` (used verbatim in proposal). - Project board milestone descriptions updated to match. ## Acceptance criteria - Scope note is readable by non-domain reviewers (minimal NiPreps jargon). - Includes an explicit "Out of scope (Round 2)" section with FactoryPrep named. - Linked from the Project board and from user-facing docs. --- # Issue 3 title: "Establish ELT transition and versioning policy for pilot pipelines" brief: "Define how ELT features ship (flags, defaults, legacy ETL mode), and how we prevent breaking changes during the transition. This mitigates perceived risk that refactoring mature pipelines delays impact." repo: nipreps/fmriprep effort: medium impact: high type: Task x-refs: [14, 23, 31, 32] body: |- ## Context ELT/fit&transform refactors touch mature, widely used pipelines. We need an explicit transition policy that balances innovation with stability (including a clear legacy ETL mode). ## What we will do - Define CLI/API stability rules for: - `--fit-only`, `--transform-only` - `--legacy-etl` (or equivalent) - default behaviour during the grant period (what is opt-in vs default) - Define versioning expectations for fit-artifact manifests (schema + compatibility). - Define "release readiness" requirements tied to evaluation gates. ## Deliverables - `docs/elt-transition-policy.md` in-repo. - Release checklist updated to reference evaluation gates and artifact schema validation. ## Acceptance criteria - Policy includes a deprecation/transition timeline (even if conservative). - Policy is referenced by CI release workflow (Issue 14). - Policy explicitly states what does *not* change for typical users during the grant. --- # Issue 4 title: "Automate weekly progress snapshots from the RSMF2 Project board" brief: "Generate a weekly markdown snapshot of status-by-milestone, blockers, and recently completed work. This makes delivery legible and reduces coordination overhead for a small core team." repo: nipreps/niworkflows effort: low impact: medium type: Task x-refs: [46, 51] body: |- ## Context A recurring reviewer concern is execution confidence. Automated progress snapshots provide transparency without adding manual reporting burden to maintainers. ## What we will do - Implement a GitHub Action that: - Reads Project board items via GitHub GraphQL API - Produces `reports/weekly/YYYY-MM-DD.md` - Highlights blockers (Status=Blocked or missing owner) - Include a short template section for manual notes (decisions made, risks). ## Deliverables - GitHub Action workflow + script. - A first generated weekly report. ## Acceptance criteria - Runs on schedule (weekly) and on-demand. - Produces stable links suitable for sharing in funder updates. - Requires no secrets beyond GitHub token. --- # Issue 5 title: "Create RSMF2 issue templates and label taxonomy across NiPreps repos" brief: "Standardise issue metadata (milestone, area, effort/impact, acceptance criteria) to keep execution consistent across repos. This also supports onboarding and distribution of throughput beyond the core." repo: nipreps/niworkflows effort: low impact: high type: Task x-refs: [6, 16, 42] body: |- ## Context A large part of execution risk in multi-repo work is inconsistent issue quality. Templates and labels reduce ambiguity and make it easier for occasional contributors to pick up tasks. ## What we will do - Create templates for: - Feature (with acceptance criteria + migration notes) - Task (with checklist) - Bug (with minimal reproducible example guidance) - Define label taxonomy (shared across repos): - `milestone:*`, `area:*`, `risk:*`, `needs:*`, `good first issue` - Document how effort/impact labels map to prioritisation. ## Deliverables - Template files and label list (with recommended color codes) in `docs/`. - A short "How we write issues" guide. ## Acceptance criteria - All new issues created during the grant use a template. - At least 10 existing issues are retrofitted to the new metadata format as exemplars. --- # Issue 6 title: "Specify the evaluation framework for ETL→ELT refactors" brief: "Define the technical and scientific equivalence criteria we will use to validate ELT refactors, including qualitative and quantitative gates. This anchors confidence that refactors won't degrade outputs." repo: nipreps/niworkflows effort: medium impact: high type: Task x-refs: [7, 8, 9, 10, 11, 12, 13, 14, 41] body: |- ## Context The core risk of ELT refactoring is silent scientific drift. A shared evaluation framework makes refactors measurable and reviewable across pipelines. ## What we will do Define a framework with: - **Quantitative** comparisons: - file counts and BIDS-derivatives structure - summary metrics (runtime, disk footprint) - transform equivalence checks where meaningful (e.g., affine similarity) - **Qualitative** comparisons: - structured visual report review rubric (Issue 12) - Explicit "acceptable difference" policy (where ELT changes ordering/caching). ## Deliverables - `docs/evaluation-framework.md` specifying: - datasets used (Issues 8–11) - what is tested per pipeline - pass/fail criteria per milestone - A minimal JSON schema for evaluation outputs. ## Acceptance criteria - Framework is implementable in CI (no manual-only gates except the explicit visual review gate). - Each pilot pipeline has a mapped test plan section. --- # Issue 7 title: "Implement a BIDS-derivatives comparison harness for CI equivalence checks" brief: "Create a reusable tool that compares 'before' vs 'after' derivatives directories and produces a structured diff report. This enables release gating for ELT refactors." repo: nipreps/niworkflows effort: high impact: high type: Feature x-refs: [14, 41] body: |- ## Context We need automated evidence that ELT refactors preserve outputs (or change them in documented, acceptable ways). Directory diffs alone are not enough; we need structured comparisons. ## What we will do - Implement a comparison tool that: - validates both trees as BIDS derivatives - compares file inventory (adds/removes/renames) - compares metadata JSON fields for expected invariants - optionally compares image headers and basic statistics - Emit a machine-readable report (JSON) + human summary (markdown). ## Deliverables - Python module + CLI in `niworkflows` (or a small companion package if needed). - Example output checked into `docs/`. ## Acceptance criteria - Can run in GitHub Actions on small datasets. - Produces deterministic output suitable for gating. - Supports "allowlist" rules for known acceptable changes. --- # Issue 8 title: "Curate a minimal fMRIPrep benchmark dataset and baseline derivatives for ELT testing" brief: "Select and freeze a small human fMRI dataset (including an edge-case subset) plus baseline derivatives from a known release. This becomes the reference for CI equivalence gates." repo: nipreps/fmriprep effort: medium impact: high type: Task x-refs: [14, 41] body: |- ## Context Automated equivalence checks require stable, reproducible test inputs and baseline outputs. ## What we will do - Identify a small OpenNeuro dataset with: - BOLD + T1w, ideally with fieldmaps or SDC-relevant metadata - at least one run with motion/artefacts to stress the pipeline - Create a scripted fetch/subset procedure (no manual steps). - Generate baseline derivatives with a pinned fMRIPrep release. ## Deliverables - `tools/fetch-mini-dataset.py` (or similar) + documentation. - Baseline derivative manifest (hashes + version metadata). - CI job skeleton that downloads inputs + baseline for comparisons. ## Acceptance criteria - Dataset fetch is deterministic and lightweight enough for CI. - Baseline derivatives include full provenance and can be regenerated. --- # Issue 9 title: "Curate a minimal fMRIPrep-rodents benchmark dataset and baseline derivatives for ELT testing" brief: "Freeze a small rodent fMRI dataset subset suitable for CI and validation, with baseline derivatives from a known release. This supports rigorous ELT refactoring without scientific drift." repo: nipreps/fmriprep-rodents effort: medium impact: high type: Task x-refs: [14, 41] body: |- ## Context fMRIPrep-rodents is a key pilot for demonstrating cross-pipeline ELT generalisation. We need a minimal, reproducible benchmark dataset. ## What we will do - Select a rodent BIDS dataset subset with: - representative anatomy and acquisition (including at least one challenging subject/run) - metadata sufficient for SDC/registration testing - Script the fetch/subset. - Generate baseline derivatives with pinned fMRIPrep-rodents release. ## Deliverables - Dataset fetch script + README. - Baseline derivatives manifest (hashes, versions, parameters). - CI job skeleton wired to the comparison harness. ## Acceptance criteria - Data subset runs in CI within a reasonable walltime budget. - Baseline derivatives are reproducible and provenance-complete. --- # Issue 10 title: "Curate a minimal PETPrep benchmark dataset and baseline derivatives for ELT testing" brief: "Freeze a small PET-BIDS dataset subset and generate baseline PETPrep derivatives for automated equivalence testing. This enables ELT adoption in PETPrep with confidence." repo: nipreps/petprep effort: medium impact: high type: Task x-refs: [14, 41] body: |- ## Context PET workflows are sensitive to transform ordering and metadata. ELT refactors must be validated with representative data. ## What we will do - Identify a PET-BIDS dataset subset with: - a dynamic series (multi-frame) - MRI reference if applicable (coregistration) - Script fetch/subset. - Generate baseline derivatives with a pinned PETPrep release. ## Deliverables - Dataset fetch script + README. - Baseline derivatives manifest and provenance. - CI job skeleton that runs equivalence checks. ## Acceptance criteria - Dataset is small enough for automated testing. - Baseline derivs can be regenerated with pinned containers/environments. --- # Issue 11 title: "Curate a minimal sdcflows benchmark dataset and baseline outputs for ELT testing" brief: "Freeze a small dataset that stresses susceptibility distortion correction (SDC) estimation and application. This underpins refactoring sdcflows into explicit fit/transform stages." repo: nipreps/sdcflows effort: medium impact: high type: Task x-refs: [14, 28, 29, 30] body: |- ## Context sdcflows is a high-leverage target for ELT because it naturally decomposes into estimation (fit) and application (transform). We need a stable dataset to test both stages. ## What we will do - Select a dataset subset with: - EPI + fieldmap or metadata for fieldmap-less SDC - known distortion patterns (for meaningful validation) - Script fetch/subset. - Generate baseline "fit artifacts" and baseline corrected outputs under current implementation. ## Deliverables - Dataset script + baseline manifest. - Reference outputs for both estimation and application components. ## Acceptance criteria - Baseline is fully reproducible and small enough for CI. - Includes at least one case that exercises failure-handling paths. --- # Issue 12 title: "Implement a structured visual report review rubric and tooling for release gating" brief: "Define a lightweight rubric and workflow for human qualitative review of report deltas at major ELT milestones. This complements quantitative CI checks and increases confidence in scientific validity." repo: nipreps/niworkflows effort: medium impact: high type: Task x-refs: [14] body: |- ## Context Some scientific regressions are not captured by file diffs or summary stats. A minimal, structured visual review at key milestones is an effective risk control. ## What we will do - Define a rubric (scored checklist) for: - alignment quality (coregistration/normalisation) - distortion correction plausibility - obvious artefacts introduced by refactor - Implement a workflow for collecting ratings (e.g., markdown form or small web form). - Document how many reviewers are required per milestone (e.g., 2–3 maintainers). ## Deliverables - `docs/visual-review-rubric.md` - Template form + instructions for running a review session. ## Acceptance criteria - Rubric can be applied in <30 minutes for a small benchmark set. - Results are archived alongside the release candidate notes. --- # Issue 13 title: "Add performance regression tracking (runtime/disk) and budgets for ELT refactors" brief: "Instrument CI to track runtime and disk footprint over time on benchmark datasets, and define budgets that prevent regressions. This prevents 'maintenance wins' that accidentally increase compute cost." repo: nipreps/niworkflows effort: medium impact: medium type: Feature x-refs: [14] body: |- ## Context ELT refactors can change caching and ordering, which can shift runtime and output size. Performance budgets make these changes explicit and reviewable. ## What we will do - Add CI jobs that record: - walltime - peak disk usage (approximate) - output directory size - Store metrics in a machine-readable artifact and track trend over PRs/releases. - Define conservative budgets per pilot pipeline (fail CI on major regressions). ## Deliverables - CI workflow additions. - `docs/performance-budgets.md` with baseline + thresholds. ## Acceptance criteria - Metrics are produced deterministically on CI runners. - Budget thresholds are documented and reviewed quarterly. --- # Issue 14 title: "Integrate evaluation gates into CI/CD and release workflow for pilot pipelines" brief: "Make equivalence checks, schema validation, and (where applicable) visual review rubric completion explicit release requirements. This turns evaluation from a promise into a mechanism." repo: nipreps/niworkflows effort: medium impact: high type: Task x-refs: [] body: |- ## Context Reviewers explicitly asked for milestones tied to tracking and delivery confidence. CI gates and release checklists operationalise milestones. ## What we will do - Implement required checks for ELT-related PRs: - derivatives comparison harness results (Issue 7) - fit-artifact schema validation (Issue 25) - performance budgets (Issue 13) - Add a release checklist section: - confirm benchmark datasets updated as needed - archive evaluation report - link visual review notes (Issue 12) for major milestones ## Deliverables - Updated CI workflows. - `docs/release-checklist-elt.md` ## Acceptance criteria - Major ELT PRs cannot merge without evaluation artifacts attached. - Release notes include links to evaluation summaries. --- # Issue 15 title: "Create a reproducible baseline-derivatives archive workflow (pin env + artifacts)" brief: "Standardise how we freeze baseline derivatives and fit artifacts (inputs, container/environment, manifests). This makes equivalence testing reproducible and defensible." repo: nipreps/nifreeze effort: medium impact: medium type: Feature x-refs: [26] body: |- ## Context Baseline derivatives are only credible if they can be regenerated. A repeatable 'freeze' workflow increases trust in evaluation claims. ## What we will do - Define a minimal 'freeze bundle' format: - input dataset revision identifier - container image digest or environment lock - derivative manifest + hashes - fit-artifact manifests (where applicable) - Provide a CLI or script to generate the bundle. ## Deliverables - `nifreeze` command or helper to build baseline bundles. - Documentation and an example bundle for one pilot dataset. ## Acceptance criteria - Bundle can be recreated from scratch with the same hashes. - Bundle format is compatible with CI download/extract workflows. --- # Issue 16 title: "Author the shared fit&transform (ELT) API specification and reference skeleton" brief: "Write the concrete ELT contract (interfaces, artifacts, manifests, invariants) that pipelines will implement. This is the keystone for consistent generalisation across NiPreps." repo: nipreps/niworkflows effort: medium impact: high type: Task x-refs: [17, 20, 21, 22, 23, 24, 27, 28, 33, 36] body: |- ## Context To generalise ELT across NiPreps, we need a shared contract that is precise enough to implement and test, but flexible enough for modality differences. ## What we will do - Specify: - what counts as a "fit artifact" - how artifacts are versioned and stored - the minimal manifest fields (provenance, inputs, parameters) - invariants for transform-only replay - Provide a reference skeleton (classes/protocols) that pilot pipelines can adopt. ## Deliverables - `docs/elt-api-spec.md` (human-readable spec) - minimal reference skeleton in `niworkflows` (non-breaking) ## Acceptance criteria - Spec includes at least 3 worked examples (e.g., SDC, registration, motion correction). - Spec is mapped to evaluation framework sections (Issue 6). --- # Issue 17 title: "Define canonical on-disk layout and manifest schema for fit artifacts in BIDS derivatives" brief: "Standardise where ELT fit artifacts live and how they are described (schema + versioning). This enables reuse across sessions/runs and supports tooling/telemetry." repo: nipreps/niworkflows effort: medium impact: high type: Task x-refs: [25, 31, 32, 33, 34, 36, 39, 40] body: |- ## Context A key reviewer concern is whether the ELT shift will have uneven/delayed impact. A canonical artifact layout makes the benefits immediately real: reuse and interoperability. ## What we will do - Define directory conventions for fit artifacts (within derivatives). - Define a JSON manifest schema: - schema version - software versions - inputs (BIDS URIs + hashes) - parameters - produced transforms/artifacts (with paths + hashes) - Provide a JSON Schema file for validation (used in CI). ## Deliverables - `schemas/fit-artifact.schema.json` - `docs/fit-artifact-layout.md` ## Acceptance criteria - Schema supports forward-compatible versioning. - Schema is implementable without leaking sensitive data (no IPs, no user identifiers). --- # Issue 18 title: "nitransforms: add versioned serialization for transform chains with provenance" brief: "Implement a stable, versioned serialization format for transform chains (affine + nonlinear where possible), including provenance fields. This underpins transform-only replay and cross-pipeline reuse." repo: nipy/nitransforms effort: high impact: high type: Feature x-refs: [19] body: |- ## Context ELT requires that 'what was fitted' can be stored and later applied. Transform chain serialization is therefore infrastructure, not a pipeline detail. ## What we will do - Implement `TransformChain.to_dict()` / `from_dict()` with: - schema version - transform types and parameters - coordinate frame metadata - provenance fields (software versions, creation time, inputs hashes if provided) - Ensure format is JSON-serializable and stable across minor versions. ## Deliverables - Serialization API + unit tests. - Documentation with at least one example round-trip. ## Acceptance criteria - Round-trip equality for supported transform types. - Backward compatibility policy documented (what breaks when). --- # Issue 19 title: "nitransforms: implement composition/apply utilities for common transform types used in NiPreps" brief: "Provide robust composition and application utilities for the transform types most frequently produced by NiPreps pipelines. This reduces pipeline-specific glue code and accelerates ELT adoption." repo: nipy/nitransforms effort: high impact: high type: Feature x-refs: [41] body: |- ## Context If transform application remains bespoke per pipeline, ELT generalisation will be slow and uneven. Centralizing these utilities reduces technical debt. ## What we will do - Implement (or harden) utilities for: - composing affine chains - applying transforms to NIfTI images with explicit reference grids - representing and applying nonlinear warps where supported - Add test vectors for composition correctness. ## Deliverables - API additions + tests. - `docs/transform-composition.md` with examples relevant to NiPreps usage. ## Acceptance criteria - Utilities can be consumed by pilot pipelines with minimal glue. - Correctness tests cover at least affine-only and affine+warp scenarios (where feasible). --- # Issue 20 title: "niworkflows: introduce a FitTransform protocol and base classes" brief: "Create the shared Python protocol/base classes that encode the ELT contract at the workflow-component level. This is the adoption path for pipelines that aren't ready for deep refactors." repo: nipreps/niworkflows effort: medium impact: high type: Feature x-refs: [21, 22, 23, 28, 33, 36, 39] body: |- ## Context A shared contract must exist at the component level (not only in prose). Protocols enable gradual adoption and allow tooling (CI, telemetry) to reason about stages. ## What we will do - Add `FitTransform` protocol (typing + runtime hints) that defines: - `fit(inputs) -> FitResult` - `transform(inputs, fit_result) -> outputs` - Provide base classes/helpers to reduce boilerplate. - Define minimal expectations for `FitResult` (path to artifacts + manifest). ## Deliverables - New module in `niworkflows`. - Unit tests and minimal reference examples. ## Acceptance criteria - Does not break existing API users. - Enables at least one pilot component refactor without pipeline-level redesign. --- # Issue 21 title: "niworkflows: implement cache-keying and storage backend for fit artifacts" brief: "Build the caching layer that makes ELT practically valuable: detect when a fit artifact can be reused and store/retrieve it deterministically. This is crucial for reducing duplication and compute waste." repo: nipreps/niworkflows effort: high impact: high type: Feature x-refs: [22, 41] body: |- ## Context ELT is only compelling if fit results can be reused safely. We need deterministic cache keys tied to inputs and parameters. ## What we will do - Define cache key strategy based on: - relevant input identifiers + hashes - parameters - software versions where required - Implement storage backend that can write/read fit artifacts in the canonical layout (Issue 17). ## Deliverables - Cache key function + tests. - Storage backend implementation + docs. ## Acceptance criteria - Cache hits are safe (no false reuse across incompatible inputs). - Cache behaviour is explicitly logged for transparency and debugging. --- # Issue 22 title: "niworkflows: implement a transform-only executor helper" brief: "Provide a helper that runs transform stages given stored fit artifacts, enabling pipelines to support 'transform-only' mode without bespoke glue. This makes the ELT split usable for end users." repo: nipreps/niworkflows effort: medium impact: high type: Feature x-refs: [41] body: |- ## Context A user-visible ELT benefit is the ability to separate estimation from application. A shared executor avoids each pipeline implementing this independently. ## What we will do - Implement an executor that: - locates the relevant fit artifacts and manifests - validates schema version and compatibility - applies transforms to specified inputs - records provenance in outputs - Provide hooks for pipeline-specific customization. ## Deliverables - Executor utility + tests. - Example integration snippet for one pilot pipeline. ## Acceptance criteria - Works on at least one benchmark dataset in CI. - Produces provenance-complete outputs (manifest updated accordingly). --- # Issue 23 title: "niworkflows: implement ETL legacy compatibility shims for ELT-adopting components" brief: "Provide a backward-compatible 'legacy ETL' path that runs fit+transform in one go and produces current-style derivatives. This reduces adoption friction and addresses concerns about delayed/uneven impact." repo: nipreps/niworkflows effort: medium impact: high type: Feature x-refs: [31, 32, 33, 37] body: |- ## Context Transitioning mature pipelines requires a safety valve. A legacy mode lets users keep current behaviour while enabling staged ELT adoption. ## What we will do - Implement shims/wrappers that: - execute `fit()` then `transform()` in one run - write both standard derivatives and the new fit artifacts (if enabled) - Document how pipelines should expose this to users (CLI flags/policy). ## Deliverables - Wrapper utilities + tests. - Documentation for pipeline maintainers. ## Acceptance criteria - Default user output does not change when legacy mode is active. - Legacy mode can be removed later with a documented deprecation path. --- # Issue 24 title: "niworkflows: implement provenance capture and manifests for fit artifacts" brief: "Ensure fit artifacts carry sufficient provenance for reuse, debugging, and audit (software versions, inputs, parameters). This improves reviewer confidence in transparency and reproducibility." repo: nipreps/niworkflows effort: medium impact: high type: Feature x-refs: [25, 41] body: |- ## Context ELT can only be trusted if fit artifacts are transparent and reproducible. Provenance must be part of the contract, not an afterthought. ## What we will do - Implement manifest generation that records: - schema version - software versions - command-line/config snippet (sanitised) - input dataset identifiers/hashes - produced artifacts (paths + hashes) - Ensure manifests are updated for transform-only runs. ## Deliverables - Manifest utilities + tests. - Documentation describing which fields are required vs optional. ## Acceptance criteria - Manifests validate against the schema (Issue 17). - No sensitive identifiers are recorded by default. --- # Issue 25 title: "niworkflows: add schema validation and linting for fit artifact manifests" brief: "Provide a CI-friendly validator that checks fit artifact manifests against the schema and enforces invariants. This prevents drift across repos and reduces reviewer skepticism about 'plans' vs 'mechanisms'." repo: nipreps/niworkflows effort: low impact: high type: Feature x-refs: [41] body: |- ## Context Consistency across pipelines requires automated enforcement. A validator makes schema compliance routine rather than aspirational. ## What we will do - Implement a `niworkflows` CLI command (or python API) that: - validates manifest JSON against JSON Schema - checks basic invariants (hash fields present, paths exist) - Integrate it into CI templates for pilot repos. ## Deliverables - Validator tool + unit tests. - CI snippet to copy into pilot pipeline workflows. ## Acceptance criteria - Validator runs in <1 minute on CI. - Fails with actionable error messages for maintainers/contributors. --- # Issue 26 title: "nifreeze: extend freeze metadata to capture fit artifacts and ELT manifests" brief: "Update nifreeze so frozen bundles can include ELT fit artifacts and their manifests, ensuring baselines and regression tests remain reproducible as ELT adoption grows." repo: nipreps/nifreeze effort: medium impact: medium type: Feature x-refs: [41] body: |- ## Context If we 'freeze' derivatives for reproducibility, fit artifacts and manifests must be included, otherwise transform-only replay cannot be audited later. ## What we will do - Extend metadata model to: - enumerate fit artifacts included in the bundle - record manifest schema versions - record the environment/container digest used to generate artifacts ## Deliverables - Updated metadata spec and implementation. - Migration notes for existing freeze bundles. ## Acceptance criteria - Bundles produced by Issue 15 validate and include fit artifacts when present. - Documentation includes a minimal end-to-end example. --- # Issue 27 title: "Write the ELT/fit-artifacts developer documentation (adoption guide + examples)" brief: "Create developer-facing docs that explain the ELT contract, how to refactor a component, and how to validate it. This supports onboarding and mitigates 'community uptake' risk." repo: nipreps/niworkflows effort: medium impact: high type: Task x-refs: [48, 49] body: |- ## Context ELT adoption depends on other maintainers being able to implement it consistently. Clear developer docs reduce reliance on the small core team. ## What we will do - Write docs that include: - 'what is a fit artifact' and 'what is transform-only replay' - how to use the FitTransform protocol (Issue 20) - how to implement manifests (Issue 24) - how to add equivalence tests (Issue 6/7) - Provide at least two worked code examples drawn from pilot refactors. ## Deliverables - `docs/dev/elt-adoption.md` + examples. - Link from README(s) and Project board. ## Acceptance criteria - A non-core contributor can follow the doc to implement a small FitTransform component change. - Docs include explicit "common pitfalls" and "review checklist" sections. --- # Issue 28 title: "sdcflows: refactor fieldmap/SDC estimation into an explicit fit stage" brief: "Separate estimation from application in sdcflows by producing reusable fit artifacts (fieldmap estimates, warps, manifests). This is a high-leverage ELT win across multiple NiPreps pipelines." repo: nipreps/sdcflows effort: high impact: high type: Feature x-refs: [29, 30, 33, 41] body: |- ## Context Susceptibility distortion correction naturally fits the ELT paradigm: estimate (fit) once, apply (transform) many times. Refactoring sdcflows unlocks reuse across fMRIPrep(-rodents) and others. ## What we will do - Introduce a `fit_sdc()` stage that: - computes required estimates (fieldmap-based or fieldmap-less) - stores outputs as fit artifacts in canonical layout (Issue 17) - writes a manifest with provenance (Issue 24) - Ensure this stage is callable from pipelines without forcing immediate application. ## Deliverables - New fit-stage API + tests. - Updated documentation describing produced artifacts. ## Acceptance criteria - Fit artifacts validate against schema. - Fit stage runs on sdcflows benchmark dataset (Issue 11) in CI. --- # Issue 29 title: "sdcflows: refactor SDC application into a transform stage driven by fit artifacts" brief: "Implement transform-only distortion correction using stored fit artifacts, enabling deferred application and reuse across runs. This completes the ELT split for sdcflows." repo: nipreps/sdcflows effort: high impact: high type: Feature x-refs: [30, 41] body: |- ## Context The value of ELT is only realised when transform-only replay is possible. For sdcflows, this means applying unwarps based on stored fit artifacts. ## What we will do - Implement `transform_sdc()` that: - loads fit artifacts + validates schema - applies distortion correction to target images - records provenance for the transform-only run - Provide a minimal integration example for downstream pipelines. ## Deliverables - Transform-stage API + tests. - Documentation describing transform-only usage patterns. ## Acceptance criteria - Transform-only outputs match ETL outputs within evaluation tolerances. - Works in CI on the benchmark dataset. --- # Issue 30 title: "sdcflows: add ETL vs ELT equivalence tests and document acceptable deltas" brief: "Build automated tests that compare current ETL behaviour to the new ELT split, with explicit allowlists for acceptable differences. This derisks refactoring of a mature, widely used component." repo: nipreps/sdcflows effort: medium impact: high type: Task x-refs: [41] body: |- ## Context Refactoring mature components can produce subtle behavioural drift. Equivalence tests prevent regressions and make differences explicit. ## What we will do - Add CI jobs that run: - current ETL path - new ELT path (fit-only + transform-only) - Compare results using the comparison harness and metric checks. - Write a short "acceptable deltas" doc for any differences due to ordering/caching. ## Deliverables - CI workflow additions and test scripts. - `docs/elt-equivalence.md` (sdcflows-specific). ## Acceptance criteria - Tests fail on unexpected drift. - Any expected differences are documented and justified. --- # Issue 31 title: "fMRIPrep: expose fit-only/transform-only/legacy-etl CLI flags with user-facing docs" brief: "Make ELT functionality explicit and discoverable for users by adding CLI flags and documentation, while keeping legacy behaviour stable. This also creates a consistent interface for other NiPreps tools to emulate." repo: nipreps/fmriprep effort: medium impact: high type: Feature x-refs: [32, 41, 47] body: |- ## Context Even if ELT is implemented under the hood, users and reviewers need to see it as a concrete, testable capability. CLI flags define the contract. ## What we will do - Implement/standardise: - `--fit-only` - `--transform-only` - `--legacy-etl` (or naming aligned with Issue 3 policy) - Add help-text that explains intended use cases. - Add docs page describing common workflows. ## Deliverables - CLI implementation + tests. - Documentation page (linked from `--help` output). ## Acceptance criteria - Flags are stable and covered by smoke tests. - Docs explain which outputs are produced in each mode. --- # Issue 32 title: "fMRIPrep: adopt canonical fit-artifact manifests for selected core transforms" brief: "Align fMRIPrep with the shared fit-artifact layout and manifest schema (limited scope) so sister pipelines can reuse artifacts and tooling. Scope is intentionally constrained for the small-grant year." repo: nipreps/fmriprep effort: high impact: high type: Feature x-refs: [33, 34, 41] body: |- ## Context fMRIPrep is the reference implementation for fit&transform. To generalise across NiPreps, it must conform to the same manifest/layout contract as the new adopters. ## Scope (small-grant constrained) Only align manifests/layout for a limited set of transforms that downstream tools commonly need: - SDC-related warps (via sdcflows) - a core registration chain needed by pilot pipelines (as feasible) ## Deliverables - Updated artifact writing to match Issue 17 schema. - CI validation via schema linter (Issue 25). ## Acceptance criteria - Default user outputs remain unchanged in legacy mode. - Fit artifacts validate and can be consumed by transform-only tooling. --- # Issue 33 title: "fMRIPrep-rodents: replace local SDC logic with sdcflows ELT stages" brief: "Adopt the refactored sdcflows fit/transform stages in fMRIPrep-rodents to reduce duplication and demonstrate cross-pipeline ELT reuse. This is a high-signal sustainability win." repo: nipreps/fmriprep-rodents effort: high impact: high type: Task x-refs: [34, 41] body: |- ## Context The feedback highlighted concerns about uneven/delayed impact when rebasing mature pipelines. Reusing sdcflows ELT stages is an immediate, concrete reduction of duplicated logic. ## What we will do - Swap rodent-specific SDC estimation/application glue to call: - `sdcflows.fit_*` (Issue 28) - `sdcflows.transform_*` (Issue 29) - Ensure rodent-specific parameters are explicit and recorded in manifests. ## Deliverables - Updated workflow wiring. - Equivalence testing on rodent benchmark dataset (Issue 9). ## Acceptance criteria - No loss of functionality vs current rodent pipeline. - CI passes equivalence checks; any deltas are documented. --- # Issue 34 title: "fMRIPrep-rodents: refactor registration chain into reusable fit artifacts + transform-only replay" brief: "Implement fit artifacts for the rodent registration chain and enable transform-only reapplication. This demonstrates the ELT pattern beyond SDC and reduces recurring maintenance from duplicated fixes." repo: nipreps/fmriprep-rodents effort: high impact: high type: Feature x-refs: [41] body: |- ## Context Rodent pipelines often need the same registration logic as human pipelines but with parameter tweaks. Encoding registration as fit artifacts reduces duplication and makes validation clearer. ## What we will do - Identify the minimal registration steps to split into fit/transform. - Store transforms via nitransforms serialization (Issue 18/19) where applicable. - Ensure transform-only replay can run on the benchmark dataset. ## Deliverables - Fit artifact writing + manifests. - Transform-only replay wiring. - Tests validating replay reproduces ETL outputs within tolerances. ## Acceptance criteria - Fit artifacts validate and can be reapplied deterministically. - Transform-only mode produces the expected derivatives without rerunning estimation. --- # Issue 35 title: "Extract shared rodent/human subworkflows into niworkflows to reduce duplication" brief: "Identify a targeted list of duplicated subworkflows and move the shared parts into niworkflows, leaving only rodent-specific deltas in fMRIPrep-rodents. This directly tackles technical debt and small-core maintenance load." repo: nipreps/niworkflows effort: high impact: high type: Task x-refs: [41, 48] body: |- ## Context A persistent sustainability issue is duplicated logic across -Preps. A small grant can't rewrite everything, but it can extract a curated set of high-churn shared components. ## What we will do - Audit fMRIPrep-rodents for duplicated components that are: - frequently updated upstream - logically general (not rodent-only) - Migrate a constrained set (explicitly listed in the PR) into `niworkflows`. - Update fMRIPrep-rodents to import from niworkflows with minimal wrappers. ## Deliverables - New shared subworkflow modules in `niworkflows`. - Migration PR in fMRIPrep-rodents (linked in this issue). ## Acceptance criteria - Duplicated code removed in rodents repo for the targeted set. - CI equivalence checks pass; deltas documented if any. --- # Issue 36 title: "PETPrep: define fit artifacts for motion correction and coregistration (ELT-ready)" brief: "Design and implement the minimal set of PETPrep fit artifacts needed to support transform-only replay for dynamic series. This establishes ELT generalisation across modalities." repo: nipreps/petprep effort: high impact: high type: Feature x-refs: [37, 38, 41] body: |- ## Context PET workflows provide a clear ELT use case: estimate transforms once, apply to multiple frames or analyses. Defining fit artifacts carefully avoids future refactor churn. ## What we will do - Define fit artifacts for: - motion correction estimates - PET↔MRI coregistration transforms (where applicable) - Encode artifacts with canonical manifests and nitransforms-compatible serialization where feasible. - Ensure artifacts are stored in derivatives in a predictable way. ## Deliverables - Fit-stage implementation and manifest writing. - Documentation describing produced artifacts and intended reuse. ## Acceptance criteria - Fit artifacts validate against schema and are replayable. - Works on PETPrep benchmark dataset (Issue 10). --- # Issue 37 title: "PETPrep: implement transform-only replay applying stored transforms to dynamic series" brief: "Enable running PETPrep in transform-only mode, applying precomputed transforms/parameters to generate derivatives without re-estimation. This is the user-visible ELT deliverable for PETPrep." repo: nipreps/petprep effort: high impact: high type: Feature x-refs: [38, 41] body: |- ## Context Transform-only replay is the concrete ELT benefit for users and downstream tooling. PET dynamic data makes this particularly valuable. ## What we will do - Implement a transform-only path that: - loads fit artifacts + validates manifests - applies transforms to required frames - records provenance for the replay run ## Deliverables - Transform-only implementation + tests. - CLI exposure consistent with fMRIPrep flags (Issue 31 policy). ## Acceptance criteria - Transform-only outputs match ETL outputs within tolerances. - Produces complete provenance and updates manifests appropriately. --- # Issue 38 title: "PETPrep: add ETL vs ELT equivalence tests and document acceptable deltas" brief: "Add automated equivalence tests for PETPrep's ELT split using the benchmark dataset, with explicit documentation of any acceptable differences. This increases confidence for a mature pipeline refactor." repo: nipreps/petprep effort: medium impact: high type: Task x-refs: [41] body: |- ## Context PET transformations are sensitive; equivalence tests are required to claim ELT safety. ## What we will do - Run: - ETL pipeline - ELT split (fit-only + transform-only) - Compare outputs with the standard harness and PET-specific sanity checks. - Document any acceptable deltas (ordering/caching differences). ## Deliverables - CI workflow and test scripts. - `docs/elt-equivalence.md` (PETPrep-specific). ## Acceptance criteria - Tests fail on unexpected drift. - Expected deltas (if any) are justified and tracked. --- # Issue 39 title: "MRIQC: split 'mriqc group' into a cached fit stage and a report-generation transform stage" brief: "Refactor MRIQC's group step so group statistics/model fitting is cached and report generation can be replayed without recomputation. This extends ELT thinking beyond preprocessing while remaining scope-tight." repo: nipreps/mriqc effort: medium impact: medium type: Feature x-refs: [41] body: |- ## Context Although MRIQC is not a preprocessing pipeline, it has a natural fit/transform split in the group stage: compute group-level summaries once, render/update reports many times. ## What we will do - Identify the minimum split: - fit stage: compute group statistics/summaries and store as artifacts + manifest - transform stage: generate HTML/plots from stored artifacts - Add caching and a transform-only replay path. ## Deliverables - Updated `mriqc group` implementation. - Minimal tests using a small synthetic or benchmark dataset. ## Acceptance criteria - Re-running reporting does not recompute group fit by default. - Artifacts include provenance and validate against a lightweight schema. --- # Issue 40 title: "nirodents: provide reusable template/atlas transform utilities compatible with fit-artifact manifests" brief: "Add utilities for rodent template transforms and metadata handling that can be stored/reused as fit artifacts. This supports ELT adoption in rodent pipelines and reduces bespoke transform glue." repo: nipreps/nirodents effort: medium impact: medium type: Task x-refs: [41] body: |- ## Context Rodent preprocessing often depends on consistent template/atlas mapping. Making these utilities reusable and artifact-compatible reduces repeated bespoke implementations across tools. ## What we will do - Provide small utilities/helpers that: - represent template/atlas transforms with explicit coordinate frame metadata - integrate with nitransforms serialization where appropriate - can be referenced from fit-artifact manifests ## Deliverables - Utility functions + tests. - Documentation snippet demonstrating usage from fMRIPrep-rodents. ## Acceptance criteria - Utilities are general-purpose (not tied to one dataset). - Compatible with manifest layout/schema conventions. --- # Issue 41 title: "Cross-pipeline integration test: fit-only + transform-only replay matches monolithic ETL run" brief: "Create an end-to-end integration test that runs fit-only then transform-only, and compares to a monolithic run. This is the strongest 'proof' mechanism for ELT correctness across NiPreps." repo: nipreps/niworkflows effort: high impact: high type: Task x-refs: [46] body: |- ## Context Component-level tests are necessary but not sufficient. An end-to-end test demonstrates that ELT is not just an internal refactor, but a coherent pipeline execution model. ## What we will do - For at least two pilot pipelines (e.g., sdcflows-driven path and one full pipeline): 1) run monolithic (legacy ETL) execution 2) run fit-only, then transform-only 3) compare outputs via harness and pipeline-specific checks - Integrate into CI as a gated workflow (can be nightly if too expensive for PRs). ## Deliverables - CI workflows and scripts. - Stored evaluation artifacts (JSON + markdown summaries). ## Acceptance criteria - Test reliably detects regressions (fails when expected). - Test outputs are stable and archived for release candidates. --- # Issue 42 title: "migas-py: define telemetry event schema v2 for multi-stage (fit/transform) pipelines" brief: "Define a telemetry schema that can represent fit vs transform stages, success/failure categories, and key performance counters—without collecting identifying data. This directly strengthens the 'evidence at scale' story." repo: nipreps/migas-py effort: medium impact: high type: Feature x-refs: [43, 44, 45, 46] body: |- ## Context Reviewers were not fully convinced by telemetry/tracking plans and asked for clearer milestones. A versioned schema with explicit stage events makes telemetry auditable and interpretable. ## What we will do - Define event types for: - stage start/end (fit/transform) - pipeline success/failure (with categorised errors) - optional performance counters (duration, disk estimates) - Document which fields are required and which are optional. - Define privacy constraints (no IPs, no user identifiers). ## Deliverables - `migas` schema document + reference JSON Schema. - Example payloads for a pilot pipeline. ## Acceptance criteria - Schema supports aggregation by pipeline version and stage. - Schema is stable and versioned; breaking changes require a new major schema version. --- # Issue 43 title: "migas-py: implement opt-in telemetry configuration + client-side country inference (privacy-preserving)" brief: "Implement explicit opt-in controls and a privacy-preserving method for generating coarse country-level aggregates, plus documentation. This improves UK-impact measurability while respecting privacy." repo: nipreps/migas-py effort: medium impact: high type: Feature x-refs: [44, 45, 46] body: |- ## Context Country-level adoption is a funder-relevant measure, but must be privacy-preserving. Telemetry must be opt-in and transparent. ## What we will do - Add explicit opt-in/out controls (env var, config file, CLI integration hooks). - Implement a client-side country inference option that: - does not send IP addresses to the server - records only coarse country code (if enabled) - Publish a telemetry transparency statement in docs. ## Deliverables - Configuration implementation + tests. - Documentation describing: - what is collected - how to opt out - how country inference works (and limitations) ## Acceptance criteria - Default is privacy-preserving and policy-compliant. - Opt-out is easy and documented in all pilot pipelines. --- # Issue 44 title: "migas-server: implement ingestion + aggregation for telemetry schema v2" brief: "Extend migas-server to ingest the new stage-aware schema and compute aggregates required for KPI reporting (runs, success rates, stage durations). This is the backend milestone reviewers wanted to see." repo: nipreps/migas-server effort: high impact: high type: Feature x-refs: [45, 46] body: |- ## Context A telemetry schema only matters if the server can aggregate and report it. This issue adds the backend capability needed for impact evidence. ## What we will do - Add endpoints/handlers for schema v2 payloads. - Store only what is required for aggregate reporting. - Implement aggregation jobs: - runs/week by pipeline version - success/failure rate - stage durations (fit vs transform) - optional country-level counts (if enabled, with privacy thresholds) ## Deliverables - Server implementation + tests. - Documentation for deployment/maintenance. ## Acceptance criteria - Aggregation outputs match expected totals on test fixtures. - No raw identifying information is stored. --- # Issue 45 title: "migas-server: publish a privacy-preserving dashboard + exportable aggregates for reporting" brief: "Provide a simple, public-facing dashboard (or static reports) showing aggregated telemetry and adoption metrics with privacy thresholds. This makes 'evidence at scale' tangible to reviewers and stakeholders." repo: nipreps/migas-server effort: medium impact: high type: Feature x-refs: [46] body: |- ## Context Reviewers were unconvinced that ecosystem gains were evidenced at scale. A dashboard showing aggregate usage and adoption trends is a direct rebuttal. ## What we will do - Implement a minimal dashboard or static report outputs that show: - global runs and success rates - ELT feature uptake (fit-only/transform-only usage) - (optional) country-level aggregates with k-anonymity thresholds - Provide an export endpoint for CSV/JSON aggregates for quarterly reports. ## Deliverables - Dashboard/report implementation. - Documentation describing privacy thresholds and suppression rules. ## Acceptance criteria - No small-n country buckets are exposed. - Aggregates are reproducible and versioned. --- # Issue 46 title: "Automate quarterly 'Impact at scale' reporting (telemetry + GitHub + citations)" brief: "Build an automated reporting pipeline that generates a quarterly markdown/PDF-style report from telemetry aggregates, GitHub activity, and citation counts. This turns impact claims into a repeatable deliverable." repo: nipreps/migas-py effort: medium impact: high type: Task x-refs: [52] body: |- ## Context The feedback requested stronger evidence at scale and clearer milestones for tracking. Quarterly automated reporting is an explicit milestone and an output reviewers can understand. ## What we will do - Create a report generator that pulls: - migas-server aggregates (Issue 45) - GitHub stats (PRs, issues, contributors) - citation counts for key papers (where feasible via scripted sources) - Produce a quarterly report artifact and publish it in-repo. ## Deliverables - Scripted pipeline (CLI) + GitHub Action. - `reports/quarterly/` directory with the first generated report. ## Acceptance criteria - Report is generated without manual data entry. - Report includes the KPI table defined in Issue 1. - Report links to raw aggregate exports for auditability. --- # Issue 47 title: "User documentation: ELT workflows (fit-only/transform-only), migration, and troubleshooting" brief: "Publish a user-facing guide explaining ELT concepts in plain language, how to run fit-only/transform-only, and how legacy ETL mode behaves. This supports community uptake and reduces support burden." repo: nipreps/fmriprep effort: medium impact: high type: Task x-refs: [49, 50] body: |- ## Context The transformative upside depends on community uptake. Uptake depends on clear, accessible documentation and predictable user workflows. ## What we will do - Write a guide that covers: - what ELT means in NiPreps terms - when fit-only/transform-only is useful - how outputs differ (if at all) vs legacy ETL - common failure modes and how to report issues - Include at least one end-to-end worked example using a benchmark dataset. ## Deliverables - Docs page(s) in fMRIPrep documentation. - Cross-links from CLI help and release notes. ## Acceptance criteria - Guide is readable by non-developer users. - Includes a short "FAQ" and "support checklist" section. --- # Issue 48 title: "Maintainer documentation: checklist for adopting FitTransform + fit-artifact manifests in a pipeline" brief: "Create a step-by-step maintainer guide and checklist for ELT adoption, including what must be tested and what artifacts must be produced. This reduces reliance on the core team and improves execution confidence." repo: nipreps/niworkflows effort: medium impact: high type: Task x-refs: [49, 50, 51] body: |- ## Context Reviewers noted throughput concentrates in a small core. A maintainer checklist is a concrete mechanism to distribute work and standardise reviews. ## What we will do - Publish a checklist that covers: - adopting the FitTransform protocol - writing manifests in canonical layout - adding schema validation in CI - adding equivalence tests and performance budgets - Provide a template PR description for ELT refactor PRs. ## Deliverables - `docs/maintainers/elt-checklist.md` - Example PR template content. ## Acceptance criteria - Checklist is used by at least one pilot refactor PR during the grant year. - Checklist includes "reviewer sign-off" items aligned with Issue 12 rubric. --- # Issue 49 title: "Training: develop a 2-hour ELT/fit&transform workshop (slides + exercises + sample data)" brief: "Create a compact workshop package suitable for UK RSE groups and NiPreps community sessions. This is a bounded dissemination deliverable aligned to the reduced-scope Round 2 plan." repo: nipreps/niworkflows effort: medium impact: medium type: Task x-refs: [50] body: |- ## Context Adoption requires more than docs; a small, repeatable workshop makes the ELT shift legible and teaches contributors how to help. ## What we will do - Develop: - slide deck (concepts + architecture) - hands-on exercises: - inspect a fit-artifact manifest - run transform-only replay on a tiny dataset - validate manifests with the linter - Use only minimal sample data (CI-friendly). ## Deliverables - Slides + exercises in-repo. - Instructor notes and timing. ## Acceptance criteria - Workshop can be delivered end-to-end in 2 hours. - Exercises run on standard laptops without specialised infrastructure. --- # Issue 50 title: "Community: run an online ELT mini-sprint focused on onboarding and closing curated issues" brief: "Organise a remote mini-sprint with a curated set of 'good first issue' and 'help wanted' tickets related to ELT adoption, docs, and telemetry. This addresses the small-core throughput constraint with an achievable intervention." repo: nipreps/niworkflows effort: medium impact: medium type: Task x-refs: [51] body: |- ## Context We cannot magically double the core team in a year, but we *can* reduce onboarding friction and convert some users into contributors via structured events. ## What we will do - Curate ~10 sprint-sized issues across repos with clear acceptance criteria. - Prepare onboarding materials (links to Issue 48 checklist and Issue 27 docs). - Run a 1–2 day remote sprint (time-zone friendly for UK/Europe). - Publish a post-sprint summary with outcomes and follow-up actions. ## Deliverables - Sprint plan doc + curated issue list. - Post-sprint report (markdown). ## Acceptance criteria - At least 5 issues receive external contributions (PRs or reviews). - Sprint report documents lessons learned and updates onboarding docs accordingly. --- # Issue 51 title: "Governance: implement a throughput-diversification plan (triage rotation, reviewer pool, mentorship)" brief: "Create and pilot lightweight governance mechanisms that distribute workload: rotating triage duty, an expanded reviewer pool, and mentorship pathways. This is a pragmatic response to the 'small core' throughput feedback." repo: nipreps/niworkflows effort: medium impact: high type: Task x-refs: [52] body: |- ## Context The feedback notes that throughput concentrates in a small core. This issue creates mechanisms to spread load without assuming new long-term hires. ## What we will do - Define a triage rotation (weekly/monthly) and document responsibilities. - Create a reviewer pool: - identify contributors willing to review specific areas - document review checklists and escalation paths - Set up a mentorship pathway: - pair new contributors with maintainers for one PR cycle - publish expectations and time budget ## Deliverables - `docs/governance/throughput-plan.md` - Initial roster for triage rotation and reviewer pool. ## Acceptance criteria - Rotation runs for at least 8 consecutive weeks. - Metrics (issue response time, PR review latency) are reported in quarterly impact report (Issue 46). --- # Issue 52 title: "Dissemination: write and submit a technical report/perspective on ELT for neuroimaging preprocessing" brief: "Produce a concise technical perspective describing the ELT/fit&transform approach, evaluation methodology, and early results from pilot adoption. This is a bounded, high-leverage dissemination output for the reduced-scope grant." repo: nipreps/niworkflows effort: high impact: medium type: Task x-refs: [] body: |- ## Context A short, citable technical report helps translate engineering work into an impact artifact that reviewers and the broader community can evaluate. It also helps demonstrate that ELT generalisation is a reusable pattern. ## What we will do - Draft an outline covering: - motivation for ELT in neuroimaging preprocessing - the fit-artifact contract and manifest approach - evaluation framework and equivalence gates - telemetry approach and initial adoption metrics - Submit to an appropriate venue (journal perspective, preprint, or technical report series) consistent with Round 2 scope. ## Deliverables - Manuscript draft in-repo. - Submission-ready version + public link once available. ## Acceptance criteria - Includes reproducible pointers to benchmark datasets and evaluation outputs. - Clearly distinguishes what is delivered in the 1-year scope vs future directions. ``` ### Item 2 * A note