Try   HackMD

Metadata

  • Authors: Peer Herholz, Oscar Esteban, Chris Markiewicz, Ariel Rokem, and Franco Pestilli
  • Date: 2023-07-21

Abstract

In this document, we propose a general principle for developing BIDS extension proposals for derivative data. The goal is to establish consensus so that parts of BEPs that propose terms in line with this document will be considered accepted in principle. The proposal is to ask feedback from the community, provide a timeline for the discussion, and settle on a decision making process. At the end of the timeline we request that a decision be reached. The proposal is RECOMMENDED not REQUIRED in that BEPs would be allowed to deviate when deemed necessary.

Problem statement

In working through BEPs 12 and 16, we have identified a repeated pattern in generating derivatives within several imaging modalities' workflows where:

  1. We require a reference map that is used to encode spatial features and parameters. There is an antecedent of this in BIDS with BEP23 (see below). In that BEP, the proposed naming takes the pattern _<suffix>ref (e.g., _boldref, _dwiref, etc.), and that solution has been suggested as a possibility in issue #1532 of the spec repository.

  2. We have derived data that are no longer of the same type as the original, but for which we would like to keep the notion of the modality from which this was derived, while also signalling that it is derived (i.e., non-raw).

Proposal

Introduce a new suffix pattern : _<suffix>map, where is suffix is a BIDS suffix used in the raw data (e.g., dwi or bold).For example, the proposed pattern produces the suffices _dwimap or _boldmap. BEPs may use this suffix pattern under the conditions specified below and MUST specify the extension and metadata that are required with the suffix.

(1) The file descriptor does fall under one of the generic derivatives descriptors.
(2) No other descriptor exits in the BIDS spec. For example, statsmap cannot be used, because it is already being used, or soon to be, for a different specification.

Motivation

Many users are not equipped to understand fine distinctions between different classes of derivatives (e.g., those that are produced by a model fit and a direct computation)

This suffix pattern provides context through the concatenation of a raw data suffix and the word "map", which implies that the file still contains information that is spatially contiguous (in contrast to tabular/"tidy" data, with each row representing a brain region, for example).

Precedents and interactions with other BEPs

BEP 23: PET Derivatives

BEP 23 has introduced "maps" that correspond to the conventions introduced by BEP 001 (qMRI), such as T1map, T2map, etc. The following maps were introduced:

  • RDmap (receptor density map)
  • BPmap (binding potential map)
  • GEmap (genetic expression map)

These generally will be distributed as mean/standard-deviation pairs, for example: sub-01_stat-mean_desc-5HT_RDmap.nii.gz/sub-01_stat-std_desc-5HT_RDmap.nii.gz.

BEP 12: Functional MRI derivatives

BEP 12 proposes a collection of summary statistics, including mean, standard deviation, temporal SNR, regional homogeneity, etc. Following the example of BEP 23, it has adopted the proposal

  • <source_entities>_stat-<mean|std|...>_boldmap.nii.gz

BEP 16: diffusion-weighted imaging derivatives

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
We discuss this option as a considered alternative below

The current writing of the proposal follows the alternative listed below where model fit and model derived parameters are described:

  • <source_entities>_model.<extension>
  • <source_entities>_mdp.<extension>

This pattern is, in principle, more generalizable across the other ongoing BEPs and Derivatives in general:

  1. A data process might have generated primary parameters that are either 3D (x,y,z) or 4D (x,y,z,v). These parameters might be of help for further data analysis or data interpretation, and ultimately the data end user. Examples include "statistics" such as mean, std, etc. or model derivatives, such as DTI FA.

  2. At the same time the process might have generated secondary parameters. These are not strictly necessary for futher processing or data interpretation, but they can be potentially useful to interpret the outputs of the data process, to track history of the processing, for reproducibility and ultimately for debuging purposes of the developer/modeller of the code.

BEP 39: dimensionality reduction-based networks

The current version of the proposal uses a comparable pattern as outlined for BEP16:

  • <source_entities>_mdp.<extension>
  • <source_entities>_mfp.<extension>

Alternatives Considered

  1. Suffixes that distinguish between model-fit and model-derived parameters. This alternative is implemented in the current state of BEP16 and BEP39.
    We assess this option should be deemed rejectable for the following reasons:

    1. This distinction does not seem useful for end users as there are no antecedents of previous adoption by some neuroimging sub-community, who may or may not care about understanding the distinction.
    2. The distinction between model-fit and model-derived parameters is not always clear. To take one example, the eigenvalues and eigenvectors of the DTI tensor model could be seen as fit or derived. The utility of this high-level distinction is undermined if every such case is either left to the determination of the tool developer or requires an explicit declaration in the spec.
    3. For BIDS purposes, it is more important to state what something is than how it was derived.
    4. This is not something that any currently-existing software does.
  2. For the word that modifies the <suffix>, the following options have been considered

    • tensor : this was deemed rejectable because while the fancy Google branding has run with it, it still means something in physics.
    • array : all non-scalar data may be considered an array, but it lacks the association with spatial meaning
    • image : this was deemed rejectable because in its common usage in neuroimaging software, it implies raw data (e.g., boldimage would most likely be read as an image containing BOLD data)
  3. Allowing each BEP to create separate suffixes that provide a good match to the use-case in that BEP. This is the status quo, and was deemed rejectable to make both decision making and technical implementation simpler, because it provides a reference rule for future implementations, and avoids the proliferation of suffixes.

Decision making

As outlined above, we propose a two-stage decision making process within a set timeline in order to each a consensus. Furthermore, we aim to evaluate the feasibility of this process concerning other BIDS-related discussions, ie community-driven/guided decision making.

Stage 1
In the first stage, comments from the entire community are solicited and discussed. We suggest a time period of 2 weeks, starting the day after the proposal was initially circulated/posted.

Stage 2
In the second stage, a voting on the provided/proposed options (based on the Stage 1 outcomes) will take place. Here, we also suggest a time period of 2 weeks, starting the day after Stage 1 was finished.

After this time, this proposal will become part of the standard operating procedures of BIDS and be referenced in BEP development guidelines.