In this document, we propose a general principle for developing BIDS extension proposals for derivative data. The goal is to establish consensus so that parts of BEPs that propose terms in line with this document will be considered accepted in principle. The proposal is to ask feedback from the community, provide a timeline for the discussion, and settle on a decision making process. At the end of the timeline we request that a decision be reached. The proposal is RECOMMENDED not REQUIRED in that BEPs would be allowed to deviate when deemed necessary.
In working through BEPs 12 and 16, we have identified a repeated pattern in generating derivatives within several imaging modalities' workflows where:
We require a reference map that is used to encode spatial features and parameters. There is an antecedent of this in BIDS with BEP23 (see below). In that BEP, the proposed naming takes the pattern _<suffix>ref
(e.g., _boldref
, _dwiref
, etc.), and that solution has been suggested as a possibility in issue #1532 of the spec repository.
We have derived data that are no longer of the same type as the original, but for which we would like to keep the notion of the modality from which this was derived, while also signalling that it is derived (i.e., non-raw).
Introduce a new suffix pattern : _<suffix>map
, where is suffix
is a BIDS suffix
used in the raw data (e.g., dwi or bold).For example, the proposed pattern produces the suffices _dwimap
or _boldmap
. BEPs may use this suffix pattern under the conditions specified below and MUST specify the extension and metadata that are required with the suffix.
(1) The file descriptor does fall under one of the generic derivatives descriptors.
(2) No other descriptor exits in the BIDS spec. For example, statsmap
cannot be used, because it is already being used, or soon to be, for a different specification.
Many users are not equipped to understand fine distinctions between different classes of derivatives (e.g., those that are produced by a model fit and a direct computation)
This suffix pattern provides context through the concatenation of a raw data suffix and the word "map", which implies that the file still contains information that is spatially contiguous (in contrast to tabular/"tidy" data, with each row representing a brain region, for example).
BEP 23 has introduced "maps" that correspond to the conventions introduced by BEP 001 (qMRI), such as T1map
, T2map
, etc. The following maps were introduced:
RDmap
(receptor density map)BPmap
(binding potential map)GEmap
(genetic expression map)These generally will be distributed as mean/standard-deviation pairs, for example: sub-01_stat-mean_desc-5HT_RDmap.nii.gz
/sub-01_stat-std_desc-5HT_RDmap.nii.gz
.
BEP 12 proposes a collection of summary statistics, including mean, standard deviation, temporal SNR, regional homogeneity, etc. Following the example of BEP 23, it has adopted the proposal
<source_entities>_stat-<mean|std|...>_boldmap.nii.gz
The current writing of the proposal follows the alternative listed below where model fit and model derived parameters are described:
<source_entities>_model.<extension>
<source_entities>_mdp.<extension>
This pattern is, in principle, more generalizable across the other ongoing BEPs and Derivatives in general:
A data process might have generated primary parameters that are either 3D (x,y,z) or 4D (x,y,z,v). These parameters might be of help for further data analysis or data interpretation, and ultimately the data end user. Examples include "statistics" such as mean, std, etc. or model derivatives, such as DTI FA.
At the same time the process might have generated secondary parameters. These are not strictly necessary for futher processing or data interpretation, but they can be potentially useful to interpret the outputs of the data process, to track history of the processing, for reproducibility and ultimately for debuging purposes of the developer/modeller of the code.
The current version of the proposal uses a comparable pattern as outlined for BEP16:
<source_entities>_mdp.<extension>
<source_entities>_mfp.<extension>
Suffixes that distinguish between model-fit and model-derived parameters. This alternative is implemented in the current state of BEP16 and BEP39.
We assess this option should be deemed rejectable for the following reasons:
For the word that modifies the <suffix>
, the following options have been considered
tensor
: this was deemed rejectable because while the fancy Google branding has run with it, it still means something in physics.array
: all non-scalar data may be considered an array, but it lacks the association with spatial meaningimage
: this was deemed rejectable because in its common usage in neuroimaging software, it implies raw data (e.g., boldimage
would most likely be read as an image containing BOLD data)Allowing each BEP to create separate suffixes that provide a good match to the use-case in that BEP. This is the status quo, and was deemed rejectable to make both decision making and technical implementation simpler, because it provides a reference rule for future implementations, and avoids the proliferation of suffixes.
As outlined above, we propose a two-stage decision making process within a set timeline in order to each a consensus. Furthermore, we aim to evaluate the feasibility of this process concerning other BIDS-related discussions, ie community-driven/guided decision making.
Stage 1
In the first stage, comments from the entire community are solicited and discussed. We suggest a time period of 2 weeks, starting the day after the proposal was initially circulated/posted.
Stage 2
In the second stage, a voting on the provided/proposed options (based on the Stage 1 outcomes) will take place. Here, we also suggest a time period of 2 weeks, starting the day after Stage 1 was finished.
After this time, this proposal will become part of the standard operating procedures of BIDS and be referenced in BEP development guidelines.