owned this note
owned this note
Published
Linked with GitHub
# PyBIDS 1.0 layout API
The following API provides the basic data structures needed to represent a BIDS
dataset. It aims to be simple enough to bear multiple implementations.
```python
Index = PaddedInt
Value = str
Entity = Literal['subject', 'session', ...]
class Schema:
"""Representation of the state of BIDS schema
This replaces the concept of a config file, allowing PyBIDS to accept
BEP or application-specific schemas.
For the time being, this should be considered opaque to users.
"""
class File:
"""Generic file representation
This permits unparseable filenames to be represented.
"""
path: Path
_layout: BIDSLayout | None
# Implement os.PathLike
def __fspath__(self):
return str(self.path)
@cached_property
def relative_path(self) -> Path:
if not self._layout:
raise ValueError
return self.path.relative_to(self._layout.root)
class BIDSFile(File):
entities: dict[Entity, Value | Index]
datatype: str | None
suffix: str | None
extension: str | None
@cached_property
def metadata(self) -> dict[str, Any]:
"""Sidecar metadata aggregated according to inheritance principle"""
class BIDSLayout:
# Inspired by context: https://github.com/bids-standard/bids-specification/blob/master/src/schema/meta/context.yaml
schema: Schema
root: Path
dataset_description: dict[str, Any]
# tree: Tree[BIDSFile] # Probably not
# Non-conforming files, maybe better name?
# could be property
ignored: list[File]
## layouts: list[BIDSLayout]
@cached_property
def files(self) -> list[BIDSFile]: ...
@cached_property
def datatypes(self) -> list[str]: ...
@cached_property
def modalities(self) -> list[str]: ...
@cached_property
def subjects(self) -> list[str]: ...
@cached_property
def entities(self) -> list[Entity]: ...
def get_entities(
self,
entity: Entity,
**filters,
) -> list[Value | Index]: ...
def get_metadata(self, term: str, **filters) -> list[Any]: ...
def get_files(self, **filters) -> list[BIDSFile]: ...
```
It can be useful to group multiple datasets into a single logical layout,
for example, when querying raw data and derivatives:
```python
class LayoutCollection(BIDSLayout):
primary: BIDSLayout
layouts: list[BIDSLayout]
mylayout = LayoutCollection('/path/to/ds')
mylayout.primary.entities # Only once
mylayout.entities # Iterative
```
### `layout.utils`
A module will be provided to provide utilities that only rely on the `BIDSLayout`
API. The goal is to provide implementations that will survive refactors and templates
for working with the API.
The following components of the current `BIDSLayout` would be better presented as utilities:
```python
def get_bval(x: BIDSFile) -> BIDSFile: ...
def get_bvec(x: BIDSFile) -> BIDSFile: ...
def get_fieldmap(x: BIDSFile) -> List[BIDSFile]: ...
```
Some parsing/creation may be useful:
```python
def parse_file(x: Path | str) -> BIDSFile:
"""Construct a layout-free BIDSFile"""
def new_file(
template: BIDSFile | None = None,
layout: BIDSLayout | None = None,
*,
datatype: str | None = None,
suffix: str | None = None,
extension: str | None = None,
**entities,
) -> BIDSFile:
"""Generate new file
If given a template, additional arguments are overrides.
"""
```
## Todo
* API strategy
* Release API proposal for comment
*
* Options:
* Implement new API using existing backend
* Implement old API w/ ANCP-BIDS backend to test ANCP-BIDS (existing pybids-refactor repo)
* Go over failing tests and categorize
* Deficiencies in ANCP-BIDS
* (Check get_bval in particular)
* Check anbpids validation (ask Erdal to implement)
* If ancpbids is performing reasonably, maybe go ahead and do new API with ancpbids
* Open API questions:
* How to handle multiple layouts?
* LayoutColleciton or single Layout. Leaning to LayoutCollection
* layout.utils for migrating old helper object methods