6th NGFF Community Call 2022-01-27

--- tags: NGFF, community-call --- See: [Connection information](https://forum.image.sc/t/connection-information-for-next-gen-call-on-jan-27th-2022/61335) and[recording](https://downloads.openmicroscopy.org/presentations/2022/NGFF-community-call-2022-01-27/) Please paste this into the Zoom chat as new people join: :::warning Welcome to the community call. Please be aware that this session may be recorded. Live notes for the session are available in https://hackmd.io/QfiBKHIoTZ-CJSp3q0Wykg?edit Where possible, help to structure the notes for later publication rather than commenting in Zoom's chat. Thanks! ::: # 6th NGFF Community Call 2022-01-27 This document is a place where you can help drive what needs discussing on the 27th. Add your thoughts, needs, etc. or even new sections if need be. If there's an idea already in place that you like, give it a :thumbsup: If you are unclear about this document, **just add a question here** and someone will tidy it up or get in touch: * no problems yet? Excellent! :surfer: ## "User registration" | Name | Institute | Twitter Handle | GitHub Handle | |------------ |---------------------- |---------------- |--------------- | | Copy | and | paste | me | | Josh Moore | University of Dundee | notjustmoore | joshmoore | | Sébastien Besson | University of Dundee | | sbesson | | Jean-Marie Burel | University of Dundee | | jburel | | David Gault | University of Dundee | | dgault | | Ken Ho | The Francis Crick Institute | @DrKenHo | DrKenHo-crick | | John Bogovic | HHMI Janelia | @BogovicJohn | bogovicj | | Koji Kyoda | RIKEN BDR | | | | Tim-Oliver Buchholz | FMI Basel | tibuch_ | tibuch | | Norman Rzepka | scalable minds | normanrz | normanrz | | Will Moore | University of Dundee | | will-moore | | Guillaume Gay | France BioImaging | morpholg | glyg | | Matthew Hartley | EMBL-EBI | | mrmh2 | | Jean-Karim Heriche| EMBL-HD | | jkh1| | Christian Tischer | EMBL-HD | tischi | | | Constantin Pape | EMBL-HD | cppape | constantinpape | | Stephan Wagner-Conrad | ZEISS | | swg08 | | Andras Lasso | PerkLab, Queen's University | lassoan | lassoan | | Sebastian Rhode | ZEISS | sebisabs | sebi06 | | Martin Schorb | EMBL-HD | | martinschorb | | Ilan Gold | TUM/HMS | ilanbassgold | ilan-gold | Bugra Oezdemir | EMBL - EuroBioImaging | bugraoezdemir | Davis Bennett | HHMI Janelia | @davisvbennett | d-v-b | | Kevin Kozlowski | Glencoe Software | None | kkoz | | Jeremy Maitin-Shepard | Google | | jbms | | Melissa Linkert | Glencoe Software | | melissalinkert | | Jason Swedlow | University of Dundee/Glencoe Software/Wellcome Leap | jrswedlow | jrswedlow | | Lee Kamentsky | MIT / Kwanghun Chung Lab | paste | LeeKamentsky | | Erin Diel | Glencoe Software | dielwithit | erindiel | | Eric Perlman | Yikes LLC | perlman | perlman | | Trevor Manz | Harvard Medical School | trevmanz | manzt | | Erick Martins Ratamero | The Jackson Laboratory | erickratamero | erickmartins | | Caterina Strambio De Castillia | UMass Medical School | StrambioLab | strambc0 | | Mina Gheiratmand | Glencoe Software | | mgheirat | | Damir Sudar | Quantitative Imaging Systems LLC | | dsudar | | Fernando Cervantes Sanchez | The Jackson Laboratory | | fercer | _Feel free to link your projects at the bottom using `[shortcut]: URL` markup._ ## Agenda ### Welcome & Introductions {<30m} * Round table (Josh) ### Topics {90m} * Report: Zarr Update (Josh Moore) {10m} - Sharding: [prototype I](https://github.com/zarr-developers/zarr-python/pull/876) and [prototype II](https://github.com/zarr-developers/zarr-python/pull/947) - Community manager & [Extension Process](https://github.com/zarr-developers/governance/issues/14) - [xarray](http://xarray.pydata.org/) & https://cfconventions.org/ (e.g. [coordinate subsampling](https://cfconventions.org/Data/cf-conventions/cf-conventions-1.9/cf-conventions.html#compression-by-coordinate-subsampling)) - [Geo](https://github.com/zarr-developers/zarr-specs/issues/124), NASA, Brain, etc. * Report: **v0.4** (Constantin Pape/Will Moore/Seb Besson) {20m} - Intro: [v0.4 proposal](https://github.com/ome/ngff/pull/57) (Constantin) - Strategy: merge the current v0.4 [PR #57](https://github.com/ome/ngff/pull/57); propose the changes for switching `scale` and `offset` to dict and drop `axisIndices` in a separate PR - Demos (Will) - "Look there's a scalebar" (add screenshots here) - example dataset: https://minio-dev.openmicroscopy.org/idr/v0.4/2022-01-05/idr0062/6001240.zarr - opening in vizarr: ![](https://i.imgur.com/X5rfkmJ.png) - opening in BDV/MoBIE: ![](https://i.imgur.com/unEoajV.png) - In napari: ![](https://i.imgur.com/jsZvy9u.png) - HCS (Seb) - [specification proposal](https://github.com/ome/ngff/pull/24) - [example plate](https://uk1s3.embassy.ebi.ac.uk/0.4_samples/9512.zarr/) - opening in vizarr: ![](https://i.imgur.com/uby1NMs.png) * Proposal: "Transforms" (John Bogovic){???m} - consider adding new transform types in future versions - [the list from John's initial proposal](https://github.com/saalfeldlab/n5-ij/wiki/Transformation-spec-proposal-for-NGFF#transformation-types) - consider ("views/spaces") [see discussion here](https://github.com/ome/ngff/pull/57#issuecomment-1021494911) - [a sketch of how it could work](https://github.com/saalfeldlab/n5-ij/wiki/Named-Spaces-for-Transformations) - Collections [spec issue #3](https://github.com/ome/ngff/issues/31) * Report: [Validation](https://github.com/ome/ngff/pull/69) (Josh & Will Moore) {5m} * Report: [Rendering](https://github.com/ome/ngff/issues/78) (Jean-marie Burel) {5m} - [specification proposal](https://github.com/ome/ngff/issues/78) - https://github.com/saalfeldlab/render/ * Proposal: "Metadata" (Josh Moore) {10m} * Progress: "BDZ - quantitative data or ROIs" (Koji Kyoda/KenHo) {10m} - https://forum.image.sc/t/polygon-and-other-roi-annotations-in-ome-zarr/47990/11 * **(Please feel free to list topics here)** ### Next steps (for those who are interested) [15m] * Next meeting? Host? * Roadmap for upcoming specs & implementations? * ... ---- ## Session 1 live notes ### Intro (misc. notes) * Bugra: conversion at BioHub into OME-Zarr, in the cloud * J-K/Tischi: "all images at EMBL in OME-Zarr within 5 years" - also EOSC and looking into Zarr in R (ZarrR) * Guillaume: looking into French National OMERO interested in how access will be zarr-ified * Ken/Koji: ROIs (segmentations) in 3D in NGFF * Martin: "huge data" * Norman: moving webknossos to OME-Zarr * Sebi: responsible for CZI image format and AI solutions incl the "model" format at ZEISS ### Zarr specs (10:18 CET) * Josh: Zarr weekly call, v2 vs v3, moving towards a specification applicable to both HDF/Zarr, sharding. New implementation in julia, still no Matlab. Decide on extension proposal. Community manager hired to organize this process. How to avoid drift in similar specifications (e.g. multiscale) between communities? - J-K: like having an "imaging specification" level, since it might be hard to get everyone into the zarr process. Lots of similarities, like point clouds (Josh: e.g. PANOSC) * Seb: TODO need to specify that OME-Zarr is Zarr V2 only in order to have a proposal. * https://github.com/ome/ngff/issues/80 * Tischi: re: cross-language, bioformats2raw is the reference implementation? - Seb: GS-specific, but no write permissions - Josh: primary advantage is parallelism when writing - Seb: custom top-level group (collection-like), everything underneath is compatible with OME-Zarr ### v0.4 (10:30 CET) * https://github.com/ome/ngff/pull/57 * Description: better description of "axes" & initial transformations * Starting point for more general transformations, but only translation & scale for now. * Propose: - merge current state - then move spec from axis index to axis name for easier interpretation. - get prototypes working and then finish up - ETA: 1-2 weeks * J-K: how would you represent a sequence of temporal but not equally spaced? Could have the timestamp but how would you represent it here. Need the _order_ for a sequence. - Seb: e.g. multi-cycle imaging. Perhaps a "cycle" dimension that is unit-less (could inject a timestamp later) - Constantin: nothing yet. You could add a custom label, but not cross-compatible. Formally would need to have a proposal. Question is perhaps: does this help you to express what you want (in a proposal)? * Constantin: non-uniform steps should be straight-forward to introduce as a transformation that gives spacing for one dimension. * Matthew: restrictions on the encoding? https://github.com/ome/ngff/issues/81 ### Transformations (10:52 CET) * Constantin for John Bogovic * Defining affine & non-regular offsets * B-splines and more complicated would/could follow * v0.4 is based on some of John's proposals * need to re-evaluate the axisIndex / axisName subsetting logic * transformations likely to be independent entity in the spec in order to be re-usable. i.e. not just for single images but for composition - GG: also on point cloud? Yes. * Martin: **VERSIONING** of metadata when, e.g. transformation changes. - Josh: could be similar to the 3D / 2D use case. - J-K: would argue against it. Simpler. (Josh: hour copy cost on S3) - GG: "diff"? Constantin: easy on the metadata, but harder for the binary - Martin: treat them externally like a view in mobie? CP: need multiple images in the same space. plus other use cases. - Josh: https://www.datalad.org/ (potential for splitting meta- and data trees) - Norman: https://github.com/janelia-flyem/dvid - Martin: https://github.com/saalfeldlab/render/ for alignment/registration - Tischi: looking at collections? different "views"? - Martin: just need the metadata _somewhere_ (don't want to create a mess) ### Collections/validation (11:07 CET) * Josh: define collections of multiscale images i.e. some form of loose relationship as in derived images * reference issue https://github.com/ome/ngff/issues/31 * brings up the idea of subclassing. From simplest (JSON array of images), adding metadata, HCS use case being a specialized version of a collection * Started looking into options to get into richer representation. Investigating JSON-LD framework and started to validate these concepts. * J-K: YAML is a superset of JSON. But YAML does not give validation yet. * Went with JSON schemas as the simplest version for now. Introduced for every version of the spec e.g. https://github.com/ome/ngff/blob/main/0.3/schemas/image.schema. * Something to think about for 0.4? * JSON schema limitation is extensibility. How to allow the community to define their concepts? * Constantin: JSON schema great thing to have. What prevents users to fork, propose their extensions? Josh: limitation is for end-users the ability to load this extension. * issue of building JSON framework vs using JSON framework * Tischi: question of whether it is critical until there is a clear extension use case? * J-K: think extensibility is a pretty high-priority. * Josh: something we could apply for transformations? ### Rendering (11:21 CET) * J-M: reviewed supported elements and compared to other tools like BDV. Tried to capture all elements in proposal. * See https://github.com/ome/ngff/issues/78 * Complex but allows for advanced conepts such as slicing * Will try out these concepts in own CLI plugins * Open questions regarding look-up tables i.e. name-based conventions vs shipping binaries etc * Josh: overlap with the concept of "views" in MoBIE. Question of the level i.e. at the level of the image group or above * Tischi: immutable image properties -- like transform. but brightness/contrast feels more mutable. therefore not a core image property. perhaps a "default"? * J-K: what is the job of the rendering? how does a viewer decide what to show? perhaps "Default view" for the rendering is useful. * J-M: defining these defaults e.g. channel color is the intent of the proposal. * Ken: coordinate system standardized? right-hand rule? * Constantin: related to the transformation discussion (came up there already) * Constantin: Transformations can also be relevant for the rendering; for example BDV uses an affine to map from data to display * Tischi: value of drag-n-drop into napari/Fiji and have the image just look good * Ken: Does the rendering also include stitching and multiple cameras? Or is that transformation? * Will: Stitiching not mentioned yet. Angles and vies have been * Constantin: Probably unclear, relation between rendering and transformations needs to be defined * JK: renderings could reference transformations ### Metadata (11:33 CET) * Josh: trying to decide next body of work. Transformations might be happening. * From OME side, need metadata equivalent to OME-TIFF. Goal being to import OME-NGFF into IDR * Driven by recent IDR submissions where data is converted into OME Zarr * OME-TIFF/OME-XML has some form of collection concept * All metadata captured by Bio-Formats turned into OME-XML when using bioformats2raw. used in turn when running raw2ometiff. Not defined in the OME-NGFF specification * Paths: * store/read OME-XML file i.e. standardization bioformats2raw layout. Already an issue e.g. in Viv/Vitessce * convert into OME-JSon e.g. via JSON-LD * embed somewhere else e.g. as an array * Timeline would be before May/June (possibly OME community meeting TBD) * Jean-Karim: would be simpler to convert all the metadata * Tischi: hard to have all these concepts as part of the specificaiton. Also worry about opening very large JSON in the cloud * Jean-Karim: coming back to extensibility proposal * Sebastian: some options allowing to control the amount of metadata that is being stored * Ken: relationship between NGFF metadata and MicroMeta/Quarep * Josh: at OME-XML level, Zarr could point at OME-XML and/or Micro-Metadata * Ken: external concept to the specification? Josh: TBD * J-M: also issue of having the spec becoming obsolete/quickly wrt acquisition * Ilan: also 4DN extension of OME-XML. Josh: effectively a fork which points back at the extensibility issue with (OME-)XML specification ### BDZ (11:49 CET) * Ken: being developed independently of OME-NGFF at the moment * Initially developed HDF5-based format (BD5) for quantitative data. Working on BDZ * Embed or standalone to be decided * Specification for "sphere" data with (x, y, z, t) information * Issue with chunking other shapes e.g. polygons * Might be related to the future of ROIs within OME-NGFF * Constantin: also initial work on tables specification. Feels there is some overlap. - https://github.com/ome/ngff/pull/64 * Josh: underlying zarr issues: ragged arrays; currently not supported in zarr yet * Generally points at Regions of Interest as a next step ### Closing Points (11:57 CET) * Josh * Anyone wants to take on a spec extension? Constantin agreed to walk through the process. * If anyone wants to support ome.zarr in their tool and needs help, please reach out! * Anyone wants to host a meeting on it. * Next meeting may be part of OME meeting * Ken: hackathon next week? * Tischi: won't be in Milan. Meeting online. Starting Monday. * Ken: will use it as an excuse for working on MATLAB ome-zarr. * Juan: * https://j.mp/imagesc-island for the hackathon! * Tischi: agreed. * J-m: re: next topic * look at view/rendering during hackathon? too early? ---- ## Session 2 live notes ### Intro (misc. notes) * Davis: "fewer, better formats ... with good multiscales metadata" * Lee: "petascale brain hemisphere in NGFF" ### Update on zarr (Josh) * support for Julia (now in standard I/O library) * too many C/C++ implementations * missing: matlab and R * CZI funding: zarr & hdf5 interop * sharding extension, work by scalable minds * zarr v3 specification ongoing * zarr for geo-spatial data (is using ngff multiscale spec), how to implement this connection? * imaging spec more general than ngff? ### v0.4 & Transformations (18:22 CET) * Constantin: reversing the conversation from this morning, axisIndices may be better for the future (binary transformations) * John (for the group): when a dataset contains a mix of types, but if just space then it doesn't matter. - so since we're only doing scale & translation, do we exclude it for now? or include it now? * John: these simpler transforms are just moving pixels to physical space * Andras: for storing NIFTI, NRRDS would need to have handedness - Affine would suffice (and includes direction). Perhaps consider it. Also removes redundancy. ITK struggling with this for a long time. - Constantin: longer discussion in the issue, but his a roadblock. Concern that there are viewers that don't support affine. (e.g. Fiji) - @John: will create an issue with the more fleshed out transform spec for the transforms, and will tag Andras, Cosntantin, Jeremy and others * Jeremy: solution for viewers not supporting affine transforms is possibly to just fail. neuroglancer uses ND affine everywhere though it only supports, e.g., rotation in 3. * Josh: give ourselves the freedom to making a breaking change * Davis: the definition of a type (function that maps set of axes to another set of axes) - Parser could detect input axes length = output axes length * Andras: is this also transforming RGB to grayscale? Perhaps call it "SpatialTransform" - Constantin/John: "AxisTransform", Jeremy: "CoordinateTransform" **RELEVANT FOR IMMEDIATE CHANGES** - Jeremy: serving multiple purposes. Just feeding to the viewer. Might be other use cases, e.g. representing a cut-out of the data. If you're writing them back, then the representation may matter more. Transposing them might be difficult because it would be hard to understand the original resolution. - **Take away** for the immediate decision: 1.) only "scale", needs to be of same length than "axes", 2.) "axesNames" (optional) to specify the subset of axes ### Next Transformations (19:02 CET) * Moving to https://github.com/saalfeldlab/n5-ij/wiki/Named-Spaces-for-Transformations * Currently calling them "named spaces". * Transforms from "raw/original" space to e.g. "physical" space. - can also transform to "cutout physical" space - all views on the original data * Andras: came to a similar conclusion after many years of experimenting (even though different field) - clean, non-redundant, also the direction - inverses can be computed, all straight-forward - :+1: for the "invert" flag. * Ken: PSF? That would change the pixel values, this only changes the _grid_. * Jeremy: "Coordinate Transform" is also used. Maybe nicer. - Perhaps distinguish between **continuous and discrete spaces** (programmatically you eventually want an array) - John: good idea but would like to hear the use cases - Martin: channels are discrete * Will: in which order does it get applied? Andras: use affine! ("multiply from the right") ### BDZ (19:16 CET) * issue about needing non-uniform chunk size * Davis: wouldn't suggest storing in zarr things that aren't naturally arrays. Not designed for it. * Jeremy: interested in talking more. in neuroglancer, there's an annotation format, multiscale spatial index of annotations - could collaborate on something new * Andras: polygon is out of discussion? ### [Rendering](https://github.com/ome/ngff/issues/78) * specifying rendering setting * channel groups * "views" - how do these overlap with transform views (if at all?) * transforms / rendering settings may need to live globally? or higher in the hierarchy depending on users / use cases ### [Validation](https://github.com/ome/ngff/pull/69) * Developed from the [Collections](https://github.com/ome/ngff/issues/31) issue * there exist many related collections specifications * v0.4 will have some version of validation ### driving usage * users of IDR must submit ome-zarr * "all data at embl must be ome-zarr within N years"! * Ken - people are worried about abandoning proprietary formats? * Tischi - the original data are stored on tape, ome-zarr is what we work with * Josh M - we will try to get support to get that metadata into ome-zarr and into omero * where and how will the metadata be stored? there are some choices + tradeoffs - to be decided * add extra files ? * or strip the metadata out of those files and add to the zarr json * have and extra ome xml breaks the fewest things now * Q for zarr people - is it valid to put extra files in a zarr * Alistair of zarr declared "yes" * https://github.com/zarr-developers/zarr-specs/issues/112 * Where is the intersection of OME attributes (like voxel size) & Zarr scale/transform metadata? * Tischi - Worry of having metadata jsons be too big / slow to read (1GB!?) * Davis - thats not metadata, thats data (Josh: :laughing:) * Will S - there exists SIMD json libs that parse json quickly * Caterina - https://www.nature.com/articles/s41592-021-01327-9 * Tischi - if we "chunk" the meatadata than it could be ok, because one could decide not to load all of it. * Caterina - scope + settings jsons from micro-meta are in the few kb range ## Next meeting * When? * Josh - please let the community know if you start a new implementation * Damir - please use permissive licenses! ### "Metadata" Proposal * ---- [announcement]: https://forum.image.sc/t/next-call-on-next-gen-bioimaging-data-tools-2022-01-27/60885