---
tags: NGFF
---
# IDR-NGFF 2020-W44
## 2020-10-30
### Feedback from the calls?
- JMB: people want to move forward. Need to reach out to Java community.
- Seb: C/C++ also missing (following Java)
- JMB: showing performance differences there will be difficult. (No bindings) Metrics in Java, Python is easier.
- JRS: WIP but lot of constituencies. Support & attention. Need to make clear our long-term intentions.
- Seb: Roadmap incl. the most active players
- JRS: writ large, how will I use it, how do I encorporate this into my institution? e.g. Nico EPFL -- how do *others* write grants based on this?
- Petr: where is OMERO in this picture? Josh: moving to OMERO 6 (support reading files in the cloud)
- Does someone must use OMERO?
- Josh: Not in the case of the prototype i.e. one can create, upload visualize a dataset
- Probably comes into play when having many datasets. Will need some database for querying!
- JRS: numerically OME = "OME-TIFF". (Mostly deal with that through Bio-Formats) But our focus is largely on OMERO ... since the other is "done". But cf. the programming languages.
- J-m: marketing slide? About what happens to your original data.
- Seb: build on DMP that everyone is writing (i.e. text to replace Bio-Formats and IDR as the solution)
### Clients
- J-m: could imagine figure of the IDR data in S3 (Seb: OME.figure. Josh: IDR.figure!)
- Seb: Spec decision re: new changes like plate name. Need a process. Open until the end of the month or separate milestones?
- Decision: keep open (at the level of omero-ms-zarr/spec) + no bump of IDR layout
### Datasets
-
### Formats
-
### Infrastructure
-
## 2020-10-29
### Clients
- J-M:
- CP notebook using new plate layout ready (idr0002). Data retrieve from s3.embassy
- metadata read from IDR
- TODO: Tag repo when PR is merged
- next: working on idr0033
- Will
- handling empty wells
- definitive list of datasets?
- Josh: see in https://hackmd.io/_sftykiGR9mSyUan3l1WmA
### Community call
* Will: videos
* Will: where are the pages?
- original page: https://forum.image.sc/t/upcoming-calls-on-next-gen-bioimaging-data-tools-starting-oct-29/43489
- private post: https://forum.image.sc/t/connection-information-for-next-gen-call-on-oct-29th/44210/20
- agenda/notes: https://hackmd.io/_sftykiGR9mSyUan3l1WmA
* Process
* if someone asks, add them to the private post which links to the Markdown etc
* Mention videos are available to watch
* Notifications post-morten
- JRS: don't think it works because it's too sensitive to personal settings
* Josh: breakouts?
- social breakout only
- 3-5 people
- needs leader/topic/etc. to get something done.
### Datasets & Formats
- Seb: omero-cli-zarr releases up to 0.0.5 (Spec more or less timestamped)
- Available datasets
- idr0033
- idr0002 (whole timeseries + time 0 only)
- idr0004
- Seb: https://github.com/ome/omero-ms-zarr/pull/75
- captures current implementation of the hierarchy + metadata
- next priorities (November?)
- multiple acquisitions (incl. sparseness)
- well vs column vs row spec
- metadata distribution/redundancy
- spatial context (multiple fields of view)
- more suggestions from community e.g. label names
- Simon: encourage people to open PRs against the spec
### AOB
- Simon: Nada on infrastructure
- Will testing napari 0.4.0
- Dom/David: good
## 2020-10-28
### Clients
- Josh: See demo conversation
- Will: sent video
- Will: looking at omero-cli-zarr download
- Simon: using download code vs S3 client?
- Josh: if something is not listable, downloader code will be useful
- Simon: what if half of your chunks are empty. Getting lots of 404/403
- Driven by Australian use case --> (use awscli)
- https://github.com/ome/omero-ms-zarr/issues/74#issuecomment-717212598
### Datasets
- Ordered datasets
- idr0033 (complete)
- idr0002 (complete)
- Seb: bisected issues associated with the idr0033 conversion.
- ScreenReader file leak: will affect any IDR studies using .screen files. OOM due to the rendering metadata addition: will affect the conversion of any large (>100) number of images
- Will opened PR closing resources. Review looks good and looking for a tag of omero-cli-zarr.
- J-m: it needs to be clear by people using (Python) code. Have something in Java code.
- Seb: throttle number of servants/. Simon: set low limit on merge-ci?
- J-m: also need bold warning, "YOU MUST CLOSE THIS"
- All the above :+1:
- Targets
- Today: production datasets export for Thursday?
- Next: ScreenReader (to not bring IDR down)
### Formats
- Seb: working on omero-ms-zarr specification PR for tomorrow
- J-M: plates formats on minio-dev.
- adjusting cellprofiler notebook. need plate id.
- Josh: currently paths are opaque and paths can't be discovered
- Will: have multiple versions of each. Simon: even multiple S3 servers
- Seb: will need a registry
- Josh: Propose s3.embassy.ebi.ac.uk/idr/v0.1/idr/share/20201029
- J-M: adjusting https://github.com/ome/omero-guide-cellprofiler/blob/master/notebooks/idr0002_zarr.ipynb
- Simon: Trying to update conda-bioformats2raw
- see issues. it's messy.
- https://github.com/ome/conda-bioformats2raw/issues/3
- https://github.com/glencoesoftware/bioformats2raw/issues/62#issuecomment-717808003
- David: making progress, don't need anything. Hopefully a first draft this week.
### Infrastructure
- Simon: Fighting with molecule/travis on the ome-zarr-dev1 PRs
### Misc
- J-m: focusing on getting notebook working. Video? (Simon: link to notebook is more powerful)
- Josh: https://github.com/orgs/ome/projects/13
- Petr: training/NGFF -- horrific misunderstanding?
- Josh: same structure as I2K. Tischi involved in NGFF workflow, proposed to redo that for GBI
## 2020-10-27
### Infrastructure
- Simon: vm (ome-zarr-dev1) is present with docker. Mounted files as v4 (hopefully that's ok)
- ome-zarr-dev1.openmicroscopy.org
- single place to do all conversions (rather than needing the devspaces)
- Seb: separate docker partion (since they can grow quickly)? No. Dev only.
### Datasets
- Seb: omero-cli-zarr bringing down IDR. idr0033 is leaking. Prioritizing that.
- Josh: move to idr-testing? (Seb: readwrite would also work)
- Simon: try not to paste idr-next/idr-testing into public comments
- Will: not currently trying to convert at the moment.
- Seb: to retest workflow against idr-testing (Will: see "NGFF Workflow") trying new VM
### Formats
- Josh: one possible use/context for ZarrReader is I2K at the end of November
### Clients
- Will: comment from napari guys re: performance issues possibly from dask.concatenate
- dask.map_blocks is supposedly better. Trying to use that. Not grok'd yet.
- Then working on video.
- Latest omero-cli-zarr spec isn't documented
- Seb: can open a PR against omero-ms-zarr. Slightly split world since we have HCS and non-HCS data. Can prioritize that tomorrow.
- Current image spec: https://github.com/ome/omero-ms-zarr/blob/master/spec.md
- Seb: release omero-cli-zarr?
- Will: have one PR open to write metadata first, but otherwise would be good to have it released for Thursday. (Seb: also a versioning PR from Simon. Release as is, and then list PRs tomorrow.
## 2020-10-26
### Paper
- Jason: thinking about a 300-600 word letter, "Commentary" in Nature Methods on the NGFF work starting with the conversions.
- Less political and just addressing the technical reasons for why we're doing it. Enabling object stores. Data that's not otherwise manageable. Etc.
- Often useful to show that something is a quantifiable improvement (faster, cheaper, etc.)
- Accessing KLB/lightsheet off of EBI S3
- Worth a discussion about what the comparison would be.
- Text to start appearing.
- Simon: links to youtube videos in commentary? JRS: in principle yes, but academic elitism says things should be DOI'd.
- Josh: doing well to defend against possible arguments.
- JMB: depends how far we want to go. Showing all possibilities and what it enables requires several metrics. Plates in napari, segmentation on KLB in ..., etc. Too much for a commentary? JRS: perhaps less is more, but can include in supplementary.
### Clients
- Will: Demo on 29th (3 days) - what are we going to show? - what needs doing?
- little way off yet. what do we want to show?
- currently we have one idr0002 plate up (and truncated version) and idr0033 needs to be regenerated.
- Josh: could see doing a 3 spec review. Will to do a viewer video? (If can skip the initial loading time)
- J-m: also points to the practical aspects of bare minimum to load.
- Will: do we need to make do with the spec or update the metadata?
- Josh: I assume that we need more metadata
- Simon: need to figure out where the slow down is coming from
- Will: don't know what's moving down the wire
- napari:
- See https://github.com/ome/ome-zarr-py
- `napari https://minio-dev.openmicroscopy.org/idr/idr0002-heriche-condensation/plate1_1_013/422.zarr`
- loads slowly - nearly 2 mins from hitting Enter till plate loads
- Same for `napari https://minio-dev.openmicroscopy.org/idr/idr0002-heriche-condensation/plate1_1_013/422_no_T/422.zarr` (no Time-lapse)
- Need a different strategy?
- What's the file chunk-size?
- For 'No_T' set: (1, 2, 1, 64, 84)
- With T (329, 2, 1, 64, 84)
```
$ du -hsc 422_no_T/422.zarr/0/A/*
3.9M 422_no_T/422.zarr/0/A/1
3.9M 422_no_T/422.zarr/0/A/10
3.7M 422_no_T/422.zarr/0/A/11
3.8M 422_no_T/422.zarr/0/A/12
3.9M 422_no_T/422.zarr/0/A/2
3.8M 422_no_T/422.zarr/0/A/3
3.8M 422_no_T/422.zarr/0/A/4
3.8M 422_no_T/422.zarr/0/A/5
4.0M 422_no_T/422.zarr/0/A/6
4.2M 422_no_T/422.zarr/0/A/7
3.9M 422_no_T/422.zarr/0/A/8
4.2M 422_no_T/422.zarr/0/A/9
47M total
```
- https://forum.image.sc/t/connection-information-for-next-gen-call-on-oct-29th/44210/20
- Basically, how to set up a break out that you would potentially want to have.
### Datasets
- idr0033 needs pyramids. (Dom)
- Will: rsync'ing onto idr0-slot3 took many hours
- Simon: long-term need to think about the number of files
### Formats
- Everyone else is good (time for IDR)
### Infrastructure
- Tabled
- mounting minio's objectstore on idr1-slot2
- Setting up `ome-zarr-dev1.openmicroscopy.org`, problem with Docker installation at the moment
- Short hostname?
----
## Template
### Clients
-
### Datasets
-
### Formats
-
### Infrastructure
-