---
tags: scverse, open2c, data-structures, zarr, meeting
---
# 2022-11-09: open2c + scverse data-structures meeting
*Attendees: Trevor Manz, Isaac Virshup Nezar Abdennur, Geoff Fudenberg, Aleksandra Galitsyna*
## Agenda
* Presentation
* HiGlass (10 min) [Trevor Slides](https://docs.google.com/presentation/d/1TujqZGOmFewW1v45rIOT9vj5YZaqRZ6_WHVC1G5tiOg/edit?usp=sharing)
* AnnData (10 min) [Isaac's slides](https://drive.google.com/file/d/1XGO9Etil_0vFPPOSsG_AmqS3WPtzAcvu/view?usp=sharing)
* File formats/ interoperability
* Anndata + multivec
* Zarr, Parquet, Kerchunk, Dask
* Pairwise or higher-order data
* Performance/ scaling
* Multiscale data
* Memory usage
* hg
* higlass tileset "protocol"
* Standardized tileset adaptors for multiscale
* Single-cell and single-locus embeddings
## Notes
* https://pangeo-forge.org
* Zarr as a reader for hdf5
* Trevor Pres
* Higlass tilesets and data formats
* HiGlass supports many formats
* higlass client
* consistent api for retrieving regardless of input
* Especially multiscale
* Abstraction over 1d/ 2d genomics formats
* Abstraction
* TilesetInfo – metadata about the pyramid
* Tilesize, pyramid shape
* TileData
* Can be pixel or sparse
* Q: Multiscale for snps?
* SNps are supported
* Multiscale is a bit more complicates
* clodius implements tilesets for genomics formats
* https://github.com/higlass/clodius
* Higlass visualization defined in a json
* Server contains a set of vizualizations and datasets
* Clodius does the range querying
* higlass-python is python API for using in a notebook env
* v2 (https://github.com/manzt/hg)
* User defined tilesets, user defines functions
* Idea: representing tileset as a zarr dataset
* Once it is a zarr, maybe don't need a server process
* Or, maybe you don't with kerchunk
* Zep 3 discussion – non fixed size chunks
* https://github.com/orgs/zarr-developers/discussions/52
* Isaac Presentation
* Multivec, can we overlap?
* Spatial indexing?
* Can also be used in genomic coordinate systems (Rtrees used interally for some formats)
* Z-curve indexing (https://en.wikipedia.org/wiki/Z-order_curve)
* Arrow
* Accessing single rows from cooler's
* Follow up – on zulip?
* Multiscale access
* Spatial indexing
* Bioframe PRs
* Common file formats for bioinformatic data types