--- tags: scverse, meeting, performance, scaling, anndata --- # 2022-11-02: scverse scaling discussion *Attendees:* Adam, Rahul, Danila, Isaac ## Agenda * Short presentations on current work * Isaac (10 min) * Danila (10 min) * Discussion on: * Needs from scvitools * Data access patterns + usecases * Prioritization ## Notes * Question from Danila * IO for jax array * AnnData does not know how to handle Jax arrays * Isaac: this should be easy * Jax + pytorch * Adam * Fast random access to X, layers * Concurrency * Issues with memmap * * Danila – PostData * Working with multiple anndata objects * `PostData` object * Collection of tables with named axes * Contraints on axes * Hashed data + unique names for tables * Query based on sets of axes/ tags * `map` map a function over input * `bind` * Written as: * directory of parquet + json * `Shadows` * Calling scanpy on a shadow, cached loading on access * Cannot subset * Parquet storage * `uns` as json * Seperate complex stuff, then merge * Can use SQL files on `parquet` * Usage from Adam * Random access asap * Multithreaded access from pytorch * Benchmark case?