20220126 LEAPS-INNOV BLOSC meeting

# 20220126 LEAPS-INNOV BLOSC meeting Present: - Vincent Favre Nicolin - Thomas Vincent - Guifré Cuní Soler - Francesc Alted - Nicolas Soler BLOSC funding generally: small dev. grants (e.g. NUMFOCUS), Huawei, Google FA cofunds Iron Array ## Ideas which could be included: * TV: **lossy compression**, e.g. jpeg2k. * FA: Would be nice to have a review on BLOSC's achievment and provide feedback * ZFP work: https://github.com/Blosc/c-blosc2/tree/main/plugins/codecs/zfp * TV: better tomography compression ratio with jpeg2k than ZFP (but maybe slower according to FA). High throughput jpeg2k, jpegxl look promising (informal source) * FA: all plugins imported should be in C (open source licence) * TV: **sparse data compression**? -> FA ironarray, zero suppression implemented * TV BLOSC2 availa. for as a filter for HDF5, FA: would then be a twin project * FA: Blosc1 not fwd compatible with blosc2 (but retro compatible, API from blosc1 respected in Blosc2 (you can read Blosc1 with Blosc2) * TV: bitshuffle LZ4 multithreadded mode * **HDF5 connection** FA proposals: Roadmap: (e.g. Python wraper) https://github.com/Blosc/c-blosc2/blob/main/ROADMAP.rst new BLOSC2 functionalitir: group chunks in containers, stored in memory or on disc. (super chunks => persistency, performance). VFN: Can it be used inside HDF5 file, FA: not in this way it's another format TV: direct chunk read can allow to bypass hdf5 [h5py's read_direct_chunk](https://github.com/h5py/h5py/blob/cff8537cc74a4897e2c0dc523309be17bec06dd4/h5py/h5d.pyx#L471) ## Code snippet for HDF5 direct chunk read for a 3D stack of images with one chunk per image: ```python import h5py def hdf5_raw_chunk_reader(filename, datasetname): with h5py.File(filename, mode="r") as h5file: dataset = h5file[datasetname] dtype = dataset.dtype chunk_shape = dataset.chunks for index in range(dataset.shape[0]): filter_mask, chunk = dataset.id.read_direct_chunk((index, 0, 0)) yield chunk, dtype, chunk_shape ``` ## Technique-specific schemes - tomography: lossy needed (ESRF jpeg2k)- VFN: decompression speed can be slower than data processing itself (time wasted while reading images), need to go at 25~40GB/sec ZFP could be a good candidate (suited for multidimensional data) VFN: we want to avoid lossy compression before processing. Lot of data (DECTRIS) bitshuffle LZ4 Dectris contacts: Stefan Brandstetter <stefan.brandstetter@dectris.com>, dubravka.sisak@dectris.com https://www.dectris.com/landing-pages/application-note-mx/ GCS: https://github.com/dectris NS: different use cases, acquisition, archiving FA: study which combination of codecs, filters to use in each case. TV: maintenance would be easier with BLOSC VFN: legal status of BLOSC: * autonoumous worker (officially company: Francesc Alted) * also possible through Iron Array (avoiding the single-person company if necessary) GCS: how to do the amendment/ orders to BLOSC? FA: ESRF hired me for bitshuffle implementation # 12 April 2022 [NICOLAS] ## Bullet point list for launching the collaboration * the production of lossless and lossy compression generic methods that can ideally be integrated in NeXus/HDF5. * the design of lossless and lossy compression algorithms tailored to particular techniques (tomography, macromolecular crystallography, SAXS and others, again best if integrated to HDF5). * A synergy between Blosc and other groups engaged in the project like Peter's (in cc) * A possible integration at the detector's level * Use heuristics based on actual data as well as storage requirements so as to determine the best codec and filters inside Blosc. This will lead to optimize storage resources, potentially reducing infrastructure costs and making data handling easier and snappier. ## Admnistrative details - idea: 3pm/ facility (and we are 4 facilities ) ~ 80 000 Eur/yr ## List of tasks / deliverables - Implement parallel compression / decompression using BLOSC orchestration to substitute to the internal (slow) HDF5 I/O. - would need to patch h5py or provide a filter - Focusing on different types of data like (but not limited to) the following: - raw tomography images - processed tomography data (reconstructed volume) - MX/SX - XPCS (sparse compression. Blosc2 already detects series/blocks of zeros and sore them with a single 32bit word) --> Use Blosc2 as an orchestrator to explore the different codecs adapted to each kind of data --> test JPEG2000 (or equivalent) with BLOSC on tomography data --> Automatically learn the best parameters for compression (BTune ML-based ) determined the best compromise between compression and decompression speed / compression ratio - explore the potential of irronarrays in terms of computation (done on chunks in the cache) - would it be suitable for a particular technique? - Building a tool to uncompress in GPUs directly (avoid uncompressing in memory and send to GPU) - -> Jerome Kieffer mentioned an example with bitshuffle/LZ4 (uncompressing only in GPU, some code taken from NVIDA, expecting 10x faster calculation, John Wright) - would have to define the codecs to use - diffraction tomography data - NVIDIA has put effort in codecs for GPUs, check online - CUDA or openCL? (PyOpenCL or PyCUDA) - ~~vkfft might prove useful~~ ## Timeline ## Evaluation of resources needed