# LEAPS-INNOV WP7 Coordination Kickoff
**collaborative meeting minutes**
present:
- Darren Spruce / MAXIV
- David Pennicard / DESY
- Axel Kaprolat / ESRF
- Vincent Favre Nicolin / ESRF
- Nicolas Soler / ALBA
- Guifré Cuní / ALBA
- Alun Ashton / PSI
- Peter Steinbach / HZDR
## Project Documents
Shared by Peter Steinbach
https://cloud.hzdr.de/s/L9m6LbLacwrgZ6w
PW: Inn0V2021
## Tasks
### WP7.1 Collaboration platform for data reduction and compression
**(ULUND, DESY)**
- DP: github as platform for sharing code
- AA: specific area on github maybe premature, maybe first only act as forwarding party
- DP: do we know if others are doing similar things?
- DS: making access to code or existing projects
- VFN: maybe setup a wiki
- DP: what are best practices?
- AA: not sure
- DS: mixed bag of practices
- AN: same here
- PS: what is the objective - connecting people or establishing a "brand"? involved parties have their side conditions (with respect to funding), wiki or alike should be supported by satellite events/roadshows perhaps?
- DS: more connecting people
- DP: focus on a single kick-off event for now
- DS: use the kick-off to ask these questions to the community
- AS: how much do we want to share? (would be nice to succeed in this)
- DP: LEAPS kick-off planned for April 20/21? (circulated through LEAPS-INNOV mailing list)
- DP: will circulate the tentative agenda
- DP: mornings reserved for work on the work packages
- AA: open for people by the facilities
- DP: TODO circulate mail to WP7 coordinators mailing list
- DS: content of kickoff
- DP: 4 WP for 12 sites - having talks by all is overdoing it
- DP: split the time in WP
- present ideas about our ideas
- invite feedback to some degree
- DS: prepare questions to guide discussion?
- PS: we need the commitment by participants as we rely on that down the road
- DS: combine both!
- GS: prepare template presentation with a concrete example
- DS: TODO for all provide questions to Darren+David for this
- VN: maybe use gdoc or hackmd document
- PS: will provide hackmd document to collect questions (circulate in WP7 coordinators mailing list)
- DS: timeline -> April 1/2
- AA: ask presenters to pre-record talks (like the flipped classroom)
- AA: should we invite industry?
- PS: maybe not for the 1st meeting -> but it is important to meet (perhaps a 2nd iteration)
### WP7.2 Assessment of future needs and development of metrics for data compression and reduction
(**ALBA-CELLS, ESRF**, DESY, DIAMOND, ELETTRA, EuXFEL, HZB, HZDR, PSI,
SOLARIS, SOLEIL, ULUND)
- NS: touring the beamlines and getting in touch with people
- DS: ESRF will help to get in touch with people
- DP: typically upgrades to hardware involved assessment of data production and data types
- DS: maybe a survey?
- DS: frame the survey to attract people
- VFN: 12 partners - how to deal with heteregenous field
- PS: collecting information is super critical
- where does the data come from?
- what are ideas where the noise is coming from?
- what is the data used downstream?
- how is it stored?
- summary: pin people to provide most detailed information as possible -> super important for WP7.3
- made good experiences with [mlcanvas](http://machinelearningcanvas.com/)
- VFN: we likely have to select projects/target specific techniques
- DP: include questions to what people like to achieve?
- VFN: Month 1 == April 2021
- NS: select specific techniques?
- total storage capacity (and future increase)
- data volume generated per year and per beamline from say 2018 to now to see the trend
- Current strategies to avoid saturation
- DP: preparing data for compression or ML is challenging
- PS: indeed, most time my team is spending revolves around
### WP7.3 Evaluate and adopt new strategies for data reduction and compression
(HZDR, DESY, ALBA-CELLS, DIAMOND, ELETTRA, ESRF, EuXFEL, HZB, PSI, SOLARIS, SOLEIL, ULUND)
- PS: depending on WP7.2 output
- PS: mostly aiming to publish a open library for fast pipelines (hoping that )
- PS: need to infer tons of technical details (prominent container format?, bandwidth limitations, ...)
- PS: lossy algorithms will involve knowing the downstream algorithms to see WHAT data can be discarded
- PS: hope to get closer in contact with ESRF (blosc devs) and/or PSI (contact to IBM wrt to hardware)
### WP7.4 Research infrastructure integration with detector suppliers on data
(PSI, DESY, DIAMOND, ESRF)
- DS: connecting to industry
- AA: work ongoing with respect to blosc on PowerX?
- AA: industry interest is present
- AA: need to work out contacts with manufacturers
- AA: perfect spot for compression is tricky to find
- compression close to detection?
- compression close on detection?
- compression offline when data lands in storage
- each of the above have specific merits wrt industry
- some sites have tradition to buy/deploy end-to-end solutions