Dashboards like (W&B / ML Flow) are a good way to bring people "on the same page" and compare/review/debug performance
Frameworks that are built around hooks and plugin/callback structure are a good way to allow extensibility without growing "Dinosaur classes. For example, lightning hooks like on_validation_epoch_ends allow you to write callbacks to do stuff at the end of an epoch rather than sublassing/modifying your class
User experience for physics data analysis tools/Documentation/Training (Tuesday)
Note: This has been copied down to the Thursday session
We should be able to label / name axes for readability.
Awkward Array can/should add support for named axes, c.f. XArray with dimension names
No need to add labels because they're much more specific to plotting
New function e.g. array = ak.with_axis_name(array, 0, "events") to permit ak.sum(array, axis="events")
The flatten(array[argmax(..., keepdims=True)]) pattern is complex to understand, despite being a pattern
We could add an accessor that allows us to force ragged indexing. This permits to have a single-index accessor if needs be, that directly consumes the result of argmax(..., keepdims=False), e.g. array.at[...]. We could also have a similar-yet-different array.select[...] that accepts a keepdims=True positional reducer result, and flattens the result afterwards (we know that keepdims=True produces regular dimensions, which can be identified statically).
In discussing Lukas brought up the view of coarsness/analysis resolution:
A workflow language (like Luigi) allows for the user to define coarse, high-level analysis operations (e.g. event selection, histogramming, statistical analysis) (scale: What resources you're needing)
Lukas: Should think about orthogonalization of views. Need to have the workflows systems think about the operations be distributed.
Dask then takes the blocks of the things that are defined in this workflow language and then efficiently operate on them (scale: What expression are you calculating)
Sharing our outputs/provenance:
If you do an expensive computation you want to be able to share that with colleauges
Tiled: In the future will be able to store Awkward arrays
Upcoming IRIS-HEP topical seminar
Workflow languages
Lukas points out that in the bioinformatic world (where Snakemake came from) that the workflows are much more focused on using well designed CLI tools, whereas for our HEP workflows they are much more complicated and more unique.
Conclusions:
nested workflows (Luigi & Dask)
getting people to use it ("making it advantageous")
Matthew: I think "not promising" is maybe a bit too bleak. But more a shift of where to start using workflows.
Lukas: Success / Problems:
Success: New physics results are coming out that are using reuse of published ATLAS anlayses. People are also using it, and evidence that analyses actually are expressable in a workflow language.
Problems: People don't tend to use yadage/workflow languages in their day to day (a first step is usually building a new Docker image for your analysis).
Discussed why people might want to use a workflow system and while there can still be an "escape hatch" this also allows you to work for a long time before you might need to "pull it".
What is the "workflow file" and how does it work? For columnflow this is Python files vs. Snakemake/Yadage that use a YAML definition file.
Integration with REANA:
Seems like this should be quite doable with Luigi/Law-and-order!
Already an AGC implimention using columnflow
Open PR to add it to the AGC implimentation listing
Should have seemless integration with coffea on dask (but not yet)
Fitting and Histograms (Wednesday)
Seemed like there was quite a bit of disagreement, uncertainty on what "fitting" means, so moving things later into the week (tomorrow, Wednesday)
Focus on importance of serialization that isn't just binary but also "human readable" (to Matthew "human readable" means JSON or YAML, regardless of how gnarly)
Mike Sokoloff suggested writing both binary and human readable
Why did these not take off (code rot, people's shifting interests, etc.)
Unit tests of fitting / inference approaches (Wednesday)
Mike Sokoloff, Josue Molina, Matt Bellis
Our discussion broke down along two lines, a) datasets for comparing and stress-testing fitting/inference frameworks and b) agreed upon values for "standard" amplitudes (e.g. Breit-Wigner)
Datasets for comparing frameworks
Imagine a location where datasets are hosted in a hypersimple, future-proof (lol) format like ASCII-text. These datasets are observables with the features labeled in some way (README, .csv, .json (?)).
Each dataset could come with suggested PDFs or amplitudes that could be used as a hypothesis for the underlying distribution. These suggestions should be typed up in LaTeX or from some paper and not provided as code. A framework-developer can then try to code up this "physics" however they see fit.
Should the dataset contributors be required to provide information on how the data was generated?
Is Project Euler a template for this type of website?
There should be some agreed-way (HS3?) for framework developers to compare their outputs. But in general, these comparisons should involve
Central values of the parameters
Uncertainties (this means different things to different people)
Likelihood scans
Confindence intervals
Nuisance parameters / constraints (how they varied)
Anything else?
So as an example, imagine a single-column dataset of an invariant mass distribution that was generated as two interfering Breit-Wiger functions (BW) with some resolution applied. The suggested fits could be
1 or 2 Gaussian functions
1 BW
2 BWs
2 BWs with resolution (convolution with Gaussion) with "correct" resolution
2 BWs with incorrect resolution provided as central value (e.g. for Gaussian constraint)
Should come up with examples for binned fits as well. But the above lets someone play around with "native" PDFs or PDFs generated from interfering amplitudes.
It should be straightforward for people up upload datasets that they feel are useful for the community (PWA, hist-templates, etc.) to check their framworks on. This makes it easier for frameworks to compare not just the code but the assumptions about the physics and how they coded it up. Josue has experience with this comparing rio++ and laura++ and finding that they gave different results with a particular physics model, because of how it was coded up in laura++ and how the integrals were calculated.
There could be datasets that map observable values to ampltude values. This way people could test their code to see if get the same numbers.
For example, imagine a 2-column dataset that is designed to vet their code for a BW. So the first column is an observed mass and the second column is the amplitude as a complex number. Challenges are:
The BW depends on the mass and width so do you have this very finely grained for many different masses and widths?
How do you agree upon the canonical values?
Related to this was a spin-off discussion of how amplitude analyses should share their results. In addition to the parameter values, should they also share their
Data
Efficiency maps
Need to make sure the phase space is sampled enough so provide good enough coverage
Chalkboard of our discussion.
Histogram and Fit Serialization (Wednesday)
Present: Peter, Jim, Angus, Ben, Henry
General
Jim: It is not a technical difficulty, but a standarization difficulty.
Desirables
From and to root files
Good C++ and Python interfaces
Bin and edges data should be binary
Embeddable in root or HDF5
Nice to have
Metadata as human readable, rest could/needs to be binary
Complete custom streamer. From boost_histogram define the serialization as a separate C++ library. uproot can recognized that and deserialized appropriately.
Difficulties
Cannot put everything on JSON efficiently, need to add binary data.
Discussion:
Peter: could add metadata through the attributes of HDF5. Metadata could be a JSON with attributes.
Jim: Most distinguishing thing that may work is to serialize it as a TString, but root won't be able to inspect it.
Jim: Hans proposed to serialize using python using Boost.
HDF5 easy to inspect, tons of libraries for that.
Protocol buffers.
HS3 standard back/from JSON.
Angus: If HS3 standard is JSON, we could define a superset that maps directly onto e.g. BSON.
Jim: But, there are many JSON-like binary formats.
C++ needs to be able to define the ROOT streamers.
Can't depend upon ROOT → Need to use special ROOT macros in streamers.
Implies multiple libraries; histlib C++ histogram serialization, another library that depends upon histlib to serialize into ROOT.
Henry: we could avoid worrying about a dependency tree and just write the code twice
Angus: I imagine this code might not change much, so that's probably reasonable
Henry: It probably will with feature additions (schema evolution) etc.
We can avoid writing a ROOT deserialiser given that we have a TH1 converter for boost-histogram
Jim: Streamers are bidirectional, so this is a moot point
Common miscommunication is that any-JSON means all-JSON
Angus: What does human-readable mean?
We should try and define this for the purpose of future conversations
Protobuf requires protocol definition to understand metadata, so not very readable
HDF5 metadata is easily findable
JSON tools can include linked data, but do we really worry about not following JSON standards?
Henry: should bin edges be part of metadata?
Worst case - a very long list of edges
Angus: should this not be out-of-band for variable edges
Henry: consider a zip file, with two files: metadata + binary blob
Angus: is this intended to be easily modified, or is it write-once read-once.
Yes!
Jim: there are ways to embed in zip, ROOT, HDF5.
Zip can have many files (single file with well defined name, metadata).
HDF5 you'd need to make a group with a special interpreter for tha group.
ROOT is all packed up in streamers anyway.
If the primary expression of the histogram metadata was JSON, then we could use cling on the JSON schema to generate the C++ classes representing JSON.
If we start by saying that we want a JSON schema and binary blob(s).
We have decided upon JSON-like data, i.e. it trivially maps to JSON (without wedding to a particular format)
Jim: if we're using JSON schema to describe the metadata, then we're "all good".
We need to really clear that we're talking about binary blobs out-of-band.
Henry: it would be complicated to have edges as data (not metadata)
Jim: multiple-data (exploded) exposes internals more, at a tradeoff of memory usage
We want to say - it should be easy enough to put things into HDF5, ROOT, zip.
We (Jim, Angus) are thinking about this as we do for Awkward Arrays; we can target many formats using this, even those that we don't yet know of.
Juraj: Example of JSOM+binary blobs: glTF, the format has two modes "everything in one file" or "structrure + adresses"
Could probably add histogram fitting and need to work with them
For hist, need to be able to fit a simple model to a histogram very easily
Lukas: There could be a seperate inference library where pyhf or zfit provide the model and logpdf function and then the inference library can do the operations
Think similar to blackjax how you provide the logpdf
Alex: The things you need to do require you to be able to modify the model (e.g. hold parameters constant or remove sections of). If you need to go through some generic interface it becomes much harder. How do we fix this?
Alex: How do you write the model? For HistFactory you either go through some framework or you reinvent the wheel. So how would you reasonably write out the model in some language (e.g. STAN, PyMC –- but not these, probably, but something along these ideas)?
Nathan: Can we bridge the models that we use the most to probabilistic programming language like PyMC?
So maybe we should be having pyhf just be going from JSON to \(\textrm{Pois}(n | \lambda(\theta))\)
Having a clean seperation of providing these functions and then being able to work with probabilistic programming languages would be something beneficial to do.
Nick: It will be important to have explicit LogNormal functions as constraints can be different if you start with Normal and then construct a log normal in composite manners.
Lukas: A problem with using probabilisitic programming languages for modeling and building is that all these tools are designed for Bayesian methodology. If we were going to try to use a probabilisitic programming languages in a frequentist manner we might have to do all this work ourselves, which is maybe quite a heavy lift.
To be able to interface between the inference library and the model need a clear API
Alex: The part for external libraries that is perhaps interesting is being able to do things that are more complicated than MLE like root finding and being able to freeze specific model parameters. All of this should be able to be abstraced away, though unclear if making this generic is actually useful.
Lukas: Model mutations should happen on the modeling library side.
Nathan: It seems that the key ingrideint would be a minium viable API for what the model should be/impliment for being able to do impact plots and others for any model.
Alex: Has an example of where this become quite complicated with goodnes-of-fit. Example in cabinetry: cabinetry.fit._goodness_of_fit: this relies on evaluating constraint terms separately and requires a notion of what those are
Nick: pytrees are a nice way of representing structured parameter vectors while still easily flattening them for the purpose of minimization. Each bin in a binned fit could be annotated with labels that allow to isolate different contributions to the total NLL. I assume the same is true for domains of integration in unbinned fits.
(nathan) even using a dict (which is still a pytree) helps here more than a vector – you can pass a subset of params etc by name into a method!
Lindsey: We need to break everything in the above out into individual parts at the high level and then agree on APIs there.
Why str? Seems very specific to fitting (namely: the str represents a parameter or variable). Why enforce that on non-fitting packages that don't care about names? In other words: x is more than just \(\mathbb{R}^n\). Answer (partial): for example, if you want to just vary some parameters?
Warning: each dimension has its own dimension because it's ArrayLike
InterOp between C++ and Python (Wednesday)
Present: Baidyanath, Ianna, Ioana, Juraj
Use Case by Juraj: FCC team uses RDataFrame to run their analysis. New users expect to write everything in Python but RDataFrame requires C++ functions to be passed into it. They are heavily dependent on C++ libraries. They use Spack for distribution which can interfere with other environments. Solutions:
RDataFrame can add better Python support for all functions passed to it. This is ongoing but will take a bit more time. Depends on tighter Numba-Cppyy integration.
Awkward array can be used. This solution is not viable for many use cases due to heavy dependence on C++ based libraries for the user (will most probably be fixed once cppyy 3.1.0). Another issue is additional columns cannot be added to the data.
Investigate intererplay between EDM4hep and Awkward array
JIT compilation
Issues:
JIT compile time
What is a good amount of time for compilation: order of magnitude within out of run compile
This was a problem in Garima's project. gcc/clang took well under 1 second; gInterpreter.Declare took 17 seconds.
Less knowledge in JIT compilation than in AOT, hard for it to know what to spend time optimizing.
How about users of JIT compilation having hooks to be able to inject knowledge about which collection instances are big (need more optimization than the rest).
Is this related to C++? Does C++ JIT need to compile for cases that may never be called?
Isolating state of JITted code
So if the JITted code crashes, the outer binary does not crash, and is able to restart.
This is a C++ thing; Numba and Julia don't provide a way to access to raw pointers/ability to segfault.
Python/C++ interoperability
.
Cppyy maintenance
Avoiding ROOT dependency (faster turn-around time than asking the experiments if adding a feature breaks them).
Hard to find people who want to work on PyROOT.
clang-repl is already in LLVM; libInterop should be, too (eventually).
xeus
QuantStack wants to move xeus-cling to clang-repl.
Demo exists of passing variables from C++ to Python in a Jupyter notebook, but not the other way.
numba-rvsdg to deal with new, rapidly changing Python bytecode.
Jim's side-project: extending this (without LLVM) to provide Python bytecode → AST.
Awkward
has extensions for Numba and for cppyy; Numba and cppyy have integrations; will Awkward work with Numba and cppyy?
unclear what Jim is asking, whether there are any problems there.
Automatic Differentiation and Gradient Passing (Wednesday)
Lukas: If we start basic with the IRIS-HEP AGC what are the parameters that
Alex: Cuts that you apply, machine learning piece that gets applied
Lukas: One initial approach is that replace cuts with weights (each event gets a weight)
What is the thing that you want to optimize? Can then plot the loss landscape in 1D and this would then allow us to understand if this is something what would be achievable for optimization with AD. Visualization of this would be quite important.
Run the AGC at full fidelty as is with many cut values for 1 parameter that can be wiggled.
For the AGC there is a way to run a small version with limited data all in memory on a single machine, so could possibly get enough events for sufficient loss landscape
Lukas: You might somewhat solve that when you fill the histogram you fill it with an array of weights the size of the histogram.
The AGC currently uses hist. You
Normally you have 3 bins. If you wiggle the parameters a bit then an event will fall into different bins.
If now you say that each event goes into all bins with each bin getting a different weight (that sum to 1) then as you wiggle the parameters then the weights in each bin shift. This is basically doing a binned KDE.
For hist instead of trying to change hist itself could also just impliment a custom backwards AD rule to deal with it as this is just going to deal with summation.
Lukas: Plan for IRIS-HEP Analysis Systems
Run AGC and visualize the landscape for 1 cut.
Change the AGC to work with weights.
Change to all-bin filling.
In the event weight scenario it must be that any event downstream must be evaluatable, so if you have a selection cut on 4 leptons that is failed and then you downstream require a 5th leptop pt then this won't work and you would be required to do a stochastic approach.
put together a simple "analysis": read jet four-vector information from file, calibrate energy as a function of a nuisance parameter and calculate invariant mass of dijet pair
do this with awkward+jax and take derivatives of the mean of invariant masses (across events) wrt. the nuisance parameter
Actor histograming, dask automatically knows about actors, and triggering connection to the actor where the data is stored.
HDF5 memory compress histograms. Efficient because they are often sparse.
One histogram internally chunked in HDF5 for compression.
One possible issue is that hist.fill needs to keep everything in memory.
Ideas:
Henry: Maybe subclass hist.fill so it that it knows to work in chunks, rather than keep all in memory. Work in "snapshots" of the fill.
Lindsey: At the end of computation, how do histograms look to the user?
Ben: When computation end> Does it end when all chunks have been filled for a postprocess analysis? Or when they have been virtually merged.
How to do the dispatch?
Henry: Transfer all fill to all workers and then filter to what is not needed. E.g. some histograms just look at slices of the axes.
How do you do the sums across the fills?
How to do conditional dask? E.g. split on categories before dispatch. (numpy masks?)
Ben: Lists of numpy masks that select rows of array. Label compute nodes so that they process a particular mask.
Peter: Treat the fills like a Triton server. Send a signal that dumps the histogram at the end.
Peter: How do you know that a data chunk belongs to a particular fill? Lindsey: the bining is done before hand.
Lindsey:
h = hist.dask.sharded.new....()
# then do...
h.fill(*arrays)
# rows of *arrays are dispatched to compute nodes, and...
dask.compute(... sum over bin segments ....)
Teaching, training, documentation and coordination
Present Tuesday session: Matthew Bellis, Angus, Oksana, Mason, Aman, Zoe, Kilian, Ben, Remco, Clemens, Josue, Benjamin, Juraj
How to connect the different resources (how to guides, training)? Scikit-HEP is decentralized development, but central vision. Possible solution: Analysis gallery
Analysis Gallery: Should there be a central "analysis" gallery
have plausible web analysis around all of our training material investigate discoverability/user behavior
Had GSoC proposal to rebuild this in a more dynamic website able to list more and filter it by need (and some alpha-versions were made that we could start from)
Need to find good balance between too narrow in focus and too wide
Negative example for "too wide": "Awesome lists" that keep on expanding and stop being useful
How to interlink different documentations/trainings
Relation between different kinds of matrial: Diataxis: https://diataxis.fr/ - lots of work but can be used as an overall guide
Getting people to contribute & getting user feedback
Should we have a mechanism for third-parties to contribute documentation to specific repositories from a centralised source?
How can we get users (especially new users) to write documentation?
How to add a "comment" box for notebooks:
Jim: Could have link to gh issue/hackmd/etc.
How to give notification to developer
Could do "hypothesis"
How can we break down hurdles:
Lean more heavily on Web-IDE (GitHub codespaces etc.) for PRs
Essential to give users the option to give feedback quickly (without having to create issues/pull requests). Ideal: feedback/comment button. Example sphinx-comments, sphinx-disqus (from this FAQ on RTD)
Could people earn "Karma". Become official contributors to the project if they report a documentation issue.
Angus: Let's come up with a "social ethos" to help build a community of users. How to build a community?
Matthew: People are very shy/apprehensive about doing things "in public". Could there be a sandbox for it? Or having private tickets (also think of GDPR)? Philip: Pythia does this.
Juraj: Tutorials ready from CI.
Angus: create stubs:
Users can see which topics are already identified
New developers can see where to start contributing
Long-term developers can contribute in free time
Training paradigms:
Matthew: People may start to prefer video resources rather than reading material. One problem is that it is harder to keep video up-to-date.
Kilian: There are already some prototypes video documentation for some HSF Training Modules (like Docker etc.).
Having more things as prerequisites to even out level of participants and to make sure we don't lose time with "trivial things"
Platforms for running code for training:
Angus: Should we have central "Binder" service?
Forum & chat
similar to ROOT forum?
Both chat and forum have merit
Been over a year that we talked about this.
what to use for chat? discord?
How to balance chat vs forum?
Jerry: Bot in chat that generates discourse post, initially invisible, for posterity.
Forum ~ stackoverflow-ish
Other ideas/suggestions:
Benjamin: Workshop about "what to do if there's no documentation?"
Philip: Discussions sounds similar to HEP Forge: Find packages, link them together on a page. Do we repeat history? So what can we learn from that? Reasons HEP Forge failed:
ran out of funding
overambitious: wanted to be more than just an organizing/discoverability project but wanted to solve versioning (and failed)
switch from SVN to fabricator/git (and lost people in the switch)
Oksana: How can we train developers that they engage users and write peoper documentation.
Office hours similar to scipy? - Repeating new contributors meeting
✨ Conclusions & actionable items ✨
Discoverability: Need to link resources together and make them discoverable:
Documentation/training material should be interlinked
HSF Training Center can be expanded to make tutorials discoverable. Must strike balance between notable/maintained and inclusive. Considerations:
Resources have to be curated. Set minimum standards for quality and notability.
Be clear about scope, don't be HEP forge or one of the awesome-xyz lists
Plausible can be used to understand where users come from/where they go
Contributions: Making it easier to contribute/give comments (rather than opening PRs): Options include:
teaching people about GitHub Web IDEs for simple PRs
include more feedback/comment buttons (like sphinx-comments), ideally also anonymous.
include stubs for things that are missing in docs and should be filled in
Prerequisites: Having prerequisites for workshops can "even out" experience levels of people and avoid "trivial questions"
Feedback: Jim recommended directly implementing feedback buttons into notebooks (e.g., via Slido)
Maintainability: guaranteeing that code examples work (see also CI remark; documentation from Jupyter notebooks) and that interlinking is correct (e.g. linkcheck).
Hacking away & other ideas:
Regular "documentation day"
Half-day workshop as part of PyHEP to help developers write good (or any at all) user guides/documentation. Could also use that to get everyone to interlink things.
hsf training has around 3-5 alternative "from scratch" implementations/PoCs of similar platforms that we could consider/start from. Might also rope in some of the GSoC candidates with JS knowledge.
Purpose is to provide a minimal (basis?) set of tutorials that provide a starting point for self-learning(?).
Do we split tutorial from guides at the top-level, or as a filter criterion?
Can be non-linear, to provide a better visual overview of how different packages integrate/can be used at different stages
Adding a place for videos (for the ones that have been presented live), and a place for time estimate: "This tutorial will take 10 minutes."
Easy pipeline for contributors to add new ones.
Amplitude analysis (PWA)
Present: Henry, Jonas, Josue, Mike, Remco
General questions
Can amplitude building be completely separated from (efficient) function evaluation and fitting? If so, it would make it easier to let PWA frameworks talk to each other. –> More easy to do with Python than with C++
What do we consider to be a PDF? How does it relate to amplitudes? How does it do its normalisation?
Can we standardise plotting in amplitude analyses?
Issues that might bring PWA frameworks together
Can PWA frameworks become a consumer of more general fitter packages?
Can we define a standard to describe amplitudes and amplitude models? (Like UFO, DecayLanguage, …)
We need to improve comparisons between different frameworks to make our results more reliable (and reproducible!)
Central place for documentation would be great. References to important literature and inventory of frameworks.
Benchmark analyses / unit tests, would be solved by UHI-like test package
Challenges with comparisons
Most PWA tools are isolated frameworks (see list here); hard to make them talk to each other
Differences in lineshapes, handling of spin, sign conventions
Integration strategies for likelihoods
Concrete follow-up steps
Regular PWA software meetings (tie in to existing conferences like PWA/ATHOS)
streamline amplitude analyses (optional dependencies), i.e. input/output to different frameworks
standardisation, protocols of data and/or amplitudes
test suites for comparing performance and result consistency, think array-api-tests
host test data sets (e.g. through release notes, like GooFit does)
Note: pwa is still available on PyPI; pwa.readthedocs.io is already 'ours' ;)
Task Graphs and Using Annotations to Match Resources and Modify the Computation
Present: Nick, Ben, Angus
Dask annotations: metadata about tasks that schedulers may choose to respect, e.g.:
with dask.annotate(big_mem_machine=True, prefer_gpu=True):
z = something.compute()
with dask.annotate(machine_profile=..., checkpoint=True, environment=...):
...
# then somewhere in the custom scheduler:
dask.get_annotations()
# {'big_computation': True}# or:
l.annotations for (n, l) in z.dask.layers.items()
E.g.: Give a hint to the scheduler that it is a good idea to cache something because it was expensive to compute. But cache only what is relevant in the context of the task graph in question (e.g. do not cache everything as persist would do.)
Easy case to test is to use these annotations as class ads for HTCondor.
Label graph for things like systematic variations, and modify the graph accordingly:
Offload expensive sections of code to accelerators
defsuperMLalgo(jet):
# pretend this needs a GPU to be fast enoughreturn jet.pt*20 + jet.eta
with dask.annotate(needs_gpu=True):
events["jets", "discriminant"] = superMLalgo(events.jets)
Cache results that are expensive to compute and/or small relative to input
New CMS OD data campaign (NanoAOD2016) will require to adopt analysis -> lets track it over issue (e.g. year-dependent analysis decisions / specific file handling)
L.Gray: wishlist to have proper AGC tutorial as well with more CMS complexity
L.Gray: the training example of ParticleNet would be amazing (very computationally expensive model to train / evaluate)
Autodiff integration: how to make sure what is already differentiable
moving in small prototypes: continue until hitting issues, then feeding back to developers to resolve them
small prototypes can already conceptually capture lots of important functionality and be extended in various directions
How will look like using the pattern dask+jax?
supposedly can work, none of us has practical experience
Training in differentiable analysis: we will need to have different model each time when changing cut, meaning that model should be updated and making it very expensive
Q from Tal: what about systematics? would the whole chain aware that pt cut was changed?
yes, can conceptually propagate this all through (would require e.g. differentiable correctionlib)
Q: if there is an interest to include running event generators in the AGC setup? -> We expect that AGC pipeline starts from "common" data formats, however this is rather interesting for workflows such as those based on madminer
Next step will be to move coffea2023: no obvious blockers, need to test this again
can also adopt recent changes like ML interface that should help
Marcel is asking if we can add "debugging style" AGC implementation (e.g. cutflow), it could be useful while teaching
should come "for free" with PackedSelection in coffea now, other such types of functionality would be useful to probe
showcase of columnflow implementation of AGC
File formats
Jim, Matt B., Zoe, Jerry, Nikolai, Ioana
Nikolai (PHYSLITE in ATLAS): Stores some classes, not just simple data values.
RNtuple may have been informed by early discussions about Parquet.
Arrow is probably the way of things going forward because of how they have carefully considered the memory layout, informed by frustrations with pandas dataframe.
Arrow is in-memory. Parquet and Feather are on-disk storage.
Action items
What should we do about the HDF5 projects? (hepfile,HEP-CCEs, TDTree)
Matt B will push forward with hepfile so that we can profile its performance going forward.
Jerry and Jim (CHEP 21 suite) will provide Matt with some tests to convert to the different formats. This could turn into a general performance suite for file formats (ROOT, feather, HDF5, etc.). Profile tests could also help with discoverability of different approaches.
Nikolai will work on an output module for Belle2 efforts.
odo tool to convert from one format to another.
Future items
Could AGC provide a test case to see how much time is spent doing the following. Since this is a real analysis test case, it would allow us to understand where time is spent.
Decompressing data
Reading from file
In-memory operations (after read in from file)
Code readability
Present: Angus, Aman, Iana, Kilian, Gordon
Defining readability:
How many "units" of code a concept requires to explain
Extrema of APL vs code-gen
Tradeoff between terseness and units of action per line
Tradeoff between modularity and succintness
User happiness vs developer happiness. And who's the user here?
Examples:
Sometimes the most performant code comes at a cost of readability
The axis=1 discussion -> named axes
General Discussion:
Tradeoff between production and development w.r.t terseness
Tradeoff between users and developers; both are important, and different kinds of readability matter
Type hints:
Is this a good goal, does it help readability?
Can we type hint array libraries?
Ianna suggests that we touch on FuncADL
Gordon has a hitlist.
argmin, argmax, axis=1
argmin - the common pattern is to slice into a record to pull out multiple fields. To be able to minimise with respect to a field.
Should we have some kind of manual pass through repos
Jim's work to analyse repos?
How do the language models feel about our syntax?
Use ChatGPT et al. to explain what code does, which implicitly encodes user understanding?
Office hours?
Requires need to advertise in multiple formats, and drive point home.
Physicist nature is to assume they are misunderstanding things, rather than the tooling being confusing.
We should try and improve this.
pyhep forum & chat
There will be a vote to reach community consensus about what to do