# UW-EPE Group Meeting - 2021-05-26 tags: group-meeting Present: Gordon, Mason, Tal ## Annoucements * [PyHEP](https://indico.cern.ch/event/1019958/) deadline is June 6 * *ACTION*: Next week come up with ideas * [Secret sauce](https://hackmd.io/features?both) for hackmd * Really excellent [talk on ATLAS Tau's by Quentin](https://indico.cern.ch/event/1039328/) at last week's EPE seminar (he is a new hire @ UW). Zoom recording availible (you have to be logged into UW Zoom to view for now). * Looks like we will soon have an official team of people coordinating the Analysis Challenge (Oksana and Alex), that might mean that the Analysis Systems leadership comes free. ## Discussion * How is the [LLP Workshop going?](https://indico.cern.ch/event/980853/) * Not a lot is being recorded - so slides * Ideas for places to branch out? * BIB for HL-LHC discussion * Interesting talk on HNLs from Richard Ruiz: https://indico.cern.ch/event/980853/contributions/4375091/attachments/2252025/3820438/rruiz_LLP9_HeavyN.pdf * Anything to note from [vCHEP](https://indico.cern.ch/event/1019088/)? * The Analysis Systems video - should we do something like that for a CalRatio recorded seminar? * Anything we should be submitting for PyHEP? * Run 3 What new should we be exploring? What is worth continuing? ## Physics & IRIS-HEP Projects - Active last week Try to keep a list week-to-week fo the projects, and what we've done, with a bit about plans for the following week. Hopefully filled out before the meeting. * **`func_adl_xAOD` for CMS Run 1 AOD's**. Baid, a IRIS-HEP fellow near the end of his fellowship is implementing this. Eventual goal: be able to run the CMS Run 1 $h\rightarrow 4 \ell$ example analysis. * Status: Baid is implementing the `Range` sequence generator. This will generate a range between two numbers, and be used to loop over an integer index. The code changes to `func_adl` are minor. Useful for when loop boundaries aren't implied. Currently debugging the C++ code generation. * Next: Complete this integration and build a container with the cms aod transformer in it. * Does `func_adl_uproot` need to support this? * **CERN Open Data DID Finder**: Allows `ServiceX` to run on files stored in [CERN's Open Data portal](http://opendata.cern.ch/). CMS's Run 1 AOD files are all stored there, for example. * Status: have converted the original `rucio` DID finder. Now have a demo, a CERN Open Data, and an improve `rucio` DID finder, all based on a library that makes it easy to write. * Next: Clean up conversion and get _buy-in_ from rest of ServiceX community. * Question: are there any other sources of data we should be looking at that would help us do something cool and interesting? * **servicex_frontend**: The bit of code that makes it easy for a user to execute and download data from a servicex backend * Status: Released, and in production. No major upgrades planed * Next: Two bugs recently found: * CMS filenames are so long they break the Linux filesystem when combined with metadata that we use to track the files downloaded. Need to implement hashing. * If someone requests over 100 transforms to be run at the same time, there are connection failures. * CalRatio + trackless jets * Started some more serious BDT tests with XGBoost * Starting with only per-jet NN outputs as the BDT input * Not surprisingly, this doesn't do as well as the nominal 2 CR per-event BDT yet * Also iterating with Louie on his `dv_ntuple_reader` MR * CalRatio * Started looking at R22 validation of analysis code * Migrate EXOT15 derivation to R22 * Validate using DAOD_PHYSVAL derivation (check with Cristiano) * Submitted QCD jobs reran with new BDT + download to LLPData (nearly finished) * func_adl_uproot * Tracing through issue [func_adl#64](https://github.com/iris-hep/func_adl/issues/64) * There seem to be some hacky AST tricks going on in the xAOD backend, so I'm trying to understand this better * Cabinetry * Started learning pyhf and cabinetry structure, following tutorials, etc. * Useful to meet with Alex one of next weeks? * ACTION: see when the appropriate time is (early this week). ## Physics & IRIS-HEP Projects - Not Active Place things here that are active, but not worked on the last week. * **ABCD Axies By ML**: Use ML trianing to determine axes for the ABCD method. * Status: Finished proof-of-concept. Using actual MC data. However, the MC multijet data, after basic analysis cuts, is too sparse, so have to generate. * Next: Figure out why generated data looks nothing like the distribution it is being pulled from. * **hep_tables**: Use a generic array programming interface to hide how we actually manipulate the data (e.g. ServiceX, awkward, etc.) * Status: Prototype done, see [CHEP talk](https://www.youtube.com/watch?v=81WFn040j9s). * Next: figure out what is needed to drive project forward. Perhaps by narrowing scope a bit. ## Random Stuff * This week is the big IRIS-HEP review. The NSF does this every single year to make sure we are not doing something crazy (oversight!). A great deal of my time over the past month has been spent preparing for this. Next year is going to be a really big review as the decision out of that will be if we should run another 5 years or end after this first five.