# DP Meeting 2023
## DP meeting notes for 2022 and earlier are at https://hackmd.io/dbskL9vDQjeQ1PXuu-Zzog?both
# DP Meeting 7th December 2023
## Attending
> Alexander Heidelbach
Anja Novosel
Bianca Scavino
Christian Wessel
Doris Kim
Frank Meier
Gayane Ghevondyan
Giacomo "the GOAT" De Pietro
Giovanni Gaudino
Giulia Casarosa
Giulio Dujany
Giuseppe Finocchiaro
Jaeyoung Kim
Karim Trabelsi
Laura Zani
Luigi Corona
Marko Staric
Matt Barrett (he/him)
Noah Brenny
Pere Gironella
Stefano Lacaprara
Valerio Bertacchi
Xiao HAN
Xiaodong Shi
# SW Shift report (Alex Heidelbach)
* Validation
* ECLCharedPID still needs a contact person
* KLM, FEI no response yet
* CDCdEdxPID, EKLM TRG possibly fixed soon.
* Red plot in Charge asymmetry in tracking looks like a normal fluctation. Christian Wessel confirmed it. He will upload a new reference plot.
* Arul fixed a GitLab issue: plots with invalid conatct info.
* Encoding declaration in hte python scripts
* format string fixes
* multiple MR going on at the same time.
* b2luigi
* Tried to import all GitHub issues to Belle II Gitlab
* comparison between the GitHub and the GitLab is needed.
* CI issues: failing tests
* Development build
* Trigger simulation is now part of FullSim, which changed resource plots.
* Somehow DAQ has more problems. Need to open a issue.
* Fluctuation in ARICH memory leak?
* Thanks Giacomo for help.
* Would it be possible to include fluctuation sigma band in the plots?
* Frank: There is a summary page. Better to report all the strange issues.
# Skim report (Bianca Scavino)
* Collections
* Tool for authomatization is being prepared.
* Re-skim preparation
* The due date of Dec. 15th is tight. Hopefully we can merge all the issues.
* WG's are responding
* Skim meeting next Monday.
* Giulia: It will be useful to have a time schedule for data, especially for Chunk 1.
* Stefano: there is no answer yet.
* Stefano: A technical issue with the light version this morning.
# Calibration
* CDC de/dx issue (Renu Garg)
* During the summer, the order of module runs was changed including MC matcher. This is still problematic.
* "flip and refit" & "MC matcher" should go before dEdx. It is not trivial to implement the fix now.
* Giacomo suggested adding MCtrack-matching inside CDCdEdxPIDModule
* Frank(?): Why MC matching looked not so bad before?
* Renu: We were using private-version ntuples and did not realize the order was causing problems.
* Marko: the right position should be post-tracking & post-filtering. The modification should not be too difficult.
* Showed pi/K separation with CDC only in march, now, and after moving dedx to postfilter_posttracking. Last and first are similar, so that is the proposal.
* Giacomo: we need also to check all CDC reconstruction with these reshuffling, to be sure that they are correct fro calibration and not only for MC
* mentioned possible timeline to generate the necessary MC sample.
* Frank: is not convinced if this is the right time to try the modificaiton and checking.
* Marko: it can be done in a day.
* Stefano: What would be next steps after Marko's test?
* Marko needs CDST for D* (maybe calib cDST)
* Giacomo: Two projects in parallel
* check effect of his branch in MC
* also check Sviat's check on s-proc5 on data
* Need to be sure that data and MC are consistent
# Data Processing (Pere)
* no news
# MC report (Giovanni Gaudino)
* s-proc5 MCrd: almost finished. 116 jobs remaining.
* Stefano analyzed the speed of the production during s-proc5. The data will be used to set the reference.
* MC15rd_B signals: 85% done. ~200 to be submitted. The finished channels can be passed to collection creation.
# DP news
* calibration review: Updated note Belle2-NOTE-TE-2023-013
* Re-calibratoin at KEKCC. Dedicated disk and LSF. Resource will be OK until 2028.
* Continuing discussion for resource after 2028.
* Signal request for MC16
* will request a list with priorities from WG (from DP liason)
* want to start generation ASAP, right after the calibration of a given chunk is completed. Both for generic MCrd and signal (at least high priority ones)
* Xiao will help Boyang.
* Milestone setup at GitLab.
* Label setup at GitLab, too.
* Please use "submitted"
* Giulio: proposal to re-organize DP repositories
* data ---> dataprod https://gitlab.desy.de/belle2/data-production/data/-/issues/20
* Giulio: proposal for rel8 DP global tags
* the default names: data_reprocessing_., data-processing_., mc_production_.: https://gitlab.desy.de/belle2/data-production/data/-/issues/22
* Giulia: Question on run 2 payload name conventions.
* Giulio: data_prompt
* Maintenance days every 4 weeks https://gitlab.desy.de/belle2/data-production/data/-/issues/25
* patch releases will be synchronized
* buckets? to be decided
---
# DP Meeting 23rd November 2023
## Attending
>Alessandro Gaz
Andreas Gellrich
Anja Novosel
Bianca Scavino
Doris Kim
Gaurav Sharma
Giacomo "the GOAT" De Pietro
Giulia Casarosa
Giulio Dujany
Jaeyoung Kim
Luigi Corona
Marko Staric
Michel Hernandez Villanueva
Noah Brenny
Pere Gironella
Shanette (guest)
Stefano Lacaprara
Swagato Banerjee
Valerio Bertacchi
Xiaodong Shi
## Software shift report (Vard)
## DP status
* New policy for signal request. If a new dec/dat is needed, DP will start produciton only after it has been merged to a patch release
* Technical meeting DP-DC resumed: schedule is bi-weekly. Next is Thursday Dec 7th 15:00-16:00 JST
* Airflow future:
* Either we find someone to support it, or we should consider moving to different tool
* Candidates B2Luigi or maybe https://github.com/justanhduc/task-spooler (used by CC).
* Would need a detailed description of what AirFlow is doing and evaluate the possiblity.
* Goal is to have a plan for full BPAC in Feb.
## Calibration
* Ale did PID validation, still found the problem with dEdx calibration
* Issue: https://gitlab.desy.de/belle2/software/basf2/-/issues/10204
* Seems very serious
* Suspect is that calibration code for CDC dEdx produces not a good payload
* Marko volunteers to help
* Once it is solved CDC sproc5 calibration should be redone and validation redone before we can start with proc16
## Processing (Pere)
* mDST submitted, now at 99% both for all and hadron
* Giacomo: Large log might be related to a problem in payload, as change in the order of GT in the chain make it disappear. Also, was not there in cDST production
## Montecarlo
* pipipi0pi0ISR production almost done, will be saved on DSS with pipipi0pi0ISR event type
* s-proc5 GT: payload 10/1 ready, BeamParameter not yet. Stefano and Giulio created payload using script from Radek (with some hack): need to have confirmation from Radek.
* Once ready we can start
* MCrd taupair production with different lifetimes
* Resubmitted and running
## Skims (Stefano)
* collection preparation ongoing
* which release should we use? A patch release or a light release?
* A light one if technically possible (it seems so)
* Analysts should be using a newer release than the one that was used for skims
* Bottomline: we will run skims with the light release and we will do that also for the future
* Skim librarian should anyway cherry-pick changes to skim to the release patch branch.
---
# DP Meeting 9th November 2023
## Attending
> Bianca Scavino
Carlos Lizama
Doris Kim
Frank Meier
Gaurav Sharma
Giovanni Gaudino
Jaeyoung Kim
Jim Libby
Karim Trabelsi
Luigi Corona
Marko Staric
Matt Barrett (he/him)
Noah Brenny
Pere Gironella
Racha Cheaib
Renu Garg
Shanette (guest)
Stefano Lacaprara
Thomas Lueck
Valerio Bertacchi
Yubo Han
## Software shift report (Th. Lueck)
* nightly build failure (since months)
* few in hlt, created issue
* FM: known, multiple attempt to fix, no success so far
* Validation
* ECLChargedPid, existing issue
* TDCPV_ccs skim (MR merged)
* analysis_pi0 (cannot reproduce locally on different dataset)
* FM: fixed
* Other
* updated jira->gitlab issue for DP
* fix additional lib and warning for several packages
* Occasional memory leak reproduced locally (sending many jobs to lsf) - Issue 10181 created by Christian
* failing at beginRun, in modules using MVA
* MVA writes to /tmp: about 1.2GB for these jobs, maybe the max allowed.
* changing by hand /tmp to local dir --> no crashes
* Fix not clear
* Frank: great work expectially for memory leak investigation
* confirmed that there is now a quota in /tmp (1GB for user)
* maybe possible to increase the quota for the validation jobs.
* more critical is to undestanda why memory increses if /tmp is full
* SL: concern if this large use of /tmp can be problem
## MC status (Giovanni)
* ee -> pi+ pi- pi0 pi0 gamma
* special request since it was not included in generic hhISR, now it is.
* Issue with one long run (8h) resulting in job too long
* s-proc5:
* Using Luigi's (fast) tool to create GT for MCrd for s-proc5.
* asked all expert to provide payload to be included.
* Few replies so far.
* BGO production with release8
* single campaing for all BGO for all exp
* done for bucket36
* MCri
* validation sample with release8 done
* production of B->tau ell (24/ab)
* progressing, some issue with retention rate due statistical fluctuation
## Calibration (Renu)
* post-tracking calib for s-proc5 (by hand). All completed.
* last was ECL done by chris
* Last is Beam Spot calib running now
* GT to be ready ~today
* SL and Pere to start production of data mDST as soon as ready
## Skim (Valerio)
* plan to define with WG which collection are useful
* proposal is to have two collection per skim
* main (with signal) + background
* definition depends by WG requests
* Need feedback from WG
* For proc16
* will add MCtype variable (string) to select all MCType
* Have a single collection with all generic
* need feedback
* Re-skimming campaign
* collecting all MR with new or updated skims
* uDST future
* FEI is using it, what about other WG?
* Jim: some analysis in DM are actaully B-like anaysis. So might need some fine tuning
* Frank: are the MR directly affecting the skim packages or the one which might affect indirectly?
* eg: fix KL
## Status of MCrd signal
* delayed by few technical issue, plus B2GM
* Now it should be back on track
* Total Done 187 - 2 running - 31 requested
---
# DP Meeting 12 October 2023
## Attending
> Anja Novosel
Bianca Scavino
Doris Kim
Frank Meier
Gaetano de Marino
Gaurav Sharma
Giacomo "the GOAT" de Pietro
Giovanni Gaudino
Jaeyoung Kim
Jim Libby
Karim Trabelsi
Karol Adamczyk
Marko Staric
Matt Barrett (he/him)
Michel Hernandez Villanueva
Noah Brenny
Pere Gironella
Priyanka Cheema
Renu Garg
Stefano Lacaprara
Tadeas Bilka (guest)
Tommy Lam
Tommy Martinov
Trevor Shillington
Valerio Bertacchi
## Software shift report (Tommy Lam)
* several memcheck issue from oct 11
* several validation plot turned red
* Some failure from hlt due to a missing file
* push problem in git solved itself just before meeting
* work done on documentation, MVA tool interface, and DQM improve documentation
* Frank: MVA tools need to update description on ticket (TL: Done)
* Script for ECL (ECLChargedPID.py) times out for some reason. Person responsible for this has left Belle II so until replacement person starts, the script will continue to fail. This means that the ECLChargedPID plots will have no nightly plots to compare to (they will be blue)
## DP status (Stefano)
* MC
* B-> tu l (EvtGen Skim) running, 60% done.
* MCrd: Backlog delayed due to switch from CCDB to RunDB for BGO info.
* MR!398 merged
* MCrd signal collections
* collection for all existing MC15rd signals are listed under confluence and /belle directory.
* How to merge MCrd signal
* Three ideas emerged: Merged by exp, additional merge step for every production, or run locally.
* MCrd for misalignment studies requested by tau group.
* Technical difficulties preventing creating the sample: mis-alignment GT is run-independent, random seed cannot be set on grid, but 1M event is too large for KEKCC.
* S-proc5 (Umberto's report)
* Running at KEKCC manually w/o airflow.
* Airflow
* Tasks to migrate Airflow to KEKCC, gitlab not finished yet.
* Migration to Airflow2 done.
* Summary on skim
* Discussion is going on about how to create collections. Proposals of directory structure have emerged for both data and MC.
* Handling collections is not a trivial task, given that the large number of collections.
* B2GM meeting items
* DP group restructuring
* skim schema
## Calibration (Renu)
* pre-tracking started, jobs are running
* ETA by early next week
* no problem with CDB so far
## Skim (Bianca & Trevor)
* Bianca: Discussion is going on collections of skim samples
* Trevor:
---
# DP Meeting 28 September 2023
## Attending
> Anja Novosel
Bianca Scavino
Boyang Zhang
Carlos Lizama
Christian Wessel
Doris Kim
Frank Meier
Gaurav Sharma
Giacomo "the GOAT" De Pietro
Giovanni Gaudino
Giulia Casarosa
Giulio Dujany
I Ueda
Jaeyoung Kim
Jim Libby
Karim Trabelsi
Karol Adamczyk
Marko Staric
Matt Barrett (he/him)
Patrick Ecker
Pere Gironella (guest)
Stefano Lacaprara
Thomas Grammatico
Trevor Shillington
Ueda (guest)
Umberto Tamponi (he/him)
Valerio Bertacchi
Xiaodong Shi
Yu Nakazawa
## Software shift report (Frank Meier)
* Dev build: few fixes
* on 25/26 still problem with CDB issue
* Validation failed twise (disk space exceeded):
* buildbot reconfigured w/o using /tmp
* Still many validation and even mode expert plot failing
* Memory footprint increased on Sept 20
* RootOutput, CDC track finder, and V0 finder show two variations at the same time
* One related to dEdx payload update
* The other to KLM geometry (also mDST/DST size increased, not clear why)
* File size: Two changes not related to the above problem.
* initial particles added on September 15th
* DEC files related work
* charm dec file updated MR !1551
* discussion: convention on aliases should be reverted.
* discussion on DECAY_Belle2.dec and mixing & CP parameters MR !2324
* Jim agrees that we should have a centralized solution for mixing and CP. The matter will be discussed in a meeting.
* Added global tags to cvmfs squid MR !2328
* Stefano says DP group should have handled this merge earlier.
* Can we use rel8 BGO for real production (validation etc)
* Frank: yes.
* Double check that these have been produced.
## DP status (Stefano)
* Nothing running on grid from DP point of view.
* All the jobs are user jobs.
* The only request is a large B to tau ell with gen-level skim.
* An issue is being investigated.
* CoreComputing DB access issue
* CCDB access discontinued due to security concerns.
* hRAW files will be produced by CC, not DP anymore.
* BGO and run related info will be retrieved from RunDB, not CCDB.
* MC15rd skim production 100% done.
* Thanks to Trevor.
* Though EWP skim needed a patch.
* Jira legacy tickets.
* asking managers to close old tickets.
* Proposal for future
* There will be only one data-production/request project.
* labels will be used to distinguish Data, MCri, MCrd, etc.
* B2GM planning started. Tuesday slot will be assigned to DP.
* DP group restructuring, new managers, skim schema improvement, etc.
## Calibration (Umberto)
* calibration@KEKCC
* 400 cores, disk for cDST, squid
* Stress test to saturate all cores and see the impact
* mostly load of CDB
* Pick 5 calibration (~random), and run them together
* Queue filled and kept full
* Some load peak at the beginning of jobs, not very much
* Then at some point somewhere else (not calib) jobs started and killed the CDB
* During the debugging, we found that squid was not actually running
* KEKCC is basically fine (also w/o squid), but no other test before squid is fixed and CDB issue is understood
* sproc-5 plans
* Airflow at KEKCC in progress but not fully ready
* Release 8 not validated yet
* DB is fragile
* devising ways to speed things up.
* partial calibration instead of full recalibration
* prerel-08d
* Soft wanted sproc-5 to be run with prerel-08d
* Umberto: How about tracking validation?
* Frank says release 8 after sproc-5. Because SW needed data to validate release 8.
* Decision: We will use pre8d for s-proc5
* do not reproduce cDST
* manual management instead of Airflow
* We can start s-proc5 now, but should we with the issue of CDB?
* CDB issue:
* not understood, BNL are investigating but no explanation is found yet.
* For s-proc5 we can get locally the payload or via cvmfs or via squid.
* squid issue:
* KEKCC are tuning parameters
* Need to deploy squid server in B2 grid sites, starting with RawProc ones.
## Monte Carlo rd (Giovanni)
* issue 879 tau pair samples: This will be treated as a signal MC.
* User should use a special mc event number 3400100004{0,1,2,3} to search them.
* collection directory is the same as before.
* There was an observation from Marcus that gbasf2 got timeout looking for signal MC collections. Stefano will open a ticket.
## Skims
* MCrd done
* EWP will start as soon as patch release will be ready
## Validation (Patrick)
* 9 validation modes implemented in VIBE
* But the last one got into trouble on Grid due to a GNN flavor tagger.
* No feedback from WGs to SW/DP yet. One WG is looking into it.
* VIBE ran very fast.
* EWP, KLM, Quarkonium, SKIM are interested in doing things in VIBE. TDCPV not sure.
---
# DP Meeting 14 September 2023
## Attending
> Christian Wessel
Doris Kim
Frank Meier
Giacomo De Pietro
Giovanni Gaudino
Giulio Dujany
Jake Bennett
Matt Barrett (he/him)
Radek Zlebcik
Thomas Kuhr
Umberto Tamponi
## Software shifter (Thomas Kuhr)
* Unusual shift week
* CDB access failures: builds and test failed. No monitoring and no merging of Mrs
* The cause is not known yet. Temporary remedy is increasing the client time.
* Giacomo: But the strategy does not change basf2.
* Umberto: The higher plateau of CPU Usage cannot be explained easily. Could it be the KEK batch system load?
* a KEKCC /tmp quota problem suddenly appeared, causing build failures.
* FEI validation consumes large disk space.
* Reduce number of parallel jobs?
* Status of migraion of analysis repository
* TDCPV
* SL and ME
* Directory structure to be changed?
* A discussion on Doxygen update in the externals.
## Production status (Umberto)
* The next step is calibration.
* Stress test: Transport to KEKCC needs another week.
* sproc-5 can be started after the stress test.
* Nothing serious in the processing side
## MCrd (Giovanni Guadino)
---
# DP Meeting 31 August 2023
## Attending
>Anja Novosel
Bianca Scavino
Christian Wessel
Doris Kim
Frank Meier
Giacomo "the GOAT" De Pietro
Giulia Casarosa
Jake Bennett
Karol Adamczyk
Marko Staric
Matt Barrett (he/him)
Michel Hernandez Villanueva
Noah Brenny
Priyanka Cheema
Renu Garg
Stefano Lacaprara
Umberto Tamponi (he/him)
Yang Li (guest)
## Data processing and other stuffs (Stefano Lacaprara)
* request for additional sproc-4 came in.
* Marko asked to use release 7 since it has a fix on bunch finding.
* only hadron, with patch in release7, to follow up in ticket.
* preparation for s-proc5 started
* release 8 + scripts test runs
* HLT in monitoring mode will be included in the DP tools.
* Matt Barrett: The number of full events in the prerelease-08 validation samples looks OK.
* Watanuki-san is changing institute.
* MC special request
* Request for a large sample (1/ab) of generic MCri with heavy gen-level filtering
* Since we are moving from ri to rd, a compromise was being searched.
* Signal: MCri and MCrd status are shown on a slide by Stefano.
* MCri with BGx0 is being generated: most probably the last production with the option.
* MCrd: reduce numbers of ProdID
* one production per experiment/beam energy
* produce together multiple signals
* multiple processes, each one signal type to specific output, and run together
* Skim news
* completing MC15rd skim production
* new skim team: Racha, Trevor -> Valerio, Bianca, X
## Calibration (Renu Garg)
* Ueda-san managed to give permision to the calibration group at KEKCC.
* /gpfs/group/belle2/grid/storm/CALIB/belle/group/detector/CAF/prompt/
* /gpfs/group/belle2/grid/storm/CALIB/belle/group/detector/CAF/prompt/s-proc4
* Need to be part of B2_calib group
* The folder structure at KEKCC is the same as the one at NAF.
* calibration documentation: TBA
## Skim
* There is no comprehensive list of skim collections. Should be discussed with the new skim manager.
## SW shifter report (Christian Wessel)
* Overview plots mostly stable
* 0.9% increase in mdst size between Aug. 29th and Aug 30th
* Interesting enough, there was a change in the tracking FOMs in validation between the two days, too (increase in charge asymmetry)
* And the number of shifter validation plots in error state changed between the to days
* Only two MRs merged on August 29th:
* [2180](https://gitlab.desy.de/belle2/software/basf2/-/merge_requests/2180) `Fix definition of backward EKLM shield layers 13 and 14. `
* [2232](https://gitlab.desy.de/belle2/software/basf2/-/merge_requests/2232) `Bugfix/10056 remove energycutbkgd0`
* Work done this week:
* Fix compiler warnings in beast, generators, daq
## AOB
* Giacomo: JWT authentication
* https://gitlab.desy.de/belle2/admin/jwt-server/-/blob/main/cdb_token.py
* Several authors of the global tag tools are not in the collaboration any more.
* Umberto
* We saw number of question related to very basic issue such as how MC is produced, how to use different data tier, etc, showing a lack of information
* Proposal: when you register to B2MMS, you get a welcome mail with list of documentation and links
* for B2MMS the right person to ask is Daniel or Andreas.
* eg https://confluence.desy.de/display/BI/Belle+II+Newcomers
* Analysis model is empty there: we need to fill.
* Frank: How about a basic level quiz for newcomers?
---
# DP Meeting 17 August 2023
## Attending
>Doris Kim
Frank Meier
Kirill Chilikin
Noah Brenny
Priyanka Cheema
Severino Bussino
## SW shifter report (Severino Bussino)
* Validation plots:
* 1 ECL script failed at the beginning.
* Priyanka is looking at the failed ECL scripts.
* Frank: If he remembers correctly, there is a time limit on the execution time of the scripts. How about contacting the performance group or ask the ECL group for a new contact person?
* New failed scripts in TOP and tracking appeared yesterday (Aug 16th), but then disappeared.
* Frank wonders what is causing the LSF memory usage limit in the failed TOP script, which was not a problem in the past.
* Fluctuations in memory leaks, as usual.
* After KEKCC was turned on, configuration had to be changed.
* Christian's suggestions as the August 3-10 shifter.
* However, could not test the basf2 examples. This task will be the suggestion for the next shifter. Please discuss the matter with Gayane and collaborate with him.
* Overall, smooth shift.
* Frank: Validation summary plots: the number of failed scripts and comparisons is not accurate, and there are no warning emails for failures. It is not clear what is happening.
* Kirill: What happens to updating externals libraries? Frank says let's try after the completion of release 8 validation.
## Others
* Frank: Release 8 validation sample productions have been finished.
* Next week will be the regular SW meeting with bug fix reports.
# DP Meeting 3 August 2023
## Attending
> Christian Wessel
Doris Kim
Frank Meier
Gayane Ghevondyan
Matt Barrett
## SW shifter report (Gayane Ghevondyan)
* Some memcheck fluctuations, but most metrics are stable.
## AOB
* Matt
* KEKCC will close down in 12 hours.
* Frank says only KEKCC will be down from the entire Belle II GRID, but no Belle II grid jobs may run as DIRAC services will be suspended for the duration of the KEKCC shutdown.
* In one of Andreas's emails, he says the confluence migration will happen. The details are not announced yet.
* Frank: This means older and not used information should be deleted from the current Belle II confluence pages.
* Christian: Please make sure that we do not lose the access to the Belle II code on stash / bitbucket. There could be still some private analysis codes not propertly stored in the Belle II system yet.
* Christian Wessel
* DESY limits the memory usage of JupyterHub server to 10 GB. This limit was in place before, according to DESY IT, but somehow jobs could exceed this limit without any consequences. Only now jobs started crashing last week and according to DESY IT nothing changed in the setup.
* The US summer school did not use the DESY JupyterHub so it didn't interfere with this.
* What will happen when the Belle II people come back from summer vacation and find that their jobs at DESY JupyterHub are crashing?
# DP Meeting July 20
## Attending
> Andreas Gellrich
Bianca Scavino
Boyang Yu
Christian Wessel
Cédric Serfon
Doris Kim
Dmitry Matvienko
Frank Meier
Gaetano de Marino
Gaurav Sharma
Giacomo "the GOAT" De Pietro
Giovanni Gaudino
Giulia Casarosa
Jake Bennett
Karim Trabelsi
Luigi Corona
Matt Barrett
Michel Hernandez Villanueva
Patrick Ecker
Stefano Lacaprara
Suryanarayan Mondal
Swagato Banerjee
Trevor Shillington
## SW shifter report (Christian Wessel)
* Old (former Jira) issues
* pinged O(450) old issue, where last update was before migration
* including many unassigned
* 30 issues sent to backlog
* What shall we do with those issues created by people who left the collaboration?
* Comments to Christian: impressive work on the huge amount of issues pinged
* Parts of nightly build pages were broken.
* EPIC sub-issues lost links during migration.
* Geant4 memory issues: will not be easy to fix from our side.
* Doris will look into the matter later.
* Also fixed some of memory warnings and doxygen warnings.
* Recommendation for next shifter: look into the remaining old issues of O(200).
* Validation issue report page: https://gitlab.desy.de/belle2/software/basf2/-/issues/3281
## General production status (Stefano)
* Nothing serious is happening now.
* Stefano will report on the data production group restructuring at the coordinator's meeting.
* Stefano will go for vacation next week till August 7.
* Umberto is in vacation now and will be back at the end of next week.
## Calibration (Giacomo)
* Giacomo and co are discussing with BNL on the migration of DB using JWT (the token). Thomas is also in the loop.
* JWT is issued by a service (already in place), by default the validity is 24h.
* Is this fine for Airflow? unlikelily.
* proposal is to extend the token validity to 1 month for calibration manager.
## Speed up SVD calibration (Surya)
* most time consuming is SVD cluster calibration: 8-12 h
* Ran all the files on NAF to recreate/indentify the issues.
* Many histrogram O(1000) are created by each job, and it takes a long time to access them by getObjectPtr()
* 1D histo per sensor -> 2D time vs sensor -> 3D to put also calibrated time
* merge time greatily reduced.
* Plan to implement a time limit for collector job: if too long, skip the rest of events.
* Giacomo:
* can be implemented in base class, so also other collector can use the funcionality.
* Does boost have a helpful mechanism to alleviate the situation?
## MCrd (Giovanni)
* MCrd for dark sector BIIDP-6356 has finished.
* For each energy
* The final set size of a certain event is small as expected.
* collectios to be created
* Request from tau group with different shifted lifetime
* s-proc4:
* Manifested as a problem in Kpi mass resolution
* Giulia tracked down the problem and found a wrong payload from bucket35 and later (i.e., 35 and 36). 34 and earlier buckets are ok.
* Fixed by hand.
* In future, we need an automatic procecure to avoid the same problem.
* Should MCrd sproc4 be redone?
* Giacomo: MC15rd is fine, the wrong payload is used only in release7, so MC15rd is fine
* what about s-proc4? Giulia: probably not worth
* probably better to focus on s-proc5
* Action item:
* create new GT with correct payload
* produce BBbar and Giulia, Bianca to use VIBE
* BGO
* problem with first production is understood, new version samles are being produced.
* better to wait for new BGO production for validation sample?
* probably yes but need quick validation
## Skim (Trevor)
* MCrd skim remaining skims
* increased time per event x2 and less skim per combined skim
* now finally running smoothly.
* Hadron ski 95% done, taus as well
* remaining all_skim to be resubmit in 3 combined skims
## Validation (Patrick)
* group on grid working smoothly.
* Implementation of modes:
* Gaurav from DM and LM, Tracking (Bianca, Martina), nothing else from other WG.
*
# DP Meeting July 06
## Attending
> Carlos Lizama
David Dossett
Doris Kim
Frank Meier
Gaurav Sharma
Giacomo "the GOAT" De Pietro
Giovanni Gaudino
Jake Bennett
Marko Staric
Matt Barrett (he/him)
Michel Hernandez Villanueva
Patrick Ecker
Ravinder Dhayal
Renu (guest)
Stefano Lacaprara
Swagato Banerjee
Suryanarayan Mondal
Trevor Shillington
Umberto Tamponi (he/him)
## SW shifter report (Doris Kim)
* Confluence is not working from outside.
* Also many shift tools not working as well.
* Frank is aware of many of the issues reported by the shifter.
* Red validation plots are an issue, especially because it seems that nobody is feeling responsible for them.
* When shift tools are not working, inform Thomas Kuhr. These index problems happen even when DESY could be accessed from outside.
* Clang check is switched off now. We had too many messages.
* Error reporting tool on geometry is not working.
* HLT test scripts failing: Experts tried to fix the issue more than 1 year, but could not stabilize the problem.
* Several memory issues fixed by merging into the new release. The issue around July 1st should be fine now.
## Other SW issues
* Giacomo reported that Gitlab pipelines break from time to time. The cause is not known yet. https://gitlab.desy.de/belle2/software/basf2/-/issues/9994
* BGO: some issue with limit of job submission
* first batch should be ready by tomorrow
* Jake Bennett found a segmentation failure issue coming from Geant4. https://gitlab.desy.de/belle2/software/basf2/-/issues/9998
* JIRA: deadlne for migration? end of aug.
* In Aug JIRA will be in read=only mode
## Data production
* as announced, mDST for s-proc4 are fully done with fixed alignment.
* Old mDST (with wrong alignment) will be removed after a follow-up mail
## MCrd status (Giovanni)
* MC for BIIDP-6356 is finally almost done
* s-proc4 mDST are fully done and collection created
* Request for BIIDP-6367 for tau with shifted lifetime
* MR created
* BGO for MCrd: can we use prerelease-08-00-00a?
* it should be fine
* Stefano: The MCrd signal production is going on. In the beginning, there was a very long queue. Some of them were even 1 month old.
* Sometimes they had to stop processing signal generation since there are more urgent requests.
* Umberto thinks the current size of backlog is manageable.
## Calibration
* Umberto: Not much is happening in this corner now.
* David Dossett: In the midst of migration of Airflow.
* The DB should be upgraded, which incurred a certain compatibility problem.
* https://gitlab.desy.de/belle2/data-production/calibration/automated_calibration/-/issues/4
* Back up plan? Migration to MySQL is rather easy.
* A long term strategy is using Progress.
* Currently, the DB part is within a Docker, which would give a compatibility when switched to other type of DB.
* Michel Hernandez Villanueva: Just to comment on the DB at DESY, I could ask Andreas for MySQL (and a very specific version) installed in the VM instead of MariaDB without issues.
* Need to test squid server at KEK:
* can try with validation jobs, but the payload access pattern is different. Still can be useful (and very easy)
## Validation (Patrick Ecker)
* Good attendance for VIBE tutorial: some feedback
* We are collecting steering (Gaurav)
* gitlab ticket to be created directly (skipping JIRA) to follow up it
* Number of events is what? 1M.
## Skim (Trevor)
* Update on MCrd skims
* Lost a month due to failing skims.
* US B2 @duke university meeting soon: Trevor to present DP to students.
* Skim presentation by Umberto at PGM:
* The end purpose is sustainability of skimming process.
* interests from WG, not much feedback yet
* following up with Jim and Diego
* Umberto is thinking of reducing the number of skims.
# DP Meeting June 22
## Attending
> Andreas Gellrich
Carlos Lizama
Doris Kim
Frank Meier
Giacomo "the GOAT" De Pietro
Giovanni Gaudino
Jake Bennett
Marko Staric
Markus Prim
Matt Barrett (he/him)
Michel Hernandez Villanueva
Patrick Ecker
Stefano Lacaprara
Suryanarayan Mondal
Tommy Lam
Trevor Shillington
Umberto Tamponi (he/him)
## SW shifter report (Suryanarayan Mondal)
* many tracking plot went green, but problem reported today in mailing list
* Frank's comments: important fix happened due to updating reference plot, very important! Tracking looks OK now.
* Frank's comments on validation plots:
* Since Geant4 version got updated a while ago, some plots may have changed.
* not sure what happened to ECL around June 19 2023.
* 1000th pipelie crosse by Giacomo
* working on issue
* 7700 (python update to PEP8) % to f-string
* 9960 cleaning SVDSpacePointCreator: separate header from helper
* 9961 SVDTimeGrouping:
* 9963 update contact info of validation plots: did not have time to tackle it yet. Next shifter perhaps?
* next shifter : none. The week after (June 29-July 6) is Doris.
## DP status (Stefano)
* S-proc4 status
* Additional MC14ri for D(*)
* MC15ri for pythia tuning: To help Gevorg and his group. Going smoothly.
* Mis-aligned signals. Now at the merging status.
* Running productions: 48
* 11 lines somehow 99.9% instead of 100%. Needs to understand why.
* GitLab migration
* restructuring desired, but still is at a plan stage.
* Now open PR to gitlab allowed only.
* Calibration manager: Markus Prim -> Renu Garg.
* Skim manager? Trevor Shillington steps down. Two candidates.
* Patrick: Validation scripts should be moved to GitLab before the deadline. https://stash.desy.de/projects/SRVF/repos/srvf/browse
## Calibration (Markus)
* proc16 will be done at KEKCC
* There will be a request for staging at DESY to copy cDST to KEKCC
## MCrd status (Giovanni)
* BIIDP 6356:MCrd ee channel, all experiment, but with heavy skimming so <0.01% retention rate.
* resulting in very short jobs, overloading the sandbox server
* If we rescale the retention rate, the jobs become too long now, exceeding max job length.
* Tuning is complex: we should have it done
* s-proc4
* different tracking configuration is completed
* collections are ready
* All info updated in concluence page
* s-proc4 standard only 1 merge job running
* collection will follow
* Then we cancel old productions
* Reported size of all productions
* BIIDP-6367
* taupair only, with different tau lifetime
* total expeced size 2GB * 4
## Skims (Trevor)
* MC15rd: many skims cancelled probably due to CPUTime issue (according to Miyake-san).
* Some samples did not have enough events to set CPUTime correctly.
* The above solution does not fix all the affected lines. Should we increase ExpectedEventCPUTime artifically?
* No skim requests for s-proc4 yet. Stefano thinks we do not need skims for this line.
* Umberto: time profile for different exp is very different, with few skims dominating the total time
## Validation (Patrik)
* Fri and Mon validation tutorial
* Nex wed discussion with dist computing to have special queues.
* Michel: Is there any special sample needed? The answer is no from Patrick.
## Schedule of validation (Giacomo)
* BGO is converging on numbers for BGO production
* so we can start BGO immediately after freeze
* 1-2 weeks (hopefully better than that, comments by Jake Bennett)
* Jake: Do we need BGO for exp 1004?
* Let's start with 1003 then 1004 and 0 (?)
* Marko: there is MR to be merged (still some open issue). Asking Giacomo for help.
* For example, a new xlm file for Run 2 should be updated. Who will do this project? Jake mentions some names such as Giulia (PXD/SVD), since they should be aware of the situation.
* Simulation time shedule should be discussed between relevant people.
* Steering for validation sample
---
# DP Meeting May 25
## Attending
>
Andreas Gellrich
Carlos Lizama
Doris Kim
Frank Meier
Gaurav Sharma
Giacomo "the GOAT" De Pietro
Giovanni Gaudino
Giulio Dujany
Jim Libby
Karim Trabelsi
Kenkichi Miyabayashi
Luigi Corona
Marko Staric
Matt Barrett (he/him)
Michel Hernandez Villanueva
Noah Brenny
Patrick Ecker
Radek Zlebcik
Stefano Lacaprara
Swagato Banerjee
Tadeas Bilka
Thomas Lueck
Tommy Lam
Trevor Shillington
## SW shifter report (Tommy)
* no change in validation: maybe KL cluster time, but already reported by previous shifter
* worked on some beginner friendly ticket
## DP news (Stefano)
* Luigi: wher to put MCrd GT preparation scripts
*
## MC (Giovanni/Gaurav)
* MCrd: request from dark sector: ee (427/fb) but with skimming in place so size will be small. Needed to increase the BGO reuse rate by x100 (40k)
* s-proc4: done with missing GT.
* Prepared new production w/ correct GT
* prepared also for 3 special tracking configuration
* GT approved by Giacomo and Tracking group.
## Skim (Trevor)
* MC5rd skim running: split in two batches with less skims for each.
* still failure for wall time issue
* cancelled and further split
* one is done, other still having walltime issue
* Need to understand why
* Might be that the culprit is one specific skim, even though each one is tested against CPUtime.
* Also "all" skim has suffered the same problem, and will need to be resubmitted
* Also problem with disk space
* 200 TB still to be produced, might be too much
* Carlos:
* prelimianry work on grid overlap matrics and possible combination
* https://indico.belle2.org/event/8745/contributions/56796/attachments/22465/33153/Overlap%20Skims%20v2.pdf
## validation (Patrick)
* grid queue for validation, discussing with DC
## Gitlab before August
* will make jira read-only
# DP Meeting May 11 2023
## Attending
>Andreas Gellrich
Ansu Johnson
Carlos Lizama
Doris Kim
Frank Meier
Gaurav Sharma
Giacomo "the GOAT" De Pietro
Giovanni Gaudino
Giulia Casarosa
Giulio Dujany
Karim Trabelsi
Karol Adamczyk
Kihong Park
Ludovico Massaccesi
Luigi Corona
Mario Merola
Marko Staric
Markus Prim
Matt Barrett (he/him)
Patrick Ecker
Stefano Lacaprara
Tadeas Bilka
Umberto Tamponi
Vismaya V S
## Jira -> gitlab migration
* status: no progress for DP
* Andreas: Need to be finished by summer, since DESY has to pay for a new licence September.
## Linux update
* CERN and DESy and DAQ is moving to linux 9 (skipping 8) AlmaLinux9 for desy, same or rocky9 for DAQ
## Shifter report (Frank)
* dev build:
* new warning inECL (loop optimization failed)
* other are from external packages
* clang are now showing (new)
* good task for next shifters
* test for hlt, skipped test still trigger a mail (intended)
* validation summary plot now ok (Thomas)
* all current failing knows
* Memory dropped by 45MB, still >2GB (2.65)
* file size increase: not clear where from, but no worry about that
* Buildbot still failing (limits), as well dev build (hlt), ext failing (no worry for shifter)
* Worked for new externals v02-00-01
* g4 update, one python package codeautolink, config fixes
* working on upgradng root 2.26 or even 2.28
* only two geometry overlaps
* new GT for validation and new main GT
* for next: clang warning, look at stalled MR, b2questions
* Umberto:
* which version of roofit
* Stefano PXD2 geom: when we expect to be in?
* new MC exp number 1004 (new goem)
* likely we will produce MC with 1003 and 1004 for next campaign
* to be discussed
* For validation we will use 1003
* Giulia: we might need to retrain CKF for tracking.
* also collecting condition that might be changed at restart of data taking.
* UT: what about background? would be needed?
* Yes.
* #
* So we need two sample of BGO (1003 and 1004)
* BGO is expceted to start as soon as the rel8 is frozen
* need to decide what to do by then
## DP general news
* Ansu stepping down from MC deputy manager, new deputy manager will be Gaurav (IIT MADRAS)
* Next reprocessing proc16 (skip proc14 and proc15 to align name with next MC campaign MC16)
* Start in fall 2023, end by spring 2024
* Rel8 deadlines: freeze by end of June, ready for validation by July, s-proc5 by September, and prompt calibration dry run by October/November
* No progress on skim optimization
* Markus: must complete s-proc5 by deadline (David is leaving then, might become a problem if plans are delayed)
## Calibration
* Markus: will **not** recalibrate everything for proc16, so experts must opt-in if they need recalibration
## MC
* Giovanni: no news
## Skim
* Reamining skims (MCRD) are running/queued on the GRID
## Validation (Patrick)
* Data/MC quality control workflow ready on gitlab
* Also gitlab pipeline to test the steering files and tutorial are in place
* Stefano: where/when should we show the tutorial? Schedule a remote session one afternoon
* Local testing mode available to check steering files on your own machine
* Still need to decide which machine to run this on
* Need the monitoring modes to be added to the framework to actually test it
* Need to have the possibility to merge runs to increase stat
* Remaining issues/questions
* Current gbasf2 setup script breaks the automation (user has to input their password): can this be skipped?
* Integrate this into mirabelle or make a new mirabelle instance just for this?
* Stefano: if not a problem, should use current one to avoid proliferation
* Umberto: using same mirabelle instance allows to compare different calibrations, so it would be better to use current one
* Giulia: online quantities computed by DQM, so one must make sure that the quantities we compare are computed the same way
* Store number of streams in the metadata to speedup & simplify future MCRD productions?
* Stefano: in the metadata of files it is not possible (number available only after processing all files)
* Stefano: Number available in the metadata of the collection, can be accessed automatically
* Umberto: there was an idea to put this in the MCRD GT, but it requires some work (the format design is not trivial)
## AOB
### Requests from tracking (Giulia)
* Have some raw data permanently staged and available on KEKCC/NAF
* Order of 1 run/bucket, with different bkg and detector conditions
* Purpose: allow developers to test their new code whenever they want, always on the same data, to compare/validate different sw versions
* s-proc can be too late to fix certain bugs
* Stefano: already working on this on KEKCC (Alessandro from Torino), for NAF need to interact with Andreas and understand size requirements
* Umberto: workaround is to use cDST (they contain all raw data), when using them must manually set which branches to read and which GTs
* Andreas: on NAF cDST still need to be staged from tape
* Action Item: need to have an idea about size: enough to be run on and see something, no B physics
* suggestion: run w/ and w/o HLT filtering
* Request for 1 stream of MCrd for s-proc4 with special reconstruction (x3)
* improvement for fake rate on data is visible and important, would like to have MCrd
* which processing?
* need to ask Petar and co.
### Giacomo (b2luigi):
* b2luigi is now unmaintained (dev graduated and has no contract now)
* we can branch and get it into our repo and maintain it
# DP Meeting April 27 2023
## Attending
>Alessandro Boschetti
Andreas Gellrich
Doris Kim
Frank Meier
Giacomo "the GOAT" De Pietro
Giovanni Gaudino
Kihong Park
Markus Prim
Matt Barrett (he/him)
Patrick Ecker
Racha Cheaib
Trevor Shillington
Umberto Tamponi (he/him)
Michel Hernandez Villanueva
## Shifter report (Radek)
## DP Status
* Not much happening, except skimming
* Discussing how to migrate calibration to the grid and migrate to gitlab
## Calibration
* proc-16
* release freeze end of June. Validation throughout September.
## MC production (Giovanni Gaudino)
* run dependent status - all jobs done for sproc-4
* showed a slide on how to find the files. Created collections, but could not remove a problematic root file.
* Showed a table of MC file size for sproc-4
* Umberto suggests to add luminosity information to the table
## Skim (Trevo and Racha)
* MCrd15
* Systematics done
* FEI almost done
* Hadron_skims and all_skims resubmitted
## Validation
* GitLab pipeline for the validation framework in place to test reconstruction scripts
* Umberto to Giacomo: Would it be possible to prepare validation scripts? Giacomo says doable.
# DP Meeting April 13 2023
## Attending
Alessandro Boschetti
Andreas Gellrich
Doris Kim
Francesco Tenchini
Gayane Ghevondyan
Giacomo "the GOAT" De Pietro
Giovanni Gaudino
Karim Trabelsi
Matt Barrett
Racha Cheaib
Stefano Lacaprara
Umberto Tamponi
## Shifter report (Gayane Ghevondyan)
* quite week
* one new red in TRG validation: ticket to be opened
## Skim (Trevor)
* MC15r/proc13+prompt Done.
* Two prod for EWP for error in skim code
* FEI redone w/o ECL cut
* disk usage:
* FEI are using most of space (but running on largest sample)
* about 10 skims (out of 70) uses ~50% of space
* plots shown for disk usage for skims divided by MC type and WG
* Also for data
* MCrd: all submitted.
* Syst are done
* other progressing (slowly)
* Issue with job too long,
* need to be resubmitted with less skims for each production
* FEI to be submitted
* Plan for B2GM:
* move to b2luigi for skim workflow
* skim optimization (Carlos and Leo) - no update yet
## DP status
* s-proc4 still not completed for an issue with run 1825 exp 26 which has only empty events
* Giacomo fund a problem in the way basf2 initialized modules, and will produce a patch
* some issue in removing the input files (https://gitlab.desy.de/belle2/computing/distributed-computing/belledirac/-/issues/1925 )
## MC production
* semileptonic production (1/ab uDST) is almost complete, 1 job in a weird state
* Got request for a large production with heavy skimming at generation level:
* will require a new release
# DP Meeting March 30 2023
## Attending
Alberto Martini
Andreas Gellrich
Carlos Lizama
Christian Wessel
Doris Kim
Frank Meier
Giacomo De Pietro
Giovanni Gaudino
Hiroaki Ono
Karim Trabelsi
Karol Adamczyk
Marko Staric
Matt Barrett
Patrick Ecker
Priyanka Cheema
Stefano Lacaprara
Stefano Spatato
Tadeas Bilka
## Shifter report (Tadeas Bilka)
* shifter report is available [here](https://indico.belle2.org/event/8741/contributions/56770/attachments/21646/32054/SWShift2023a.pdf)
* Nightly build and validation page not yet updated
* Sometimes random errors in pipeline
* Replaced remaining TVector3 into XYZVector in the alignment package
* A trivial bug found in the VXD geometry reconstruction and found. (no effect)
* had do unmerge old PR to fix issues in prompt alignment
* Several examples in display package not working, fixed. Some of them requiring ROOT files which are missing, maybe one needs script to generate the files in cases they are missing. There are some default DSTs, needs to check.
* Notifications for old BIIDB tickets
## DP status (Stefano Lacaprara)
* s-proc4 calibration ended, this week started the processing, in few days it should be compelted. Hadrons staged this night. Preprocessed old mdst (all) with different global tags requested by tracking group.
* MCrd was submitted, mostly done (very fast jobs)
* Additional MCri productions BIIDP-6283 and BIIDP-6301 with new decay files and dedicated SL skims, respectively. The skim are not in the release, this is an exception, the skims MUST be in the release before being submitted, if not they are untraceable.
* Still tails of MCri signals
* Next weeks:
* calibration on the GRID, a document is under preparation to be discussed, aimed to be ready for the next B2GM
* GDP when cal on the GRID? SL - realese 8, but realistically try on both grid and standard cal and see how it goes
* GDP a demostrator for B2GM would be nice. SL - not sure about time scale, a stable API for the GRID is needed and not yet ready
* KT it would be better an earlier discussion, before B2GM
* gitlab migration in April (the plan was presented by UT last meeting but not in indico yet), and some code restructuring
* GDP deadline for migration September
* skim optimization, goal June B2GM
## Montecarlo (Giovanni Gaudino)
* Done jobs for M14rd_b hhISR exp12
* s-proc4 almost done
* Several paylods had to be added manually, in the next iteration it will be automatic
* ROIParameters
* ECLWFNoiseParams
* ECLWFParameters
* ECLWFAlgoParams
* BeamParameters
* KLMTriggerParameters
* KLMScintillatorDigitizationParameters
* SVDChargeSimulationCalibrations
* CDClayerTimeCut
* CDCFEElectronicss
* CDCCrossTalkLibrary
* CDCEDepToADCConversion
* Report of all the sizes for M15ri
| MC15ri_b | uubar | ddbar | ssbar | ccbar | mixed | charged | taupair | mumu | ee | gg | lowmulti |
| ------------------------------------ | ------- | ------- | ------- | ------- | ------- | ------- | ------- | ------ | ------- | ------ | -------- |
| Size(GB) | 10847.4 | 2701.2 | 2682.0 | 11229.4 | 16324.4 | 16857.5 | 3536.1 | 2682.9 | 18209.3 | 2719.8 | 8652.8 |
| Skim Size(GB) | 59571.5 | 15085.8 | 13990.5 | 58612.6 | 30561.4 | 32090.1 | 6941.4 | 2843.1 | 14822.0 | 2552.6 | 0 |
| TDCPV Size(GB) | 4759.0 | 783.5 | 738.4 | 2486.5 | 120.9 | 166.0 | 281.9 | 0 | 0 | 0 | 0 |
| Tau Size(GB) | 3950.5 | 938.0 | 1632.3 | 2489.0 | 114.9 | 127.9 | 0 | 1071.7 | 5111.3 | 264.0 | 0 |
| SystematicsCombinedHadronic Size(GB) | 2934.2 | 1880.1 | 835.2 | 4950.4 | 5177.6 | 3490.8 | 291.5 | 0 | 0 | 0 | 0 |
| SystematicsCombinedLowMulti Size(GB) | 1141.1 | 486.7 | 270.8 | 839.7 | 47.2 | 48.1 | 2220.1 | 659.6 | 4449.5 | 0 | 0 |
| FEI Size(GB) | 7346.7 | 1766.4 | 1471.6 | 9551.6 | 16848.0 | 19410.7 | 0 | 0 | 0 | 0 | 0 |
| CharmAll Size(GB) | 2807.6 | 695.8 | 683.5 | 2936.7 | 621.0 | 644.6 | 218.9 | 0 | 0 | 0 | 0 |
| CharmHad Size(GB) | 10601.2 | 2444.2 | 2581.4 | 11893.7 | 2837.3 | 2984.5 | 376.6 | 0 | 0 | 0 | 0 |
| Quarkonium Size(GB) | 433.3 | 105.6 | 168.1 | 512.1 | 185.9 | 174.3 | 85.9 | 0 | 0 | 0 | 0 |
| Dark Size(GB) | 3427.3 | 877.8 | 873.3 | 1986.0 | 144.1 | 147.0 | 2271.7 | 1111.8 | 5261.2 | 2288.6 | 0 |
| EWPetal Size(GB) | 4159.1 | 1048.5 | 892.7 | 4783.4 | 2310.1 | 2374.2 | 425.3 | 0 | 0 | 0 | 0 |
| BToCharmlessHad Size(GB) | 7475.8 | 1642.3 | 1594.7 | 5337.5 | 359.8 | 531.1 | 570.9 | 0 | 0 | 0 | 0 |
| BToCharm Size(GB) | 17866.5 | 4024.4 | 3813.9 | 15989.7 | 2154.4 | 2522.0 | 769.5 | 0 | 0 | 0 | 0 |
| BToCharmNoTau Size(GB) | 2271.9 | 540.8 | 422.3 | 2937.8 | 899.7 | 768.3 | 0 | 0 | 0 | 0 | 0 |
## Validation (Patrick Ecker)
* Work on the tutorial, presented some slides
* Validation framework in gitlab, to isntall, in b2luigi
* Monitored quality inehritated from ValidationModeBaseClass
* Do not use common variablesToNtuple but the custom variables_to_ntuple
* b2luigi pickles the basf2 path, which needs to be availabe before sending the job on the grid
* How to create monitoring objects
# DP Meeting March 16 2023
## Attending
> Alessandro Boschetti
Andreas Gellrich
Carlos Lizama
Christian Wessel
David Dossett
Doris Kim
Frank Meier
Giacomo "the GOAT" De Pietro
Giovanni Gaudino
Mario Merola
Marko Staric
Matt Barrett (he/him)
Michel Hernandez Villanueva
Racha Cheaib
Samantha Taylor
Stefano Lacaprara
Tadeas Bilka (guest)
## Shifter report (Sourabh Chutia)
* shifter report is available [here](https://notes.desy.de/s/N7McNPVEq#)
* gitlab migration going on
* still some problem with buildbot
* reported some problem with validation plots: tracking are due to stat fluctuation wrt reference, some with normalization
* automatic issue for next shifter not working yet
## DP general news
## Calibration
* s-proc4 almost done, only ECMS still running, issue in plotting, lack of pdflatex at BNL
* do not want to further delay s-proc4 for fancy plot.
* ping to Radek
## Montecarlo
* hhISR has been fixed for MCrd in DP repository
* do we need to reproduce these for MC15rd ? Yes for exp12.
* all is in place and produciton can start.
* The updates of the codes for the hhISR and for the XXll production is already implemented for future productions
## Skim
* FEI skim w/o ECL cut done for had and SL FEI
* analysis requested to have bbbar FEI skim in chunk of 200/fb
* MCrd still running: some problem with some job too long
* Not clear if this happens only in MCrd and not in MCri or in Data
* Trevor is investigating with Hideki-san: possibly split combined skim a bit more
## review of /dataprod disk at KEKCC (Alessandro Boschetti)
* Review directory structure of backgroudn samples and tidy up /dataprod
* Also provide a function to get the background file location https://gitlab.desy.de/belle2/software/basf2/-/issues/9552
* What's done:
* wrote confluence page of /dataprod listing.
* wrote an executible which downloads grid data into KEKCC directory.
* The /dataprod/new has directory according to skim names.
* It's tricky to create a new directory structure for background files. Ideas by Giulia and Christian
* [Michel]:
* try use --new for gb2_ds_get
* consider also udst
* Also cDST
* possibly one bucket (now it is the case for bucket 16)
* Also RAW/hRAW
## AoB (David)
* progress moving calibration to gitlab
* https://gitlab.desy.de/david.dossett/automated_calibration/-/tree/1-rework-setup-instructions-and-scripts-to-use-ansible/
* updated to newer long-term support version (we were a bit behind)
* Then start updating to airflow2
* then move from jira to gitlab
* Tentative timescale: update host after s-proc4 (next week)
* few more weeks for airflow2
# DP Meeting March 2 2023
## Attending
>Ansu Johnson
Christian Wessel
David Dossett
Doris Kim
Frank Meier
Giacomo "the GOAT" De Pietro
Kirill Chilikin
Luigi Corona
Matt Barrett (he/him)
Racha Cheaib
Stefano Lacaprara
Swarna Prabha Maharana
Umberto Tamponi (he/him)
Vismaya V S
## Shifter's report (Vismaya)
* nightly development fails now.
* test script failures seen
* validation failures in the backgroud package cannot be fixed until the next major release.
* Frank: next shifter Swarna.
* Moving repo from stash to gitlab, so monitoring will change. Don't be surprised since we will see many new error messages.
* Updating documentation can continue
* Jira be put on read-only. Also jira for software will not be accessible during the migration week.
* users should be warned.
## Skimming
* Running now
* FEI re-skim almost done
* MCrdskim progressing
* Some MCrd signal
* Few MCri signal
* Alberto wrote MCrd ACAT 2022 proceedings.
* new people for skim optimization
* Carlos Lizama and Leonardo Salina from Cinvestav-IPN under Eduard Cruz Berelo
* Based on Varghese's proposal on skim groupings, will continue and develop the skim flagging mechanism.
* Also will work on automation of Collection documetation.
* Skim collections
* What is the correct procedure to ask for a skim collection?
* Racha: Jira ticket, then go to confluence.
* Racha suggests differentiating MC from data collections. One may not need the entire continuum MC.
* Frank: We do not have the correct metadata parameters for this problem.
* How to distringuish which sample is in MC ?
* Now only by production, so by query DSS via MCEventType
* Need a flag in metadata
## s-proc4 status
* no processing pending, only waiting for expert to sign-off
## GitLab migration (Umberto)
* is based on project
* each project h can have project wih just code or just issues or whatever
* On top of project we have group which contains a number of project
* can contains a grup as well
* Proposal:
* project for central tools: libraries and tools
* need work to rewrite some tools
* project (w/o code): with old tickets from JIRA
* group (each with projects)
* processing
* validation
* calibrations
* Separation between tooling and operations: how to?
* have a group operation in parallel to specific tools
* migration will be mostly manual
* for testing we have test group
* https://gitlab.desy.de/belle2/test-group
* Matt:
* suggestion to use labelling as we don't have epic in gitlab
* group label or project label are possible
* Legacy tickets: sync JIRA ticket number with gitlab number, there are tools from DESY IT.
* can be very handy to find again the ticket
## MCrd automation (Luigi)
* status: now subsystem experts to produce payloads
* Goal: collect all scripts and centralize
* Proposal:
* IOV can be run independent or RD
* payload can be new (need to run script) or existing (copy from old GT and/or extend IOV)
* UT: copy from DP_prompt or releaseGT since they are used in MC production
* unless there is a need to change IOV
* or need to override some payloads
* Where to do the work?
* GT manipulation is already done in AirFlow for calibration
* CDC need di-muon sample: this can be moved to prompt calibration
* TRG can be problematic: need to have a GT prepared by TRG accessing relevant info
* and we need a script to check that the payload are coherent (at least), so a sanity check
*
# DP/DC session B2GM
## Python3, API, streaming
* py3 transition ongoing. Client side ready and usable, server side should be completed by march
* API work still ongoing, no direct progress on the single tools but several blockers have been identified and are being solved. The API will be the main focus as soon as the py3 transition is completed in May. The goal is to have it working in summer and use it for the reprocessing in proc16.
* Why is argparse a problem?
* Streaming: preliminary studies done. The changes needed to DIRAC and gbasf2 don't seem to be radical, but currently there is very little workforce available to work on this. Advertized as service task!
## Skimming status
* MC15ri and release-06 data skims are basically over. Overall, much less productions and a smoother operation than MC14 and release-05 data.
* MC15rd ongoing with some important changes being made to speed-up the system
* no limit on number of runs per production
* few outputs in one directory instead of many small files in many directories, thanks to the intra-run merging
* Fewer combined skims, some of them carrying even 30 skims (30 outputs per job and it's working!)
* Several improvements over the previous issues: Warnings now efficiently suppressed, no more large memory usage by the charm skims. We had however issues with duplicated particle list names and last minute changes that required to reproduce some skim.
* The total size of the skims is ~ equal to the size of the input mDST. Opimization by grouping overlapping skims is badly needed!
## Production system
* No major grid problems since last B2GM
* Volume-dependent merge: now the file merge will take into account the actual size of the files (no more assuming all files are of the same size!) Work in progress to deploy it in production (PRs to be open around end of February). The run boundary-less scheme is currently available only for productions with input files (not for MC production then...). Need to cìkee doing the fake skim step?
* Code for the automatic staging is being prepared. Need further development on the production system
* Merge campaigns will still be needed for run-dependent MC productions.
## MC summary
* summary of all the improvements introduced in the MC generation
* Overall, MC15 has been much, much smoother than 14. Still, large delay at the beginning due to the collection of paylaods, bugfixing and waiting for feedbacks.
* Signal MCrd very smooth.
* We managed to distribute the workload much better
* Alberto's proposal to have one job submit multiple runs, Stefano objection that was is important is to have less final files, and to achieve that is better to focus on the merge step, that is already setup, rather than a huge refactoring of the submission procedure
## (Re)Calibration
- bucket size
- min size we need to make calib work: currently is 9 fb-1
- if datataking bigger: prescaled
- generally 14 days to collect that data: accidental but gut
- due to change to hlt trigger from Chris, 9 fb-1 not enough anymore
- bhabha gets heavily prescaled
- this doesn’t affect the modus operandi of prompt
- affects only bhabha, right?
- **recalib on grid**
- “current setup doesn’t scale with lumi”
- no new agreement, NAF can’t keep up
- overview:
- airflow resolves dependency of different CAF jobs (CAF=different calibs of subdetecors)
- CAF resolves the dependency of collector and algorithm jobs
- 99% jobs are collectors, algorithms are memory heavier
- what to do
- Development of the CAF gbasf2 backend
- handle stuck jobs
- new validation when new release
- constraints
- outputs < 5 GB
- output = root
- memory consumption < 4 GB (TOP memory leak, 8 GB required)
- no interdependence between collect and algorit
- requirem on gbasf2: stable python3 API for submiss via gbasf2
- requirem on computing: dedicated q for prompt and recali w high prio ; squid proxy to not submit request qualizzion time
- 20k jobs pre prompt calib (TBC)
- 300k jobs per recalib (TBC)
- 14 TB/bucket of cdst files (TBC and can likely be reduced by a bit.)
Discussion points:
- Maybe 4 GB memory is too much. 2 would be better. Software probably has the answers, memory consumption will mainly come from reconstruction (check!).
- old cDST could go on tape, keep on disk only the new ones (1 year?)
- SQUID proxy setup works and is a good candidate to resolve the database "issue". **Technical, solvable problem(?)**
- High priority queue needs discussion on how to implement it in DIRAC. **Technical, solvable problem(?)**
- Are multicore jobs a possibility for improved performance? Development from basf2 software side should directly propagate to calibration.
- BNL confirmed (once more) they are the prompt calibration center. We will just access them via DIRAC (may need to setup a dedicated high priority queue?)
- We should check if the collector exec time is dominated by CPU or I/O. Strong suspect it's CPU from analysis jobs experience.
- Check with DESY (Andreas) on what kind of resources are pledged from the written document because this caused confusion. Current understanding from calibration is that the pledged resources are GRID resources and not NAF resources.
# DP Meeting Feb 9 2023
## Attending
> Alberto Martini
Andreas Gellrich
Ansu Johnson
Doris Kim
Frank Meier
Hiroaki Ono
Jake Bennett
Marko Staric
Michel Hernandez Villanueva
Moritz Bauer
Patrick Ecker
Racha Cheaib
Sourav
Stefano Lacaprara
Swagato Banerjee
Trevor Shillington
Tsovinar Karapetyan
Umberto Tamponi
## SW shifter (Tsovinar Karapetyan)
* some days w/o buildbot, restated
* FM: is Philip Grace still listed somewhere? need to be updated
## Signal Event DB (Swagato)
* mcEventType 10 digits code to represent a signal, define in technical note
* Existing dec are stored in confluence page, with link to dec file.
* difficult to search, update is done by hand by DP liaisons, but not all the time
* Create a DB to hold these informations
* first version available at https://mirabelle.belle2.org/Signal/
* including instant filtering by desc and nickname
* based on posttgreSQL on a VM in DESY
* possibly have a dedicated DNS alias
* Few suggestions recieved:
* should try to use the same kind of DB as all other services (MariaDB?)
## EventType proposal (Stefano)
Total of 650 signal production in MC15 (~575 of MC14ri).
Additional requests added to the default productions are ~180 -> towards automatic job submission
Therefore, event type definition has to be addressed.
* very important but complicated/not friendly. Not easy to memorise and identify the corresponding channel easily.
* 10 digit numbers not strictly followed. Only 1 is free to use as extra flag.
* Create code by hand (following instructions on confluence) without checks.
* Hard to check if the mcEventType is already used and where the corresponding .dec file is.
Proposal on how to improve this:
* Every dec file has an EventType, Descriptor and NickName and we want Descriptor and/or NickName to be unique (Frank already checked that this is not the case already).
* Create automatically the EventType from Descriptor and/or NickName. The latters are not needed to be meaningful.
* Check if the event type is unique (relying also on Swagato's tool).
What do we need?
* tools to create the EventType number and check it properly.
The dec files are in release but there are many duplications on MC repository.
Targeting the next MC campaign:
* if .dec file is in release -> all good.
* if .dec file is not in release:
* merged PR to main branch is **mandatory **.
* Need modification of signal production tool (not a showstopper).
* MC production tools take
Q&A:
Giacomo: Software can accept .dec files at every patch releases, proposals are reasonable.
S.L.: we have to test the releases to be used on grid -> maybe we can avoid considering the situation (?).
Swagato: we do not want a unique something (did not hear that properly..)
I basically lost the whole thing..
suggestion is to do "instant server-side filtering" on "Documentation" or "EventType" fields
Need to fill dec file with description which can allow user to search meaningfully for it, nick and descriptor might not be enough.
## Skims (Trevor)
* few skims still running for B2charm (not clear why)
* EWP log file (solution understood, need to be implemented)
* MC15rd: syst almost done
* ~30 skims in each production, seems to work ok.
* FEI submitted separately for Knunubar analysis
* Using New intrarun merge: result one dir with 100MB->1GB files instead of hundreds of dirs with more files
*
## Validation (Patrick)
* Presentation from Tracking F2F meeting.
* Define a task= the quantity which should be monitored.
* Example workflow is there (runinng and uploading to local copy of mirabelle)
* Machine to run on:
* Proposal: Some machine at DESY (tbd)
* Transfer grid outputs to DESY with a short lifetime (~1 month) to not block storage space
## AOB
# DP Meeting Jan 26th 2023
## Attending:
> Stefano Lacaprara
Alberto Martini
Andreas Gellrich
Doris Kim
Frank Meier
Giacomo "the GOAT" De Pietro
Giovanni Gaudino
Giulio Dujany
Luka Santelj
Marko Staric
Markus Prim
Matt Barrett (he/him)
Patrick Ecker
Priyanka Cheema
Racha Cheaib
Swagato Banerjee
Tommy Lam
Umberto Tamponi (he/him)
Wei Shan
## SW shift (Doris Kim)
* Validation:
* Jump of shifter plots with errors from 20 Jan coming from background, ECL, and reconstruction plots.
* Another jump on Jan 24, TOP and SVD have recent scripts with errors as well.
* Frank: E/BKLM hits merging causes issues here. One combined EKLM issue for background. This one will stay for a bit until new background file comes.
* Marko: For TOP, issue with tracking in for some bug with relation with track/MC. Could be related to mDST size drop.
* Giacomo: Can add future test to check these relations in the future.
* Builds:
* Short review on how to address the many Intel warnings in build. Already discussed heavily at the last SW developers meeting.
* Test Result Failures for generators and HLT.
* Frank) these are long standing issues.
* for generators, permission errors (PR coming to suppress).
* for HLT, ideas are being created.
* Dependency problems for display as well.
* B2 Questions: AlmaLinux 9.x for immediate future? (discussed previously in the SWD meeting)
* Monitoring:
* Computing time has increased but somewhat stable.
* MDST total size and memory usage decreased.
* Marko) Could be due to missing releations in track/MC.
* Compilation issues: Huge intel warnings jump on Jan 14 (new version)
* Frank) The newly occuring Intel warnings should be fixed.
* Memory Leaks: INIT memory leaks happen from time to time but not all the time.
* Changes in tracking could cause low/high multiplicity budget plot increases?
* JIRA Issues with no assignees
* Several empty assignee slots were filled.
* Checked components information of these issues and they looked okay.
* After prompts, several issues were either closed or sent to backlog (usually person power issue).
* For next shifter:
* test failures for generators/hlt
* Next shifter should be aware of these, but experts will handle them.
* replace assignees of JIRA issues when original ones have left the collaboration.
* The slots can be left empty when person power search is not successful.
* find where more validation plots became red and inform librarians
* Except background and top packages. Marko knows the causes of errors for these packages, as mentioned previously in the minutes.
## DP status (Stefano)
* plan for B2GM and BPAC
* trying to setup a task force for scheduler (Airflow/b2luigi/etc)
* Looking for a Skim develper to help Trevor for developing new Skim strategy
* MCrd "post-mortem" (Alberto, similar talk for BPAC)
* validation status (by Patrick, B2GM and BPAC)
* calibration after LS1 (by Markus, included in BPAC)
* DP priorities/status (by SL/UT for B2GM/BPAC)
* What about future of MCri for physics? should we move totally to MCrd? To be raised somewhere during the B2MG
* Found a candidate for replacing Alberto for MCrd
## Calibration (Markus)
* Try to make concrete proposal for remainder of LS1.
* Issue with personpower (even just with sproc4, difficult to fix issues, let alone develop)
* SL: Hopefully, the task force mentioned above will help with this
* stuck in alignment phase (problem of new release)
## Processing (Stefano)
* Recent collection has been replaced and should be used instead
## Monte Carlo (Alberto)
* run dependent campaign is done. Still have signal is being produced
* for run independent, signal samples being produced (one 400 fb-1 sample requested and in progress)
* Boyang is working on allowing dec files not in release to be processed
* Debjit Gosh is helping manage the collections
## Skim (Racha)
* b->charm finishing up
* MCrd skim should be in process by Trevor
* New Skim strategy under development
* Collection skim have been not requested much so far: need to remind people at next PGM
## Validation (Patrick)
* For analysis, productions are running
* AI: add Patrick to some high priority group
* Starting to focus on other parts of MC quality (working example expected for B2GM)
* Discussions with tracking on MiraBelle usage in progress
# DP Meeting Jan 12th 2023
> Alberto Martini
Andreas Gellrich
Christian Wessel
Doris Kim
Francesco Tenchini
Frank Meier
Hiroaki Ono
Jake Bennett
Mario Merola
Marko Staric
Merle Graf-Schreiber
Michel Villanueva
Patrick Ecker
Priyanka Cheema
Racha Cheaib
Stefano Lacaprara
Swagato Banerjee
Tadeas Bilka
Trevor Shillington
Umberto Tamponi
## SW shifter (Christian Wessel)
* small and steady increase of memory usage since release 07
* review of validations: many plots are broken since long, so maybe we should really do not produce them if nobody is looking at them.
* missing references in the expert plots (PR 1591, 1597, 1602). It's time to rethink the purpose of these plots.
* started PR's to increase efficiency of the validation packages.
* Old unassigned issues in backlog
* the issue will be discussed again during SWD.
* Modified ARICH to following modderin C++ coding conventions.
* the original aim: Remove unnecessary MC reconstruction from paths when running HLT.
* (Frank) so far ignored intel compiler warning
* klm tools fix in progress
## DP status (Stefano)
* Not much activity
* MCrd almost completed only exp 8 is missing
* MCri off-resnance completed
* Merging of hlt_hadron events ongoing
## Calibration
* Progress Grid calibration on grid?
## Processing
* merging: proc13 done and tested all ok.
* prompt in progress, well advanced. Eventually wil run the test
* register to DSS (and remove the unmerged from DSS to avoid double counting)
* Need to be careful with DSS registrtion as these are technically skim (and maybe we want to change that)
* Next will be MCrd, checking the reduction factor
* 4x lumi but splitted samples (uubar/ssbar)
*
## MC (Alberto Martini)
* exp8 Mcrd offres: BGO is very limited, will use any BGO for exp8 4S and 4S_offres, so it will not be "run dependent" for BGO nor using onlt 4S_offes only BGO.
* Meeting tomorrow at 14 CET/22 JST to discuss how to do the signal w/ misaligned geometry
* Not a trivial task on grid
* Generation of sample with different decay tables: no follow up.
*
## Skim (Trevor Shillington)
* mostly is done. B2Charm stopped due to production.
* to be resubmitted with fix
* Next MCrd will be submitted first syst skims and then rest
* progress on merging skims and attaching flag: Varghese had a tool for that but no time to test
## Validation (Patrik)
* will use pre7d for validation, PR ready to be merge.
* can start production of validation modes
* Need to enable pre7d on grid to allow running
* need on prod server: what about cert server?
* Can use latest light release, which is identical to main branch
* PID and traking to be added in QA for processing, contacted groups no feedback to PID
* need some work to get the steering scripts
* Tracking is ok.
* Doable a full exercise for B2GM? should be
## s-proc4 status (Umberto)
* stuck in the alignment job
## B2GM agenda
* A short parallel session (~2 hours)
* will likely have round table plus some longer term project report
* eg: skim merge, validation and QA status,