Data Production and Computing Workshop - 43rd B2GM parallel week

# Data Production and Computing Workshop - 43rd B2GM parallel week ###### tags: `Calibration` ## MC production * reuse rate for background overlays samples for MC run independent in low multiplicity samples * might be due to a specific background overlay component, try with private production and pin down which component is causing the spikes Action items: * Nothing to do here ## Calibration * hraw production with prescale: stream it as soon as raw data are acquired, assume two-week buckets and safety overall lumi factor of +30% on all applied prescales * if any accident stops the data taking and we do not have enough lumi for running a calibration loop * hraw_calib size must be included in the resource estimate * skimming: useful to have deadline for implementing new HLT flags to be computed on data, and possibly used in the next reprocessing * scalability, understand requirements for going to the grid: * numbers needed, jobs output size main bottleneck * memory consumption * action item: collect characteristic of calibration jobs, size of input files, job output sizes * you want DIRAC to run CAF itself somewhere * CAF is a basf2 thing, b2 caf command line tools * bookkeeping for payloads to fix in prompt( Giulio's): * data_reprocessing_prompt_fix as auxiliary tag where to store the fixed payloads that were merged wrong in the prompt * first priority in the baseline tag chain for next recalibration * establish policy for management of auxiliary fixing tag, but don't overthink, don't build any workflow Action items: * Ping Stefano about first bullet point * Provide deadline for HLT skim flag during LS1, potentially implement spare flag. Inquire with calibration experts if new flags are required and if some skims can be cleaned up / removed / refined. Airflow scripts have to be updated if/when skim names are updated etc. so we need sufficient time to test everything. Start discussion on topic soon. * Get Dave to provide numbers. * Start discussion on auxiliary tag. Experts should act in the 48 hour buffer window when prompt is finished. ## Dataset searcher * without LPN adresses provided by Alberto, impossible to find the right LPNs * udst mixed with mdst, not enought the skim type code * wrong MC campaign in the ticket ## Condition Database * when to schedule a dry test with new JWT for DB? * sprocXX for rel07 datavalidation is the right occasion * promote "private" DP script currently use to modify/copy/remove IoVs into offcial b2conditionsdb tools * GTclean up, stating from larger GT ( > 100 payalods) * all backup and snapshot from 2022 will be kept * GT intervention, needed a developer who has good knowledge of DB functionality, calibration manager would be the right person ( Michael?) * volunteered ( accepted) Action items: * Michael will work on this. Make the db tools official. ## Airflow development prorities * to make it a collaborative development effort, we must migrate everything to a basf2 like project * airflow folder already on stash, but last commit dates back to 2019...where is the current actual code? * we need to be able to retrieve, push branch Action items: * Push Dave to make this collaboration type project. * Encourage Dave to delegate. ## GRID or no-GRID * Follow up on GRID/DIRAC/gbasf2 vs airflow for calibration. Action items: * Understand the workflows and start a discussion # Special Processing (sProc4) 1. present the plan at the Physics Performance ( for the official sign-off of the release to use) + calibration + DP Physics Performance) meetings, 2. make sure first with the calibration experts the statistics for the needed release validation is enough, 3. the rel-07 scripts are ready to be run in Airflow, 4. data are available at BNL, 5. people tuned and reactive. Action items: * Which software should we use? --> software group. Get green light for patch. * Is last bucket sufficient statistic? --> data production / calibration / physics performace. * Ping BNL to keep hraw from last bucket. If hraw not available we can also use cDST at DESY. What is easier faster? * Default: last bucket with prompt calibration constants * E.g. tracking group wants 2 sprocs (with and without SVD time) * Are special recalibrations required? * Think about dry run test for the airflow/caf chain

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.