Benchmark $B^+\rightarrow K^+J/\psi \mu^+\mu^-$ analysis

# Benchmark $B^+\rightarrow K^+J/\psi \mu^+\mu^-$ analysis ## Work status ### Datasets * [x] THEORETICAL MOTIVATION * [ ] CHOICE OF DATASET * [ ] DATA SAMPLES, STRIPPED * [ ] 2015 ? * [x] 2016 * [x] 2017 * [x] 2018 * [ ] SIGNAL SIMULATION SAMPLES B->K CHIC MM * [ ] 2015 ? * [x] 2016 * [x] 2017 * [x] 2018 * [x] Truth-matching * [ ] PEAKING BACKGROUND SIMULATION SAMPLES * [ ] * [ ] 2015 ? * [x] 2016 * [x] 2017 * [x] 2018 * [x] Truth-matching * [x] Stripping line ### Selection * [x] TRIGGER CUTS * [x] OFFLINE SELECTION * [ ] COMBINATORIAL BACKGROUND BOOSTED DECISION TREE * [x] Variables list * [x] Performance * [x] Train-test * [x] Correlations * [ ] hyperparameters * [x] Manual optimization * [ ] Proper systematic optimisation * [ ] Epochs plot * [ ] MISIDENTIFIED BACKGROUND STUDY * [x] D0->Kpi MC gen-level study, kink study * [ ] ... * [ ] MISIDENTIFIED BACKGROUND BOOSTED DECISION TREE * [x] Variables list * [x] Performance * [x] Train-test * [x] Correlations * [ ] Hyperparameters choice * [x] Manuel optimisation * [ ] optimisation * [ ] Epochs plot * [ ] k folding * [ ] OPTIMISATION OF THE MULTIVARIATE SELECTION * [ ] Method using FoM and inverted cuts -> validated, final ? * [x] Combinatorial fit of the upper sideband + extrapolation * [x] MisID fit including decay in flight and punch-through contributions * [x] Inverted cut transfer factor exactrction (redo for each cut pair) * [x] Final 2d scan Figure of merit and result on the optimal cut * [ ] Other way? * [ ] PID CORRECTIONS (!!) * [ ] Correcting PID distributions in (P,PT, ntrack) bins using PIDGen package * [ ] SIMULATION CORRECTIONS * [ ] EXPECTED BACKGROUND CONTRIBUTION USING DATA-DRIVEN METHOD * [x] D0->pipi results and choice not to keep it (Xiafei's studies on old setup) * [ ] Ks->pipi fake rate maps (ongoing) * [x] prompt Ks selection choice * [x] sWeights recompuations using #splot method * [x] Fake rate maps per-year computation * [x] Closure tests on data * [x] Total background evaluation * [ ] Real muons pollution in signal selection using normalisation channel (ongoing) * [ ] EFFICIENCIES * [x] Signal efficiency * [ ] Control data muons efficiency * [x] @optimal cuts * [ ] @cuts pair to insure flat efficiency (--> what to do with this information) * [ ] Resonances vetoing * [x] Chic yield evaluation * [ ] Chic vetoing? * [ ] Include $\rho/\omega \to \mu \mu$ resonances in the analysis ? does it make sense for dimuon tagging? ### Mass fits * [ ] $B\to KJ/\psi \mu\mu$ DATA BLINDED FIT * [ ] MisID bkg "shape" from data-driven method * [ ] * [ ] NORMALISATION CHANNEL $B\to KJ/\psi \mu\mu$ FIT * [ ] * [ ] Corrections to the MC (Parzival) * [ ] Efficiency computation (Parzival) ### Branching ratio * [ ] Result ### Systematics * [ ] Fit model, pull distributions (toys) * [ ] Efficiencies * [ ] BDT * [ ] PID ## Repositories: * Sonia: https://gitlab.cern.ch/sbouchib/multilepton * Matthieu (TP4b): https://c4science.ch/diffusion/12329/ * Elena: https://gitlab.cern.ch/egraveri/multilepton-eg ## Preselections ### Simulation models and particle codes * [**Evttype `12145067`**](https://gitlab.cern.ch/lhcb-datapkg/Gen/DecFiles/-/blob/v30r76/dkfiles/Bu_KOmegaJpsi,mm=LSFLAT,DecProdCut.dec): $B \xrightarrow[]{\text{PHSP}}K^+J/\psi( \xrightarrow[]{\text{VLL}} \mu^+ \mu^-)\omega( \xrightarrow[]{\text{VLL}} \mu^+ \mu^-)$ with flat $\omega$ mass profile. * **Code**:`B`$\rightarrow$`Kaon Jpsi`$(\rightarrow$`mu1 mu2`$)$`omega`$(\rightarrow$`mu3 mu4`$)$ * [**Evttype `12145068`**](https://gitlab.cern.ch/lhcb-datapkg/Gen/DecFiles/-/blob/v30r76/dkfiles/Bu_KOmegaJpsi,mm=PHSP,DecProdCut.dec): same as before with usual $\omega$ mass profile. * **Code**:`B`$\rightarrow$`Kaon Jpsi`$(\rightarrow$`mu1 mu2`$)$`omega`$(\rightarrow$`mu3 mu4`$)$ * [**Evttype `12145069`**](https://gitlab.cern.ch/lhcb-datapkg/Gen/DecFiles/-/blob/v30r76/dkfiles/Bu_chic1K,Jpsimumu=DecProdCut.dec): $B \xrightarrow[]{\text{PHSP}}K^+\chi_{c1}( \xrightarrow[]{\text{SVS} } J/\psi \mu^+ \mu^-)$. Evttype `12145069` * * **Code**:`B`$\rightarrow$`Kaon chic`$(\rightarrow$` Jpsi`$(\rightarrow$`mu1 mu2`$)$ `mu3 mu3`$)$ ### Explicit selections * **Precuts** when getting back DST files from the grid: ``` TCut AllCuts = " Kaon_PIDK>4 && mu1_ProbNNmu>0.1 && mu2_ProbNNmu>0.1 && abs(Jpsi_M-3097)<50 && mu1_TRACK_GhostProb<0.3 && mu2_TRACK_GhostProb<0.3 && mu3_TRACK_GhostProb < 0.3 && mu4_TRACK_GhostProb < 0.3 && Kaon_ProbNNk >0.2 && Kaon_TRACK_GhostProb<0.3 && (B_L0MuonDecision_TOS || B_L0DiMuonDecision_TOS) && mu3_isMuon==1 && mu4_isMuon==1"; ``` * **Stripping lines** * **Inclusive Jpsi**: FullDSTDiMuonJpsi2MuMuDetachedLine. Current used one, coded as `inclusiveJpsi`. https://lhcbdoc.web.cern.ch/lhcbdoc/stripping/config/stripping34/dimuon/strippingfulldstdimuonjpsi2mumudetachedline.html * JpsiKmumu: The one used previously for feasibility studies, coded as`stripped`. https://lhcbdoc.web.cern.ch/lhcbdoc/stripping/config/stripping34r0p2/dimuon/strippingb2mumumumub2jpsikmumuline.html * **Preselection**: code = `preselected` ``` "Kaon_PIDK>-2 && mu1_ProbNNmu>0.05 && mu2_ProbNNmu>0.05 && abs(Jpsi_M-3097)<50 && mu1_TRACK_GhostProb<0.3 && mu2_TRACK_GhostProb<0.3 && mu3_TRACK_GhostProb < 0.3 && mu4_TRACK_GhostProb < 0.3 && Kaon_TRACK_GhostProb<0.3 && (B_L0MuonDecision_TOS || B_L0DiMuonDecision_TOS) && B_DIRA_OWNPV > 0.9999" # && B_LOKI_DTF_CHI2NDOF<5 ``` * **Signal cuts** $M(B)= 5.27970e+03$ $\sigma(B) = 8.08554e+00$ For 1$\sigma$ window: `B_LOKI_MASS_JpsiConstr>5271.61446&&B_LOKI_MASS_JpsiConstr<5287.78554` For 2$\sigma$ window: `B_LOKI_MASS_JpsiConstr>5263.52892&&B_LOKI_MASS_JpsiConstr<5263.52892` * **Uppersidband**: for background training sample ``` B_LOKI_MASS_JpsiConstr>5400 ``` * **Truth-matched** : for signal MC ``` (B_BKGCAT==10 || B_BKGCAT==50) ``` ### Samples for $B \rightarrow K J/\psi \mu^+\mu^-$ #### LPHE cluster Latest versions * **MC2018 signal ntuple**, Evttype `12145069` * Location: `/panfs/bouchiba/TP4b_spring22/NTUPLES/BDT1update_ntuples/BDT1added_truthmatched_cleanChi2Vtx234_preselectedDimuon_inclusiveJpsi_gangaAll_2018_MC69_MM_220412.root` * Cuts applied: precuts, inclusive Jpsi stripping, preselection, cleanChi2Vtx234. Branch `BDT1` added: BDTG1 response, including Kmu3mu4 vertex $\chi^2$. * **Data 2018 ntuple**, latest version: * Location: `/panfs/bouchiba/TP4b_spring22/NTUPLES/BDT1update_ntuples/BDT1added_preselectedDimuon_inclusiveJpsi_gangaAll_2018_Data_MD_220412.root` * Cuts applied: precuts, inclusive Jpsi stripping, preselection. BDTG1 including Kmu3mu4 vertex $\chi^2$ applied in branch `BDT1` ### Collection Branches: ``` { "B_LOKI_MASS_JpsiConstr", "B_M", "B_PX", "B_PY", "B_PZ", "B_PE", "B_PT", //"B_BKGCAT", "B_IPCHI2_OWNPV", "B_DIRA_OWNPV", "B_LOKI_DTF_CHI2NDOF", "B_ENDVERTEX_CHI2", "Kaon_M", "Kaon_PX", "Kaon_PY", "Kaon_PZ", "Kaon_PE", "Kaon_PT", "Kaon_ETA", "Jpsi_M", "Jpsi_PX", "Jpsi_PY", "Jpsi_PZ", "Jpsi_PE", "Jpsi_PT", "Jpsi_ETA", "Jpsi_FDCHI2_OWNPV", "mu1_M", "mu1_P", "mu1_PX", "mu1_PY", "mu1_PZ", "mu1_PE", "mu1_PT", "mu1_isMuon", "mu2_M", "mu2_P", "mu2_PX", "mu2_PY", "mu2_PZ", "mu2_PE", "mu2_PT", "mu2_isMuon", "phi_M", "phi_PX", "phi_PY", "phi_PZ", "phi_PE", "phi_PT", "phi_ETA", "phi_ENDVERTEX_CHI2", "mu3_M", "mu3_P", "mu3_PX", "mu3_PY", "mu3_PZ", "mu3_PE", "mu3_PT", "mu3_ProbNNmu", "mu3_ProbNNk", "mu3_PIDmu", "mu3_PIDK", "mu3_TRACK_CHI2", "mu3_hasCalo", "mu3_EcalPIDmu", "mu3_HcalPIDmu", "mu3_RichDLLmu", "mu3_RichDLLk", "mu3_MuonMuLL", "mu3_MuonChi2Corr", "mu3_TRACK_MatchCHI2", "mu3_isMuon", "mu4_M", "mu4_P", "mu4_PX", "mu4_PY", "mu4_PZ", "mu4_PE", "mu4_PT", "mu4_ProbNNmu", "mu4_ProbNNk", "mu4_PIDmu", "mu4_PIDK", "mu4_TRACK_CHI2", "mu4_hasCalo", "mu4_EcalPIDmu", "mu4_HcalPIDmu", "mu4_RichDLLmu", "mu4_RichDLLk", "mu4_MuonMuLL", "mu4_MuonChi2Corr", "mu4_TRACK_MatchCHI2", "mu4_isMuon", }); ``` ## Soft muon BDT ### Variables ``` 'muX_P', 'muX_PT', #'muX_TRACK_CHI2’, -> technical issue being fixed 'log10(abs(muX_ProbNNmu))', 'log10(abs(muX_ProbNNk))', 'muX_PIDmu - muX_PIDK', 'muX_RichDLLmu', 'muX_RichDLLk', 'muX_TRACK_MatchCHI2', 'muX_CaloEcalE', 'muX_MuonMuLL’ ``` ### Hyperparams/algo ``` X_dev,X_eval, y_dev,y_eval = train_test_split(X, y, test_size=0.33, random_state=42) X_train,X_test, y_train,y_test = train_test_split(X_dev, y_dev, test_size=0.33, random_state=492) print("X_test",X_test.shape) dt = DecisionTreeClassifier(max_depth=3, min_samples_leaf=1) bdt = AdaBoostClassifier(dt, algorithm='SAMME', n_estimators=450, learning_rate=0.5) ``` ### Training samples ### ## Ongoing studies ### $B^+\rightarrow K^+ J/\psi \pi^+ \pi^-$ Normalization channel. Evaluate preselection effect.