Stripping (pre-selection)

Download Report

Transcript Stripping (pre-selection)

Strategie di analisi a LHCb
Vincenzo Vagnoni
INFN Bologna
CCR – INFN GRID Workshop
Palau, 13 Maggio 2009
LHCb Computing Model
Baseline DAQ settings



~2 kHz output rate
35 kB/event


probably larger at the beginning until optimal thresholds for
zero suppression are set (40-50 kB/event)
~70 MB/s recording rate at Tier-0
RAW DATA
CERN
RAW DATA
CNAF

PIC
RAL
IN2P3
GRIDKA
NIKHEF
Reconstruction output  rDST

Performed at the 7 sites (CERN + 6 Tier1s)
2
LHCb Computing Model (II)

Stripping (pre-selection) output  DST




Analysis


In order to reduce the data sample with specific physics pre-selection algorithms
Performed at the site where rDST were produced
Output DST replicated to the 6 Tier-1s + CERN
Can be run indifferently at any Tier1 + CERN
MC Production

Main difference w.r.t.
other experiments
Performed at Tier2s
CERN
PIC
IN2P3
DST replicated in each Tier-1
CNAF
O(100) TB per L=2 fb-1
(107 seconds of data taking for LHCb)
RAL
NIKHEF
GRIDKA
3
RDST
Stripping Workflow
RAW
First run stripping on rDST reconstructed events
DaVinci
(stripping)

FETC

Re-reconstruct events and produce full DSTs

Stripping re-run after re-reconstruction


Brunel
(recons)
DST

Produces Event Tag Collections for events passing the
stripping pre-selection as “pointers” to the corresponding
RAW events (FETC)
includes stripping results in DST (which pre-selection algorithm fired)
Stripping outputs merged together

DaVinci
(stripping)
Tag collection created in the merging job (SETC)
DST1
DST1
DST1
DST1
DST1
Merger
DST1
Tagger
SETC1
4
Data stripping and analysis
(space token view)

Stripping on rDST files



DST files and ETC produced during the process
stored locally on T1D1


Space tokens: LHCb_M-DST (“M” stays for master copy)
DST and ETC files then distributed to all other
computing centres on T0D1


Input rDST files and associated RAW files
Space tokens: LHCb_RAW (T0D1) and LHCb_rDST (T0D1)
Space tokens: LHCb_DST
Data analysis performed on LHCb_DST (if running
on replicas) or on LHCb_M-DST (if running on site
owning the master copy)
5
Centralized and end-user
analyses

Physics Working Group analysis

Aim

provide general purpose datasets to the whole Working Group
 Output data: µDST + ETC
 Factor 10 reduction in size

Advantages

Datasets registered in BK, available to all users on the Grid
 Saves computing time (no need by each user to run on full DST dataset)

User analysis

Algorithm development and tuning on limited datasets


Full statistics from Working Group datasets


Using real data as well as MC
On the Grid
Non-data processing (e.g. toy MC simulations) and CP fits (Root/RooFit jobs)

Needs to be clarified and quantified
6
Baseline event rates

Current baseline  maximum 2 kHz HLT output rate


Until 2005, HLT was 200 Hz
In 2005, introduced additional HLT streams for calibration, systematic errors
studies, data mining: 2000 Hz (but core Physics still at 200 Hz)

Very Early Physics
 Minimum/small bias HLT for inclusive studies (no b-physics)
 HLT can be higher than 2 kHz

Early physics
 Start from stripping rate of selected channels: few 10’s Hz
 Open cuts in HLT, select working point efficiency/retention: << 2 kHz?
 Add other channels when stripping and HLT rates demonstrated
 Add calibration (dimuons?) and inclusive b channels up to max 2 kHz
 Very close to the Computing TDR model
7
But… HLT strategy being
reworked out

The HLT is now living a third review process


A further increase of the rate means more data to store


For the first year or so, the rate might increase further, maybe to 5 kHz
… constrained by WLCG pledges
 but mainly tape storage
More data to reconstruct...

But already with the current baseline  2 kHz @ 2.4 kSI2k per event



ATLAS: 200 Hz @ 15 kSI2k per event
 LHCb = 1.6 times ATLAS!
… and of course more data to understand
However, full reconstruction in the HLT farm at CERN could
be an option



Multi-cores revolution! The online farm is moving from the original 2000 cores to
O(10000) cores
5 kHz event rate may require “only” ~2000 cores if reconstruction is optimised
Avoids unnecessary storage and offline reconstruction
8
Controlled tests of data
analysis at Tier-1s

Report (not by INFN people  unbiased) circulated few
days ago
9
10
11
CASTOR
GPFS/StoRM
dCache
dCache
dCache
dCache
CASTOR
12
CNAF
13
14
15
Detailed performance by site


Distribution of Wall Clock time
to process 100 consecutive
events
Distribution of Wall Clock time
to process 100 consecutive
events including a new file
opening
16
Detailed performance by site
17
Why CNAF had so excellent
performance?

CNAF was the only site which had simultaneously





All of these amongst the 7 LHCb Tier-1s


CERN, CNAF, GRIDKA, IN2P3, NIKHEF, PIC, RAL
And the main reason is StoRM/GPFS



High CPU efficiency
Low failure rate
Low time processing per event
Low file opening time
Fast SRM response, SRM stability, fast and reliable access to files via
GPFS
(for the moment) something to be proud of !
Main pending issues from the current analysis



Poor Performance of SRM at GRIDKA (dCache)
Poor Performance of SE at IN2P3 (dCache)
Large File opening time at RAL (CASTOR)
18
Conclusions

LHCb main issue for running an analysis on the GRID?
DISK STORAGE !
Reliability, Stability, Performance,
Scalability for both SRM and SE
19