Stripping (pre-selection)
Download
Report
Transcript Stripping (pre-selection)
Strategie di analisi a LHCb
Vincenzo Vagnoni
INFN Bologna
CCR – INFN GRID Workshop
Palau, 13 Maggio 2009
LHCb Computing Model
Baseline DAQ settings
~2 kHz output rate
35 kB/event
probably larger at the beginning until optimal thresholds for
zero suppression are set (40-50 kB/event)
~70 MB/s recording rate at Tier-0
RAW DATA
CERN
RAW DATA
CNAF
PIC
RAL
IN2P3
GRIDKA
NIKHEF
Reconstruction output rDST
Performed at the 7 sites (CERN + 6 Tier1s)
2
LHCb Computing Model (II)
Stripping (pre-selection) output DST
Analysis
In order to reduce the data sample with specific physics pre-selection algorithms
Performed at the site where rDST were produced
Output DST replicated to the 6 Tier-1s + CERN
Can be run indifferently at any Tier1 + CERN
MC Production
Main difference w.r.t.
other experiments
Performed at Tier2s
CERN
PIC
IN2P3
DST replicated in each Tier-1
CNAF
O(100) TB per L=2 fb-1
(107 seconds of data taking for LHCb)
RAL
NIKHEF
GRIDKA
3
RDST
Stripping Workflow
RAW
First run stripping on rDST reconstructed events
DaVinci
(stripping)
FETC
Re-reconstruct events and produce full DSTs
Stripping re-run after re-reconstruction
Brunel
(recons)
DST
Produces Event Tag Collections for events passing the
stripping pre-selection as “pointers” to the corresponding
RAW events (FETC)
includes stripping results in DST (which pre-selection algorithm fired)
Stripping outputs merged together
DaVinci
(stripping)
Tag collection created in the merging job (SETC)
DST1
DST1
DST1
DST1
DST1
Merger
DST1
Tagger
SETC1
4
Data stripping and analysis
(space token view)
Stripping on rDST files
DST files and ETC produced during the process
stored locally on T1D1
Space tokens: LHCb_M-DST (“M” stays for master copy)
DST and ETC files then distributed to all other
computing centres on T0D1
Input rDST files and associated RAW files
Space tokens: LHCb_RAW (T0D1) and LHCb_rDST (T0D1)
Space tokens: LHCb_DST
Data analysis performed on LHCb_DST (if running
on replicas) or on LHCb_M-DST (if running on site
owning the master copy)
5
Centralized and end-user
analyses
Physics Working Group analysis
Aim
provide general purpose datasets to the whole Working Group
Output data: µDST + ETC
Factor 10 reduction in size
Advantages
Datasets registered in BK, available to all users on the Grid
Saves computing time (no need by each user to run on full DST dataset)
User analysis
Algorithm development and tuning on limited datasets
Full statistics from Working Group datasets
Using real data as well as MC
On the Grid
Non-data processing (e.g. toy MC simulations) and CP fits (Root/RooFit jobs)
Needs to be clarified and quantified
6
Baseline event rates
Current baseline maximum 2 kHz HLT output rate
Until 2005, HLT was 200 Hz
In 2005, introduced additional HLT streams for calibration, systematic errors
studies, data mining: 2000 Hz (but core Physics still at 200 Hz)
Very Early Physics
Minimum/small bias HLT for inclusive studies (no b-physics)
HLT can be higher than 2 kHz
Early physics
Start from stripping rate of selected channels: few 10’s Hz
Open cuts in HLT, select working point efficiency/retention: << 2 kHz?
Add other channels when stripping and HLT rates demonstrated
Add calibration (dimuons?) and inclusive b channels up to max 2 kHz
Very close to the Computing TDR model
7
But… HLT strategy being
reworked out
The HLT is now living a third review process
A further increase of the rate means more data to store
For the first year or so, the rate might increase further, maybe to 5 kHz
… constrained by WLCG pledges
but mainly tape storage
More data to reconstruct...
But already with the current baseline 2 kHz @ 2.4 kSI2k per event
ATLAS: 200 Hz @ 15 kSI2k per event
LHCb = 1.6 times ATLAS!
… and of course more data to understand
However, full reconstruction in the HLT farm at CERN could
be an option
Multi-cores revolution! The online farm is moving from the original 2000 cores to
O(10000) cores
5 kHz event rate may require “only” ~2000 cores if reconstruction is optimised
Avoids unnecessary storage and offline reconstruction
8
Controlled tests of data
analysis at Tier-1s
Report (not by INFN people unbiased) circulated few
days ago
9
10
11
CASTOR
GPFS/StoRM
dCache
dCache
dCache
dCache
CASTOR
12
CNAF
13
14
15
Detailed performance by site
Distribution of Wall Clock time
to process 100 consecutive
events
Distribution of Wall Clock time
to process 100 consecutive
events including a new file
opening
16
Detailed performance by site
17
Why CNAF had so excellent
performance?
CNAF was the only site which had simultaneously
All of these amongst the 7 LHCb Tier-1s
CERN, CNAF, GRIDKA, IN2P3, NIKHEF, PIC, RAL
And the main reason is StoRM/GPFS
High CPU efficiency
Low failure rate
Low time processing per event
Low file opening time
Fast SRM response, SRM stability, fast and reliable access to files via
GPFS
(for the moment) something to be proud of !
Main pending issues from the current analysis
Poor Performance of SRM at GRIDKA (dCache)
Poor Performance of SE at IN2P3 (dCache)
Large File opening time at RAL (CASTOR)
18
Conclusions
LHCb main issue for running an analysis on the GRID?
DISK STORAGE !
Reliability, Stability, Performance,
Scalability for both SRM and SE
19