CHEP 04 conference report

Download Report

Transcript CHEP 04 conference report

CHEP 04 conference report
Judith Katzy, DESY
Conference Contributions



522 participants
25 plenary talks
228 parallel talks organized in 7 parallel tracks:










Online Computing
Event Processing
Core Software
Distributed Computing Services
Distributed Computing Systems + Experiences
Computer Fabrics
Grid Security
3 Poster sessions
Industrial program
1 Birthday Party
All talks and papers available at http://www.cern.ch/chep04
Outline


Online computing
Event processing & core software
• Simulation
• Software Frameworks
• Developments in languages and tools

Distributed Computing in the Grid century
•
•
•
•

Wide area networking
Computer Fabrics
Grid Middleware
Experiences
Summary
Online Computing



Online computing track: very direct report of
findings by BES, D0, Phenix, H1, Zeus, Alice,
Atlas, LHCb, ..
Major shift of technology: pervasive adoption of
commodity computing and networking in ALL
areas of trigger and DAQ
Growing importance of control software and
simulation
Pierre VANDE VYVRE – CERN/PH
Online Operating Systems



Online computing used to be a zoo for OS
Special OS tend now to disappear from the
landscape
Present generation of experiments still using
kernels:
• But even there, Linux is becoming a credible
competitor

Transition out of Windows
• PHENIX EB: windows->Linux for performance

New developments (even extremely
demanding such as BTev or LHCb L1) plan to
use standard OS (Linux)
Pierre VANDE VYVRE – CERN/PH
Online @ HERA2
H1 HLT
- HERA luminosity upgrade
- Transition from VME SBCs
to commodity hw
- Corba as transport layer for
control and data
A.Campbell
ZEUS TRG L2
- HERA luminosity upgrade
- ZEUS detector upgrade:combined
trigger of CTD (Central Tracking)
with MVD (Micro Vertex) and STT
(Straw Tube Tracker)
- Transition from Transputers to
commodity hw
M.Sutton
Outline


Online computing
Event processing & core software
• Simulation
• Software Frameworks
• Developments in languages and tools

Distributed Computing in the Grid century
•
•
•
•

Wide area networking
Computer Fabrics
Grid Middleware
Grid experiences
Summary
Simulation

Currently Geant3, Geant4, FLUKA being used for HEP
and other applications
• 3 LHC experiments G4, Alice and Star considering FLUKA

Virtual Monte Carlo:
• interface to allow easy switch between different MC (G3, G4,
FLUKA)

Validation of Simulation done on test beams:
• Macroscopic tests:

measure LHC detector components response
• Microscopic test:
 typically thin-target setups to measure single interaction
or cross section
Test beam at Bessy 1
Advanced Concepts and Science Payloads
Pure material samples:
• Cu
• Si
• Fe
• Al
• Ti
• Stainless steel
Monocromatic photon beam
HpGe detector
detector
67 mm
40 mm
45°
beam
40 mm
material samples
A. Owens, A. Peacock
Comparison of Simulation
•Most data/Geant3/Geant4 comparison favor G4
Geant4
electromagnetic physics
has been already
validated at percent
level. In the next future,
efforts to reach permil
level forseen.
• A reasonable
agreement between data and both Geant4
and Fluka in hadronic physics was found. The shape of
hadronic shower needs further improvements.
Parameterized Simulations
Fast simulation a MUST for LHC to get
from O(min) to <1s/event
Software frameworks


Reports on data models, reconstruction and
analysis frameworks from Atlas, BaBar, CMS, H1,
Star
Component based architectures
• More modularity and flexibility in the assembling of
the application

OS: Much of the software only works on Linux
• Alternatives?

Solaris
• Still seen in “server” environment

Mac OS X
• Growing popularity (supported by LCG)
• Compiler is the same as Linux
Objects Persistency



ROOT I/O used by many
experiments
(all LHC, H1, GLAST,
BaBar..)
Hybrid solution:
Relational DB for
catalog and ROOT I/O
proved to be very
successful
POOL:
implements the hybrid
solution
widely used, grid
enabled
Improvements of Root I/O

TFile improvement
• Large files and trees, Double32_t, XML output format.
• Support for non-instrumented classes

Enhancement in I/O and Tree Query for collection
• Split Collections
• Fast histograming of (potentially) any collections
• Lift restrictions on STL I/O



Nested containers
Reading without compiled code
TTree
• Remove stringent requirements on CloneTree
• Add support for auto loading of referenced objects
• Support for RDBMS databases back-end coming soon.

TTree Queries
• Can call any functions taking numerical arguments
• Can use arbitrary C++ and still use the branch names as
variables
• TTree Friend linked by Index
Talk by P.Canal
Languages

Most experiments use more than one language
• Use the best tool for each application


Mostly C++
Only legacy FORTRAN discussed
• Still some being replaced (e.g. ZOOM replacing Minuit)
• Majority of remaining in event generators

Java (still) mostly confined to graphics and event viewing
• Places where performance isn’t limiting factor
• Used for event processing by North American work on ILC:
estimated 20-30% slower than C++


Python becoming more popular
• One big use is for quick prototyping
Need for bridges
• I/O exchanges (LCIO, Java , root I/O, XML data files)
• Interpreter Binding (PyROOT, Java JNI)
• Language independent API (AIDA)
Standard C++ for HEP


FNAL member of the ANSI C++ standard committee
Likely to become standard soon:
• Improve performance by


Enhanced function declaration
Move semantics (10-20fold speed increase)
• Improve domain support with


Random number toolkit
Mathematical special functions (math.h didn’t change since 30
years)
• Improve core language by



Compile time reflection
Dynamic libraries
For a long list of improvements on the std.library and the
core language please check the talk and
http://www.open-std.org/jtc1/sc22/wg21
M.Paterno,W.Brown, FNAL
Outline


Online computing
Event processing & core software
• Simulation
• Software Frameworks

Distributed Computing in the Grid century
•
•
•
•

Wide area networking
Computer Fabrics
Grid Middleware
Experiences
Summary
Grid Vision
What is Grid ?
“A Grid provides an abstraction for resource sharing and
collaboration across multiple administrative domains …”
(Source: NGG Expert Group, 16 June 2003 “European Grid Research 20052005-2010)
• Benefits
 Increased productivity by reducing Total Cost of
Ownership
 Any-type, anywhere, anytime
services by/for all
Industry & Business
Grids
 Infrastructure for dynamic virtual
organisations
e-Science
 Next generation Internet services
backbone
M.Lemke, European Commission
30 sites
3200 CPUs
Grid Reality
LCG-2
Many different grids
30 sites
3200 cpus
25 Universities
4 National Labs
2800 CPUs
Grid3
LHC Data Grid Hierarchy
CERN/Outside Resource Ratio ~1:2
Tier0/( Tier1)/( Tier2)
~1:1:1
Online System
Experiment
~100-1500
MBytes/sec
CERN Center
PBs of Disk;
Tape Robot
Tier 0 +1
Tier 1
10 - 40 Gbps
IN2P3 Center
INFN Center
RAL Center
FNAL Center
~10 Gbps
~1-10
Gbps
Tier3
Tier2
Institute Institute
Physics data cache
Workstations
Institute
Tier2 Center
Tier2 Center
Tier2 Center
Tier2 Center
Tier2 Center
Institute
1 to 10 Gbps
Tier 4
Tens of Petabytes by 2007-8.
An Exabyte ~5-7 Years later.
Network for Grid
CERN/Outside Resource Ratio ~1:2
Tier0/( Tier1)/( Tier2)
~1:1:1
~TByte/sec
Online System
Experiment
~100-1500
MBytes/sec
CERN Center
PBs of Disk;
Tape Robot
Tier 0 +1
Tier 1
10 - 40 Gbps
IN2P3 Center
INFN Center
RAL Center
FNAL Center
~10 Gbps
~1-10
Gbps
Tier3
Tier2
Institute Institute
Physics data cache
Workstations
Institute
Tier2 Center
Tier2 Center
Tier2 Center
Tier2 Center
Tier2 Center
Institute
1 to 10 Gbps
Tier 4
Tens of Petabytes by 2007-8.
An Exabyte ~5-7 Years later.
Existing Networks
Europe: Geant, USA: Esnet (DOE Energy Sciences Network) ,Japan: SINET
All at 10 Gbits/sec currently, upgrades in the near future, internationally linked
LIGO
PNN
L
ESnet IP
MIT
J
G
I
LB
NL
NERSC
SLAC
SN
LL
QWEST
ATM
LL
NL
FN
AL
AN
L
AM
ES
MAE-W
BNL
NY-NAP
PP
PL
MAE-E
ANL-DC
INEELDC
ORAUDC
4xLAB-DC
LLNL/LAN
GTN&NN
L-DC
PAIX-E
SA
KCP
YUCCA
MT
SDS
C
ALB
HUB
GA
Office Of Science Sponsored (22)
LAN
L
OR
OSTINL
SNL
A Allied
Signal
AR
M
JL
AB
ORAU
NOAA
SRS
Network development
 Network backbones and major links used by HENP and
other fields are advancing rapidly
 To the 10 G range in < 3 years; much faster than
Moore’s Law
 New HENP and DOE Roadmaps: a factor ~1000 BW
Growth per decade
 We are learning to use long distance 10 Gbps networks
effectively
 2004 Developments: to 7.5 Gbps flows with TCP over
16 kkm
(B.Newman,Caltech)
The near future
Global Ring Network for Advanced Applications Development
GLORIAD 5-year Proposal (with US NSF) for expansion to 2.5G-10G
Moscow-Amsterdam-Chicago-Pacific-Hong Kong-Pusan-Beijing early
2005; 10G ring around northern hemisphere 2007;
Multi-wavelength hybrid service from ~2008-9
Network status & outlook


Todays R&D networks are excellent, reliable, and
responsive to research needs
(today they are underloaded)
In the future we will see a greater diversity of
requirements from “high demand” applications, including
the Grid

Today there is a global movement to develop new ways of
networking for grids and research in general

HEP Must undertake network-data-challenges now to
demonstrate it really needs ~100 Gbits/s in the LHC era
Talk by P.Clarke
Computer Fabrics
CERN/Outside Resource Ratio ~1:2
Tier0/( Tier1)/( Tier2)
~1:1:1
~TByte/sec
Online System
Experiment
~100-1500
MBytes/sec
CERN Center
PBs of Disk;
Tape Robot
Tier 0 +1
Tier 1
10 - 40 Gbps
IN2P3 Center
INFN Center
RAL Center
FNAL Center
~10 Gbps
~1-10
Gbps
Tier3
Tier2
Institute Institute
Physics data cache
Workstations
Institute
Tier2 Center
Tier2 Center
Tier2 Center
Tier2 Center
Tier2 Center
Institute
1 to 10 Gbps
Tier 4
Tens of Petabytes by 2007-8.
An Exabyte ~5-7 Years later.
Computer Fabrics

Scale (CERN, BNL, GridKa, Belle,FNAL,..)
• 500-1300 nodes
• O(100) TB disk, O(100) disk server
• O(1) PB tapes, O(100) tape server

Monitoring is a maturing field
• Ganglia: widespread use, LEMON: similar

Installation and maintenance philosophies
• Rocks: Reproducible installations (reinstall to update)
• Quattor: Actively manage the running environment
• Grid underware is complex to install
…including the grid installation server itself

Storage management:
• CASTOR and dCache in full growth
• SRM proliferating to support all major storage managers

OS: still large demand for OS variety
• Move to RHES3 / Scientific Linux
Computer Fabrics on the Grid

Tiers 0,1:
• Turnkey solutions
• Low maintenance central services


“Challenge is to run cheap HW with minimal staff and
moderate expertise”
Tiers 2ff:
• Large variety: different HEP experiments,
different Tier1/0s
• large teams want tools to coordinate large and
diverse services
• Emphasis on coordinated information
• Central servers to orchestrate the automation
Data management


POOL widely used
Interplay of data base technology and native grid
service for data distribution and replication
• FronTier (FNAL, running experiments

Decouple development and user data access
Scalable
Many commodity tools and techniques (SQUID)
Simple to deploy

Sustainable infrastructure



• LCG 3D (CERN, LHC experiments) Convergence foreseen
• SAM, BaBar DM

Experience with running experiments
• gLite Data management

New technology and experience
and conceivable
Workload management

BNL STAR SUMS systems
• Used for Distributed / Grid Simulation
job submission
• Emphasis on stability



Running experiments, lots of users!
EGEE gLite WMS is being released
Optimization by
• “Phenomenological” estimates based on few
parameters like CPU power & event size (J.
Huth et al.)
• Based on Workload Monitoring (Sphinx)
Grid Security





Pre-Grid services like ssh on Grid machines
are already under attack!
People are developing tools to look for
attacks.
Grids still need to interface to security used by
pre-Grid systems like Kerberos, AFS and
WWW
We are developing tools to manage 1000s of
users in big experiments.
Application-level software developers are
starting to interface to Grid security systems.
Andrew McNab
Grid Middleware Status
M.Lamanna

New generation of middleware becoming available:
• LCG-2->gLite


LCG-2: Focus on production, large-scale data handling, tested in the
2004/5 LHC data challenges
gLite: Focus on analysis
• GAE



Interactivity as a goal as opposed to “production” mode
RPC-based web service framework (Clarence)
Compatibility with gLite to be explored
• DIRAC




XML-RPC: no need for WSDL…
Instant messages protocol inter service/agents communication
Interact with other exp. specific services (cfr. File-Metadata
Management System For The LHCb Experiment“)
Some commonality on technology
• Service Oriented Architecture; Web Services

Dynamics of the evolution of the middleware very
complex
• Experience injected in the projects via large user community/Data Challenges
Grid Middleware
– General remarks
How many middleware flavours?
Many opposing forces at work






Innovation
Standardisation
Source of funding
National/regional
services
Functionality
Security



Stability
Maturity & experience
HEP goals



Worldwide collaborations
Reliability
Ease of access
On the timescale of LHC startup - we are going to have to live
with a few different middleware implementations/standards
But we really should try to avoid two solutions
that do the same thing slightly differently!
Les Robertson, CERN
Grid experiences at LHC…

All 4 LHC experiments performed
2004 data challenges:




Based on LCG-2 software
30-60TBytes (~30M events) in 6-100k jobs
and 972 MSI-2k hours total CPU
Focus on production, large-scale data
handling
Result:
• ~65% job efficiency
• Stability is largest source of problems
Summary of I.Bird
…pre-Grid experience at CDF

CDF: Global User Analysis Computing
• Computing environment for user analysis with elements of
global computing and use of standard components
• Consist of commodity linux based system and file servers
• Presently 300TB data + 1500 cpus available for physicist
• 770 registered users, 100% utilization
Alan Sill
Grid Talk
Summary




Conference covered large variety of
aspects of HEP computing
Component based architectures in all
areas of software development
Move to commodity hardware and
Linux OS
GRID is coming!