CHEP 04 conference report

Transcript CHEP 04 conference report

CHEP 04 conference report
Judith Katzy, DESY
Conference Contributions



522 participants
25 plenary talks
228 parallel talks organized in 7 parallel tracks:










Online Computing
Event Processing
Core Software
Distributed Computing Services
Distributed Computing Systems + Experiences
Computer Fabrics
Grid Security
3 Poster sessions
Industrial program
1 Birthday Party
All talks and papers available at http://www.cern.ch/chep04
Outline


Online computing
Event processing & core software
• Simulation
• Software Frameworks
• Developments in languages and tools

Distributed Computing in the Grid century
•
•
•
•

Wide area networking
Computer Fabrics
Grid Middleware
Experiences
Summary
Online Computing



Online computing track: very direct report of
findings by BES, D0, Phenix, H1, Zeus, Alice,
Atlas, LHCb, ..
Major shift of technology: pervasive adoption of
commodity computing and networking in ALL
areas of trigger and DAQ
Growing importance of control software and
simulation
Pierre VANDE VYVRE – CERN/PH
Online Operating Systems



Online computing used to be a zoo for OS
Special OS tend now to disappear from the
landscape
Present generation of experiments still using
kernels:
• But even there, Linux is becoming a credible
competitor

Transition out of Windows
• PHENIX EB: windows->Linux for performance

New developments (even extremely
demanding such as BTev or LHCb L1) plan to
use standard OS (Linux)
Pierre VANDE VYVRE – CERN/PH
Online @ HERA2
H1 HLT
- HERA luminosity upgrade
- Transition from VME SBCs
to commodity hw
- Corba as transport layer for
control and data
A.Campbell
ZEUS TRG L2
- HERA luminosity upgrade
- ZEUS detector upgrade:combined
trigger of CTD (Central Tracking)
with MVD (Micro Vertex) and STT
(Straw Tube Tracker)
- Transition from Transputers to
commodity hw
M.Sutton
Outline


Online computing
Event processing & core software
• Simulation
• Software Frameworks
• Developments in languages and tools

Distributed Computing in the Grid century
•
•
•
•

Wide area networking
Computer Fabrics
Grid Middleware
Grid experiences
Summary
Simulation

Currently Geant3, Geant4, FLUKA being used for HEP
and other applications
• 3 LHC experiments G4, Alice and Star considering FLUKA

Virtual Monte Carlo:
• interface to allow easy switch between different MC (G3, G4,
FLUKA)

Validation of Simulation done on test beams:
• Macroscopic tests:

measure LHC detector components response
• Microscopic test:
 typically thin-target setups to measure single interaction
or cross section
Test beam at Bessy 1
Advanced Concepts and Science Payloads
Pure material samples:
• Cu
• Si
• Fe
• Al
• Ti
• Stainless steel
Monocromatic photon beam
HpGe detector
detector
67 mm
40 mm
45°
beam
40 mm
material samples
A. Owens, A. Peacock
Comparison of Simulation
•Most data/Geant3/Geant4 comparison favor G4
Geant4
electromagnetic physics
has been already
validated at percent
level. In the next future,
efforts to reach permil
level forseen.
• A reasonable
agreement between data and both Geant4
and Fluka in hadronic physics was found. The shape of
hadronic shower needs further improvements.
Parameterized Simulations
Fast simulation a MUST for LHC to get
from O(min) to <1s/event
Software frameworks


Reports on data models, reconstruction and
analysis frameworks from Atlas, BaBar, CMS, H1,
Star
Component based architectures
• More modularity and flexibility in the assembling of
the application

OS: Much of the software only works on Linux
• Alternatives?

Solaris
• Still seen in “server” environment

Mac OS X
• Growing popularity (supported by LCG)
• Compiler is the same as Linux
Objects Persistency



ROOT I/O used by many
experiments
(all LHC, H1, GLAST,
BaBar..)
Hybrid solution:
Relational DB for
catalog and ROOT I/O
proved to be very
successful
POOL:
implements the hybrid
solution
widely used, grid
enabled
Improvements of Root I/O

TFile improvement
• Large files and trees, Double32_t, XML output format.
• Support for non-instrumented classes

Enhancement in I/O and Tree Query for collection
• Split Collections
• Fast histograming of (potentially) any collections
• Lift restrictions on STL I/O



Nested containers
Reading without compiled code
TTree
• Remove stringent requirements on CloneTree
• Add support for auto loading of referenced objects
• Support for RDBMS databases back-end coming soon.

TTree Queries
• Can call any functions taking numerical arguments
• Can use arbitrary C++ and still use the branch names as
variables
• TTree Friend linked by Index
Talk by P.Canal
Languages

Most experiments use more than one language
• Use the best tool for each application


Mostly C++
Only legacy FORTRAN discussed
• Still some being replaced (e.g. ZOOM replacing Minuit)
• Majority of remaining in event generators

Java (still) mostly confined to graphics and event viewing
• Places where performance isn’t limiting factor
• Used for event processing by North American work on ILC:
estimated 20-30% slower than C++


Python becoming more popular
• One big use is for quick prototyping
Need for bridges
• I/O exchanges (LCIO, Java , root I/O, XML data files)
• Interpreter Binding (PyROOT, Java JNI)
• Language independent API (AIDA)
Standard C++ for HEP


FNAL member of the ANSI C++ standard committee
Likely to become standard soon:
• Improve performance by


Enhanced function declaration
Move semantics (10-20fold speed increase)
• Improve domain support with


Random number toolkit
Mathematical special functions (math.h didn’t change since 30
years)
• Improve core language by



Compile time reflection
Dynamic libraries
For a long list of improvements on the std.library and the
core language please check the talk and
http://www.open-std.org/jtc1/sc22/wg21
M.Paterno,W.Brown, FNAL
Outline


Online computing
Event processing & core software
• Simulation
• Software Frameworks

Distributed Computing in the Grid century
•
•
•
•

Wide area networking
Computer Fabrics
Grid Middleware
Experiences
Summary
Grid Vision
What is Grid ?
“A Grid provides an abstraction for resource sharing and
collaboration across multiple administrative domains …”
(Source: NGG Expert Group, 16 June 2003 “European Grid Research 20052005-2010)
• Benefits
 Increased productivity by reducing Total Cost of
Ownership
 Any-type, anywhere, anytime
services by/for all
Industry & Business
Grids
 Infrastructure for dynamic virtual
organisations
e-Science
 Next generation Internet services
backbone
M.Lemke, European Commission
30 sites
3200 CPUs
Grid Reality
LCG-2
Many different grids
30 sites
3200 cpus
25 Universities
4 National Labs
2800 CPUs
Grid3
LHC Data Grid Hierarchy
CERN/Outside Resource Ratio ~1:2
Tier0/( Tier1)/( Tier2)
~1:1:1
Online System
Experiment
~100-1500
MBytes/sec
CERN Center
PBs of Disk;
Tape Robot
Tier 0 +1
Tier 1
10 - 40 Gbps
IN2P3 Center
INFN Center
RAL Center
FNAL Center
~10 Gbps
~1-10
Gbps
Tier3
Tier2
Institute Institute
Physics data cache
Workstations
Institute
Tier2 Center
Tier2 Center
Tier2 Center
Tier2 Center
Tier2 Center
Institute
1 to 10 Gbps
Tier 4
Tens of Petabytes by 2007-8.
An Exabyte ~5-7 Years later.
Network for Grid
CERN/Outside Resource Ratio ~1:2
Tier0/( Tier1)/( Tier2)
~1:1:1
~TByte/sec
Online System
Experiment
~100-1500
MBytes/sec
CERN Center
PBs of Disk;
Tape Robot
Tier 0 +1
Tier 1
10 - 40 Gbps
IN2P3 Center
INFN Center
RAL Center
FNAL Center
~10 Gbps
~1-10
Gbps
Tier3
Tier2
Institute Institute
Physics data cache
Workstations
Institute
Tier2 Center
Tier2 Center
Tier2 Center
Tier2 Center
Tier2 Center
Institute
1 to 10 Gbps
Tier 4
Tens of Petabytes by 2007-8.
An Exabyte ~5-7 Years later.
Existing Networks
Europe: Geant, USA: Esnet (DOE Energy Sciences Network) ,Japan: SINET
All at 10 Gbits/sec currently, upgrades in the near future, internationally linked
LIGO
PNN
L
ESnet IP
MIT
J
G
I
LB
NL
NERSC
SLAC
SN
LL
QWEST
ATM
LL
NL
FN
AL
AN
L
AM
ES
MAE-W
BNL
NY-NAP
PP
PL
MAE-E
ANL-DC
INEELDC
ORAUDC
4xLAB-DC
LLNL/LAN
GTN&NN
L-DC
PAIX-E
SA
KCP
YUCCA
MT
SDS
C
ALB
HUB
GA
Office Of Science Sponsored (22)
LAN
L
OR
OSTINL
SNL
A Allied
Signal
AR
M
JL
AB
ORAU
NOAA
SRS
Network development
 Network backbones and major links used by HENP and
other fields are advancing rapidly
 To the 10 G range in < 3 years; much faster than
Moore’s Law
 New HENP and DOE Roadmaps: a factor ~1000 BW
Growth per decade
 We are learning to use long distance 10 Gbps networks
effectively
 2004 Developments: to 7.5 Gbps flows with TCP over
16 kkm
(B.Newman,Caltech)
The near future
Global Ring Network for Advanced Applications Development
GLORIAD 5-year Proposal (with US NSF) for expansion to 2.5G-10G
Moscow-Amsterdam-Chicago-Pacific-Hong Kong-Pusan-Beijing early
2005; 10G ring around northern hemisphere 2007;
Multi-wavelength hybrid service from ~2008-9
Network status & outlook


Todays R&D networks are excellent, reliable, and
responsive to research needs
(today they are underloaded)
In the future we will see a greater diversity of
requirements from “high demand” applications, including
the Grid

Today there is a global movement to develop new ways of
networking for grids and research in general

HEP Must undertake network-data-challenges now to
demonstrate it really needs ~100 Gbits/s in the LHC era
Talk by P.Clarke
Computer Fabrics
CERN/Outside Resource Ratio ~1:2
Tier0/( Tier1)/( Tier2)
~1:1:1
~TByte/sec
Online System
Experiment
~100-1500
MBytes/sec
CERN Center
PBs of Disk;
Tape Robot
Tier 0 +1
Tier 1
10 - 40 Gbps
IN2P3 Center
INFN Center
RAL Center
FNAL Center
~10 Gbps
~1-10
Gbps
Tier3
Tier2
Institute Institute
Physics data cache
Workstations
Institute
Tier2 Center
Tier2 Center
Tier2 Center
Tier2 Center
Tier2 Center
Institute
1 to 10 Gbps
Tier 4
Tens of Petabytes by 2007-8.
An Exabyte ~5-7 Years later.
Computer Fabrics

Scale (CERN, BNL, GridKa, Belle,FNAL,..)
• 500-1300 nodes
• O(100) TB disk, O(100) disk server
• O(1) PB tapes, O(100) tape server

Monitoring is a maturing field
• Ganglia: widespread use, LEMON: similar

Installation and maintenance philosophies
• Rocks: Reproducible installations (reinstall to update)
• Quattor: Actively manage the running environment
• Grid underware is complex to install
…including the grid installation server itself

Storage management:
• CASTOR and dCache in full growth
• SRM proliferating to support all major storage managers

OS: still large demand for OS variety
• Move to RHES3 / Scientific Linux
Computer Fabrics on the Grid

Tiers 0,1:
• Turnkey solutions
• Low maintenance central services


“Challenge is to run cheap HW with minimal staff and
moderate expertise”
Tiers 2ff:
• Large variety: different HEP experiments,
different Tier1/0s
• large teams want tools to coordinate large and
diverse services
• Emphasis on coordinated information
• Central servers to orchestrate the automation
Data management


POOL widely used
Interplay of data base technology and native grid
service for data distribution and replication
• FronTier (FNAL, running experiments

Decouple development and user data access
Scalable
Many commodity tools and techniques (SQUID)
Simple to deploy

Sustainable infrastructure



• LCG 3D (CERN, LHC experiments) Convergence foreseen
• SAM, BaBar DM

Experience with running experiments
• gLite Data management

New technology and experience
and conceivable
Workload management

BNL STAR SUMS systems
• Used for Distributed / Grid Simulation
job submission
• Emphasis on stability



Running experiments, lots of users!
EGEE gLite WMS is being released
Optimization by
• “Phenomenological” estimates based on few
parameters like CPU power & event size (J.
Huth et al.)
• Based on Workload Monitoring (Sphinx)
Grid Security





Pre-Grid services like ssh on Grid machines
are already under attack!
People are developing tools to look for
attacks.
Grids still need to interface to security used by
pre-Grid systems like Kerberos, AFS and
WWW
We are developing tools to manage 1000s of
users in big experiments.
Application-level software developers are
starting to interface to Grid security systems.
Andrew McNab
Grid Middleware Status
M.Lamanna

New generation of middleware becoming available:
• LCG-2->gLite


LCG-2: Focus on production, large-scale data handling, tested in the
2004/5 LHC data challenges
gLite: Focus on analysis
• GAE



Interactivity as a goal as opposed to “production” mode
RPC-based web service framework (Clarence)
Compatibility with gLite to be explored
• DIRAC




XML-RPC: no need for WSDL…
Instant messages protocol inter service/agents communication
Interact with other exp. specific services (cfr. File-Metadata
Management System For The LHCb Experiment“)
Some commonality on technology
• Service Oriented Architecture; Web Services

Dynamics of the evolution of the middleware very
complex
• Experience injected in the projects via large user community/Data Challenges
Grid Middleware
– General remarks
How many middleware flavours?
Many opposing forces at work






Innovation
Standardisation
Source of funding
National/regional
services
Functionality
Security



Stability
Maturity & experience
HEP goals



Worldwide collaborations
Reliability
Ease of access
On the timescale of LHC startup - we are going to have to live
with a few different middleware implementations/standards
But we really should try to avoid two solutions
that do the same thing slightly differently!
Les Robertson, CERN
Grid experiences at LHC…

All 4 LHC experiments performed
2004 data challenges:




Based on LCG-2 software
30-60TBytes (~30M events) in 6-100k jobs
and 972 MSI-2k hours total CPU
Focus on production, large-scale data
handling
Result:
• ~65% job efficiency
• Stability is largest source of problems
Summary of I.Bird
…pre-Grid experience at CDF

CDF: Global User Analysis Computing
• Computing environment for user analysis with elements of
global computing and use of standard components
• Consist of commodity linux based system and file servers
• Presently 300TB data + 1500 cpus available for physicist
• 770 registered users, 100% utilization
Alan Sill
Grid Talk
Summary




Conference covered large variety of
aspects of HEP computing
Component based architectures in all
areas of software development
Move to commodity hardware and
Linux OS
GRID is coming!

CHEP 04 conference report

Transcript CHEP 04 conference report

Directory