Caltech/CERN/HP Joint Project

Download Report

Transcript Caltech/CERN/HP Joint Project

The GIOD Project
(Globally Interconnected Object Databases)
For High Energy Physics
A Joint Project between Caltech(HEP and CACR), CERN and Hewlett Packard
http://pcbunn.cithep.caltech.edu/
I2-DSI Applications Workshop
Julian Bunn/Caltech & CERN
March 1999
CERN’s Large Hadron Collider- 2005 to >2025
 Biggest machine yet
built: a proton-proton
collider
 Four experiments: ALICE,
ATLAS, CMS, LHCb
March 5th. 1999
I2-DSI workshop: J.J.Bunn
WorldWide Collaboration
 CMS
 100 Mbytes/sec from online systems
 >1700 physicists
 ~1 Pbyte/year raw data
 140 institutes
 ~1 Pbyte/year reconstructed data
 30 countries
 Data accessible across the globe
March 5th. 1999
I2-DSI workshop: J.J.Bunn
Data Distribution Model
~PBytes/sec
Online System
~100 MBytes/sec
Offline Processor Farm ~10
TIPS
There is a “bunch crossing” every 25 nsecs.
There are 100 “triggers” per second
~100 MBytes/sec
Each triggered event is ~1 MByte in size
~622 Mbits/sec
or Air Freight
(deprecated)
USA Regional
Centre ~1 TIPS
France Regional
Centre
CERN Computer
Centre
Italy Regional
Centre
Germany Regional
Centre
~622 Mbits/sec
Institute
Institute Institute
~0.1TIPS
Physics data
cache
Institute
~1 MBytes/sec
Physicists work on analysis “channels”.
Each institute will have ~10 physicists working on one or more
channels
Data for these channels should be cached by the institute
server
Physicist workstations
March 5th. 1999
I2-DSI workshop: J.J.Bunn
Large Hadron Collider - Computing Models
 Requirement: Computing Hardware, Network and Software systems to support
timely and competitive analysis by a worldwide collaboration
 Solution: Hierarchical networked ensemble of heterogeneous, data-serving
and processing computing systems
 Key technologies:
 Object-Oriented Software
 Object Database Management Systems (ODBMS)
 Sophisticated middleware for query brokering (Agents)
 Hierarchical Storage Management Systems
 Networked Collaborative Environments
March 5th. 1999
I2-DSI workshop: J.J.Bunn
The GIOD Project Goals
 Build a small-scale prototype Regional Centre using:
Object Oriented software, tools and ODBMS
Large scale data storage equipment and software
High bandwidth LAN and WANs
 Measure, evaluate and tune the components of the Centre
for
LHC Physics
 Confirm the viability of the proposed LHC Computing Models
 Use measurements of the prototype as input to simulations of realistic
LHC Computing Models for the future
March 5th. 1999
I2-DSI workshop: J.J.Bunn
Why ODBMS ?
 OO programming paradigm
is the modern, industry direction
supported by C++, Java high level languages
excellent choice of both free and commercial class libraries
suits our problem space well: rich hierarchy of complex data types (raw data, tracks, energy
clusters, particles, missing energy, time-dependent calibration constants)
 Allows us to take full advantage of industry developments in software technology




 Need to make some objects “persistent”
 raw data
 newly computed, useful, objects
 Need an object store that supports an evolving data model and scales to many
PetaBytes (1015 Bytes)
 (O)RDBMS wont work: For one year’s data would need a virtual table with 109 rows and many
columns
 Require persistent heterogeneous object location transparency, replication
 Multiple platforms, arrays of software versions, many applications, widely distributed in
collaboration
 Need to banish huge “logbooks” of correspondences between event numbers, run numbers,
event types, tag information, file names, tape numbers, site names etc.
March 5th. 1999
I2-DSI workshop: J.J.Bunn
ODBMS - choice of Objectivity/DB
 Commercial ODBMS
 embody hundreds of person-years of effort to develop
 tend to conform to standards
 offer rich set of management tools & language bindings
 At least one (Objectivity/DB) - seems capable of handling PetaBytes.
 Objectivity is the best choice for us right now
 Very large databases can be created as “Federations” of very many
smaller databases, which themselves are distributed and/or replicated
amongst servers on the network
 Features data replication and fault tolerance
 I/O performance, overhead and efficiency are very similar to traditional
HEP systems
 OS support (NT, Solaris, Linux, Irix, AIX, HP-UX, etc..)
 Language bindings (C++, Java, [C, SmallTalk, SQL++ etc.])
 Commitment to HEP as target business sector
 Close relationships built up with the company, at all levels
 Attractive licensing schemes for HEP
March 5th. 1999
I2-DSI workshop: J.J.Bunn
Storage management - choice of HPSS
 Need to “backend” the ODBMS
 Using a large scale media management system
 Because:
“Tapes” are still foreseen to be most cost effective
(May be DVDs in practice)
System reliability not enough to avoid “backup copies”
 Unfortunately, large scale data archives are a niche market
 HPSS is currently the best choice:
 Appears scale into the PetaByte storage range
 Heavy investment of CERN/Caltech/SLAC… effort to make HPSS evolve in
directions suited for HEP
 Unfortunately, only supported on a couple of platforms
 A layer between the ODBMS and an HPSS filesystem has been developed:
it is interfaced to Objectivity’s Advanced Multithreaded Server. This is one
key to development of the system middleware.
March 5th. 1999
I2-DSI workshop: J.J.Bunn
ODBMS worries
 Bouyancy of the commercial marketspace?
+ Introduction of Computer Associates
“Jasmine” pure ODBMS (targetted at
multimedia data)
+ Oracle et al. paying lip-service to OO
with Object features “bolted on” to their
fundamentally RDBMS technology
- Breathtaking fall of Versant stock!
- Still no IPO for Objectivity
 Conversion of “legacy” ODBMS data from
one system to another?
 100 PetaBytes via an ODMG-compliant
text file?!
 Good argument for keeping raw data
outside the ODBMS, in simple binary
files (BUT doubles storage needs)
March 5th. 1999
I2-DSI workshop: J.J.Bunn
Federated Database - Views of the Data
Track
Hit
March 5th. 1999
Detector
I2-DSI workshop: J.J.Bunn
ODBMS tests
• Developed simple scaling application:
matching 1000s of sky objects at different
wavelengths
• Runs entirely in cache (can neglect disk
I/O performance), applies matching
algorithm between pairs of objects in
different databases.
• Looked at usability, efficiency and
scalability for
•number of objects
•location of objects
•object selection mechanism
•database host platform
Database 1
Database 2
Wavelength 
March 5th. 1999
 Results
 Application is platform independent
 Database is platform independent
 No performance loss for remote client
 Fastest access: objects are “indexed”
 Slowest: using predicates
Match using indexes,
predicates or cuts
Wavelength 
I2-DSI workshop: J.J.Bunn
ODBMS tests
 Other Tests:
 Looked at Java binding performance (~3 times slower)
 Created federated database in HPSS managed storage, using NFS export
 Tested database replication from CERN to Caltech
d
s
2 0 0 0 0
n
1 8 0 0 0
1 6 0 0 0
c
o
c
c
c
c
e
1 4 0 0 0
re a
re a
o m
o m
te L A N
te W A N
m it L A N
m it W A N
i l i s
1 2 0 0 0
m
1 0 0 0 0
8 0 0 0
6 0 0 0
4 0 0 0
2 0 0 0
0
0
5 0
1 0 0
S a tu ra te d
h o u rs
~
1 0
2 0 0
k b its /s e c o n d
U p d a te
March 5th. 1999
1 5 0
N u m b e r (T im e
U n s a tu ra te d
2 5 0
~
1
M b its /s e c o n d
o f D a y )
I2-DSI workshop: J.J.Bunn
ODBMS tests
 Caltech Exemplar used as a convenient
testbed for Objy multiple-client tests
 Evaluated usability and performance of
Versant ODBMS, Objectivity’s main
competitor.
 Results: Exemplar very well suited for this
workload. With two (of four) node
filesystems it was possible to utilise 150
processors in parallel with very high
efficiency.
 Results: Versant a decent “fall-back”
solution for us
 Outlook: expect to utilise all processors
with near 100% efficiency when all four
filesystems are engaged.
March 5th. 1999
I2-DSI workshop: J.J.Bunn
GIOD - Database of “real” LHC events
 Would like to evaluate system performance with realistic Event objects
 Caltech/HEP submitted a successful proposal to NPACI to generate ~1,000,000 fullysimulated multi-jet QCD events
 Directly study Higgs   backgrounds for first time
 Computing power of Caltech’s 256-CPU (64 Gbyte shared memory) HP-Exemplar
makes this possible in ~few months
 Event production on the Exemplar since May ‘98 ~ 1,000,000 events of 1 MByte.
 Used by GIOD as copious source of “raw” LHC event data
 Events are analysed using Java Analysis Studio and “scanned” using a Java applet
March 5th. 1999
I2-DSI workshop: J.J.Bunn
Large data transfer over CERN-USA link to Caltech
Try one file ...
Let it rip
HPSS fails
Tidy up ...
 Transfer of ~31 GBytes of Objectivity databases from Shift20/CERN to HPSS/Caltech
 Achieved ~11 GBytes/day (equivalent to ~4 Tbytes/year, equivalent to 1 Pbyte/year
on a 622 Mbits/sec link)
 HPSS hardware problem at Caltech , not network, caused transfer to abort
March 5th. 1999
I2-DSI workshop: J.J.Bunn
GIOD - Database Status
 Over 200,000 fully simulated di-jet events
in the database
 Population continuing using parallel jobs
on the Exemplar (from a pool of over
1,000,000 events)
 Building the TAG database
 For optimising queries, each event is
summarised by a small object, shown
opposite, that contains the salient
features.
 The TAG objects are kept in a
dedicated database, which is
replicated to client machines
 Preparing for WAN test with SDSC
 Preparing for HPSS/AMS installation and
tests
 For MONARC: Making a replica at
Padua/INFN (Italy)
March 5th. 1999
I2-DSI workshop: J.J.Bunn
GIOD - WAN/LAN Tests
Temporary FDB
Parallel
CMSOO
Production Jobs
10 GByte
155 Mbits/s
155 Mbits/s
OC12/622 Mbits/s to San
Diego SuperComputing
Center (SDSC)
DB files (ftp)
Master FDB
Oofs traffic
80 GByte
March 5th. 1999
Clone FDB
I2-DSI workshop: J.J.Bunn
MONARC - Models Of Networked Analysis At Regional Centers
Caltech, CERN, FNAL, Heidelberg, INFN,
KEK, Marseilles, Munich, Orsay, Oxford,
Tufts, …
 GOALS
Specify the main parameters
characterizing the Model’s performance:
throughputs, latencies
Determine classes of Computing Models
feasible for LHC (matched to network
capacity and data handling resources)
Develop “Baseline Models” in the
“feasible” category
Verify resource requirement baselines:
(computing, data handling, networks)
 REQUIRES
Define the Analysis Process
Define Regional Centre Architectures
Provide Guidelines for the final Models
March 5th. 1999
Desk
tops
Desk
tops
FNAL
2.106 MIPS
200 Tbyte
Robot
Caltech
n.106MIPS
50 Tbyte
Robot
Desk
tops
CERN
107 MIPS
200 Tbyte
Robot
I2-DSI workshop: J.J.Bunn
JavaCMS - 2D Event Viewer Applet
 Created to aid in Track Fitting algorithm
development
 Fetches objects directly from the
ODBMS
 Java binding to the ODBMS very
convenient to use
March 5th. 1999
I2-DSI workshop: J.J.Bunn
CMSOO - Java 3D Applet
 Attaches to any GIOD database and allows to view/scan all events in the federation, at
multiple detail levels
 Demonstrated at the Internet-2 meeting in San Francisco in Sep’98 and at
SuperComputing’98 in Florida at the iGrid, NPACI and CACR stands
 Running on a 450 MHz HP “Kayak” PC with fx4 graphics card: excellent frame rates
in free rotation of a complete event (~ 5 times performance of Riva TNT)
 Developments:“Drill down” into the database for picked objects, Refit tracks
March 5th. 1999
I2-DSI workshop: J.J.Bunn
Java Analysis Studio
public void processEvent(final EventData d) {
final CMSEventData data = (CMSEventData) d;
final double ET_THRESHOLD = 15.0;
Jet jets[] = new Jet[2];
Iterator jetItr = (Iterator) data.getObject("Jet");
if(jetItr == null) return;
int nJets = 0;
double sumET = 0.;
FourVectorRecObj sum4v = new FourVectorRecObj(0.,0.,0.,0.);
while(jetItr.hasMoreElements()) {
Jet jet = (Jet) jetItr.nextElement();
sum4v.add(jet);
double jetET = jet.ET();
sumET += jetET;
if(jetET > ET_THRESHOLD) {
if(nJets <= 1) {
jets[nJets] = jet;
nJets++;
}
}
}
njetHist.fill( nJets );
if(nJets >= 2) {
// dijet event!
FourVectorRecObj dijet4v = jets[0];
dijet4v.add( jets[1] );
massHist.fill( dijet4v.get_mass() );
sumetHist.fill( sumET );
missetHist.fill( sum4v.pt() );
et1vset2Hist.fill( jets[0].ET(), jets[1].ET() );
}
}
March 5th. 1999
I2-DSI workshop: J.J.Bunn
GIOD - Summary
 LHC Computing models specify
 Massive quantities of raw, reconstructed and analysis data in ODBMS
 Distributed data analysis at CERN, Regional Centres and Institutes
 Location transparency for the end user
 GIOD is investigating
 Usability, scalability, portability of Object Oriented LHC codes
 In a hierarchy of large-servers, and medium/small client machines
 With fast LAN and WAN connections
 Using realistic raw and reconstructed LHC event data
 GIOD has
 Constructed a large set of fully simulated events and used these to create
a large OO database
 Learned how to create large database federations
 Developed prototype reconstruction and analysis codes that work with
persistent objects
 Deployed facilities and database federations as useful testbeds for
Computing Model studies
March 5th. 1999
I2-DSI workshop: J.J.Bunn
GIOD - Interest in I2-DSI
 LHC Computing: timely access to powerful resources
 Measure the prevailing network conditions
 Predict and manage the (short term) future conditions
 Implement QoS with policies on end to end links,
 Provide for movement of large datasets
 Match the Network, Storage, and Compute resources to the needs
 Synchronize their availability in real time
 Overlay the distributed, tightly coupled ODBMS on a loosely-coupled set of
heterogeneous servers on the WAN
 Potential Areas of Research with I2-DSI
 Test ODBMS replication
 Burst mode, using I2 backbones up to the Gbits/sec range
Experiment with data “localization” strategies
Roles of caching, mirroring, channeling
Interaction with Objectivity/DB
Experiment with policy-based resource allocation
strategies
Evaluate Autonomous Agent Implementations
March 5th. 1999
I2-DSI workshop: J.J.Bunn