presentation source - PCBUNN

Download Report

Transcript presentation source - PCBUNN

Experience with Objectivity in the
GIOD Project
The GIOD Project
(Globally Interconnected Object Databases)
A Joint Project between Caltech(HEP and CACR), CERN and Hewlett Packard
http://pcbunn.cithep.caltech.edu/
Digital Sky @
Caltech
Julian Bunn/CERN
April 1999
CERN’s Large Hadron Collider- 2005 to >2025
 Biggest machine yet
built: a proton-proton
collider
 Four experiments: ALICE,
ATLAS, CMS, LHCb
April 1997
Digital Sky meeting.
J.J.Bunn
WorldWide Collaboration
 CMS
April 1997
 100 Mbytes/sec from online systems
 >1700 physicists
 ~1 Pbyte/year raw data
 140 institutes
 ~1 Pbyte/year reconstructed data
 30 countries
 Data accessible across the globe
Digital Sky meeting.
J.J.Bunn
Data Distribution Model
~PBytes/sec
Online System
~100 MBytes/sec
Offline Processor Farm ~10
TIPS
There is a “bunch crossing” every 25 nsecs.
There are 100 “triggers” per second
~100 MBytes/sec
Each triggered event is ~1 MByte in size
~622 Mbits/sec
or Air Freight
(deprecated)
USA Regional
Centre ~1 TIPS
France Regional
Centre
CERN Computer
Centre
Italy Regional
Centre
Germany Regional
Centre
~622 Mbits/sec
Institute
Institute Institute
~0.1TIPS
Physics data
cache
Institute
~1 MBytes/sec
Physicists work on analysis “channels”.
Each institute will have ~10 physicists working on one or more
channels
Data for these channels should be cached by the institute
server
Physicist workstations
April 1997
Digital Sky meeting.
J.J.Bunn
Why ODBMS ?
 OO programming paradigm
is the modern, industry direction
supported by C++, Java high level languages
excellent choice of both free and commercial class libraries
suits our problem space well: rich hierarchy of complex data types (raw data, tracks, energy
clusters, particles, missing energy, time-dependent calibration constants)
 Allows us to take full advantage of industry developments in software technology




 Need to make some objects “persistent”
 raw data
 newly computed, useful, objects
 Need an object store that supports an evolving data model and scales to many
PetaBytes (1015 Bytes)
 (O)RDBMS wont work: For one year’s data would need a virtual table with 109 rows and many
columns
 Require persistent heterogeneous object location transparency, replication
 Multiple platforms, arrays of software versions, many applications, widely distributed in
collaboration
 Need to banish huge “logbooks” of correspondences between event numbers, run numbers,
event types, tag information, file names, tape numbers, site names etc.
April 1997
Digital Sky meeting.
J.J.Bunn
ODBMS - choice of Objectivity/DB
 Commercial ODBMS
 embody hundreds of person-years of effort to develop
 tend to conform to standards
 offer rich set of management tools & language bindings
 Objectivity is the best choice for us right now
 Architecture of Objectivity/DB seems capable of handling PetaBytes.
 Very large databases can be created as “Federations” of very many
smaller databases, which themselves are distributed and/or replicated
amongst servers on the network
 Features data replication and fault tolerance
 I/O performance, overhead and efficiency are very similar to traditional
HEP systems
 OS support (NT, Solaris, Linux, Irix, AIX, HP-UX, etc..)
 Language bindings (C++, Java, [C, SmallTalk, SQL++ etc.])
 Commitment to HEP as target business sector
 Close relationships built up with the company, at all levels
 Attractive licensing schemes for HEP
Digital Sky meeting.
April 1997
J.J.Bunn
Federated Database - Views of the Data
Track
Hit
April 1997
Detector
Digital Sky meeting.
J.J.Bunn
GIOD - Database Tests (I)
 Tested database replication from CERN to
Caltech
 Remote database replication transparent
given sufficient bandwidth
d
s
20000
n
18000
16000
c
o
c
c
c
c
e
14000
re a te L A N
re a te W A N
o m m it L A N
o m m it W A N
12000
i l i s
 Results
 Application is platform independent
 Database is platform independent
 No performance loss for remote
client
 Fastest access: objects are
“indexed”
 Slowest: using predicates
 Tested creating federated database in HPSS
managed storage, using an NFS export
 Can use HPSS/NFS interface to store
databases, but performance poor and
timeouts a problem
10000
m
 Developed simple scaling test application:
 Looked at usability, efficiency and scalability
while varying
 number of objects
 location of objects
 object selection mechanism
 database host platform
8000
6000
4000
2000
0
0
Wavelength 
100
150
200
S a tu r a te d h o u r s ~ 1 0 k b its /s e c o n d
Wavelength 
April 1997
50
U n s a tu r a te d ~ 1 M b its /s e c o n d
U p d a te N u m b e r (T im e o f D a y )
Digital Sky meeting.
J.J.Bunn
250
GIOD - WAN/LAN Tests (I)
Large data transfer over CERN-USA link to Caltech
Try one file ...
Let it rip
HPSS fails
Tidy up ...
 Transfer of ~31 GBytes of Objectivity databases from CERN to HPSS at Caltech
 Achieved ~11 GBytes/day (equivalent to ~4 Tbytes/year, equivalent to 1 Pbyte/year
on a 622 Mbits/sec link)
 HPSS hardware problem at Caltech , not network, caused transfer to abort
April 1997
Digital Sky meeting.
J.J.Bunn
GIOD - WAN/LAN Tests (II)

The traffic destined for the database is exclusively
object data, and consists of raw event objects,
reconstructed objects and analysis objects, together
with small "tag" objects.

The write rate to the disk is always lower than the
input rate from the ATM.

There is significant traffic towards the Reconstruction
client, particularly when the client starts up:
unexpected (it is schema information)

Summary: A lot of traffic in both directions on the
ATM: only a small fraction of it ends up as objects in
the database.
April 1997
Digital Sky meeting.
J.J.Bunn
GIOD - Database Tests (II)

Caltech Exemplar used as a convenient testbed
for Objy multiple-client tests

Initial tests showed instability of lockserver
 subsequently ran lockserver on a different
machine

Tested clients running simulated Track
reconstruction
 CPU-intensive with modest I/O.

Event level (coarse-grained) parallelism

N = 15 - 210 reconstruction processes evenly
distributed in the Exemplar system.

Data in an Objectivity/DB database federation
 hosted on the Exemplar.

Objects read with simple read-ahead
optimisation layer.
 Performance gain of a factor 2
 Results: Exemplar very well suited for
this workload. With two (of four) node
filesystems it was possible to utilise 150
processors in parallel with very high
efficiency.
 Outlook: expect to utilise all processors
with near 100% efficiency when all four
filesystems are engaged.
 (Work by Koen Holtman: CHEP’98 paper)
April 1997
Digital Sky meeting.
J.J.Bunn
GIOD - Database Tests (III)
 Evaluated usability and performance of
Versant ODBMS, Objectivity’s main
competitor.
 Converted our test application without
problem, then measured performance in
same way as we had for Objectivity
 Conversion took ~1/2 a day: minor
changes to schema and source code
 Systems operation of Versant more
cumbersome and time consuming
(e.g. applying schema to each and
every database)
 API easier to use and offered some
convenient built-in features (e.g.
LinkVstr - Objectivity equivalent is
user-constructed ooVArray).
 “Predicate” queries seemed better
implemented: certainly faster.
 Versant Java binding worked well: at
the time we tested, Objectivity did not
offer Java.
April 1997
 Conclusion: Versant a decent “fall-back”
solution for us
 Following these tests we concentrated
solely on Objectivity
Digital Sky meeting.
J.J.Bunn
GIOD - LHC Event Reconstruction and Analysis





In 1998, CMS Physicists had produced
several sub-detector orientated OO
prototypes (e.g. Tracker, ECAL, HCAL …)
(Release of ORCA now supercedes much of
this work)
Written in C++, occasionally using some
Fortran, and without persistent objects
We took these codes and
 Integrated them into an overall
structure
 Redesigned and restructured where
necessary
 Made the objects persistent, with data
written to an Objectivity database
Then we reviewed the code and its
structure for speed, performance and
effectiveness of the algorithms

We added the global reconstruction
aspects, such as track/ECAL cluster
matching, Jet finding, event tagging

Using the resulting “CMSOO” application,
we are processing data in a large store of
fully simulated di-jet events on the
Exemplar
April 1997
ECAL Cluster
Individual ECAL crystal
with energy
Reconstructed Track
Tracker geometry
Digital Sky meeting.
J.J.Bunn
GIOD - Database of “real” LHC events
 Caltech/HEP submitted a proposal to NPACI to generate ~1,000,000 fullysimulated multi-jet QCD events
 Using CMSIM on the Exemplar
 Directly study Higgs   backgrounds for first time
 Computing power of Exemplar makes this possible in < 1 year
 Accepted in March ‘98.
 Event production on the Exemplar since May ‘98.
 ~1.0 TBytes of FZ files (~1,000,000 events) in HPSS
 Physics now being analysed at Caltech by Shevchenko/Wilkinson
 Events also used as copious source of “raw” LHC event data for GIOD
 Files are read using the “ooZebra” utility developed in CMS
 Raw data objects (Tracker, Muon hit maps, ECAL, HCAL energy maps)
created for each event and stored in federated database
 Tracks and energy clusters reconstructed: new objects stored in the database
 Pattern matching creates “physics” objects like Jets, Photons, Electrons,
Missing ET which are stored in the database as “analysis objects”
April 1997
Digital Sky meeting.
J.J.Bunn
GIOD - WAN/LAN Tests
Temporary FDB
Parallel
CMSOO
Production Jobs
10 GByte
155 Mbits/s
155 Mbits/s
OC12/622 Mbits/s to San
Diego SuperComputing
Center (SDSC)
DB files (ftp)
Master FDB
Oofs traffic
80 GByte
April 1997
Clone FDB
Digital Sky meeting.
J.J.Bunn
Java Analysis Studio
public void processEvent(final EventData d) {
final CMSEventData data = (CMSEventData) d;
final double ET_THRESHOLD = 15.0;
Jet jets[] = new Jet[2];
Iterator jetItr = (Iterator) data.getObject("Jet");
if(jetItr == null) return;
int nJets = 0;
double sumET = 0.;
FourVectorRecObj sum4v = new FourVectorRecObj(0.,0.,0.,0.);
while(jetItr.hasMoreElements()) {
Jet jet = (Jet) jetItr.nextElement();
sum4v.add(jet);
double jetET = jet.ET();
sumET += jetET;
if(jetET > ET_THRESHOLD) {
if(nJets <= 1) {
jets[nJets] = jet;
nJets++;
}
}
}
njetHist.fill( nJets );
if(nJets >= 2) {
// dijet event!
FourVectorRecObj dijet4v = jets[0];
dijet4v.add( jets[1] );
massHist.fill( dijet4v.get_mass() );
sumetHist.fill( sumET );
missetHist.fill( sum4v.pt() );
et1vset2Hist.fill( jets[0].ET(), jets[1].ET() );
}
}
April 1997
Digital Sky meeting.
J.J.Bunn
CMSOO - Java 3D Applet
 Attaches to any GIOD database and allows to view/scan all events in the federation, at
multiple detail levels
 Demonstrated at the Internet-2 meeting in San Francisco in Sep’98 and at
SuperComputing’98 in Florida at the iGrid, NPACI and CACR stands
 Running on a 450 MHz HP “Kayak” PC with fx4 graphics card: excellent frame rates
in free rotation of a complete event (~ 5 times performance of Riva TNT)
 Developments:“Drill down” into the database for picked objects, Refit tracks
April 1997
Digital Sky meeting.
J.J.Bunn
GIOD - Database Status
 Over 200,000 fully simulated di-jet events
in the database
 Population continuing using parallel jobs
on the Exemplar (from a pool of over
1,000,000 events simulated)
 Created the TAG database
 Preparing for WAN test with SDSC
 Completed HPSS/AMS installation starting tests
 For MONARC: Made a GIOD replica at
Padua/INFN
April 1997
Digital Sky meeting.
J.J.Bunn
GIOD - Upcoming Work
 Move towards a flexible, distributed data access and analysis system:
 Expand the GIOD database tests to the WAN at >155 mbps
 Run reconstruction in “production mode” on the Exemplar
 Test integration of the Caltech and SDSC HPSS systems with the GIOD
database
 Decide on detailed mechanisms and policies for replicating, mirroring and
copying data
 Obtain performance measurements of distributed reconstruction, analysis
and event scanning using the GIOD database
 Develop a few prototype physics analyses e.g. “Dijet analysis”
 Develop tests with agent-based clients
April 1997
Digital Sky meeting.
J.J.Bunn
Using the Objy Java binding with a C++ Objy database
 No multiple inheritance
 C++/Java language issue
 No fixed-length arrays
 C++/Java language issue
 No arrays of objects
 Objy deficiency
 No zero length arrays
 Objy deficiency
 No pointer data members
 No access to named Schema
 Tedious schema conversion
 Use “Hudson” from Micram?
 Cannot use Visual J++
 VM not supported by Objy
 Careful with that Schema, Eugene
 Default open mode is update!
 Remember to “fetch”
 Need to ensure latest version
 Remember to use
ooReleaseReadLock() frequently
April 1997
Digital Sky meeting.
J.J.Bunn
Objectivity Usage Heuristics
 If in doubt, keep re-OODDLXing
 Application crashes with weird, never-seen-before-error deep in system
kernel ? Re-OODDLX.
 If in doubt, re-OOINSTALLFD
 Multiple writers in one database or container? Best of luck to you ...
 Locking problems in MROW mode ? MROW doesn’t work, does it?
 Buffers full ? Check OO_CACHE variables - should be suitably big
 Database replication? Use ftp … and ooinstallfd/ooattachdb
 Bad errors when closing Windows apps ? Use ooExitCleanup()
 Avoid evolving your Schema
 Run regular checks on all objects in your database: if they’re OK today, they
may not be tomorrow
 Keep locking rate below ~5 locks per second. Exceed this at your peril
 The Advanced Multithreaded Server isn’t …. Until 5.2
 (That’s enough Heuristics … Ed.)
April 1997
Digital Sky meeting.
J.J.Bunn
GIOD - Publications and Press
 "Scalability to Hundreds of Clients in HEP Object Databases", Koen Holtman, Julian
Bunn, Proc. of CHEP '98, Chicago, USA
 "Status Report from the Caltech/CERN/HP "GIOD" Joint Project - Globally
Interconnected Object Databases”, Julian Bunn, Harvey Newman, and Rick Wilkinson,
Proc. of CHEP '98, Chicago, USA
 “GIOD – Globally Interconnected Object Databases”, Julian Bunn and Harvey
Newman, CACR Annual Report 1998
 “Global Initiatives Challenge Traditional Data Management Tools”, Internet-2
Press Release, September 1998
 “Caltech HP Exemplar Supports Test of Data Analysis from CERN Large Hadron
Collider”, Julian Bunn and Tina Mihaly, NPACI “Online” Article, April 1998
 “Data Topics”, Electronic News article, December 1, 1997
 “CERN, Caltech and HP open scientific datacenter”, HPCN News Article, November
1997
 “Large-scale Scientific Data-analysis Center Will Address Future Computing and
Data-handling Challenges”, Hewlett Packard Press Release, November 1997
April 1997
Digital Sky meeting.
J.J.Bunn