RAW tests - hep-proj-database Site
Download
Report
Transcript RAW tests - hep-proj-database Site
CERN/IT/DB
An evaluation of Data
Storage and Analysis With
Oracle DBMS
Eric Grancher, CERN IT/DB
Database Workshop, CERN
July 11th-13th 2001
CERN/IT/DB
Outline
Initial study, using Oracle 8i
RAW data tests
Various methods
Bandwidth
Overhead
TAG data analysis
Type of queries
Bitmap indexes
Binning and function based indexes
CERN/IT/DB
RAW Data Tests
Purpose
Verify on a reduced scale that oracle is not
too consuming when stressed to acquire a
large data volume
First design options
Ideas about the overhead
CERN/IT/DB
Model
Typical model (NA 45) and random generator
thanks to Dirk Düllmann
Run, burst, event and label types
Many ways to map it in an ORDBMS
*
RUN
1
*
BURST
1
1
*
*
EVENT
1
1
*
1
*
LABEL
1
*
Vector<long>
CERN/IT/DB
Storage
We implemented:
Run, Burst and Event: relational tables
Label: one Binary Large Object in order to
store the doubles
There is more than one way to do it !
CERN/IT/DB
Tests
We used a workgroup server, sun E420R with
a D1000 (ultra SCSI) disk subsystem
Veritas volume manager (manageability and
tuning)
RAW devices tests show that disk subsystem
saturates with 4 disks stripe (38MBytes/s
compared to the 40MBytes/s specification)
CERN/IT/DB
Host Language
There are many ways to connect to
Oracle DBMS, supported ones:
C/C++ and Oracle8i: OCI, Pro*C and
ODBC
Pro*C: pre-compiler (embedded SQL)
ODBC: generic interface
CERN/IT/DB
Choice: OCI
The low level C native interface, all
facilities
Encapsulated into C++
Typically quite complex (callback and
C++) and “verbose”
CERN/IT/DB
Setup
Linux (lxplus)
Sun E420R
+ D1000
Oracle
100Mbit/s
Random
CERN/IT/DB
Optimisation
We use “NO CACHE” tables,
“NOLOGGING” hint
Using VM, we could see that control
files were the hot spot -> we moved
them to free disks, factor of 5 gain in
speed
Networking subsystem is the bottleneck
(100 Mbit/s), several Linux clients
CERN/IT/DB
Data files and redo log files
Log history
Current log sequence
Checkpoint information
Control Files
Instance
SGA
Shared pool
Server
process
LGWR CKPT
DBWR
User
process
Parameter
file
Modified slide from
Bjørn Engsig / Oracle
Redo log
files
Data files
Database
Control
files
CERN/IT/DB
Overhead
Overhead: 4.1% over the C++ data
NUMBER type in Oracle
LOB are multiple of the block size
(8kB), size of the Label was often
around 20k long
(ie 80kB on Linux/i386+)
CERN/IT/DB
Results and perspectives
Sustained 10.88 MB/s during 2000s,
8 Linux client machines
The server networking sub-system is the
bottleneck
Study during this summer with Gbit/s
Ethernet and large Linux PC
Larger machines later (how does “Oracle Real
Application Cluster” scales for such an application?)
CERN/IT/DB
Conclusion
At this scale, it works
LOB -> byte ordering dependency
Other tests with non-LOB types
more overhead
better facilities to analyse the data
Overhead has to be further studied
Other bindings (OCCI and refs)