Transcript Colt - Cern

The Colt Distribution Open Source Libraries
for High Performance
Scientific and Technical Computing
in Java
Wolfgang Hoschek
CERN IT/PDP
Atlas Graphics Group Meeting
Dec, 1999
[email protected]
1
Overview






Technology Tracking
Motivation & Goals
Colt distribution
Features
Status & Future plans
Conclusions
Atlas Graphics Group Meeting
Dec, 1999
[email protected]
2
Technology Tracking

Scientific and technical computing



Technology Tracking



demanding problem sizes
need for high performance at reasonably small memory footprint
Don’t pray Java, C++ or whatever
Gain enough experience to be able to take well founded strategic decisions
when the time comes…
Increased adoption in the field





Performance gap steadily closing
ease of use
cross-platform nature (no compiler/architecture/linker issues)
built-in support for multi-threading, network friendly APIs, ...
IBM Watson's Ninja project

BLAS matrix computations up to 90% as fast as optimized Fortran
Atlas Graphics Group Meeting
Dec, 1999
[email protected]
3
Motivation & Goals







Users need libraries to get their job done
Java lacks foundation toolkits broadly available and conveniently
accessible in C/C++ and Fortran
Build an infrastructure for scalable scientific and technical computing in
Java
a la CLHEP
Don’t reinvent the wheel - share ressources in common efforts
Open source
User convenience



Document, package and distribute loosely coupled set of libraries under one
single uniform umbrella
Avoid compiler/linker/architecture headaches
Set a single env. variable to cross-platform shared library and run a program
no matter where you are
Atlas Graphics Group Meeting
Dec, 1999
[email protected]
4
Colt

Efficient High Level Data structures & algorithms for






Approach



On-line & Off-line Data Analysis
Histogramming
NTuple like manipulations, multi-dim. arrays, matrices
Random Numbers, Monte Carlo Simulation
Concurrent & Parallel Programming
Summon some of the best designs and implementations thought up
over time by the community
Port or improve them; Introduce new approaches where need arises
Results so far


In overlapping areas competitive or superior to toolkits such as STL,
Root, HTL, CLHEP, TNT, GSL, C-RAND / WIN-RAND, (all C/C++) as well as
IBM Array, JDK 1.2 Collections framework, JGL (all Java),
in terms of performance (!), functionality and (re)usability
Atlas Graphics Group Meeting
Dec, 1999
[email protected]
5
Features (1)

Several free libraries


Colt library



For user convenience documented, packaged and bundled under one
single uniform umbrella
Fundamental general-purpose data structures optimized for numerical
data, e.g.
Dense and sparse matrices (multi-dimensional arrays), Linear Algebra,
resizable arrays, associative containers, buffer management
Jet library




Mathematical and statistical tools for data analysis,
Histogramming functionality,
Random Number Generators and Distributions for simulations
more
Atlas Graphics Group Meeting
Dec, 1999
[email protected]
6
Features (2)

JAL library




Random library



special math functions, complex numbers
Contributions from


A complete port of CLHEP’s random number library
Concurrent library
VNI library


a partial port of the C++ Standard Template Library
developed by Silicon Graphics
contains a wide range of efficiently coded general-purpose algorithms
on arrays
Sun, SGI, Visual Numerics, Univ. New York
Your package or library ?
Atlas Graphics Group Meeting
Dec, 1999
[email protected]
7
Download Contents

Documentation


Executive summary, installation details, FAQs, news, feedback
HTML API documentation


Extensive doc for each package, class, and method. Examples, Tutorials
Build by javadoc


Source codes for all libraries,


High quality, starting from single top entry point, easy navigation, browsing,
exploration of features
and everything else needed to build the entire distribution from scratch
One single cross-platform shared lib
Atlas Graphics Group Meeting
Dec, 1999
[email protected]
8
Benchmarks

Matrix Computations



2D Assignment: 320 MB/sec, Element-wise Mult: 10 Mflops/sec
Linear Equation Solving: ~ 15 Mflops/sec
2D matrix-matrix mult: 25+ Mflops/sec
Mflops/sec, type=dense, SparcII@400 MHz, Solaris, SunJDK1.2.2, Classic VM
| density
| 0.0010 0.01
0.1
0.99
------------------------------------s 30 | 23.432 23.579 23.318 23.17
i 33 | 49.397 30.667 20.509 19.953
z 66 | 110.442 63.632 24.201 25.161
e 100 | 149.674 73.202 28.417 28.014
300 | 415.985 153.482 29.826 30.901



Random Numbers ~ 3*10^6 numbers/sec
Histogram filling ~ 10^6 numbers/sec
JDK1.2 on Solaris, Linux, NT, AIX, SGI, HP, …
Atlas Graphics Group Meeting
Dec, 1999
[email protected]
9
Related Work

JAS (www-sldnt.slac.stanford.edu/jas)


Java Grande Forum (math.nist.gov/javanumerics)



Random Number
TNT (math.nist.gov/tnt)


Similar design as Colt matrix classes
CLHEP (wwwinfo.cern.ch/asd/lhc++/clhep)


Working group on numerical computing in Java
Jama Linear Algebra package + many more
IBM Array@IBM Watson (www.research.ibm.com/ninja)


Histogram package
Linear Algebra
Colt (nicewww.cern.ch/~hoschek/colt/index.htm)

Beta 1.3 under ASIS, Beta 1.4 under ASIS starting next week
Atlas Graphics Group Meeting
Dec, 1999
[email protected]
10
Status & Future Plans





Currently V1.0 Beta 4
Open Source
V1.0 Final mid Feb. 99
CVS access ?
Under construction



Histogram package
Transparent Parallel matrix computations for SMPs
Contributions welcome
Atlas Graphics Group Meeting
Dec, 1999
[email protected]
11
Conclusions

Technology Tracking




At LHC time-scale change is inevitable
Java may soon be a major player in performance sensitive scientific and
technical computing
Ease of use, Portability, Productivity, Fun
Colt distribution






Users need libraries to get their job done
Java lacks foundation toolkits broadly available and conveniently
accessible in C/C++ and Fortran
Build an infrastructure for scalable scientific and technical computing in
Java
Don’t reinvent the wheel - share ressources in Open Source efforts
Document, package and distribute loosely coupled set of libraries under
one single uniform umbrella
Performance is good and improving - Only a question of time when Java
will be faster than C++
Atlas Graphics Group Meeting
Dec, 1999
[email protected]
12