Scientific Computing at SLAC

Download Report

Transcript Scientific Computing at SLAC

Scientific Computing at SLAC
Richard P. Mount
Director: Scientific Computing and Computing
Services
Stanford Linear Accelerator Center
HEPiX
October 11, 2005
SLAC Scientific Computing
Balancing Act
• Aligned with the evolving science
mission of SLAC
but neither
• Subservient to the science mission
nor
• Unresponsive to SLAC mission needs
Richard P. Mount, SLAC
October 11, 2005
2
SLAC Scientific Computing Drivers
•
BaBar (data-taking ends December 2008)
– The world’s most data-driven experiment
– Data analysis challenges until the end of the decade
•
KIPAC
– From cosmological modeling to petabyte data analysis
•
Photon Science at SSRL and LCLS
– Ultrafast Science, modeling and data analysis
•
Accelerator Science
– Modeling electromagnetic structures (PDE solvers in a demanding application)
•
The Broader US HEP Program (aka LHC)
– Contributes to the orientation of SLAC Scientific Computing R&D
Richard P. Mount, SLAC
October 11, 2005
3
DOE Scientific Computing
Funding at SLAC
• Particle and Particle Astrophysics
– $14M SCCS
– $5M Other
• Photon Science
– $0M SCCS
– $1M SSRL?
• Computer Science
– $1.5M
Richard P. Mount, SLAC
October 11, 2005
4
Scientific Computing
The relationship between Science and the
components of Scientific Computing
SCCS FTE
Application Sciences
Issues addressable
with “computing”
Computing
techniques
Computing
architectures
Computing
hardware
Richard P. Mount, SLAC
High-energy and Particle-Astro Physics, Accelerator
Science, Photon Science …
Particle interactions with matter, Electromagnetic
structures, Huge volumes of data, Image processing …
~20
PDE Solving, Algorithmic geometry, Visualization, Meshes,
Object databases, Scalable file systems …
Single system image, Low-latency clusters, Throughputoriented clusters, Scalable storage …
~26
Processors, I/O devices, Mass-storage hardware, Randomaccess hardware, Networks and Interconnects …
October 11, 2005
5
Scientific Computing:
SLAC’s goals for
Scientific Computing
Application Sciences
SLAC ++ Stanford
Stanford Science
SLAC
Science
Computing
techniques
Computing
architectures
Computing
hardware
Richard P. Mount, SLAC
The Scien ce of Scientific Computing
Computing
for
DataIntensive
Science
October 11, 2005
Collaboration with
Stanford and Industry
Issues addressable
with “computing”
6
Scientific Computing:
Current SLAC leadership and recent
achievements in Scientific Computing
Richard P. Mount, SLAC
October 11, 2005
7
What does SCCS run (1)?
256 dualopteron
Sun V20zs
Data analysis “farms” (also
good for HEP simulation)
~ 4000 processors
~ Linux and Solaris
Shared-Memory
multiprocessor
Myrinet cluster
– SGI 3700
– 128
processors
– 72 processors
– Linux
– Linux
Richard P. Mount, SLAC
October 11, 2005
8
What does SCCS run (2)?
Application-specific
clusters
– each 32 to 128
processors
– Linux
PetaCache
Prototype
And even …
– 64 nodes
– 16 GB memory
per node
– Linux/Solaris
Richard P. Mount, SLAC
October 11, 2005
9
What does SCCS run (3)?
Disk Servers
About 120 TB
– About 500TB
Sun/Solaris Servers
– Network attached
Sun fibrechannel
disk arrays
– Mainly xrootd
– Some NFS
– Some AFS
Tape Storage
– 6 STK Powderhorn Silos
– Up to 6 petabytes capacity
– Currently store 2 petabytes
– HPSS
Richard P. Mount, SLAC
October 11, 2005
10
What does SCCS run (4)?
Networks
– 10 Gigabits/s
to ESNET
– 10 Gigabits/s
R&D
– 96 fibers
to Stanford
– 10 Gigabits/s
core in
computer
center (as soon
as we unpack
the boxes)
Richard P. Mount, SLAC
October 11, 2005
11
SLAC Computing - Principles and Practice
(Simplify and Standardize)
• Lights-out operation – no operators for the last 10 years
– Run 24x7 with 8x5 (in theory) staff
– (When there is a cooling failure on a Sunday morning, 10–15 SCS
staff are on site by the time I turn up)
• Science (and business-unix) raised-floor computing
–
–
–
–
–
–
Adequate reliability essential
Solaris and Linux
Scalable “cookie-cutter” approach
Only one type of farm CPU bought each year
Only one type of file-server + disk bought each year
Highly automated OS installations and maintenance
• e.g see talk on how SLAC does Clusters by Alf Wachsmann
http://www.slac.stanford.edu/~alfw/talks/RCcluster.pdf
Richard P. Mount, SLAC
October 11, 2005
12
SLAC-BaBar Computing Fabric
Client
Client
Client
Client
Client
IP Network
(Cisco)
Disk
Server
Disk
Server
Disk
Server
Tape
Server
Disk
Server
Richard P. Mount, SLAC
Tape
Server
1700 dual CPU Linux
800 single CPU
Sun/Solaris
HEP-specific ROOT software (Xrootd) +
Objectivity/DB object database some NFS
Disk
Server
IP Network
(Cisco)
Tape
Server
Client
Disk
Server
120 dual/quad CPU
Sun/Solaris
~400 TB Sun
FibreChannel RAID
arrays (+some SATA)
HPSS + SLAC enhancements to
ROOT and Objectivity server code
Tape
Server
October 11, 2005
Tape
Server
25 dual CPU
Sun/Solaris
40 STK 9940B
6 STK 9840A
6 STK Powderhorn
over 1 PB of data
13
Scientific Computing
Research Areas (1)
(Funded by DOE-HEP and DOE SciDAC and DOE-MICS)
•
Huge-memory systems for data analysis
(SCCS Systems group and BaBar)
– Expected major growth area (more later)
•
Scalable Data-Intensive Systems:
(SCCS Systems and Physics Experiment Support groups)
– “The world’s largest database” (OK not really a database any more)
– How to maintain performance with data volumes growing like “Moore’s Law”?
– How to improve performance by factors of 10, 100, 1000, … ?
(intelligence plus brute force)
– Robustness, load balancing, troubleshootability in 1000 – 10000-box systems
– Astronomical data analysis on a petabyte scale (in collaboration with KIPAC)
Richard P. Mount, SLAC
October 11, 2005
14
Scientific Computing
Research Areas (2)
(Funded by DOE-HEP and DOE SciDAC and DOE MICS)
•
Grids and Security:
(SCCS Physics Experiment Support. Systems and Security groups)
– PPDG: Building the US HEP Grid – Open Science Grid;
– Security in an open scientific environment;
– Accounting, monitoring, troubleshooting and robustness.
•
Network Research and Stunts:
(SCCS Network group – Les Cottrell et al.)
– Land-speed record and other trophies
•
Internet Monitoring and Prediction:
(SCCS Network group)
– IEPM: Internet End-to-End Performance Monitoring (~5 years)
– INCITE: Edge-based Traffic Processing and Service Inference for HighPerformance Networks
Richard P. Mount, SLAC
October 11, 2005
15
Scientific Computing
Research Areas (2)
(Funded by DOE-HEP and DOE SciDAC and DOE MICS)
•
GEANT4: Simulation of particle interactions in million to billion-element
geometries:
(SCCS Physics Experiment Support Group – M. Asai, D. Wright, T. Koi,
J. Perl …)
– BaBar, GLAST, LCD …
– LHC program
– Space
– Medical
•
PDE Solving for complex electromagnetic structures:
(Kwok Ko‘s advanced Computing Department + SCCS clusters)
Richard P. Mount, SLAC
October 11, 2005
16
Growing Competences
• Parallel Computing (MPI …)
– Driven by KIPAC (Tom Abel) and ACD (Kwok Ko)
– SCCS competence in parallel computing (= Alf Wachsmann
currently)
– MPI clusters and SGI SSI system
• Visualization
– Driven by KIPAC and ACD
– SCCS competence is currently experimental-HEP focused
(WIRED, HEPREP …)
– (A polite way of saying that growth is needed)
Richard P. Mount, SLAC
October 11, 2005
17
A Leadership-Class Facility for
Data-Intensive Science
Richard P. Mount
Director, SLAC Computing Services
Assistant Director, SLAC Research Division
Washington DC, April 13, 2004
PetaCache Goals
• The PetaCache architecture aims at
revolutionizing the query and analysis of
scientific databases with complex structure.
– Generally this applies to feature databases
(terabytes–petabytes) rather than bulk data
(petabytes–exabytes)
• The original motivation comes from HEP
– Sparse (~random) access to tens of terabytes
today, petabytes tomorrow
– Access by thousands of processors today, tens of
thousands tomorrow
Richard P. Mount, SLAC
October 11, 2005
19
PetaCache
The Team
• David Leith, Richard Mount, PIs
• Randy Melen, Project Leader
• Chuck Boeheim (Systems group leader)
• Bill Weeks, performance testing
• Andy Hanushevsky, xrootd
• Systems group members
• Network group members
• BaBar (Stephen Gowdy)
Richard P. Mount, SLAC
October 11, 2005
20
Latency and Speed – Random Access
Random-Access Storage Performance
1000
100
10
Retreival Rate Mbytes/s
1
0.1
0.01
0.001
PC2100
WD200GB
0.0001
STK9940B
0.00001
0.000001
0.0000001
0.00000001
0.000000001
0
1
2
3
4
5
6
7
8
9
10
log10 (Obect Size Bytes)
Richard P. Mount, SLAC
October 11, 2005
21
The PetaCache Strategy
•
•
Sitting back and waiting for technology is a BAD idea
Scalable petabyte memory-based data servers require much more than just cheap chips. Now is the time
to develop:
–
–
–
Data server architecture(s) delivering latency and throughput cost-optimized for the science
Scalable data-server software supporting a range of data-serving paradigms (file-access, direct addressing, …)
“Liberation” of entrenched legacy approaches to scientific data analysis that are founded on the “knowledge”
that accessing small data objects is crazily inefficient
•
•
Applications will take time to adapt not just codes, but their whole approach to computing, to exploit the new architecture
Hence: three phases
1.
2.
3.
Prototype machine (In operation)
•
•
Commodity hardware
Existing “scalable data server software” (as developed for disk-based systems)
•
•
HEP-BaBar as co-funder and principal user
Tests of other applications (GLAST, LSST …)
•
•
Tantalizing “toy” applications only (too little memory for flagship analysis applications)
Industry participation
Development Machine (Next proposal)
•
Low-risk (purely commodity hardware) and higher-risk (flash memory system requiring some hardware development)
components
•
•
Data server software – improvements to performance and scalability, investigation of other paradigms
HEP-BaBar as co-funder and principal user
•
•
Work to “liberate” BaBar analysis applications
Tests of other applications (GLAST, LSST …)
•
•
Major impact on a flagship analysis application
Industry partnerships, DOE Lab partnerships
Production Machine(s)
•
Storage-class memory with a range of interconnect options matched to the latency/throughput needs of differing applications
•
•
Scalable data-server software offering several data-access paradigms to applications
Proliferation – machines deployed at several labs
•
•
Economic viability – cost-effective for programs needing dedicated machines
Industry partnerships transitioning to commercialization
Richard P. Mount, SLAC
October 11, 2005
22
Prototype Machine
(Operational)
Cisco Switch
Clients
up to 2000 Nodes, each
2 CPU, 2 GB memory
Linux
Data-Servers 64-128 Nodes, each
Sun V20z, 2 Opteron CPU, 16 GB memory
Up to 2TB total Memory
Solaris or Linux (mix and match)
Cisco Switches
PetaCache
MICS + HEPBaBar Funding
Existing HEP-Funded
BaBar Systems
Richard P. Mount, SLAC
October 11, 2005
23
Object-Serving Software
• Xrootd/olbd (Andy Hanushevsky/SLAC)
–
–
–
–
–
–
–
Optimized for read-only access
File-access paradigm (filename, offset, bytecount)
Make 1000s of servers transparent to user code
Load balancing
Self-organizing
Automatic staging from tape
Failure recovery
• Allows BaBar to start getting benefit from a new data-access
architecture within months without changes to user code
• The application can ignore the hundreds of separate address
spaces in the data-cache memory
Richard P. Mount, SLAC
October 11, 2005
24
Prototype Machine:
Performance Measurements
• Latency
• Throughput (transaction rate)
• (Aspects of) Scalability
Richard P. Mount, SLAC
October 11, 2005
25
Latency (microseconds) versus data
retrieved (bytes)
250.00
200.00
Server xrootd overhead
Server xrootd CPU
Client xroot overhead
Client xroot CPU
TCP stack, NIC, switching
Min transmission time
150.00
100.00
50.00
10
0
60
0
11
00
16
00
21
00
26
00
31
00
36
00
41
00
46
00
51
00
56
00
61
00
66
00
71
00
76
00
81
00
0.00
Richard P. Mount, SLAC
October 11, 2005
26
Throughput Measurements
100000
90000
Transactions per Second
80000
70000
60000
Linux Client - Solaris Server
Linux Client - Linux Server
Linux Client - Solaris Server bge
50000
40000
30000
22 processor
microseconds per
transaction
20000
10000
0
1
5
10
15
20
25
30
35
40
45
50
Number of Clients for One Server
Richard P. Mount, SLAC
October 11, 2005
27
Storage-Class Memory
• New technologies coming to market in
the next 3 – 10 years (Jai Menon –
IBM)
• Current not-quite-crazy example is flash
memory
Richard P. Mount, SLAC
October 11, 2005
28
Development Machine
Plans
Data-Servers 30 Nodes, each
2 Opteron CPU, 1TB Flash memory
~ 30TB total Memory
Solaris/Linux
SLAC-BaBar
System
Switch (10 Gigabit ports)
Clients up to 2000 Nodes, each
2 CPU, 2 GB memory
Linux
Data-Servers 80 Nodes, each
8 Opteron CPU, 128 GB memory
Up to 10TB total Memory
Solaris/Linux
Cisco Switch Fabric
PetaCache
Richard P. Mount, SLAC
October 11, 2005
29
Minor Details?
• 1970’s
– SLAC Computing Center designed for ~35 Watts/square foot
– 0.56 MWatts maximum
• 2005
– Over 1 MWatt by the end of the year
– Locally high densities (16 kW racks)
• 2010
– Over 2 MWatts likely need
• Onwards
– Uncertain, but increasing power/cooling need is likely
Richard P. Mount, SLAC
October 11, 2005
30
Crystal Ball (1)
• The pessimist’s vision:
– PPA computing winds down to about 20% of its
current level as BaBar analysis ends in 2012
(Glast is negligible, KIPAC is small)
– Photon Science is dominated by non-SLAC/nonStanford scientists who do everything at their
home institutions
– The weak drivers from the science base make
SLAC unattractive for DOE computer science
funding
Richard P. Mount, SLAC
October 11, 2005
31
Crystal Ball (2)
• The optimist’s vision:
– PPA computing in 2012 includes:
• Vigorous/leadership involvement in LHC physics analysis using
innovative computing facilities
• Massive computational cosmology/astrophysics
• A major role in LSST data analysis (petabytes of data)
• Accelerator simulation for the ILC
– Photon Science computing includes:
• A strong SLAC/Stanford faculty, leading much of LCLS science, fully
exploiting SLAC’s strengths in simulation, data analysis and
visualization
• A major BES accelerator research initiative
– Computer Science includes:
• National/international leadership in computing for data-intensive science
(supported at $25M to $50M per year)
– SLAC and Stanford:
• University-wide support for establishing leadership in the science of
scientific computing
• New SLAC/Stanford scientific computing institute.
Richard P. Mount, SLAC
October 11, 2005
32