Grid_Experiences_at_ATO.ppsx

Download Report

Transcript Grid_Experiences_at_ATO.ppsx

Experiences with a SAS Grid
Ray Lindsay ATO
ACT SAS Users Group
21 May 2015
1
Background
• ATO has had a SAS licence for its Analytics
capability going back to early 2000s
• Initially a pair of Windows servers running 9.1.3
– Included Enterprise Miner
• Switched to Linux in about 2008, initially Debian,
now uBuntu
– Red Hat install failed in 2008
– Debian and uBuntu not officially supported but
everything worked
2
Linux environment
• Set of initially 5 Linux servers running 9.1,
then 9.2, 9.3, 9.4
• But also running many other Analytics
capabilities
– R, python etc.
• Plusses and minuses
– Contention for resources
– Camp ‘A’ vs camp ‘B’
3
Advantage in staying within Linux
• Although there are some subtle differences
most Linux/Unix commands are the same on
RedHat compared with uBuntu
• But a much reduced set of commands on RH
4
SAS Grid
• Settled on a 42 core grid, with 2 metadata
servers and 3 compute servers
• 24 compute cores (3 x 8)
• Delivered Dec 2014
• All users migrated
5
Overall architecture
6
Compute nodes
7
Logical separation
• Discovery and Production
– But same physical hardware
• 75% and 25% of disk space
• Different queues with different priorities
• Most of our work is within the ‘Discovery’
environment – only 2 queues
8
Speed of nodes
• 3.3 GHz versus 2.4 GHz on older hardware
• Compute nodes have 128 Gb and Metadata 32
Gb of memory
– This is notably less memory than is
needed/specified for our R machines, typically 750
Gb
9
Notes on Moore’s law
• In conventional computers, number of transistors
doubling every two years
– Valid since 1965
– http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumbe
r=4567410 argues that will reach limit in 2036
– Others think sooner
• But also note that chip speeds have not increased
as rapidly in recent years.
• Hence need for explicit and implicit
parallelisation
10
Grid Workload Distribution
• http://support.sas.com/rnd/scalability/grid/S
GF09_SASHA.pdf
•
11
Notes on a supercomputer
• Beacon Project,
https://www.nics.tennessee.edu/beacon
• 768 conventional cores, 11520 accelerator
cores
• 210 TeraFLOPS, 2.499 GigaFLOPS/Watt
• 12 Tb of system memory
• 1.5 Tb coprocessor memory
• 73 Tb of SSD storage
12
Our Grid – run time in seconds
Anet
SAS Grid
HPForest
161
20
HPForest
149
16
GradientBoost
240
141
GradientBoost
149
84
DecisionTree
32
18
DecisionTree
24
14
Variable Selection
13
6.4
Variable Selection
30
8.9
Data processing
Up to 15 times faster
13
Shared data
• Shared file system /sasdata accessible to all
• Each machine has its own copy of sas
installed.
– Only compute nodes have SAS/Access for Teradata
installed
• Also shared work area
14
SAS software
• As before
– Base,
– Stat,
– ETS,
– Graph,
– Access for Teradata and ODBC
– Enterprise Miner, High Performance Suite
• Plus: IML, Add-in for MS Office
15
Clients
•
•
•
•
•
Enterprise Guide 6.1
SAS Studio
IML Studio
Personal Login Manager
Management Console
16
Parallelisation of code
• Not for faint hearted – at this stage
• Only really useful for regular long-running
tasks
• Task A and Task B do not depend on each
other
• Task C depends on both A and B
– Two stage models
17
Proc SCAPROC
• Can be used to identify task dependency
within a large program
• Requires the program to be run in order to
output a new vectorised program
• Same functionality exists within Enterprise
Guide
– Analyse program for Grid Computing
18
SCAPROC continued
• Nearly all the code is marked ‘no changes
below here’ so original code must be kept in
case changes needed
• Our efforts on actual problems so far not
successful
• Have had to sanitise code and logs and send
to Tech Support
19
What is still missing
• Xwindows environment, with tools such as file
browser, version control software etc.
• Interface to R
– Enterprise Miner and IML both have interfaces to
Open Source R
– (EM uses IML in fact)
– Also IML/Studio
20
Disclaimer
• While based on my experiences using SAS at
the ATO all opinions expressed are my own
• Ray Lindsay, [email protected]
21
Questions
22