Copyright Gordon Bell Clusters & Grids What can be learned from

Download Report

Transcript Copyright Gordon Bell Clusters & Grids What can be learned from

The CC – GRID? Era
CC
GSC 2002
Gordon Bell
([email protected])
Bay Area Research Center
Microsoft Corporation
Copyright Gordon Bell
Clusters & Grids
Copyright Gordon Bell
Clusters & Grids
Observations from a mostly Grid
workshop
Clusters. Let’s finish the job!
 Grids generally.
 Grids as arbitrary cluster
platforms…why?
 Examples of Grid-types,
especially web services
 Summary…

Copyright Gordon Bell
Clusters & Grids
Blades aka a “cluster in a cabinet”

366 servers per 44U cabinet
–
–
–



Single processor
2 - 30 GB/computer (24 TBytes)
2 - 100 Mbps Ethernets
~10x perf*, power, disk, I/O per cabinet
~3x price/perf
Network services… Linux based
*42, 2 processors, 84 Ethernet, 3 TBytes
Copyright Gordon Bell
Clusters & Grids
Clusters aren’t as bad as
programs make them out to be,
but we need to make them work
better and be more transparent.





Everything is becoming a cluster. Certainly all of 500!
64 bit addressing will cause more change!
Future nodes should bet on CLMP smP’s (p = 4-32) .
Utilize existing and emerging smP’s nodes versus
assuming lcd PM-pairs & MPI.
Massive gains from compiler and runtime.
ES has set a new standard of efficiency and system
transparency for “clusters”.
Expand the MPI programming model:
–
–
Full transparency of MPI needs to be the goal
Objectify for greater flexibility and greater insulation from
latency
Grids: If they are the solution
what’s the problem?





Economics… thief, scavenger, power,
efficiency or resource sharing?
Research funding… that’s where the
money is
Are they where the problems lie?
Does massive collaboration that the
Grids enable, create massive
overhead and generally less output?
Unless the output is for a community!
Is funding and middleware a good
investment?
Copyright Gordon Bell
Clusters & Grids
Same observations as 2000

GRID was/is
X an exciting concept …
–
–

They can/must work within a community,
organization, or project. Apps need to drive.
“Necessity is the mother of invention.”
Taxonomy… interesting vs necessity
Web SVCs
–
–
–
–
–
–
–
Cycle scavenging and object evaluation
(e.g. seti@home, QCD)
File distribution/sharing for IP theft e.g. Napster
Databases &/or programs for a community
(astronomy, bioinformatics, CERN, NCAR)
Workbenches: web workflow chem, bio…
Exchanges… many sites operating together
Single, large objectified pipeline… e.g. NASA.
Grid as a cluster platform! Transparent &
arbitrary access including load balancing
Grid nj. An arbitrary distributed,
cluster platform






A geographical and multi-organizational
collection of diverse computers dynamically
configured as cluster platforms responding to
arbitrary, ill-defined jobs “thrown” at it.
Costs are not necessarily favorable e.g. disks
are less expensive than cost to transfer data.
Latency and bandwidth are non-deterministic,
thereby changing cluster characteristics
Once a large body of data exists for a job, it is
inherently bound to (set into) fixed resources.
Large datasets & I/O bound programs need to
be with their data or be database accesses…
But are there resources there to share?
Bound to cost more?
Bright spots… near term, user
focus, a lesson for Grid suppliers


Tony Hey apps-based funding.
Web services based Grid & data orientation.
David Abramson - Nimrod.
–
–
–

Andrew Grimshaw - Avaki
–



Parameter scans… other low hanging fruit
Encapsulate apps! “Excel”-- language/control mgmt.
“Legacy apps are programs that users just want, and
there’s no time or resources to modify code
…independent of age, author, or language e.g. Java.”
Making Legion vision real. A reality check.
Lip 4 pairs of “web services” based apps
Gray et al Skyservice and Terraservice
Goal: providing a web service must be as easy
as publishing a web page…and will occur!!!
SkyServer: delivering a web service to the
astronomy community.
Prototype for other sciences?
Gray, Szalay, et al
First paper on the SkyServer
http://research.microsoft.com/~gray/Papers/MSR_
TR_2001_77_Virtual_Observatory.pdf
http://research.microsoft.com/~gray/Papers/MSR_
TR_2001_77_Virtual_Observatory.doc
Later, more detailed paper for database community
http://research.microsoft.com/~gray/Papers/MSR_
TR_01_104_SkyServer_V1.pdf
http://research.microsoft.com/~gray/Papers/MSR_
TR_01_104_SkyServer_V1.doc
Copyright Gordon Bell
Clusters & Grids
What can be learned from Sky Server?
It’s about data, not about harvesting flops
1-2 hr. query programs versus 1 wk
programs based on grep
 10 minute runs versus 3 day compute &
searches
 Database viewpoint. 100x speed-ups


–
–
–

Avoid costly re-computation and searches
Use indices and PARALLEL I/O.
Read / Write >>1.
Parallelism is automatic, transparent, and just
depends on the number of computers/disks.
Limited
experience and talentClusters
to use dbases.
Copyright Gordon Bell
& Grids
Heuristics for building communities
that need to share data & programs






Always go from working to working
Do it by induction in time and space
(Why version 3 is pretty good.)
Put ONE database in place that’s useful by
itself in terms of UI, content, & queries
Invent and demo 10-20 instances of use
Get two working in a single location
Extend to include a second community,
with an appropriate superset capability
Copyright Gordon Bell
Clusters & Grids
Some science is hitting a wall
FTP and GREP are not adequate (Jim Gray)

You can FTP 1 MB in 1 sec.
You can FTP 1 GB / min.
…
2 days and 1K$
…
3 years and 1M$

You can GREP 1 GB in a minute
You can GREP 1 TB in 2 days
You can GREP 1 PB in 3 years.

1PB ~10,000 >> 1,000 disks

At some point you need
indices to limit search
parallel data search and analysis

Goal using dbases. Make it easy to
– Publish: Record structured data
– Find data anywhere in the network





Get the subset you need!
–

Explore datasets interactively
Database becomes the file system!!!
Network concerns

Very high cost
–
–


Disks cost less than $2/GByte to purchase
Low availability of fast links (last mile problem)
–
–

Labs & universities have DS3 links at most,
and they are very expensive
Traffic: Instant messaging, music stealing
Performance at desktop is poor
–

$(1 + 1) / GByte to send on the net;
Fedex and 160 GByte shipments are cheaper
DSL at home is $0.15 - $0.30
1- 10 Mbps; very poor communication links
Manage: trade-in fast links for cheap links!!
Gray’s $2.4 K, 1 TByte
Sneakernet aka Disk Brick
Cost to move a Terabyte
Cost, time, and speed to
move a Terabyte
Cost of a “Sneaker-Net” TB
We now ship NTFS/SQL disks.
Not good format for Linux.
Ship NFS/CIFS/ODBC servers
(not disks).
Plug “disk” into LAN.
DHCP then file or DB
serve…
Service
in Bay
long
term
CourtesyWeb
of Jim Gray,
Microsoft
Area
Research
Cost to move a Terabyte
Speed
Rent
Raw
Context
Mbps $/month $/Mbps
home phone
0.04
40 1,000
home DSL
0.6
70
117
T1
1.5
1,200
800
T3
43
28,000
651
OC3
155
49,000
316
100 Mpbs
100
Gbps
1000
OC192
9600 1,920,000
200
Raw
$/TB
Time/TB
sent
days
3,086 6 years
360 5 months
2,469 2 months
2,010 2 days
976 14 hours
1 day
2.2 hours
617 14 minutes
Cost, time of Sneaker-net vs Alts
Medi
a
CD
DVD
Tape
DiskBric
1500
200
25
7
Robot$
2x800
2x8K
2x15K
1K
Media
$
240
400
1000
1,400
TB read +
write time
ship
time
TotalTim/
TB
Mbps
60 hrs
24
hrs
6 days
28
60 hrs
24
hrs
6 days
28
$20 K $2,000
92 hrs
24
hrs
5 days
18
$31 K $3,100
19 hrs
24
hrs
Courtesy of Jim Gray, Microsoft Bay Area Research
2 days
52
Cost
(10 TB)
$/TB
shipped
$2 K
$208
$2.6
K
$260
Grids: Real and “personal”
Two carrots, one downside. A bet.


Bell will match any Gordon Bell Prize
(parallelism, performance, or
performance/cost) winner’s prize that is
based on “Grid Platform Technology”.
I will bet any individual or set of
individuals of the Grid Research
community up to $5,000 that a Grid
application will not win the above by
SC2005.
Copyright Gordon Bell
Clusters & Grids
The End
How can GRIDs become a real,
useful, computer structure?
Get a life.
Adopt an application community!
Success if CCGSC2004 is the last
…by making Grids ubiquitous.
Clusters & Grids
Copyright Gordon Bell