PPT - GPU : a Global Processing Unit

Download Report

Transcript PPT - GPU : a Global Processing Unit

GPU, a Gnutella Processing Unit
an extensible P2P framework for
distributed computing
(joint work involving source code of at least 25 people)
GPU logo created by Mark Grady and David A. Lucas
How many of you did not shutdown
the computer and are now here in
this room?
• Assume we are 15 people running a screensaver without performing
real work. The talk lasts one hour.
• Opportunity loss for one hour:
Speed:
15 * 0.8 GFlops = 12 GFlops
Comp: 12GFlops * 1h = 43‘200 billion of floating point operations
• Costs for one hour:
Power consumption: 15 * 300 W =
4500 W during one hour = 4.5 kWh
Money: 4.5kWh à 0.20 CHF = 0.9 CHF
Oil needed:
0.36 liter
(Gasoline: 12.3 kWh/kg)
CO2 emissions:
0.81 kg CO2
(Gasoline: 2.27 kg CO2 / liter)
During one year (15 people)…
• Opportunity loss for one year:
Speed:
15 * 0.8 GFlops = 12 GFlops
Comp: 12GFlops * 1y = 378 432 000 billion of floating point ops
• Costs for one year:
Power consumption: 15 * 300 W =
4500 W during one year = 39.42 MWh
Money: 39.42 MWh => 7 884 CHF (525.6 CHF per head)
Oil needed:
3153.6 liter
(Gasoline: 12.3 kWh/kg)
CO2 emissions:
7 t CO2 (Gasoline: 2.27 kg CO2 / liter)
We signed the Kyoto Protocol…
Let‘s jump into the matter
This video was computed in a distributed fashion by 10 computers
running for about one day. The GPU program distributed jobs,
Terragen computed the frames. Hacker Red created terrain, sky
and camera path.
Terragen artists: Nico, Red, paulatreides, nikoala, nanobit
Distributed Computing
 Distributed computing is a science which solves a large problem by
giving small parts of the problem to many computers to solve and
then combining the solutions for the parts into a solution for the
problem.
 Recent distributed computing projects have been designed to use
the computers of hundreds of thousands of volunteers all over the
world, via the Internet, to look for extra-terrestrial radio signals, to
look for prime numbers so large that they have more than ten million
digits, and to find more effective drugs to fight cancer and the AIDS
virus.
 These projects run when the computer is idle
 These projects are so large, and require so much computing power
to solve, that they would be impossible for any one computer or
person to solve in a reasonable amount of time. (from
distributedcomputing.info)
Distributed Computing vs.
Supercomputers
Lot of computational power available through Distributed Computing.
However, supercomputers support intra-processor algorithms.
Distributed computing projects keep running the same dumb task on
all computers.
Timeline
Cl
ie
nt
2000
2002
2004
Cl
ie
nt
Se
rv
er
Cl
ie
nt
1998
Client/Ser
ver
Cl
ie
nt
Cl
No
de
No
de
Cl
Many other projects
follow:
Folding@home,
Chessbrain.net…
BOINC project
provides framework
to unify several
projects.
Centralized approach
No
de
No
de
Cl
ie
nt
ie
ie
Seti@Home
nt
nt
(Anderson et al.)
Peer to
peer
No
de
No
de
File
sharing
systems:
Kazaa
eMule..
No
de
Gnutella,
J. Frankel, T. Pepper
P2P research frame
works: Sun JXTA,
Triana, GPU
Grid Computing
Globus Toolkit
Peer to Peer approach
Principle of Gnutella
• Flooding
incoming
A
Node
outgoing
Several ideas to limit geometrical growth of packets:
List of already seen packets, Time To live stamp, QueryHit routing,
Ultrapeers system keeps the network more tree-like.
Gnutella is implemented in GPU thanks to Kamil Pogorzelski.
P2P networks are random
graphs…
Node is computer. Length of edges is distance in milliseconds
between two nodes.
…random graphs with given edge
length do not necessarily fit into
plane…
…random graphs are fractal in their
nature.
• Fractal dimension of the Gnutella network:
D = 7.79 (difficult to imagine).
• Interesting property of fractals: patterns repeat, patterns
similar at different scale lengths.
Framework Architecture
Network Architecture
Suitable Tasks for the framework
Extensions
Distributed Search Engine
Terragen Landscape Generator
Long term goals
GPU architecture on a single
computer
Example
Main GPU
application
Frontends
visualize
results of
computations.
Delphi GL port and
effects by Tom Nuydens
Plugins perform computation.
and Chris Rorden
Frontends
monitor
network
performance.
Network Architecture
GPUs advertise their IP number on a public list, GPUs
know each other through autoping and autopongs, GPUs
know IP address of entry gates.
GPU Network in practice
(December 2004)
Around 10 computers available at any time
of the day, in average
3 FTP servers, one for collecting
generated images, one to distribute
updated binaries and one to distribute
videos.
Web on sourceforge.net
CVS on sourceforge.net
Documentation on sourceforge.net
Special features: Chat System
• Allows developers and users to meet, to exchange ideas
and bug reports
• Debugging on the fly: if the network is running correctly,
you should definitively not see the same sentence
repeated five times (as it happened before).
Suitable Tasks for the framework
The star topology is a subset of the
random graph. Any centralized approach
can run with some overhead on the P2P
network (Rene Tegel, applaunch.dll)
No overhead for:
Monte Carlo Methods
Evolutionary Algorithms
Randomized Algorithms
Distributed Databases (to same extent)
GPU Extension I
• Distributed Search Engine by Rene Tegel
Each GPU can run crawlers on websites. Links are visited randomly.
Visited pages are indexed in a local database. Each GPU can query the
databases of other GPUs. Status for this extension: experimental.
GPU Extension II
• Terragen Landscape Generator
(PlanetSide Software and Rene Tegel)
Terragen™
• Terragen is a software written by PlanetSide, a UK
company. It is not open source but free for personal use.
• GPUs download terrain description and camera path
from the FTP server, and decide to render a particular
frame randomly. The computed frame is uploaded back
to the FTP server.
• By merging frames together with a codec, we are able to
generate videos (merging on one computer only)
• Status for this extension: production
• Typical centralized extension
• Download already produced videos here.
Special features: Autoupdate
system
Releasing through Sourceforge takes about three quarter hour.
Quick way to deliver fixes and to keep the cluster updated:
Download new files from FTP server (tea.ch)
Long term goals
• At present, system scales up to
40-60 computers. Change this to
scale up to 500 000 computers as any good P2P
network does.
• Try to extend the framework to get a so that it
supports agents (agent based model)
• Try to implement an example of evolutionary
algorithm (e.g. Core Wars)
• Try to implement a project with public appealing,
like Near Earth Objects Hazard Monitoring
(ORSA)
Long term goals II
• GPU Core
– Keep it under GPL license.
– Rewrite it to be less ugly and more object-oriented.
– Abandon Gnutella and go for a connection layer with
Distributed Hash Tables.
– Connection layer should be generalized to support
any sort of communication (chats, computations, filesharing)
– Native Linux implementation (not only through wine
emulator, although already stable and fast)
– Not only x86 architecture
Special features: CVS support
• Goal: keep source code of developers synchronized.
• Done through CVS of Sourceforge, bash Unix shell,
Cygwin or TortoiseCVS
• Red files are not in sync with repository.
GPU Cluster Pictures
Thank you for your attention!
Home of the project is
http://gpu.sourceforge.net
More videos here…
And thanks to the GPU Team and