Lecture 6 - CSE Home
Download
Report
Transcript Lecture 6 - CSE Home
Lecture 6 - Other
Distributed Systems
CSE 490h – Introduction to
Distributed Computing, Spring
2007
Except as otherwise noted, the content of this presentation is
licensed under the Creative Commons Attribution 2.5 License.
Outline
DNS
BOINC
PlanetLab
OLPC & Ad-hoc Mesh Networks
Lecture content wrap-up
DNS: The Distributed
System in the
Distributed System
Domain Name System
Mnemonic identifiers work considerably better
for humans than IP addresses
“www.google.com? Surely you mean
66.102.7.99!”
Who maintains the mappings from nameIP?
A Manageable Problem
© 2006 Computer History Museum. All rights reserved.
www.computerhistory.org
In the beginning…
Every machine had a file named hosts.txt
Each line contained a name/IP mapping
New hosts files were updated and
distributed via email
… This clearly wasn’t going to scale
DNS Implementations
Modern DNS system first proposed in
1983
First implementation in 1984 (Paul
Mockapetris)
BIND (Berkeley Internet Name Domain)
written by four Berkeley students in 1985.
Many other implementations today
Hierarchical Naming
DNS names are arranged in a hierarchy:
www.cs.washington.edu
Entries are either subdomains or
hostnames
subdomains contain more subdomains, or
hosts (up to 127 levels deep!)
Hosts have individual IP addresses
Mechanics: Theory
DNS Recurser (client) parses address
from right to left
Asks root server (with known, static IP
address) for name of first subdomain DNS
server
Contacts successive DNS servers until it
finds the host
Mechanics: In Practice
ISPs provide a DNS recurser for clients
DNS recursers cache lookups for period of
time after a request
Greatly speeds up retrieval of entries and
reduces system load
BOINC
What is BOINC?
“Berkeley Open Infrastructure for Network
Computing”
Platform for Internet-wide distributed
applications
Volunteer computing infrastructure
Relies
on many far-flung users volunteering
spare CPU power
Some Facts
1,000,000+ active nodes
521 TFLOPS of computing power
20 active projects (SETI@Home,
Folding@Home, Malaria Control…) and
several more in development
(Current as of March 2007)
Comparison to MapReduce
Both are frameworks on which “useful”
systems can be built
Does not prescribe particular programming
style
Much more heterogeneous architecture
Does not have a formal aggregation step
Designed for much longer-running
systems (months/years vs. minutes/hours)
Architecture
Central server runs LAMP architecture for
web + database
End-users run client application with
modules for actual computation
BitTorrent used to distribute data elements
efficiently
System Features
Homogenous redundancy
Work unit “trickling”
Locality scheduling
Distribution based on host parameters
Client software
Available as regular application,
background “service”, or screensaver
Can be administered locally or LANadministered via RPC
Can be configured to use only “low
priority” cycles
Client/Task Interaction
Client software runs on variety of
operating systems, each with different IPC
Uses shared memory message passing to
transmit information from “manager” to
actual tasks and vice versa
Why Participate?
Sense of accomplishment, community
involvement, or scientific duty
Stress testing machines/networks
Potential for fame (if your computer “finds”
an alien planet, you can name it!)
“Bragging rights” for computing more units
“BOINC
Credits”
Credit & Cobblestones
Work done is rewarded with “cobblestones”
100 cobblestones = 1 day of CPU time for a
computer with performance equaling 1,000
double-precision floating-point MIPS
(Whetstone) & 1,000 integer VAX MIPS
(Dhrystone)
Computers are benchmarked by the BOINC
system and receive credit appropriate to their
machine
Anti-Cheating Measures
Work units are computed redundantly by
several different machines, and results are
compared by the central server for
consistency
Credit is awarded after the internal server
validates the returned work units
Work units must be returned before a
deadline
Conclusions
Versatile infrastructure
SETI
tasks take a few hours
Climate simulation tasks take months
Network monitoring tasks are not CPU-bound
at all!
Scales extremely well to internet-wide
applications
Provides another flexible middleware layer
to base distributed applications on
Volunteer computing comes with add’l
considerations (rewards, cheating)
PlanetLab
What if you wanted to:
Test a new version of Bittorrent that might
generate GB’s and GB’s of data?
Design a new distributed hashtable
algorithm for thousands of nodes?
Create a gigantic caching structure that
mirrored web pages in several sites across
the USA?
Problem Similarities
Each of these problems requires:
Hundreds
or thousands of servers
Geographic distribution
An isolated network for testing and controlled
experiments
Developing one-off systems to support these
would be
Costly
Redundant
PlanetLab
A multi-university effort to build a network
for large-scale simulation, testing, and
research
“Simulate the Internet”
Usage Stats
Servers: 722+
Slices: 600+
Users: 2500+
Bytes-per-day: 3 - 4 TB
IP-flows-per-day: 190M
Unique IP-addrs-per-day: 1M
As of Fall, 2006
Project Goals
Supports short- and long-term research
goals
System put up “as fast as possible” –
PlanetLab design evolves over time to
meet changing needs
PlanetLab
is a process, not a result
Simultaneous Research
Projects must be isolated from one another
Code from several researchers:
Untrustworthy?
Possibly buggy?
Intellectual property issues?
Time-sensitive experiments must not
interfere with one another
Must provide realistic workload simulations
Architecture
Built on Linux, ssh, other standard tools
Provides
“normal” environment for application
development
Hosted at multiple universities w/ separate
admins
Requires
trust relationships with respect to
previous goals
Architecture (cont.)
Network is divided into “slices” – server
pools created out of virtual machines
Trusted intermediary “PLC” system grants
access to network resources
Allows
universities to specify who can use
slices at each site
Distributed trust relationships
Central system control Federated control
Resource allocation
PLC authenticates users and understands relationships
between principals; issues tickets
SHARP system at site validates ticket + returns lease
request
PLC
ticket
User
ticket
lease
slice
User Verification
Public-key cryptography used to sign
modules entered into PlanetLab
X.509
+ SSL keys are used by PLC + slices to
verify user authenticity
Keys distributed “out of band” ahead of time
Final Thoughts
Large system with complex relationships
Currently upgrading to version 4.0
New systems (GENI) are being proposed
Still provides lots of resources to
researchers
CoralCache,
PlanetLab
several other projects run on
OLPC
“They want to deliver vast amounts
of information over the Internet.
And again, the Internet is not
something you just dump
something on. It's not a big truck.
It's a series of tubes.”
The Internet is a series of tubes
The internet is composed of a lot of
infrastructure:
Clients
and servers
Routers and switches
Fiber optic trunk lines, telephone lines, tubes
and trucks
And if we map the density of this
infrastructure…
… it probably looks something like this
Photo: cmu.edu
How do we distribute knowledge
when there are no tubes?
What if we wanted to share a book?
Pass
What if we wanted to share 10,000 books?
Build
it along, door-to-door.
community library.
How about 10 million books? Or 300
copies of one book?
A very
large library?
Solutions
We need to build infrastructure to make
large-scale distribution easy (i.e.,
computers and networking equipment)
We need to be cheap
Most
of those dark spots don’t have much
money
We need reliability where reliable power is
costly
Again,
did you notice that there weren’t so
many lights? It’s because there’s no
The traditional solution: a shared
computer with Internet
India
75%
of people in rural villages
90% of phones in urban areas
Many villagers share a single phone, usually
located in the town post office
Likewise, villages typically share a few
computers, located at the school (or
somewhere with reliable power)
What’s the downside to this model?
It
might provide shared access to a lot of
information, but it doesn’t solve the “300 copies of
a book” case
The distributed solution: the XO
AKA: Children’s Machine, OLPC, $100
laptop
A cheap (~$150) laptop designed for
children in developing countries
• OLPC = One Laptop Per
Child.
Photo: laptop.org
XO design
Low power consumption
No
moving parts (flash memory, passive
cooling)
Dual-mode display
In color, the XO consumes 2-3 watts
In high-contrast monochrome, less than 1 watt
Can
be human powered by a foot-pedal
Rugged, child-friendly design
Low material costs
Open-source software
XO networking
The XO utilizes far-reaching, low-power
wireless networking to create ad-hoc mesh
networks
If
any single XO is connected to the Internet,
other nearby computers can share the
connection in a peer-to-peer scheme
Networks can theoretically sprawl as far as
ten miles, even connecting nearby villages
XO storage and sharing
XO relies on network for content and
collaboration
Content
is stored on a central servers
Textbooks
Cached websites (Wikipedia)
User content
Software
makes it easy to see other users on
the network and share content
XO distribution
XO must be purchased in orders of 1
million units by governments in developing
nations (economies of scale help to lower
costs)
Governments are responsible for
distribution of laptops
Laptops are only for children, designed
solely as a tool for learning
XO downfalls
Distribution downfalls
What
about children in developed nations?
Sell to developed markets at a higher price to subsidize
costs for developing nations.
Can
governments effectively distribute? What
about black markets?
OLPC could perhaps partner with local schools and
other NGOs to aid in distribution, training and
maintenance
Too expensive?
Some
nations can only afford as much $20 per
child per year. How can we cater to them?
What can the XO achieve?
Today, only 16 percent of the world’s
population is estimated to have access to the
Internet
Develop new markets
Microcredit
Make small loans to the impoverished without requiring
collateral
Muhammad Yunus and the Grameen Bank won the
2006 Nobel Peace Prize for their work here
The
power of the village economy
As millions of users come online in developing nations,
there will be many new opportunities for commerce.
Helps those in developing nations to advance their
economies and develop stronger economic models
Why give the XO to children?
UN Millennium Development Goal #2:
“achieve universal primary education”
Empower children to think and compete in
a global space
Children
are a nations greatest resource
Backed by a bolstered economy, they will
grow to solve other issues (infrastructure,
poverty, famine)
The Course Again (in 5 minutes)
So what did we see in this class?
Moore’s
law is starting to fail
More computing power means more
machines
This means breaking problems into sub
problems
Sub-problems cannot interfere with or depend on
one another
Have to “play nice” with shared memory
MapReduce
MapReduce is one paradigm for breaking
problems up
Makes
the “playing nice” easy by enforcing a
decoupled programming model
Handles lots of the behind-the-scenes work
Distributed Systems & Networks
The network is a fundamental part of a
distributed system
Have
to plan for bandwidth, latency, etc
We’d like to think of the network as an
abstraction
Sockets
= pipes
RPC looks like a normal procedure call,
handles tricky stuff under the hood
Still have to plan for failures of all kinds
Distributed Filesystems
The network allows us to make data
available across many machines
Network
file systems can hook into existing
infrastructure
Specialized file systems (like GFS) can offer
better performance with loss of generality
Raises issues of concurrency, process
isolation, and how to combat stale data
And finally…
There are lots of distributed systems out
there
MapReduce, BOINC, MPI, several other
architectures, styles, problems to solve
You might be designing an important one
soon yourself!