Lecture 6 - CSE Home

Download Report

Transcript Lecture 6 - CSE Home

Lecture 6 - Other
Distributed Systems
CSE 490h – Introduction to
Distributed Computing, Spring
2007
Except as otherwise noted, the content of this presentation is
licensed under the Creative Commons Attribution 2.5 License.
Outline
DNS
 BOINC
 PlanetLab
 OLPC & Ad-hoc Mesh Networks
 Lecture content wrap-up

DNS: The Distributed
System in the
Distributed System
Domain Name System

Mnemonic identifiers work considerably better
for humans than IP addresses
“www.google.com? Surely you mean
66.102.7.99!”

Who maintains the mappings from nameIP?
A Manageable Problem
© 2006 Computer History Museum. All rights reserved.
www.computerhistory.org
In the beginning…
Every machine had a file named hosts.txt
 Each line contained a name/IP mapping
 New hosts files were updated and
distributed via email

… This clearly wasn’t going to scale
DNS Implementations
Modern DNS system first proposed in
1983
 First implementation in 1984 (Paul
Mockapetris)
 BIND (Berkeley Internet Name Domain)
written by four Berkeley students in 1985.
 Many other implementations today

Hierarchical Naming

DNS names are arranged in a hierarchy:
www.cs.washington.edu
Entries are either subdomains or
hostnames
 subdomains contain more subdomains, or
hosts (up to 127 levels deep!)
 Hosts have individual IP addresses

Mechanics: Theory
DNS Recurser (client) parses address
from right to left
 Asks root server (with known, static IP
address) for name of first subdomain DNS
server
 Contacts successive DNS servers until it
finds the host

Mechanics: In Practice
ISPs provide a DNS recurser for clients
 DNS recursers cache lookups for period of
time after a request


Greatly speeds up retrieval of entries and
reduces system load
BOINC
What is BOINC?
“Berkeley Open Infrastructure for Network
Computing”
 Platform for Internet-wide distributed
applications
 Volunteer computing infrastructure

 Relies
on many far-flung users volunteering
spare CPU power
Some Facts
1,000,000+ active nodes
 521 TFLOPS of computing power
 20 active projects (SETI@Home,
Folding@Home, Malaria Control…) and
several more in development

(Current as of March 2007)
Comparison to MapReduce
Both are frameworks on which “useful”
systems can be built
 Does not prescribe particular programming
style
 Much more heterogeneous architecture
 Does not have a formal aggregation step
 Designed for much longer-running
systems (months/years vs. minutes/hours)

Architecture
Central server runs LAMP architecture for
web + database
 End-users run client application with
modules for actual computation
 BitTorrent used to distribute data elements
efficiently

System Features
Homogenous redundancy
 Work unit “trickling”
 Locality scheduling
 Distribution based on host parameters

Client software
Available as regular application,
background “service”, or screensaver
 Can be administered locally or LANadministered via RPC
 Can be configured to use only “low
priority” cycles

Client/Task Interaction
Client software runs on variety of
operating systems, each with different IPC
 Uses shared memory message passing to
transmit information from “manager” to
actual tasks and vice versa

Why Participate?
Sense of accomplishment, community
involvement, or scientific duty
 Stress testing machines/networks
 Potential for fame (if your computer “finds”
an alien planet, you can name it!)
 “Bragging rights” for computing more units

 “BOINC
Credits”
Credit & Cobblestones



Work done is rewarded with “cobblestones”
100 cobblestones = 1 day of CPU time for a
computer with performance equaling 1,000
double-precision floating-point MIPS
(Whetstone) & 1,000 integer VAX MIPS
(Dhrystone)
Computers are benchmarked by the BOINC
system and receive credit appropriate to their
machine
Anti-Cheating Measures
Work units are computed redundantly by
several different machines, and results are
compared by the central server for
consistency
 Credit is awarded after the internal server
validates the returned work units
 Work units must be returned before a
deadline

Conclusions

Versatile infrastructure
 SETI
tasks take a few hours
 Climate simulation tasks take months
 Network monitoring tasks are not CPU-bound
at all!
Scales extremely well to internet-wide
applications
 Provides another flexible middleware layer
to base distributed applications on
 Volunteer computing comes with add’l
considerations (rewards, cheating)

PlanetLab
What if you wanted to:
Test a new version of Bittorrent that might
generate GB’s and GB’s of data?
 Design a new distributed hashtable
algorithm for thousands of nodes?
 Create a gigantic caching structure that
mirrored web pages in several sites across
the USA?

Problem Similarities

Each of these problems requires:
 Hundreds
or thousands of servers
 Geographic distribution
 An isolated network for testing and controlled
experiments

Developing one-off systems to support these
would be
 Costly
 Redundant
PlanetLab
A multi-university effort to build a network
for large-scale simulation, testing, and
research
 “Simulate the Internet”

Usage Stats
Servers: 722+
 Slices: 600+
 Users: 2500+
 Bytes-per-day: 3 - 4 TB
 IP-flows-per-day: 190M
 Unique IP-addrs-per-day: 1M

As of Fall, 2006
Project Goals
Supports short- and long-term research
goals
 System put up “as fast as possible” –
PlanetLab design evolves over time to
meet changing needs

 PlanetLab
is a process, not a result
Simultaneous Research
Projects must be isolated from one another
 Code from several researchers:
 Untrustworthy?
Possibly buggy?
 Intellectual property issues?
Time-sensitive experiments must not
interfere with one another
 Must provide realistic workload simulations

Architecture

Built on Linux, ssh, other standard tools
 Provides
“normal” environment for application
development

Hosted at multiple universities w/ separate
admins
 Requires
trust relationships with respect to
previous goals
Architecture (cont.)
Network is divided into “slices” – server
pools created out of virtual machines
 Trusted intermediary “PLC” system grants
access to network resources

 Allows
universities to specify who can use
slices at each site
 Distributed trust relationships
 Central system control  Federated control
Resource allocation


PLC authenticates users and understands relationships
between principals; issues tickets
SHARP system at site validates ticket + returns lease
request
PLC
ticket
User
ticket
lease
slice
User Verification

Public-key cryptography used to sign
modules entered into PlanetLab
 X.509
+ SSL keys are used by PLC + slices to
verify user authenticity
 Keys distributed “out of band” ahead of time
Final Thoughts
Large system with complex relationships
 Currently upgrading to version 4.0
 New systems (GENI) are being proposed
 Still provides lots of resources to
researchers

 CoralCache,
PlanetLab
several other projects run on
OLPC
“They want to deliver vast amounts
of information over the Internet.
And again, the Internet is not
something you just dump
something on. It's not a big truck.
It's a series of tubes.”
The Internet is a series of tubes

The internet is composed of a lot of
infrastructure:
 Clients
and servers
 Routers and switches
 Fiber optic trunk lines, telephone lines, tubes
and trucks

And if we map the density of this
infrastructure…
… it probably looks something like this
Photo: cmu.edu
How do we distribute knowledge
when there are no tubes?

What if we wanted to share a book?
 Pass

What if we wanted to share 10,000 books?
 Build

it along, door-to-door.
community library.
How about 10 million books? Or 300
copies of one book?
 A very
large library?
Solutions
We need to build infrastructure to make
large-scale distribution easy (i.e.,
computers and networking equipment)
 We need to be cheap

 Most
of those dark spots don’t have much
money

We need reliability where reliable power is
costly
 Again,
did you notice that there weren’t so
many lights? It’s because there’s no
The traditional solution: a shared
computer with Internet

India
 75%
of people in rural villages
 90% of phones in urban areas



Many villagers share a single phone, usually
located in the town post office
Likewise, villages typically share a few
computers, located at the school (or
somewhere with reliable power)
What’s the downside to this model?
 It
might provide shared access to a lot of
information, but it doesn’t solve the “300 copies of
a book” case
The distributed solution: the XO
AKA: Children’s Machine, OLPC, $100
laptop
 A cheap (~$150) laptop designed for
children in developing countries

• OLPC = One Laptop Per
Child.
Photo: laptop.org
XO design

Low power consumption
 No
moving parts (flash memory, passive
cooling)
 Dual-mode display
In color, the XO consumes 2-3 watts
 In high-contrast monochrome, less than 1 watt

 Can
be human powered by a foot-pedal
Rugged, child-friendly design
 Low material costs
 Open-source software

XO networking

The XO utilizes far-reaching, low-power
wireless networking to create ad-hoc mesh
networks
 If
any single XO is connected to the Internet,
other nearby computers can share the
connection in a peer-to-peer scheme

Networks can theoretically sprawl as far as
ten miles, even connecting nearby villages
XO storage and sharing

XO relies on network for content and
collaboration
 Content
is stored on a central servers
Textbooks
 Cached websites (Wikipedia)
 User content

 Software
makes it easy to see other users on
the network and share content
XO distribution
XO must be purchased in orders of 1
million units by governments in developing
nations (economies of scale help to lower
costs)
 Governments are responsible for
distribution of laptops
 Laptops are only for children, designed
solely as a tool for learning

XO downfalls

Distribution downfalls
 What

about children in developed nations?
Sell to developed markets at a higher price to subsidize
costs for developing nations.
 Can
governments effectively distribute? What
about black markets?


OLPC could perhaps partner with local schools and
other NGOs to aid in distribution, training and
maintenance
Too expensive?
 Some
nations can only afford as much $20 per
child per year. How can we cater to them?
What can the XO achieve?


Today, only 16 percent of the world’s
population is estimated to have access to the
Internet
Develop new markets
 Microcredit


Make small loans to the impoverished without requiring
collateral
Muhammad Yunus and the Grameen Bank won the
2006 Nobel Peace Prize for their work here
 The


power of the village economy
As millions of users come online in developing nations,
there will be many new opportunities for commerce.
Helps those in developing nations to advance their
economies and develop stronger economic models
Why give the XO to children?
UN Millennium Development Goal #2:
“achieve universal primary education”
 Empower children to think and compete in
a global space

 Children
are a nations greatest resource
 Backed by a bolstered economy, they will
grow to solve other issues (infrastructure,
poverty, famine)
The Course Again (in 5 minutes)

So what did we see in this class?
 Moore’s
law is starting to fail
 More computing power means more
machines
 This means breaking problems into sub
problems
Sub-problems cannot interfere with or depend on
one another
 Have to “play nice” with shared memory

MapReduce

MapReduce is one paradigm for breaking
problems up
 Makes
the “playing nice” easy by enforcing a
decoupled programming model
 Handles lots of the behind-the-scenes work
Distributed Systems & Networks

The network is a fundamental part of a
distributed system
 Have

to plan for bandwidth, latency, etc
We’d like to think of the network as an
abstraction
 Sockets
= pipes
 RPC looks like a normal procedure call,
handles tricky stuff under the hood

Still have to plan for failures of all kinds
Distributed Filesystems

The network allows us to make data
available across many machines
 Network
file systems can hook into existing
infrastructure
 Specialized file systems (like GFS) can offer
better performance with loss of generality

Raises issues of concurrency, process
isolation, and how to combat stale data
And finally…
There are lots of distributed systems out
there
 MapReduce, BOINC, MPI, several other
architectures, styles, problems to solve
 You might be designing an important one
soon yourself!
