ppt - Computer Science
Download
Report
Transcript ppt - Computer Science
Reliability Tools and Options
Professor Ken Birman
Dept. of Computer Science
Cornell University
Last Time
We saw that reliability is a complex
spectrum of properties and tradeoffs
We developed the idea of e-triage
And we glanced at some technologies
Today
Last of three lectures on reliability
Focus on technologies in more depth
What can they do for us?
How do they work?
How well to they integrate?
Limitations? Scalability issues?
Technologies
Communication Tools:
TCP/IP
Remote Procedure Call (or “method invocation”)
Process group membership tracking and multicast
Publish/subscribe (also called “MOMS”)
Checkpoint and Restart, perhaps with mirrored disks
Transactions and Databases
Web servers and Java/JNI/JavaScript
Components and Object-oriented architectures
Cluster fault-tolerance and load-balancing
Traditional Linux tools and “scripts”
Hardware reliability – fault-tolerant computers
Sorting Things Out
Computer scientists like to think in terms
of big chunks of technology that they
classify into categories
Often we talk about “layers”
Lowest layers are close to the hardware
Higher layers deal with things closer to
the user who sits in front of a screen
Examples of Layers
Applications
Server Technologies
Middleware
Network Protocols
Operating System
What Makes a Layer?
Layer uses stuff below it but nothing from
above it
And the layer offers a set of services to things
above it
Sometimes we imagine a layer as a thing that
transforms a computer or a network into a
new one with new properties!
Somewhat like looking through a set of magic
eyeglasses, each one somehow transforming
the world into a magic new world…
Examples of Layers
Applications
Server Technologies
Middleware
Network Protocols
Operating System
Operating System
Major ones are
Windows (several varients), from Microsoft
Linux (one of many versions of Unix)
Macintosh OS
Palm OS
VxWorks, QNX
Many other minor ones
Operating System
It runs the hardware for one computer
Also supports “processes”, manages memory and other
resources, provides security
People refer to the OS as a “platform”
Applications use OS features and run “on” it
They don’t need to deal with special issues involving the
hardware because the OS handles them
These days OS also includes components that handle
networking
A modern OS is structured as a set of “objects”
Protocols
These are little programs that run in
applications or in the OS
They work by sending messages over network
connections
Goal is to do something useful in a distributed
manner
For example, network can lose packets
But web pages can’t tolerate missing chunks of data!
So web uses a protocol that resends lost packets
Representative
Protocols?
Just look at two examples to get the feel
Don’t worry about the details
Idea is to understand the “kinds of
things” each layer is doing, not the
specifics
We do teach the specifics in Cornell courses
But any one of these would take weeks to
cover in a comprehensive way
Communication Tools:
TCP/IP
The basic communication technology of the
Web
Works like a telephone call:
Your browser connects to a server using its IP
address (looks like 128.64.77.133)
Your request is sent as a message over the
connection. The result comes back.
The connection automatically matches sending and
receiving rates (easily fooled by noisy links!)
Also, automatically corrects for data loss
TCP sliding window
sender provides data
window has k “segments”
mi+k mi+k-1 ....
receiver replies with
acks and nacks. sender
resends missing data
mi+k+1
-
When acknowledgement is
received, segment number
keeps incrementing but slot
number is reused.
IP packets carry segments
- mi+k-2 - mi+k-3 ...
mi
receiver consumes data
TCP/IP: Pros and
Cons
Simple, widely supported way to communicate
Overcomes packet loss, duplication, out of
order delivery
But can reduce rate down to zero when network
becomes congested, easily fooled by a noisy link
Also, connections can break even if neither
endpoint actually fails.
Things that use TCP, like web browsers, inherit
these benefits… and these problems!
Communication Tools:
RPC
Idea is that each program declares a set
of actions it can perform – “methods”
that can be invoked using an “interface”
Client programs “bind” to interface
Send a message to invoke a method,
reply comes back in form of a message
too. Special protocols overcome failure
The basic RPC protocol
client
“binds” to
server
server
registers with
name service
The basic RPC protocol
client
“binds” to
server
server
registers with
name service
prepares,
sends request
receives request
The basic RPC protocol
client
“binds” to
server
server
registers with
name service
prepares,
sends request
receives request
invokes handler
The basic RPC protocol
client
“binds” to
server
server
registers with
name service
prepares,
sends request
receives request
invokes handler
sends reply
The basic RPC protocol
client
“binds” to
server
server
registers with
name service
prepares,
sends request
receives request
invokes handler
sends reply
unpacks reply
RPC Summary
Basic technology in most “client-server”
situations with exception of the Web
Can hide packet loss but not server failure
Can certainly fail (due to timeout) when server
and client are actually both healthy
Many limitations in terms of form of data you
can send, packet size, etc.
When are they used?
TCP is used to transfer “objects”
Usually objects are reasonably large
Examples are email messages, files, web
pages, copies of programs
RPC is used when a program asks for a
service provided by some other program
Best for small requests and replies
Examples of Layers
Applications
Server Technologies
Middleware
Network Protocols
Operating System
Concept Of
Middleware
Middleware is any kind of a software tool that
runs over a basic infrastructure
Provides a standard set of services for some class
of applications
Idea is that OS and network may be “too general”
Middleware creates a better environment for some
large class of applications that all share a need
poorly addressed by the lower layers
Middleware is increasingly important
Communication Middleware
Example: Multicast
Broad term covering a variety of one-many
communication tools
We talk about the:
Process group: set of programs for which
membership is tracked
Multicast: a way of sending data to group
State transfer: brings a joining program up to date
Order, atomicity: guarantee that messages are seen
in same order by all members, despite failure
Virtual Synchrony
Model
G0={p,q}
G1={p,q,r,s}
p
G2={q,r,s}
G3={q,r,s,t}
crash
q
r
s
t
r, s request to join
r,s added; state xfer
p fails
t requests to join
t added, state xfer
Communication Middleware
Example: Publish/Subscribe
Packaging of one-many communication
tools into an elegant, easily understood
form
Idea is that data producers “publish”
information, marked with “subjects” that
each item is about
Subscribers “subscribe” to the subjects
of interest to them
Conceptually, a
message “bus”
Boxes are publishers (red / green subjects)
Circles are subscribers (“ “ )
Disks represent spoolers used for playback
Flexible and easily extended over time
Supports huge numbers of subjects
Conceptually, a
message “bus”
Boxes are publishers (blue / green subjects)
Circles are subscribers (“ “ )
Disks represent spoolers used for playback
Flexible and easily extended over time
Supports huge numbers of subjects
Conceptually, a
message “bus”
Boxes are publishers (blue / green subjects)
Circles are subscribers (“ “ )
Disks represent spoolers used for playback
Flexible and easily extended over time
Supports huge numbers of subjects
Publish/Subscribe
Pros and Cons
Conceptually very simple, popular
But in practice the infrastructure can be
limiting and cumbersome
Often end up with more or less all processes
receiving more or less all the messages,
anyhow
Example of a technology that made more
sense when computers were slower
When Are They Used?
Process groups?
New York Stock Exchange, Swiss Exchange
French air traffic control system
AEGIS rebuild
Publish-Subscribe message bus
Most trading floors
Factory automation and process control
Some internal use for gluing databases to web sites
Examples of Layers
Applications
Server Technologies
Middleware
Network Protocols
Operating System
Servers
Many modern technologies follow a clientserver programming model
You are the client
The server handles incoming requests
This model is probably the big success of the
1980-2000 period for computing
Normally, client connects to server on network
and uses some form of RPC to talk to it
Servers
Web servers
Database servers
Weblogic: a fancy web server that combines
features needed for eCommerce sites
Mail servers, message queuing servers
Other application-specific servers
E.g. computer-aided design, payroll, etc…
Servers
Secretly, most servers are a database
perhaps extended to know about a
specific category of application or use
We call this domain-specific refinements
Idea is that an Oracle database, out of the
box, is a very general platform but that a lot
of work is needed to use it for, say, payroll
Databases use “transactional” model
Transactions and
Databases
One of the very big, well supported
technologies
Associated with databases
Each program “runs a transaction”
begin
action1 action2 action3 ….
commit or abort
Either entire transaction is performed, or entire
transaction is erased (if disrupted by crash)
ACID Properties
Atomicity: entire group of actions is treated as
one “atomic unit”
Concurrency: more than one can run at the
same time on the same database
Isolation: but they are isolated from each
other, as if only one ran at a time
Durability: committed transactions survive
failures and recoveries
Pros and Cons
Mixture of a powerful model with powerful,
comprehensive vendor support
More or less integrated with web
But recovery can be slow
And high availability databases usually sacrifice some aspects
of ACID guarantee
Note that vendors offer “replication” products but
nobody uses these – performance is terrible.
Hot topic: cluster-style parallel servers
Clustering is a way to get scalability
Trends in Systems
Enough on layers
In previous lectures looked at business issues
associated with the Internet
Today have also seen lots of technology
Mixture of current systems
Emerging products and systems
Technologies
What comes next in distributed computing?
Ways of posing
questions
As a business question:
I want to get rich, what should I invest in?
Ultimately a flakey and meaningless question
Should ask “what should I learn about”
As a research question
I want to be famous, what should I invent?
If you’re so smart, you should tell me!
As a big-picture question
Where is dramatic change inevitable?
This question makes more sense than the others
Looking for Exciting
Change
Our goal is to anticipate dramatic,
unexpected change
Is there a methodology for identifying
the big opportunities?
How can we apply it to networks and
distributed computing?
Traditional Areas
File systems
Communications
Naming of objects, interoperation
Security
Resource management
Transactions
Extensibility
Emerging areas
Scalable service management
Tools for hosting data
Mechanisms for offloading work from
customers onto 3rd party solution
provider systems
QoS mechanisms
Power-aware and mobility support
Where are the big
opportunities?
We could review these one topic at a
time, but that might get dull
Can we develop a methodology for
recognizing big opportunities and
“leaping in”?
Technology trends
CPU MIPS
700
600
500
400
300
200
100
0
Memory MB
LAN Mbits
5
0
0
-2
0
0
0
O/S
overhead
0
2
1
9
9
5
-2
0
0
5
9
9
-1
0
9
9
1
1
9
8
5
-1
9
9
0
WAN Mbits
Source: Scientific American, Sept. 1995
Note tremendous growth
in WAN speeds
Typical latencies
(milliseconds)
Disk I/O
1000
100
Ethernet
RPC
10
1
ATM
roundtrip
0.1
WAN
roundtrip
5
-2
0
0
0
0
0
0
2
1
9
9
5
-2
0
0
5
9
9
-1
0
9
9
1
1
9
8
5
-1
9
9
0
0.01
WAN, disk latencies are
fairly constant due to
physical limitations
O/S latency: the most expensive
overhead on LAN communication!
40
35
30
25
20
15
10
5
0
O/S
overhead in
proportional
terms
19851990
19952000
Suggests?
Notice that revolutionary opportunity is
triggered by technical discontinuity
To predict a revolution…
… just identify a technology sector about to
be shaken up by a trend that breaks the
usual relationships
… predict “big things will happen”
Recent revolutions
Internet became much faster, more
widely available
Operating systems became object
oriented
Enabled the Web
Which enabled all sorts of B2B
developments people knew were coming…
Other examples?
For a long time, PCs were slow and
balky, but very cheap
But around 1990 technology gave us a
fast, big PC
Suddenly, desktop world yielded to PC world
Price point can trigger a discontinuity
Other examples?
We used to be short on memory hence relied
heavily on disks
But around 1985 memory sizes and cost
changed the equation
Suddenly massive caches made sense
Giving us ideas like log-structured file systems and
new styles of caching in file and database systems
A world where 100% hit rates made sense
Looking to the
future?
Major discontinuities:
Move from PC to PDA/telephone hybrids
Mobility, disconnected operation
Emergence of huge numbers of computing
systems that need to cooperate
Perhaps, some form of QoS?
Want to have an
impact?
Trick is to zero in on one of these areas
Be an early player
For example, get a mobile hand-held system and
start to play with it
Lots of things in the legacy infrastructure just aren’t
right for it
Your opportunity: fix a few of them by doing the obvious
things
And you’ll instantly be famous!
Mobile Trends
Nomadicity: increasingly powerful nomadic
devices
Anticipate fusion of web browser, telephone and
also PDA functionality
Some devices of this sort already exist – but they
remain primitive
Low bandwidth interaction a big obstacle right now
– you can’t talk to it, but typing without a keyboard
is a pain
Mobile trends
Communications standards
We already are seeing widespread use of
wireless ethernet cards
Bluetooth is the next big step: widespread
low-power connectivity for small devices
XML helps: data objects are readily
understood… fewer proprietary standards
Mobile trends
Power conservation
Also better understood
Flexibility: compute faster or slower, move
code or data, sleep or run more actively
Signal strength also a factor
Mobile trends
Suggests a future in which
We’ll move from place to place with our
computing context
In a given setting, devices find the
appropriate local resources and can talk to
them
And device is smart about when to ship
code, when to ship data
Mobile trends
But this also points to a missing link: exciting
research opportunity
How to do naming of objects in this new mobile
world?
User wants a single personalized name for resources and a
single name space
But we also need to share things
And how to organize or structure a nomadic or
wireless environment
Peer-to-peer and multi-peer opportunity will be
enormous
Illustrating…
A discontinuous development
From fixed infrastructure to mobile wireless one
High performance but power-aware
Fusion of previously independent technologies
(voice, web, email)
Stress on existing infrastructure
We tend to adapt the existing infrastructure to the
new setting
But a whole new approach may be needed
Driving…
New ideas in file systems
How should we do file systems for mobile and
wireless systems?
Communication
How should we do point to point and multicast for
wireless peer-to-peer or “ad-hoc” networks?
Is TCP the right protocol for a wireless connection
to a server?
The list goes on…
Dangers
It is easy to overreach
People tend to try to do 10 things all at the
same time…
Need to be incremental
Challenge?
Picking the right first step
The right infrastructure can enable just
about anything!
But we’re out of time…
Take-aways from this lecture series?
Business roles in eCommerce
Examples of existing sectors
Some thought about business role in developing
new technology-limited ventures
And some review of how technologies are
structured
Leading to an angle on how to identify big
emerging opportunity areas
What should I know?
If you want to remember just one thing…
Remember the French air traffic control project
Where the US project overreached and failed, the
French went slowly, tested like crazy, and built a
better system that really worked
Scalability and stability of technology is the key
Be French!
Also drink moderate amounts of good red wine
Visit http://www.fromages.com now and then
Remember that vision of the world as 100 people…