Lecture #1 - Wayne State University

Download Report

Transcript Lecture #1 - Wayne State University

Challenge Issues in Distributed
Systems
ECE7650
Challenge Issues
1-1
Glory of the Internet
 1960s: queuing theory and packet switching
principles (ARPNET)
 1970s: Proprietary networks (Ethernet) and
Inter-networking (TCP/IP)
 1980s: Network protocols (smtp, ftp, etc), new
networks like NSFnet and Bitnet
 1990s: Killer applications (web and ecommerce), commercialization
 2000s: Applications blooming (p2p, VoiceIP,
social networks, cloud and storage
services,etc)
Reasons for Internet Success
 Cerf and Kahn’s internetworking principles (1974)




minimalism, autonomy - no internal changes required
to interconnect networks
best effort service model
stateless routers
decentralized control
 Design philosophy is to make it simple
 A key architectural feature is “narrow-waisted
hourglass model” with a well-defined small interface at
the mid-level
Assumptions
 Stationary hosts in wired network



Each host is assigned a topologically-dependent IP
address
Routing is based on IP address
But mobile and wireless comm becomes pervasive
 Friendly environment



Hosts trust each other, little concern of security and
privacy
TCP/IP is non-secure by design
“Identity assumption” is no longer valid Accountability
problem
 Small scale and uniform edge


Grew out of early small scale ARPNET experience
No one could image today’s hundred of mills hosts,
billions’ cellular phones are ready to be plugged in; sensor
networks, etc
Assumptions (cont’)
 Simple applications
 Alternative reliable communication
infrastructure based on Cerf and Khan’s
principles and “narrow-waisted hourglass model”
 Clearly defined applications are supported by a
well-defined functional interface
 Good will and cooperative
 Best effort, store & forward, autonomous,
distributed decisions in intra-domain, as well as
inter-domain (BGP)
 Reality is a battlefield of multi-players;
competition, economic incentives must be taken
into account
Ad Hoc Work-Around
 To accommodate mobile hosts
 Mobile IP, but
 IP addr corresponding host and triangle routing is inefficient
 TCP hides high delay and loss rate in wireless networks by
dealing with them as congestion
 Hostile environment
 Firewall, but
 Violate end-to-end argument; possibility of firewall must be
taken into account by appl designers
 IPsec? How can you prevent from attacks/harassment by
unsolicited traffic!
 Large scale, diversified edge
 NAT relieves the shortage of address; Routers should process
up to layer 3, but NAT router needs to process layer 4
Ad Hoc Work-Around (cont’)
 Meet various application requirements
 QoS-aware routers: IntServ, DiffServ
 RSVP, etc
 Hard to deploy widely (all routers along the path)
 Non-cooperative, competitive
 Service Level Agreement (SLA) enforcement?
 Big BGP problems in inter-domain routing; a single
mistyped command at a router at one ISP caused
disruption of connectivity across many neighbors
 Economic incentive?
 Hard to reach consensus between competitors;
sometimes standardization may lose the market
advantage
Application-Level Mitigation
TCP/UDP/IP: “best-effort service”
 no guarantees on delay, loss
?
?
?
?
?
?
?
?
But multimedia apps requires
QoS and level of performance to be effective!
?
?
?
Today’s Internet multimedia applications
use application-level techniques to mitigate
(as best possible) effects of delay, loss
Any problem in computer science can be solved with
another layer of indirection [except the problem of too
many layers of indirection] (David Wheeler, PhD’51)
Distributed systems layer
application
transport
network
link
physical
Distributed Apps
Middleware Service:
OS/Net module
NIC/Driver
Communication:
Sync vs Async comm
Group comm
Reliable comm
Transactional comm
Latency tolerance
Etc
Coordination
Challenge Issues
1-9
What is a Distributed System
 A system in which hw or sw components located at
networked computers communicate and coordinate
their actions only by passing messages. [CDK]



Autonomous: independent failures
Concurrent program execution is norm
No global clok: coordination by exchanging messages
 Examples




Basic Internet services like Web, email, ftp,
Streaming apps (audio, video)
P2P file sharing (bitorrent)
Cloud computing and storage services
Challenge Issues
10
Middleware
 Computer sw that connects sw components or some
people and their applications. The software consists
of a set of services that allows multiple processes
running on one or more machines to interact.

The set of services together defines a uniform computing
model for use by the programmers of servers and
distributed apps
Challenge Issues
11
Challenge Issues
 Heterogeneity
 Heterogeneous components must be able to interoperate
 Distribution transparency
 Distribution should be hidden from the user as much as
possible
 Fault tolerance
 Failure of a component (partial failure) should not result in
failure of the whole system
 Scalability
 System should work efficiently with an increasing number of
users
 System performance should increase with inclusion of
additional resources.
Challenge Issues
12
Challenge Issues (cont’)
 Concurrency

Shared access to resources must be possible
 Openness
 Interfaces should be publicly available to ease
adding new components
 Security
 The system should only be used in the way
intended
Challenge Issues
13
Heterogeneity
 Variety of computers in a DS
Networks, computer HW, OS, Programming
languages, various implementations, etc
 E.g. network protocols, data types,

 Middleware is a software layer providing a
programming abstraction as well as masking
the heterogeneity.

E.g. CORBA, Java RMI are example
 Virtual machine approach provides a way of
making code executable on any hw. E.g JVM
Challenge Issues
14
Openness
 Characteristic that determines whether the
system can be extended or re-implemented in
various ways without disruption to or duplication
of existing services.


HW extension: peripheral, memory, network interface
SW extension: OS features, communication protocols,
resource sharing services
• e.g. Unix utility, browser protocol and handler
 Key interfaces are published, or standardized
(ISO, IEEE, etc); industry de-facto standards
that bypass cumbersome official standardization
procedures
 Any component implementations must conform to
the published standard.
Challenge Issues
15
Openness: Unix
 Openness is achieved by specifying and
documenting the key sw interfaces
 Unix features are fully accessible through
system calls
add drivers
 develop applications
 include new features: IPC
 Linux: the kernel is open too!

Challenge Issues
16
Openness: Web Browser
 Openness is achieved through a set of
helpers or content handlers (pluggins)
 Different data formats are decoded using
different tools

E.g. .html/.gif/.jpeg/.pdf
 Built-in content handler: extensible?
 Built-in protocol handler: extensible?
 protocol is a set of communication rules
Challenge Issues
17
Transparency
 Concealment from the user and the apps programmer
of the separation of components, so that the system
is perceived as a coherent system
 Eight Forms of transparency (ANSA’89, ISO’92)




Access transparency: enable local and remote resources to be
accessed using identical operations
Location transparency: enable resources to be accessed
without knowledge of their location
Concurrency transparency: enable several processes to
operate concurrently using shared resources without
interference between them
Replication transparency: enable multiple instances of
resources to be used to increase reliability and performance
without knowledge of the replicas by users or appl.
programmers
Challenge Issues
18
Transparency (Cont’)
 Eight Forms of Transparency (cont’)

Failure transparency: enable the concealment of
faults, allowing users and appl. Programs to
complete their tasks despite the failure of hw or sw
components (e.g. email delivery)
• Middleware generally converts the failures of networks and
processes into programming-level exception



Mobility transparency: allow the movement of
resources and clients within a system without
affecting the operation of users or programs
Performance transparency: allow the system to be
reconfigured to improve performance as loads vary
Scaling transparency: allow the system and
application to expand in scale without change to the
system structure or the application algorithms.
Challenge Issues
19
Transparency
Access transparency
Location transparency
Mobility transparency
Failure transparency
Replication transparency
Concurrency transparency
Performance transparency
Scaling transparency
Network Transparency
Different forms of transparency in a distributed system;
Full transparency is too costly and impossible in some situations
Challenge Issues
20
Scalability: High Perf./Availability
 Distributed systems operate effectively and
efficiently at different scales of resources and users

Size, Geographical location, Administration
 Objectives:
 Control the cost of physical resource. E.g. if a single file
server can support 20 users, 40 users for two servers?
 Control the performance loss, independent of resource size?
 Prevent sw resources running out.
• E.g. 32-bit Internet address IPv4 and 128-bit Internet
address IPV6.
• Cost of scalability can’t be ignored: overhead of a
scalable machine: Power, Fan, ...
• Over-compensating for future growth may be worse than
adapting to a change when we are forced to
Challenge Issues
21
Scalability (Cont’)
 Objectives (cont’)

Avoid performance bottleneck
• Centralized vs decentralized organization
Concept
Example
Centralized services
A single server for all users
Centralized data
A single on-line telephone book
Centralized algorithms
Doing routing based on complete information
Challenge Issues
22
Scaling Techniques
 Hide communication latency

Asynchronous communication
 Distribution
 Naming
 Replication
Cache
 Consistency

Challenge Issues
23
Scaling Tech. for Interactive App
1.4
The difference between letting:
a) a server or
b) a client check forms as they are being filled
Challenge Issues
24
Scalable Naming
1.5
An example of dividing the DNS name space into zones.
Challenge Issues
25
Concurrency: High Perf./Availability:
 More than one client want to access shared
resource at the same time; the requests
need be handled in parallel
 Server-side concurrency
Server side operations: Database/mining, CGI
 Servers on single CPU machines (Interleaving):

• multiprogramming

Servers on symmetric multiple CPU machines
• multiprogramming and multithreading

Servers on networks of workstations
• Scalable server technology
Challenge Issues
26
Concurrency (cont.)
 Clients share load with server
Data compression/decompression
 Data encryption/decryption
 input verification, decoration, calculation

• Java applet or JavaScript
• Client-side version of JavaScript allows “executable
content” to be included in web pages.
 Do it in parallel!
Challenge Issues
27
Failure Handling for High Availability
 HW/SW failure is common. Challenge is
how to deal with failures.
 Failures in a distributed system are often
partial. failure handling becomes even
harder.
 Service availability: server’s availability to
provide uninterrupted services over the
time; measured as the percentage of
uptime

99.9% availability equals to 8 hours 45 minutes
of downtime per year
Challenge Issues
28
Failure Handling
 How to handle failures:

Failure detection:
• Checksun is used to detect corrupted data in a
message
• How to detect a remote crashed server
Failure masking. E.g. Retransmit messages that
are lost
 Recovery from failure:

• SW is designed in a way that the state of permanent
data can be recovered or “rolled back” after a server
has crashed.

Tolerate failure, by the use of redundant
components
Challenge Issues
29
Security
 Security is a primary concern in an open
distributed system
 Secure system in three aspects:
Confidentiality (privacy): protection against
disclosure to unauthorized individuals
 Integrity: protect against alteration or
corruption
 Availability: protect against interference with
the means to access the resources

Challenge Issues
30
Challenge Issues: In Summary
 Heterogeneity
 Distribution transparency
 Fault tolerance
 Scalability
 Concurrency
 Openness
 Security
Challenge Issues
1-31