Architectures

Download Report

Transcript Architectures

Architectures
Architectural Styles (1)
 Considering the logical organization of distributed
systems into software components, also referred to
as software architecture
 Important styles of architecture for distributed
systems
•
•
•
•
Layered architectures
Object-based architectures
Data-centered architectures
Event-based architectures
2
Architectural Styles (2)
The basic idea for the layered
style is simple: components are
organized in a layered fashion
where a component at layer L; is
allowed to call components at
the underlying layer Li but not
the other way around.
An key observation is that
control generally flows from
layer to layer: requests go down
the hierarchy whereas the
results flow upward.
Figure 2-1. The (a) layered architectural style
3
Architectural Styles (3)
Each object corresponds to what
we have defined as a
component,
and
these
components are connected
through a (remote) procedure
call mechanism. This software
architecture matches the clientserver system architecture. The
layered
and
object-based
architectures still form the most
important styles for large
software systems
Figure 2-1. (b) The object-based architectural
style.
4
Reminder
• IPC
– Pipes, TCP/IP, shared storage
• RPC
– CORBA
– RMI
– Web Services
– REST
5
Architectural Styles (4)
 Data-centered architectures evolve around the idea
that processes communicate through a common (passive
or active) repository.
 It can be argued that for distributed systems these
architectures are as important as the layered and objectbased architectures.
 For example, a wealth of networked applications have
been developed that rely on a shared distributed file
system in which virtually all communication takes place
through files.
6
Architectural Styles (5)
In event-based architectures,
processes essentially communicate
through the propagation of events,
which optionally also carry data.
For distributed systems, event
propagation has generally been
associated with what are known as
publish/subscribe systems.
The basic idea is that processes
publish events after which the
middleware ensures that only
those processes that subscribed to
those events will receive them. The
main advantage of event-based
systems is that processes are loosely Figure 2-2. (a) The event-based architectural
coupled. In principle, they need not
style
explicitly refer to each other.
7
Architectural Styles (6)
Event-based architectures can be
combined with data-centered
architectures, yielding what is
also known as shared data
spaces. The essence of shared
data spaces is that processes are
now also decoupled in time: they
need not both be active when
communication takes place.
Furthermore, many shared data
spaces use a SQL-like interface to
the shared repository in that
sense that data can be accessed
using a description rather than
an explicit reference, as is the
case with files.
Figure 2-2. (b) The shared dataspace architectural style.
8
ESB
• An Enterprise Service Bus, ESB, is an
application that gives access to other
applications and services. Its main task is to be
the messaging and integration backbone of an
enterprise.
9
System Architectures
 How many distributed systems are actually organized
by considering where software components are placed.
 Deciding on software components, their interaction,
and their placement leads to an instance of a software
architecture, also called a system architecture.
 Two Types of architecture:
 Centralized architecture
 Decentralized architecture
10
Centralized Architectures (1)
 In the basic client-server model, processes in a
distributed system are divided into two (possibly
overlapping) groups.
 A server is a process implementing a specific service, for
example, a file system service or a database service.
 A client is a process that requests a service from a server by
sending it a request and subsequently waiting for the server's
reply.
This client-server interaction, also known as requestreply behavior
11
Centralized Architectures (2)
• Figure 2-3. General interaction between a
client and a server.
12
Centralized Architectures (3)
 Communication between a client and a server can be
implemented by means of a simple connectionless
protocol when the underlying network is fairly reliable
as in many local-area networks
When a client requests a service, it simply packages a
message for the server, identifying the service it wants,
along with the necessary input data. The message is
then sent to the server.
 Server will always wait for an incoming request,
subsequently process it, and package the results in a
reply message that is then sent to the client.
13
Centralized Architectures (4)
 Advantage of connectionless protocol:
 efficient
 As long as messages do not get lost or corrupted, the
request/reply protocol works fine.
 Making
the protocol resistant
transmission failures is not trivial.
to
occasional
 Solution: the client must resend the request when no reply
message comes in.
 Problem: the client cannot detect whether the original
request message was lost, or that transmission of the reply
failed.
14
Centralized Architectures (5)
 If the reply was lost, then resending a request may result in
performing the operation twice.
 Examples:
 If the operation was
"transfer $10,000 from my bank account,"
then, it would be better to report an error instead.
 If the operation was
"tell me how much money I have left,"
then, it would be perfectly acceptable to resend the request.
15
Centralized Architectures (6)
 As an alternative, many client-server systems use a
reliable connection-oriented protocol.
 Although this solution is not appropriate in a localarea network due to relatively low performance, it
works perfectly in wide-area systems in which
communication is inherently unreliable.
16
Centralized Architectures (7)
Example:
Virtually all Internet application protocols are based on
reliable TCP/IP connections.
In this case, whenever a client requests a service, it first sets
up a connection to the server before sending the request.
The server generally uses that same connection to send the
reply message, after which the connection is torn down.
Trouble: setting up and tearing down a connection is
relatively costly, especially when the request and reply
messages are small.
17
Application Layering (1)
Recall previously mentioned layers of
architectural style
• The user-interface level
• The processing level
• The data level
18
Application Layering (2)
• Figure 2-4. The simplified organization of an
Internet search engine into three different
layers.
19
Multitiered Architectures (1)
• The simplest organization is to have only two types
of machines:
• A client machine containing only the programs
implementing (part of) the user-interface level
• A server machine containing the rest,
– the programs implementing the processing and
data level
20
Multitiered Architectures (2)
• Figure 2-5. Alternative client-server
organizations (a)–(e).
21
Multitiered Architectures (3)
• Figure 2-6. An example of a server acting as
client.
22
Structured Peer-to-Peer Architectures (1)
• Figure 2-7. The
mapping of data
items onto nodes
23
Chord protocol
• Consistent hashing function assigns each node
and key an m-bit identifier using SHA-1 base
hash function.
• Node’s IP address is hashed.
• Identifiers are ordered on a identifier circle
modulo 2m called a chord ring.
• succesor(k) = first node whose identifier is >=
identifier of k in identifier space.
24
Chord Protocol
• Assumes communication in underlying
network is both symmetric and transitive.
• Assigns keys to nodes with consistent hashing
• Hash function balances the load
• When Nth node joins or leaves only O(1/N)
fraction of keys moved.
25
Chord protocol
m=6
10 nodes
26
Theorem
•
For any set of N nodes and K keys, with high
probability:
1.
2.
Each node is responsible for at most (1+e)K/N
keys.
When an (N+1)st node joins or leaves the
network, responsibility for O(K/N) keys changes
hands.
e = O(log N)
27
Simple Key Location Scheme
N1
N8
K45
N48
N14
N42
N38
N32
N21
28
Scalable Lookup Scheme
N1
Finger Table for N8
N56
N8
N51
finger 6
N48
finger 1,2,3
N8+1
N14
N8+2
N14
N8+4
N14
N8+8
N21
N8+16
N32
N8+32
N42
N14
finger 5
N42
finger 4
N38
N32
29
N21
finger [k] = first node that succeeds (n+2k-1)mod2m
Scalable Lookup Scheme
// ask node n to find the successor of id
n.find_successor(id)
if (id belongs to (n, successor])
return successor;
else
n0 = closest preceding node(id);
return n0.find_successor(id);
// search the local table for the highest predecessor of id
n.closest_preceding_node(id)
for i = m downto 1
if (finger[i] belongs to (n, id))
return finger[i];
return n;
30
Lookup Using Finger Table
N1
lookup(54)
N56
N8
N51
N48
N14
N42
N38
N32
N21
31
Scalable Lookup Scheme
• Each node forwards query at least halfway
along distance remaining to the target
• Theorem: With high probability, the number
of nodes that must be contacted to find a
successor in a N-node network is O(log N)
32
Dynamic Operations and Failures
Need to deal with:
– Node Joins and Stabilization
– Impact of Node Joins on Lookups
– Failure and Replication
– Voluntary Node Departures
33
Node Joins and Stabilization
• Node’s successor pointer should be up to date
– For correctly executing lookups
• Each node periodically runs a “Stabilization”
Protocol
– Updates finger tables and successor pointers
34
Node Joins and Stabilization
• Contains 6 functions:
– create()
– join()
– stabilize()
– notify()
– fix_fingers()
– check_predecessor()
35
Create()
• Creates a new Chord ring
n.create()
predecessor = nil;
successor = n;
36
Join()
• Asks m to find the immediate successor of n.
• Doesn’t make rest of the network aware of n.
n.join(m)
predecessor = nil;
successor = m.find_successor(n);
37
Stabilize()
• Called periodically to learn about new nodes
• Asks n’s immediate successor about successor’s predecessor p
– Checks whether p should be n’s successor instead
– Also notifies n’s successor about n’s existence, so that successor may
change its predecessor to n, if necessary
n.stabilize()
x = successor.predecessor;
if (x  (n, successor))
successor = x;
successor.notify(n);
38
Notify()
• m thinks it might be n’s predecessor
n.notify(m)
if (predecessor is nil or m  (predecessor, n))
predecessor = m;
39
Fix_fingers()
• Periodically called to make sure that finger table entries are
correct
– New nodes initialize their finger tables
– Existing nodes incorporate new nodes into their finger tables
n.fix_fingers()
next = next + 1 ;
if (next > m)
next = 1 ;
finger[next] = find_successor(n + 2next-1);
40
Check_predecessor()
• Periodically called to check whether
predecessor has failed
– If yes, it clears the predecessor pointer, which can
then be modified by notify()
n.check_predecessor()
if (predecessor has failed)
predecessor = nil;
41
Theorem 3
• If any sequence of join operations is executed
interleaved with stabilizations, then at some
time after the last join the successor pointers
will form a cycle on all nodes in the network
42
Stabilization Protocol
• Guarantees to add nodes in a fashion to
preserve reach ability
• By itself won’t correct a Chord system that has
split into multiple disjoint cycles, or a single
cycle that loops multiple times around the
identifier space
43
Impact of Node Joins on Lookups
• Correctness
– If finger table entries are reasonably current
• Lookup finds the correct successor in O(log N) steps
– If successor pointers are correct but finger tables
are incorrect
• Correct lookup but slower
– If incorrect successor pointers
• Lookup may fail
44
Impact of Node Joins on Lookups
• Performance
– If stabilization is complete
• Lookup can be done in O(log N) time
– If stabilization is not complete
• Existing nodes finger tables may not reflect the new nodes
– Doesn’t significantly affect lookup speed
• Newly joined nodes can affect the lookup speed, if the new nodes
ID’s are in between target and target’s predecessor
– Lookup will have to be forwarded through the intervening nodes,
one at a time
45
Theorem 4
• If we take a stable network with N nodes with
correct finger pointers, and another set of up
to N nodes joins the network, and all
successor pointers (but perhaps not all finger
pointers) are correct, then lookups will still
take O(log N) time with high probability
46
Failure and Replication
• Correctness of the protocol relies on the fact
of knowing correct successor
• To improve robustness
– Each node maintains a successor list of ‘r’ nodes
– This can be handled using modified version of
stabilize procedure
– Also helps higher-layer software to replicate data
47
Theorem 5
• If we use successor list of length r = O(log N) in
a network that is initially stable, and then
every node fails with probability ½, then with
high probability find_successor returns the
closest living successor to the query key
48
Theorem 6
• In a network that is initially stable, if every
node fails with probability ½, then the
expected time to execute find_successor is
O(log N)
49
Voluntary Node Departures
• Can be treated as node failures
• Two possible enhancements
– Leaving node may transfers all its keys to its
successor
– Leaving node may notify its predecessor and
successor about each other so that they can
update their links
50
Structured Peer-to-Peer Architectures (2)
• Figure 2-8. (a) The
mapping of data items
onto nodes in CAN.
51
From (0.2,0.3) to (0.9,0.6)?
• There are several possibilities, but if we want
to follow the shortest path according to a
Euclidean distance, we should follow the route
(0.2,0.3) →(0.6,0.7) →(0.9,0.6), which has a
distance of 0.882.
• The alternative route (0.2,0.3) → (0.7,0.2) →
(0.9,0.6) has a distance of 0.957.
52
Structured Peer-to-Peer Architectures (3)
• Figure 2-8. (b)
Splitting a region
when a node joins.
53
Unstructured Peer-to-Peer Architectures
(1)
• Figure 2-9. (a) The steps taken by the active thread.
54
Unstructured Peer-to-Peer Architectures (2)
• Figure 2-9. (b) The steps take by the passive thread
55
Topology Management of Overlay
Networks (1)
• Figure 2-10. A two-layered approach for constructing
and maintaining specific overlay topologies using
techniques from unstructured peer-to-peer systems.
56
Topology Management of Overlay
Networks (2)
• Figure 2-11. Generating a specific overlay network using a
two-layered unstructured peer-to-peer system [adapted with
permission from Jelasity and Babaoglu (2005)].
57
Superpeers
• Figure 2-12. A hierarchical organization of
nodes into a superpeer network.
58
Edge-Server Systems
• Figure 2-13. Viewing the Internet as consisting
of a collection of edge servers.
59
Collaborative Distributed Systems (1)
• Figure 2-14. The principal working of BitTorrent
[adapted with permission from Pouwelse et al. (2004)].
60
Collaborative Distributed Systems (2)
Components of Globule collaborative content
distribution network:
• A component that can redirect client requests to
other servers.
• A component for analyzing access patterns.
• A component for managing the replication of Web
pages.
61
Interceptors
• Figure 2-15. Using interceptors to handle
remote-object invocations.
62
General Approaches to Adaptive Software
Three basic approaches to adaptive software:
• Separation of concerns
• Computational reflection
• Component-based design
63
The Feedback Control Model
• Figure 2-16. The logical organization of a
• feedback control system.
64
Example: Systems Monitoring
with Astrolabe
• Figure 2-17. Data collection and information
aggregation in Astrolabe.
65
Example: Differentiating Replication Strategies
in Globule (1)
• Figure 2-18. The edge-server model assumed by Globule.
66
Example: Differentiating Replication Strategies
in Globule (2)
• Figure 2-19. The dependency between
prediction accuracy and trace length.
67
Example: Automatic Component Repair
Management in Jade
• Steps required in a repair procedure:
• Terminate every binding between a component on a
nonfaulty node, and a component on the node that just
failed.
• Request the node manager to start and add a new node
to the domain.
• Configure the new node with exactly the same
components as those on the crashed node.
• Re-establish all the bindings that were previously
terminated.
68