View-synchronous group communication

Download Report

Transcript View-synchronous group communication

Distributed Systems Fall 2009
Replication
Outline
• Group communication
• Fault-tolerant services
– Passive and active replication
• Highly available services
• Summary
Fall 2009
5DV020
3
Group communication
• Static vs. Dynamic groups
• Primary partition vs. partitionable
groups
• Group management
– Interface for membership changes
– Failure detection
– Notification upon membership changes
– Provide group address expansion
Fall 2009
5DV020
4
Group views
• Views contain a set of members at
a given point in time
– Failed identified processes are not in
the view
• Events occur in views
• View-synchronous group
communication
– Allows based on view delivery, we can
know which messages must have been
delivered to other members
Fall 2009
5DV020
5
View-synchronous group
communication
• Correct processes deliver the same
set of messages in any given view
• Messages are delivered at most
once
• Correct processes always deliver
messages they send
– If delivering to q fails, the next view
excludes q
Fall 2009
5DV020
6
Why replication?
• Many algorithms require a working
server node
• Performance (load balancing)
• Increased availability
1 – p(all replicas crashed) = 1 – pn
• Fault-tolerance
– Correct servers in majority
Fall 2009
5DV020
7
Replication
• Replication transparency
– Client unaware of replication
• Problem with >1 client
– Concurrent access, rather than
exclusive
– Operations are interleaved
• How do we ensure correctness?
Fall 2009
5DV020
8
Correctness of interleavings
• Always
– Interleaved sequence of operations must meet
the specification of a single correct copy of the
object(s)
• Sequential consistency property
– Order of operations is consistent with the
program order in which each individual process
executed them
• Linearizability property
– Order of operations is consistent with the real
times at which the operations occurred during
execution
Fall 2009
5DV020
9
Example (interleaved operations)
• C1: A, B, C
• C2: d, e, f
• Order during execution:
A, B, d, C, e, f
• An interleaving with sequential
consistency:
A, B, d, e, f, C
• Interleaving with linearizability:
A, B, d, C, e, f
Fall 2009
5DV020
10
Generalized replication
1. Request: client makes request
2. Coordination: replica managers
decide upon order of request
3. Execution: request is executed
4. Agreement: replica managers
agree on result of execution
5. Response: response is sent back to
the client
Fall 2009
5DV020
11
Passive replication
• One Primary replica
manager, many backups
• If primary fails, backups
can take its place (election!)
• Implements linearizability if:
– A failing primary is replaced by a
unique backup
– Backups agree on which operations
had been performed when primary
crashed
• View-synchronous group communication!
Fall 2009
5DV020
12
Passive replication
1. Request: front end issues request with
unique ID
2. Coordination: primary checks if request
has been carried out, if so, returns
cached response
3. Execution: perform operation, cache
results
4. Agreement: primary sends updated state
to backups
5. Response: primary sends result to front
end, which forwards to the client
Fall 2009
5DV020
13
Active replication
• More distributed
• All replica managers carry out all
operations
• Requests to RM are totally ordered
• Front ends issue one request at a
time (FIFO)
• Implements sequential consistency
Fall 2009
5DV020
14
Active replication
1. Request: front end adds unique identifier
to request, mcasts it to RMs
2. Coordination: totally ordered request
delivery to RMs
3. Execution: each RM executes request
4. Agreement: not needed
5. Response: all RMs respond to front end,
front end interprets response and
forwards interpretation to client
Fall 2009
5DV020
15
Comparison (Active/Passive)
• Handling of crash failures?
– Both: yes (but differently)
• Handling of arbitrary failures?
– Active: yes, Passive: no
• Complexity?
• Optimizations?
– Send “reads” to backups in passive
• Lose linearizability property!
– Send “reads” to single backup in active
• Lose fault tolerance
Fall 2009
5DV020
16
Highly available services
• Goal is to allow clients to use
service for as long as possible
– Even if network connections are lost
– Even if results may be inconsistent
• Study this deeper on your own
– Basic questions about this have been
on exams before
Fall 2009
5DV020
17
Gossip
• Queries vs. updates
• Guarantees by Gossip
– Each client gets a consistent service
over time
• Replicas will provide data that is fresher
than what the client has seen so far
– Relaxed consistency between replicas
• Generally less than sequential consistency
• Eventually, all updates are applied (in
order), but clients may observe stale data
Gossip
• Forced (total and causal) or
immediate ordering
• Choice is up to application developer
Gossip steps
1. Request – blocking queries to a single
RM, updates perhaps to many RMs for
increased reliability
2. Update response – return immediately
on updates
3. Coordination – wait until request can be
processed according to ordering rules
4. Execution – execute request
5. Query response – reply to query
6. Agreement – update other RMs via
gossip messages occasionally
Gossip architecture
• Consider Figure 15.8 in the book © Pearson
Education 2005.
• Value timestamp: vector timestamp showing
which updates have been applied
• Update log:
• Record all updates, even if they cannot be applied
yet (not yet stable)
• Keep updates that have become stable so they can
be propagated to others
• Replica timestamp: updates that have been
accepted, but may not yet be stable
• Executed operation table: avoid repeated updates
• Timestamp table: timestamp for each RM, keep
track of applied updates in other RMs
Summary
• Group communication
– Views
– View-synchronous group
communication
• Replication
– Correctness
• Linearizability: time
• Sequential consistency: program order
– Passive and active replication schemes
Fall 2009
5DV020
22
Next lecture
• Transactions
– Nested transactions
• Concurrency control
– Locks
– Optimistic concurrency control
Fall 2009
5DV020
23