Transcript ppt

A Framework for Highly
Available Services
Based on Group Communication
Alan Fekete
Idit Keidar
University of Sidney
MIT
1
Highly Available Services
• Availability through replication
• Dynamic set of servers
– For load-balancing, adding new servers
– For fault-tolerance, when servers fail / detach
• Clients connect to ‘abstract’ service
• Preemptive migration: client can be migrated
in on-going session / transaction
2
Inspired by
• Highly available Video-on-Demand (VoD)
[Anker, Keidar, Dolev ICDCS 99]
• Uses group communication
• Server written in 2500 C++ code lines,
including all availability logic
• Use of group communication not obvious
3
Framework
• Servers store content units
– Partially replicated among servers
– Static: no updates to content
• Client-service interaction in sessions
• Service is stateful during session
– Service stores changing context for client
• Content served according to client context
4
Examples
• VoD
–
–
–
–
Content unit = movie, partially replicated
Context: location in movie, transmission rate,...
Movie frames sent to client depend on context
Client can random access -- changes context
• Courseware
• Interactive queries
5
Design Goals
• Client request leads to appropriate response
• Availability in face of failures
– Need replication (can be partial)
– Need preemptive migration
• Availability for varying number of clients
– Need to vary number of servers
• Simple clients, flexible service
– Availability is servers’ responsibility
6
Possible Problems at Migration
• Lost request (sent to dead server)
– “Stale” context
– Irrelevant responses
• Lost response (sent by dying server)
• Duplicate response
Study what failure patterns cause problems,
costs of minimizing them
7
Our Solution: The Basics
• Framework, not service
• Configurable in several parameters
– Support for different policies
• Primary server assigned to session
• Preemptive migration to backup server
• Backups mirror session context
– Context freshness at backup – configurable
– 3 levels
8
Replication of Context Info:
Three levels of Freshness
• Unit database per content unit
– Replicated among all servers of content unit
– Periodic updates, frequency configurable
• Context reflecting all client requests
– at primary and 1st level backups
• Context reflecting server responses too
– at primary
9
Group Communication (GC)
• Processes organized in groups,
communication addressed to group
• Groups are dynamic (join, leave, crash,..)
• Groups can partition (“partitionable GC”)
G
10
GC Service Interface
Input send( group, message )
Output receive( message )
Multicast
Input join( group )
Input leave( group )
Membership
Output view( group, members, id )
view id is increasing
11
Semantics: “Virtual Synchrony”
[Birman et al. 87]
• Group members that remain connected see
events in same order
– events: messages, views (totally ordered mcast)
• Framework for “state-machine” replication
with fault tolerance, local consistency
– Connected members go through same states
– New members get state transfer
• Use to replicated unit database
12
Highly-Available Service:
Multicast Groups
Service
Content
Group
Group
Gladiator
Session
Group
Session
Group
Content
Group
Crouchin
Session
g
Group
Content
Group
Spy Kids
13
Messages to Groups
14
Server Session Setup
• When start-session arrives, use local unit
database to choose primary and 1st backups
• Primary and 1st level backups join session
group (thus creating it)
• Primary sends session group name to client,
serves content to client
15
Migration
• Triggered by view in content group
• State transfer to new members
• Use local unit database to choose new
primary and backups per client
– Choose 1st level backup as primary if possible
– By virtual synchrony, same decision made
• Chosen primaries and backups join session
group, primary sends content, ...
16
Configurable Parameters
• Replicas per content unit
• 1st level backups
• Periodic updates frequency
17
Availability Analysis:
Bad Scenarios
• Membership service not live, or does not
give servers consistent views
– Consistent migration decisions not made
– Can lead to no service or duplicate service
• All content unit replicas crash
– No service possible
• Context lost in migration
– Risk depends on configurable parameters
18
Conclusions
• High availability by replication of content
and context
• Group communication facilitates context
replication
• Risk vs. load tradeoff
– 3 levels of freshness
– Configurable freshness
19