Transcript SCRIBE
SCRIBE
A large-scale and decentralized application-level
multicast infrastructure
Overview
Pastry
PAST
distributed
file system layered on top of Pastry
SCRIBE
decentralized
publish/subscribe system
Pastry – Quick Review
Chord like routing
Consistent
hashing
Prefix routing
Leaf set
Pastry – locality properties
Short routes
Total distance traveled
Average dist 1.59 to 2.2
times actual dist
Route convergence
Dist Traveled by 2
messages sent to same
key
Equal to dist between to
nodes before routes
converge
Pastry API
nodeID = pastryInit(Credentials)
Causes
node to join pastry network
route(msg,key)
send(msg,IP-addr)
Applications must export:
deliver(msg,key)
forward(msg,key,nextID)
newLeafs(leafSet)
SCRIBE
Built on top of Pastry
Support large number of groups
Handle a high rate of membership turnover
SCRIBE nodes can:
Create
groups
Join groups
Multicast messages to groups
SCRIBE API
create(credentials, groupID)
join(credentials, groupID)
leave(credentials, groupID)
multicast(credentials, groupID, message)
SCRIBE – Creating a Group
Pastry route(msg, key)
SCRIBE route(CREATE, groupID)
groupID => hash textual name cat creator name
Message delivered to closest key which become
rendez-vous point for the group (root of multicast tree
for group)
Adds to local list of groups
Stores credentials
Alternative use itself as root
good choice if creator sends to group often
SCRIBE – Joining a Group
Pastry route(msg, key)
SCRIBE route(JOIN, groupID)
routed
to rendez-vous point
along the way multicast tree formed
SCRIBE – Leaving a Group
Remove from local group children list
If list becomes empty forward to parent
Part of the multicast tree may be removed
SCRIBE – Sending a multicast message
route(MULTICAST, groupID)
ask
for rendez-vous IP address
If rendez-vous fails re-request rendez-vous point
Pastry
handles node duplication
All messages are sent through the rendez-vous point
SCRIBE – Repairing the Multicast Tree
Messages are delivered only in best-effort
may
be out of order delivery
Periodic heartbeat message sent to all children
Child rejoins the tree through sending a new JOIN
message if suspects parent has failed
Can repair rendez-vous point
Pastry
handles node duplication in leaf nodes
Children nodes JOIN new root when missing heartbeat
is detected
SCRIBE – Forming a Multicast Tree
Rendez-vous point (root)
Forwarders
may
or may not be members of the group
maintain a children table (IP and nodeID) for group
SCRIBE - Strengths
Pastry handles root duplication
Rendez-vous point does not handle all join requests
Locality properties of Pastry
short
routes
delay
route
from rendez-vous point to member is short
convergence
load
imposed on physical network is small
SCRIBE – Experimental Evalutation
Simulation experimental results
Focus on three metrics:
delay
to deliver events to group members
stress on each node
stress on each physical network link
SCRIBE – Simulator Evaluation
5050 routers and 100,000 end nodes
1,500 groups of different sizes
10 different runs using same parameters but
different random seeds
Averaged all results
Compared results with IP multicast
SCRIBE – Delay Penalty
RMD – ratio between max delay using SCRIBE &
max delay using IP multicast
RAD – ratio between average delay using SCRIBE
& average delay using IP multicast
SCRIBE – Node Stress
Average node is responsible for forwarding a small
number of multicast messages
SCRIBE – Link Stress
Total num links = 1,035,295
SCRIBE = 2,489,824 messages (mean 2.4)
IP multicast = 758,853 messages (mean 0.7)
SCRIBE – Bottleneck remover
Bottlenecks
Low capacity nodes
High capacity nodes with extremely high children entries
Drop children if over capacity
Select child to drop and send message with children table
Child chooses new parent node and sends JOIN message
Result
Removes long tail in node stress graph
Increases average link stress
SCRIBE – Scalability Small Groups
50,000 nodes
30,000 groups 11 members each group
SCRIBE performs poorly for large number small
groups
SCRIBE collapse
Removes
long paths
removing
nodes that are not members of a group & have
only one entry in their children table
SCRIBE – Scalability Small Groups
Average link stress 6.1 to 3.3
Average number of children 21.2 to 8.5
SCRIBE - Conclusion
Fully decentralized
Support large number of groups
Support large group size
Multiple multicast sources per group
QUESTIONS