Transcript SCRIBE

SCRIBE
A large-scale and decentralized application-level
multicast infrastructure
Overview

Pastry

PAST
 distributed

file system layered on top of Pastry
SCRIBE
 decentralized
publish/subscribe system
Pastry – Quick Review

Chord like routing
 Consistent
hashing
 Prefix routing
 Leaf set
Pastry – locality properties

Short routes
Total distance traveled
 Average dist 1.59 to 2.2
times actual dist


Route convergence
Dist Traveled by 2
messages sent to same
key
 Equal to dist between to
nodes before routes
converge

Pastry API

nodeID = pastryInit(Credentials)
 Causes



node to join pastry network
route(msg,key)
send(msg,IP-addr)
Applications must export:
 deliver(msg,key)
 forward(msg,key,nextID)
 newLeafs(leafSet)
SCRIBE




Built on top of Pastry
Support large number of groups
Handle a high rate of membership turnover
SCRIBE nodes can:
 Create
groups
 Join groups
 Multicast messages to groups
SCRIBE API




create(credentials, groupID)
join(credentials, groupID)
leave(credentials, groupID)
multicast(credentials, groupID, message)
SCRIBE – Creating a Group




Pastry route(msg, key)
SCRIBE route(CREATE, groupID)
groupID => hash textual name cat creator name
Message delivered to closest key which become
rendez-vous point for the group (root of multicast tree
for group)
Adds to local list of groups
 Stores credentials


Alternative use itself as root

good choice if creator sends to group often
SCRIBE – Joining a Group


Pastry route(msg, key)
SCRIBE route(JOIN, groupID)
 routed
to rendez-vous point
 along the way multicast tree formed
SCRIBE – Leaving a Group



Remove from local group children list
If list becomes empty forward to parent
Part of the multicast tree may be removed
SCRIBE – Sending a multicast message

route(MULTICAST, groupID)
 ask

for rendez-vous IP address
If rendez-vous fails re-request rendez-vous point
 Pastry

handles node duplication
All messages are sent through the rendez-vous point
SCRIBE – Repairing the Multicast Tree

Messages are delivered only in best-effort
 may



be out of order delivery
Periodic heartbeat message sent to all children
Child rejoins the tree through sending a new JOIN
message if suspects parent has failed
Can repair rendez-vous point
 Pastry
handles node duplication in leaf nodes
 Children nodes JOIN new root when missing heartbeat
is detected
SCRIBE – Forming a Multicast Tree


Rendez-vous point (root)
Forwarders
 may
or may not be members of the group
 maintain a children table (IP and nodeID) for group
SCRIBE - Strengths



Pastry handles root duplication
Rendez-vous point does not handle all join requests
Locality properties of Pastry
 short
routes
 delay
 route
from rendez-vous point to member is short
convergence
 load
imposed on physical network is small
SCRIBE – Experimental Evalutation


Simulation experimental results
Focus on three metrics:
 delay
to deliver events to group members
 stress on each node
 stress on each physical network link
SCRIBE – Simulator Evaluation





5050 routers and 100,000 end nodes
1,500 groups of different sizes
10 different runs using same parameters but
different random seeds
Averaged all results
Compared results with IP multicast
SCRIBE – Delay Penalty



RMD – ratio between max delay using SCRIBE &
max delay using IP multicast
RAD – ratio between average delay using SCRIBE
& average delay using IP multicast
SCRIBE – Node Stress

Average node is responsible for forwarding a small
number of multicast messages
SCRIBE – Link Stress



Total num links = 1,035,295
SCRIBE = 2,489,824 messages (mean 2.4)
IP multicast = 758,853 messages (mean 0.7)
SCRIBE – Bottleneck remover

Bottlenecks
Low capacity nodes
 High capacity nodes with extremely high children entries


Drop children if over capacity
Select child to drop and send message with children table
 Child chooses new parent node and sends JOIN message


Result
Removes long tail in node stress graph
 Increases average link stress

SCRIBE – Scalability Small Groups




50,000 nodes
30,000 groups 11 members each group
SCRIBE performs poorly for large number small
groups
SCRIBE collapse
 Removes
long paths
 removing
nodes that are not members of a group & have
only one entry in their children table
SCRIBE – Scalability Small Groups


Average link stress 6.1 to 3.3
Average number of children 21.2 to 8.5
SCRIBE - Conclusion

Fully decentralized
Support large number of groups
Support large group size
Multiple multicast sources per group

QUESTIONS


