111802-gisik

Download Report

Transcript 111802-gisik

<1>
A Peer-to-Peer Approach to
Resource Discovery in Grid
Environments
(in HPDC’02, by U of Chicago)
Gisik Kwon
Nov. 18, 2002
Motivation
• Two general resource discovery systems, Grid &
P2P, eventually will have the same goal
– Large-scale, decentralized and self-configuring system
with complex functionalities
• So, general guideline is needed for designing the
resource discovery system
– Proposing 4 axes(components) guiding the design of
any resource discovery architecture
– Presenting an emulated environment and preliminary
performance evaluation with simulations
How
• 4 axes (components) to be considered
– Membership protocol
• How new nodes join and learn the network
– Overlay construction function
• Selecting the set of active collaborators from the local
membership list
– Preprocessing
• Off-line preparations for better search performance
• Caching (X), pre-fetching (O)
• E.g) Dissemination of resource descriptions
– Request processing
• Local processing
– lookup the requested resource in the local info., processing
aggregated resources, ..
• Remote processing
– Request propagation rule
Evaluation
• Modeling the Grid environment (4 parameters)
– Resource info. distribution and density
• Some organizations : sharing large number of
resources or just a few
• Some resources : Common or unique
– Resource info. dynamism
• Highly variable(CPU load) or static(type of CPU)
– Requests distribution
• Pattern of Users’ requests
• E.g) Zipf distribution, uniform, ..
– Peer particapation
• varies over time more significantly in P2P than Grid
Evaluation
• Preliminary Experimental Setup
– Optimistic Grid model
• Static resource attributes, constant peer participation, no failure
– Passive membership protocol
• A new node learns the network through out-of-band
• When a new node contacts, membership is enriched
– Overlay function accepts unlimited number of
neighbors
– No preprocessing
– Request processing
• Perfect matching
• 4 request propagation strategies
Evaluation
• Random walk
– Choosing randomly
• learning-based
– Answered similar requests previously
• best-neighbor
– Answered the largest number of requests
• learning-based + best-neighbor
– Learning-based first, otherwise best-neighbor
Evaluation
• User requests
:
Resource distribution
Quantitative estimation of
resource location costs
Quantitative estimation of
resource location costs
Grid vs P2P
GRID
P2P
Thousands of peers,
Hundreds of users
Hundreds of thousands of peers and
users
Centralization
Some
(centralized repositories for
shared data)
Less than GRID or no central point
Functionality
Complex
(program exe., R/W access to
data, resource monitoring,..)
Specialized
(file sharing, parallel computations)
Participation
Stable
Unpredictable
(long, predefined periods of time)
User behavior
Some homogeneity
(incentives, policies)
Uneven
(free riding in Gnutella)
Powerful, good connection
Highly heterogeneous
Yes
No
Scale
System
Administration
<2>
Adaptive Replication
in Peer-to-Peer Systems
(in ICS’02, by UMD)
Gisik Kwon
Nov. 18, 2002
Motivation
• Recent P2P systems are good for the uniform
query demand
• But the demand can be heavily skewed
• So lightweight, adaptive & system-neutral
replication mechanism is needed to control the
skewed demand
– Proposing LAR
– Evaluation on Chord and TerraDir systems with
simulation
How
• LAR uses two soft states: caches and replicas
– Why soft: Created and destroyed by local decision,
without any coordination with the item’s home
• Caches
– Consist of <data item label, item’s home, home’s
address(IP), a set of known replica locations>
– Use LRU replacement strategy
• Replicas
– Contains item data itself and
– More states: home address(IP), neighbor of home,
known replicas
– Should be advertised
How
• Load balancing with creating replicas
– Each time a packet is routed, current load(li) of
server(si) is checked
– Load is defined in terms of messages sent to a
server during a time unit
– If li > limax (overloaded)
• si creates replica at sj (if li > lj)
– If (lilow <= li <= lihi) (highly loaded)
• si creates replica at sj (only if lj <= ljlow )
• After creating replica, it is disseminated
– 2/32 policy : 2 per msg, 32 per server
Evaluation
• Based on Chord simulator
– Single network hop = 25ms
• Query distribution
– follows the poisson distribution
– Average input rate = 500/sec
– def. skewed input : 90-1
• 90% skewed to one item, other 10% randomly distributed
•
•
•
•
•
1k servers, 32k data items
lmax = 10/sec, lhi = 0.75*lmax, llow = 0.3*lmax
Queue length = 32
Def. Load window size = 2 sec
Dissemination policy = 2/32
Static vs. adaptive replication
Static vs. adaptive replication
Load balancing
•
lhi = 0.75
Parameter sensitivity
scalability
N70’s
Finger Table
Chord Lookup
(0)
N113
N32’s
Finger Table
N102
N32
N85
N40
N80
N52
N79
N70
N60
33..33
34..35
36..39
40..47
48..63
64..95
96..31
N40
N40
N40
N40
N52
N70
N102
71..71 N79
72..73 N79
74..77 N79
78..85 N80
86..101 N102
102..5 N102
6..69 N32
N80’s
Finger Table
81..81 N85
82..83 N85
84..87 N85
88..95 N102
96..111 N102
112..15 N113
16..79 N32
Node 32, lookup(82): 32  70  80  85.
TerraDir