PPT - University of Cincinnati

Download Report

Transcript PPT - University of Cincinnati

Towards A Content-Based
Aggregation Network
By
Shagun Kakkar
May 29, 2002
University of Cincinnati
1
Outline
•
•
•
•
•
•
Motivation
Introduction
Current Approaches
Problems with Current Approaches
Overview of our Approach
Conclusion
University of Cincinnati
2
Motivation
• To design a content-based aggregation in
peer-to-peer file sharing network
University of Cincinnati
3
Introduction
• Peer-to-Peer (P2P) networks have the ability of
file sharing directly between people
• Major attraction of this system is allowing users to
share content without having an intermediary
party
• P2P is still a very vague term, covering systems
such as Napster(centralized system) ,Gnutella and
Freenet(decentralized system).
University of Cincinnati
4
Peer-to-Peer Network
• Overlay network comprised purely of end systems
without any intermediate systems
• Decentralized approach
• Similar to Freenet, not Napster
University of Cincinnati
5
Current Approaches
• Gnutella
*
*
*
*
Takes radical approach to decentralization
Does not use any form of indexing
Characterized as application-layer networks
Different from IP routing
- we address content i.e., files being shared between
users
* Concept of content-based routing
- allows to address the real problem of P2P system
rather than trying to emulate IP-style networking
University of Cincinnati
6
Contd:
 Content-based routing
* the content is exposed to the network transport
mechanism, to influence the routing of messages
* no information other than content is used
* producers generate messages but with no particular
destinations intended
University of Cincinnati
7
Contd:
• Freenet
* Use index based on meta data called
descriptive strings
* In order to retrieve a piece from Freenet
network, the user must first derive, or know
a key which matches the file associated with
the item.
* This request is sent to the querying node’s
neighbor, which forwards the request to its
neighbor
University of Cincinnati
8
Contd:
• In Freenet, knowing or deriving a key is a severe
weakness as it does not facilitate searching for an
item of content
• Addresses content-location problem
University of Cincinnati
9
Contd:
• CAN and Chord
* Address content-location in a peer-to-peer
environment.
* They are distributed hash tables
* Provide techniques to ensure the operation of hash
tables when nodes join or leave the network
* Effective in providing a content location service
University of Cincinnati
10
Problems with Current Approaches
• Problem with scalability
• Gnutella
* absence of indexing
• Freenet
* no mechanism to query a group which represents a subset of meta data, e.g., all MP3 files
* does not assist locating an item of content at the first
place
University of Cincinnati
11
Our Approach
• Overview
* To achieve scalability
- reduce the amount of search requests that the
system sends out in order to locate content
- concept of aggregation with a hierarchical scheme
is proposed
University of Cincinnati
12
Definitions
* Aggregation point
In a content-based routing, the aggregation point
advertises aggregated content to other aggregation
points in the overlay network
* Helper node
If a node perform this function, when a new node
wishes to join the overlay network, the helper node
can provide information about other nodes in the
network
University of Cincinnati
13
Contd:
- When a node wishes to join a network, it contact the helper
node
- When a node wishes to advertise a file, it should consult
the helper node for finding existing aggregation point of
that type
- If the aggregation point is not found, the advertising node
can become aggregation point
University of Cincinnati
14
Contd:
• Route Aggregation in Overlay Networks
* Introduce helper node to overcome the problem of
boot - strapping
(when a node first comes online, it does not necessarily know
the addresses of other nodes in the network)
* When a node comes online, it attempts to select an
aggregation group from the history list of aggregation
points provided by helper node
* Node then makes some policy-based measurements e.g.,
delay latency and based on these makes a decision of
which group to join
* Enables the system to aggregate on network performance
University of Cincinnati
15
The hierarchy in an Aggregation Point based peer-to-peer network
University of Cincinnati
16
Contd:
• Meta data - Towards Content-based
Aggregation
* To enable content-based aggregation
 every shared file is accompanied by a meta data
 these files are encoded in XML and comprise of various
fields, organized according to their order of significance
 these fields used for obtaining different levels of
aggregation
 the top level of hierarchy, the data format (normally a
file extension), serves as a first classification
University of Cincinnati
17
Examples of Meta Data
The meta data examples,
one describing an mp3
file with a recording of
the D’minor Organ
Toccata and Fuge by J.S
Bach and the other an
ASCII file version of
D.R. Hofstadter’s Gödel
Escher Bach
Meta data represented in XML
University of Cincinnati
18
Dealing with change in the system
• Meta data is used as the basis of aggregation in
our network
For e.g., the first field of meta data is the file type
<ASCII> or <mp3>
• Threshold is used for determining when an
aggregation group should be split
University of Cincinnati
19
Contd:
• Strategy used:
 Aggregation point choose to create a new aggregation
point based upon a lower precedence field of meta data
 This strategy used when the number of items of content
in a group expand beyond threshold
University of Cincinnati
20
Conclusion
• Introduced a scheme for content-based
aggregation, for achieving scalability, without
relying on centralized resources
• Future work: further elaboration of the scheme
along with implementing the system
University of Cincinnati
21
Reference
• R. Gold, D. Tidhar : Towards a Contentbased Aggregation Network. Proceedings of
the International Conference on Peer-toPeer Computing, Linköping, 2001.
University of Cincinnati
22