Transcript How p2p

P2P设计思想及其
在存储和共享上的应用
安全组: 吕建明
2004-3
outline
What is p2p
Why p2p
How p2p
P2p storage and sharing –my work
Road map
What is p2p
a type of network in which each workstation has
equivalent capabilities and responsibilities. This differs
from client/server architectures, in which some
computers are dedicated to serving the others.
―――http://www.webopedia.com/
What is p2p
Peer-to-peer is a communications model in which each party
has the same capabilities and either party can initiate a
communication session. Other models with which it might
be contrasted include the client/server model and the
master/slave model。
On the Internet, peer-to-peer (referred to as P2P) is a type of
transient Internet network that allows a group of computer
users with the same networking program to connect with
each other and directly access files from one another's
hard drives
―――http://searchnetworking.techtarget.com/
powered by whatis.com
What is p2p
Feature of p2p communication model

Peer

interconnection

interoperation
outline
What is p2p
Why p2p
How p2p
P2p storage and sharing –my work
Roadmap
Why p2p
集中式
•单点错误
•维护成本高
•扩展耗费大
•性能瓶颈
Why p2p
p2p
•Self-organized
•mobile computing
•self-cooperation
•Scalability
•fault-tolerance
•cost effective
Why p2p

p2p is a communication model , u can use it every
where to remove the centralized point :
web service / storage system / grid computing/
office ware / instant messaging /

peer is different . (from the physical layer to the logical
layer) .

There is no common p2p platform can suit every use
case.
Why p2p

while the typical communication model is is a type of
transient Internet network that allows a group of
computer users with the same networking
program to connect with each other and directly
access files from one another‘s hard drives the
internet (like Kazza/Napster/pp点点通/edonkey)
outline
What is p2p
Why p2p
How p2p
P2p storage and sharing – my work
Roadmap
How p2p - JXTA

There are a lot of different
scenarios to deploy p2p.
application

JXTA (sun p2p platform) display some common
scenarios and brings up some common problems .
www.jxta.org

80 projects have been built upon JXTA
How p2p – JXTA (2)
How p2p – JXTA (3)

JXTA 定义了p2p领域中的一些通用的问题:
peer addressing , Peer group , resource location ,
security , communication model , security , peer
monitoring , etc.



并给出了参考的协议实现。JXTA for J2SE , etc
但不是万能钥匙。具体的p2p系统有不同的需求。如:
特殊的安全需求、特殊的路由定位需求、特殊的效率
需求、特殊的通信模式的需求
不存在统一的平台
How p2p – Common steps

Find the centralized point and remove them
direction server / schedule server /
monitor server / indexing server

make sure what is the peer and how many types
of peer.
How p2p – Common steps

how peers communicate (communication model)

What is communication message

How does peer discover each other and resource

Security issues


… …
Build the p2p middle ware
How p2p – communication model

Unstructured p2p networks

Structured p2p networks

Loose structured p2p networks
Unstructured networks

网络形成:节点采用随机的方法或采用启发策略加入
网络,网络拓扑随着节点的变迁和网络通信的进行而
发生演变。

这一类的系统包括Gnutella、FastTrack、kazaa、
Limwire、Usenet 、freenet、Planetp等等。

路由特征:带启发的广播查找的过程,启发的方法包
括span ring、degree based、supper nodes 、
routing index、LSI method、bloom filter 等等。

往往引入集中节点来提高效率。(hybrid p2p)
Unstructured networks – pure p2p
Unstructured networks – hybrid p2p
Unstructured networks – hybrid p2p(2)
Structured networks
CAN拓扑-n维笛卡儿空间
CHORD拓扑-环形结构
Structured networks(2)
☆
网络拓扑特点:每个节点都有固定的编址,整个网络
具有相对稳定而紧致的拓扑结构,
☆
数据存储特点:DHT(Distributed Hash Table)
☆
网络路由:通过O(lg N)跳就可以定位目标节点。(N
是总节点数)
☆
这一类的系统包括CAN、Tapestry、Pastry、kademlia、
Chord等
Structured networks - DHT
key7
key8
key9
data7
data8
data9
d
c
f
b
key10
key11
data10
data11
key12
key13
key14
data12
data13
data14
key4
key5
key6
a
e
key0
Peer addr
e
f
data4
data5
data6
key1
key2
key3
data0
data1
data2
data3
key
0
1
d
4
c
7
b
10
a
12
Structured networks – DHT(2)

快速数据定位方法:
DATA  KEY  Route (KEY)  Reach the node
whose address is closest to KEY
Loose Structured networks
key1
key2
key3
data1
data2
data3
Loose Structured networks(2)
☆
网络拓扑特点:unstructured networks
☆
数据存储特点:DHT
☆
例子:JXTA 的集中节点之间的通信
Loose structured
outline
What is p2p
Why p2p
How p2p
P2p storage and sharing - my work
Roadmap
P2p storage and sharing
File sharing on internet is quite popular.

Search and download movie /mp3 from others

Unstructured hybrid p2pnetworks can do with this .
Centralized servers are used to promote the
performance.

A lot of clients. Gnutella, FastTrack, kazaa, Limwire,
edonkey, pp点点通 etc.
P2p storage and sharing
But unstructured p2p does not fit a more reliable
and high performance system design:

auto content indexing & content searching

reliable ocean storage

reliable p2p computing environment
Use case study
How to organize a lot of machine to let them compose a
system ,which is

auto content distributing / indexing /searching

high scalability ,

auto p2p backup ,

self-organized resource discover ,

self-organized resource scheduling ,

self-organized monitoring,
… …
Use case study (2)
Benefit –

Make full use of computing and storage resource

Reliable

Convenient for maintenance

Much cheaper
What we need to do ?
Use Structured p2p networks for communication
model , because :

相对稳定而紧致的拓扑结构适宜做稳定的服务

DHT数据存储,提高数据检索速度

高效网络路由(o(logn))适合做自组织的资源调度、移
动计算
What kind of structured p2p

Tapestry (UCB) ?

CAN (UCB) ?

Pastry (Rice) ?

Chord (MIT)?

Or all of them ?
Difference between them


A Peer how to find another peer in o(logn)
How to form the local routing table of peer ?
Tapestry /pastry routing table
Tapestry /pastry
Node 5230 find node 42AD
CAN
Neighbor :节点地址除了在某
一个位上数字相邻,其它位
的数字都相同:
节点0011的邻居有:
1011、0111、0001、0010
Routing table :
IP Address of neighbor set
Routing : according the logical
n-cartesian coordinate space
We will use CAN

地址向量相近的节点互为邻居。
1011、0111、0001、0010
(chord/tapestry/pastry是线性相近 1000 、1001、
1002、1003是邻居)
从而在DHT中,key向量相似的data分布在相邻的节点
上,有利于做内容检索。
 格状拓扑,稳定灵活(比tapestry/pastry稳定,比
CHORD灵活),适合组织调度
CAN is perfect?
 routing according to the logical
space only
 Gap between logical and physical
make routing ineffective
 Make no use of local physical
information
Improve CAN
 Extent Routing table entry to:
(peer address , ip address , pingtime)
(like tapestry)
 Extent routing table with local
neighbor set ,which recording the
closest peer on physical networks layer.
(like Pastry)
Pastry routing table
What we will make in the end

A middle ware providing API of:
Grid topology connection
Routing content indexing
fast Resource searching
auto p2p backup
auto resource schedule
middle-ware use case
entrance
Distributed
according to the key
index
middle-ware use case
auto content
searching
middle-ware use case
P2p auto
backup
middle-ware use case
auto finding the
idle machine
middle-ware use case
fault- tolerance
What it will look like?

Like Grid computing !

But more self-organization

But more p2p character
outline
What is p2p
Why p2p
How p2p
P2p scenarios
P2p storage and sharing –my work
Road map
Roadmap

Build the basic routing module . (modify the routing
core of tapestry open source project)

Build the content-searching module

Build the auto backup module

Build the resource schedule module

Share it …
Thank u for any further advice !