Transcript How p2p
P2P设计思想及其
在存储和共享上的应用
安全组: 吕建明
2004-3
outline
What is p2p
Why p2p
How p2p
P2p storage and sharing –my work
Road map
What is p2p
a type of network in which each workstation has
equivalent capabilities and responsibilities. This differs
from client/server architectures, in which some
computers are dedicated to serving the others.
―――http://www.webopedia.com/
What is p2p
Peer-to-peer is a communications model in which each party
has the same capabilities and either party can initiate a
communication session. Other models with which it might
be contrasted include the client/server model and the
master/slave model。
On the Internet, peer-to-peer (referred to as P2P) is a type of
transient Internet network that allows a group of computer
users with the same networking program to connect with
each other and directly access files from one another's
hard drives
―――http://searchnetworking.techtarget.com/
powered by whatis.com
What is p2p
Feature of p2p communication model
Peer
interconnection
interoperation
outline
What is p2p
Why p2p
How p2p
P2p storage and sharing –my work
Roadmap
Why p2p
集中式
•单点错误
•维护成本高
•扩展耗费大
•性能瓶颈
Why p2p
p2p
•Self-organized
•mobile computing
•self-cooperation
•Scalability
•fault-tolerance
•cost effective
Why p2p
p2p is a communication model , u can use it every
where to remove the centralized point :
web service / storage system / grid computing/
office ware / instant messaging /
peer is different . (from the physical layer to the logical
layer) .
There is no common p2p platform can suit every use
case.
Why p2p
while the typical communication model is is a type of
transient Internet network that allows a group of
computer users with the same networking
program to connect with each other and directly
access files from one another‘s hard drives the
internet (like Kazza/Napster/pp点点通/edonkey)
outline
What is p2p
Why p2p
How p2p
P2p storage and sharing – my work
Roadmap
How p2p - JXTA
There are a lot of different
scenarios to deploy p2p.
application
JXTA (sun p2p platform) display some common
scenarios and brings up some common problems .
www.jxta.org
80 projects have been built upon JXTA
How p2p – JXTA (2)
How p2p – JXTA (3)
JXTA 定义了p2p领域中的一些通用的问题:
peer addressing , Peer group , resource location ,
security , communication model , security , peer
monitoring , etc.
并给出了参考的协议实现。JXTA for J2SE , etc
但不是万能钥匙。具体的p2p系统有不同的需求。如:
特殊的安全需求、特殊的路由定位需求、特殊的效率
需求、特殊的通信模式的需求
不存在统一的平台
How p2p – Common steps
Find the centralized point and remove them
direction server / schedule server /
monitor server / indexing server
make sure what is the peer and how many types
of peer.
How p2p – Common steps
how peers communicate (communication model)
What is communication message
How does peer discover each other and resource
Security issues
… …
Build the p2p middle ware
How p2p – communication model
Unstructured p2p networks
Structured p2p networks
Loose structured p2p networks
Unstructured networks
网络形成:节点采用随机的方法或采用启发策略加入
网络,网络拓扑随着节点的变迁和网络通信的进行而
发生演变。
这一类的系统包括Gnutella、FastTrack、kazaa、
Limwire、Usenet 、freenet、Planetp等等。
路由特征:带启发的广播查找的过程,启发的方法包
括span ring、degree based、supper nodes 、
routing index、LSI method、bloom filter 等等。
往往引入集中节点来提高效率。(hybrid p2p)
Unstructured networks – pure p2p
Unstructured networks – hybrid p2p
Unstructured networks – hybrid p2p(2)
Structured networks
CAN拓扑-n维笛卡儿空间
CHORD拓扑-环形结构
Structured networks(2)
☆
网络拓扑特点:每个节点都有固定的编址,整个网络
具有相对稳定而紧致的拓扑结构,
☆
数据存储特点:DHT(Distributed Hash Table)
☆
网络路由:通过O(lg N)跳就可以定位目标节点。(N
是总节点数)
☆
这一类的系统包括CAN、Tapestry、Pastry、kademlia、
Chord等
Structured networks - DHT
key7
key8
key9
data7
data8
data9
d
c
f
b
key10
key11
data10
data11
key12
key13
key14
data12
data13
data14
key4
key5
key6
a
e
key0
Peer addr
e
f
data4
data5
data6
key1
key2
key3
data0
data1
data2
data3
key
0
1
d
4
c
7
b
10
a
12
Structured networks – DHT(2)
快速数据定位方法:
DATA KEY Route (KEY) Reach the node
whose address is closest to KEY
Loose Structured networks
key1
key2
key3
data1
data2
data3
Loose Structured networks(2)
☆
网络拓扑特点:unstructured networks
☆
数据存储特点:DHT
☆
例子:JXTA 的集中节点之间的通信
Loose structured
outline
What is p2p
Why p2p
How p2p
P2p storage and sharing - my work
Roadmap
P2p storage and sharing
File sharing on internet is quite popular.
Search and download movie /mp3 from others
Unstructured hybrid p2pnetworks can do with this .
Centralized servers are used to promote the
performance.
A lot of clients. Gnutella, FastTrack, kazaa, Limwire,
edonkey, pp点点通 etc.
P2p storage and sharing
But unstructured p2p does not fit a more reliable
and high performance system design:
auto content indexing & content searching
reliable ocean storage
reliable p2p computing environment
Use case study
How to organize a lot of machine to let them compose a
system ,which is
auto content distributing / indexing /searching
high scalability ,
auto p2p backup ,
self-organized resource discover ,
self-organized resource scheduling ,
self-organized monitoring,
… …
Use case study (2)
Benefit –
Make full use of computing and storage resource
Reliable
Convenient for maintenance
Much cheaper
What we need to do ?
Use Structured p2p networks for communication
model , because :
相对稳定而紧致的拓扑结构适宜做稳定的服务
DHT数据存储,提高数据检索速度
高效网络路由(o(logn))适合做自组织的资源调度、移
动计算
What kind of structured p2p
Tapestry (UCB) ?
CAN (UCB) ?
Pastry (Rice) ?
Chord (MIT)?
Or all of them ?
Difference between them
A Peer how to find another peer in o(logn)
How to form the local routing table of peer ?
Tapestry /pastry routing table
Tapestry /pastry
Node 5230 find node 42AD
CAN
Neighbor :节点地址除了在某
一个位上数字相邻,其它位
的数字都相同:
节点0011的邻居有:
1011、0111、0001、0010
Routing table :
IP Address of neighbor set
Routing : according the logical
n-cartesian coordinate space
We will use CAN
地址向量相近的节点互为邻居。
1011、0111、0001、0010
(chord/tapestry/pastry是线性相近 1000 、1001、
1002、1003是邻居)
从而在DHT中,key向量相似的data分布在相邻的节点
上,有利于做内容检索。
格状拓扑,稳定灵活(比tapestry/pastry稳定,比
CHORD灵活),适合组织调度
CAN is perfect?
routing according to the logical
space only
Gap between logical and physical
make routing ineffective
Make no use of local physical
information
Improve CAN
Extent Routing table entry to:
(peer address , ip address , pingtime)
(like tapestry)
Extent routing table with local
neighbor set ,which recording the
closest peer on physical networks layer.
(like Pastry)
Pastry routing table
What we will make in the end
A middle ware providing API of:
Grid topology connection
Routing content indexing
fast Resource searching
auto p2p backup
auto resource schedule
middle-ware use case
entrance
Distributed
according to the key
index
middle-ware use case
auto content
searching
middle-ware use case
P2p auto
backup
middle-ware use case
auto finding the
idle machine
middle-ware use case
fault- tolerance
What it will look like?
Like Grid computing !
But more self-organization
But more p2p character
outline
What is p2p
Why p2p
How p2p
P2p scenarios
P2p storage and sharing –my work
Road map
Roadmap
Build the basic routing module . (modify the routing
core of tapestry open source project)
Build the content-searching module
Build the auto backup module
Build the resource schedule module
Share it …
Thank u for any further advice !