Transcript Document
Project Schedule
Victor Gau, Yi-Hsien Wang, Trevor Bosaw,
and Jenq-Neng Hwang
2007.12.14
Current P2P Identification Products
Many Companies have products to detect P2P traffic – usually to control
or limit the amount of bandwidth these applications consume.
Products can be:
• Port Based
•DPI Based
•Flow Based
Project: Build a Streaming Media Detector
To develop a Streaming Media (Both Unicast and P2P) traffic identifier and
controller to enable networks to identify which flows contain Streaming
Media.
As carriers upgrade networks, the ability to deliver internet video is becoming
a key differentiator
Enabling theoretical 20-100mbps connectivity to the home is only important if
consumer applications (Joost, Babelgum, Xunlei, ppstream, etc.) can be
delivered with High QOS.
State-of-the-art Edge Routers can support flow based QOS
By elevating the QOS of flows needed for streaming media, the consumer
experience can be enhanced
Related work
P2P Optimized Traffic Control
Riad Hartani and Joe Neil, Caspian Networks
http://www.apricot.net/apricot2005/slides/KT8_2.pdf
Optimizing the Internet Quality of Service and Economics for the Digital
Generation
Lawrence Roberts, Anagran Inc.
http://www.itu.int/osg/spu/presentations/2006/worldtelecom2006/lroberts-itutelecom2006.pdf
Anagran flow router (FR-1000)
http://www.anagran.com/
Tasks
1. Classify network traffic
Identify Streaming Media traffic on a Flow by
Flow Basis, using Port, DPI and Flow
information.
Implement a Control Model to update and
manage new signatures for both DPI and
Flows
Analyze the accuracy of these methods on a
real world network that heavily uses
streaming media applications
http://www.apricot.net/apricot2005/slides/KT8_2.pdf
Consideration 1
Which one should be used in this project?
Port/Signature-base identification
Accuracy of identifying known protocol
Ability to identify encrypted or new P2P protocol
Flow-based identification
Ability to identify encrypted or new P2P protocol
False positive rate (could annoy users)
We will design our own algorithms.
Consideration 2
What would be the performance metrics used
in this project?
Accuracy of detection on a per flow basis
Further deep dives on accuracy – per byte, per 5tuple, per packet.
Average delay on a per flow basis before
identification is accomplished
Ease of update of Patterns and Flow Behavior
False Negatives – identification of background
P2P file delivery as streaming media is very
undesireable.
Task 1
Classify Network Traffic
1.1 Capture Packets
Construct a detailed list of current P2P and unicast streaming
methods and clients/trackers/servers, as well as methods
employed to defeat traffic shaping employed by these methods
(for example azereus supports, encryption, proxying control
traffic, etc.).
Capture packets generated by popular Internet applications.
P2P file Sharing and Streaming
eMule, FastTrack, BitTorrent, Gnutella, …
Joost, Babelgum, ppstream, Xunlei,
HTTP, FTP, mail, streaming, game, telnet, …
Instant Messaging Services
Skype, MSN, Yahoo! Messenger
Capture real world packets.
Packet Information
Src [IP, port]
Dest [IP, port]
Type (TCP or UDP)
Size
Arrival time
Payload
1.2 Extract Characteristics
Flow duration
Packet counts of the flow
Average packet rate of the flow
Packet size distribution of the flow
Total bytes of the flow
Average transmission rate of the flow
…
1.3 Find Particular Flow Behavior
Find the behavior observed by others.
T. Karagiannis et al.
F. G. Chou
Both TCP and UDP (Ratio)
Ratio of the number of distinct ports versus
number of distinct IPs
Packet size switching frequency per flow
Packet size standard deviation per flow
…
Observe the behavior by ourselves.
1.4 Design Program
Packet parser/analyzer
Flower identifier based on neural network
…
1.5 Test & Improve the Program
Use the identifier on real world data
Optimize its performance
Detection rate
False positive rate
Per Flow
Per Byte