Torrent file

Download Report

Transcript Torrent file

BitTorrent
1
Overview
 Released in 2002 by Bram Cohen
 Specification at http://www.bittorrent.org
 Designed so that publishing popular files
does not cause high bandwidth

Quality of service improvements
 Hybrid architecture
 BitTorrent is a protocol
 There are many clients that implement this
protocol
 Each client is a server as well
2
Some clients
 Mainline client (BitTorrent)
 Began as open source by B. Cohen
 Currently a continuation of Torrent and not open source
 Azureus/Vuze
 Xunlei – Chinese P2P style download network,




which also allows BiTorrent
BitTornado
BitComet
ABC
Transmission
3
Differences between Clients
 All clients support BitTorrent protocol
 Convenience
 Mainly different graphic options, e.g. download
bar
 Networking features
 Almost pure P2P network via “tracker less”
downloads
 NAT traversal
 Encryption of data and/or control
 Loading multiple files concurrently
4
Differences (cont.)
 Bundling with other capabilities
Media downloads
 Instant Messaging

 Commercial model
 Freeware and open source
 Adware
 Commercial software
5
Entities in BitTorrent
 Hosts
 Web sites
 Regular nodes
 Tracker
 Swarms
 Software
 BitTorrent client (downloader)
 Web server and web browsers
 Data
 Torrent file
 Data file
 Pieces
6
Discovery of content
 Use content name as key
 Surf the web (or known sites) for “torrent”
file that includes content name
 Download torrent file and discover tracker
7
Downloading content
 The number of pieces and size of each piece is






defined in Torrent file
A peer queries the tracker a list of peers
List of peers defines “neighborhood”
List of peers is typically incomplete, up to 40
peers
The peer serves pieces to requesting peers
The peer requests pieces from other peers
Ethics

Peer “is supposed” to share content for some time after
completing download
8
Actual Uptime after Download
9
Actual Uptime (cont.)
Uptime after
Download Completion
Percentage of Total
1 Hour
17%
10 Hours
3.1%
100 Hours
0.34%
10
Basic Download Flow
Peer
Peer
Peer
Control + Data
Control + Data
Control
metadata
Torrent file
Web server
Peer
New Peer
Tracker
11
Torrent file
 Data on tracker and files it is tracking.
 Tracker information
 URL of tracker
 File information
 Name of file
 Size of file
 Number of pieces
 Length of each piece
 SHA-1 hash of each piece
 Optionally, one torrent file stores data on several
downloaded files
 Other information

Date of creation of torrent file
12
Swarm
 Set of hosts concurrently downloading info
according to single torrent file
 Single tracker (super node)
 Regular nodes

Originally, at least one “Seeder”
• Anyone who has the complete file

Multiple regular nodes (sometimes called
“Leecher”)
13
Tracker
 Who is a tracker?
 Connectivity
 HTTP connection
 By default on port 6969
 Usual NAT and firewall problems
 Data
 List of all swarm peers (active and not active)
 List of peers who have completed the download
 List of upload/download ratio for each peer
14
Tracker functions
 Maintain lists of swarm members and their
state
 Maintain upload : download ratio for each
peer
 Allow a peer to receive peer lists or filter
the peers it is receiving based on its ratio
 The tracker does not have exact
information on the download/upload
process
15
Tracker-Peer connections
 Peer connects to tracker over TCP port specified
in torrent file
 Connection uses HTTP GET and response to it
 Peer request, some important fields






port – the port number that the client listens on for
BitTorrent data, by default in the range 6771-6889
uploaded & downloaded – number of bytes uploaded
(downloaded) to (from) other peers
left – number of bytes left to download
event – either “started”, “stopped” (e.g. when client is
turned off), “completed”
numwant – numbers of peers that this peer wants to
retrieve from
Peer ID – ID in overlay network
16
Tracker response
 Includes
HTTP header
 Data - a bencoded dictionary

• Interval – time until peer should request peer list
again
• List of peer IP addresses and ports
17
Overlay Network
 Separate overlay network for each swarm
 Nodes
 Tracker
 Peers
 Edges
 Bi-directional between the tracker and every
peer
 Possibly unidirectional between peers
 A peer obtains neighbors by:
• Peer list from tracker
• Another peer requests data
18
Network Dynamism
 Peers change
New peers add themselves to the swarm
 Peers leave swarm due to:

• Completion of download
• Requires upload bandwidth for other purposes
• Any other reason
 Pieces
 Mapping of pieces to peers changes constantly
19
Peer to Peer connections
 Sometimes called Peer Wire Protocol (PWP)
 Begins with handshake packet sent by requesting
peer


Includes SHA-1 hash of requested file
Peer ID of requesting client
 A peer Alice has two states for each peer Bob
 Interested (True/False) – Alice wants (doesn’t want) to
download data from Bob
 Choked (True/False) – Bob isn’t allowed (is allowed) to
download data from Alice
 All peers maintain the list of states they have for
other peers and the list of states other peers
have for them
20
Tit for Tat
 Problem: users want to download, not to upload
 Solution philosophy :
 Upload only to those that allow your download (tit for
tat)
 Specifics:
 Each peer uploads to several other peers (e.g. five), they
are unchoked
 The four peers that provide it with the most download
bandwidth
 One random peer from the swarm
• This is called “Optimistic unchoking”
21
Tit for Tat (cont.)
 Uploading to a random peer allows
improvement of downloading peers
 Without randomness what would happen to
newcomers?
 Peers with a similar upload bandwidth tend
to download one from the other
22
Peer to Peer Messages
 Peer to Peer messages sent by each peer
to all its neighbors in overlay network
 Message types
Notification of state (interested, choked)
 Request list of pieces
 Publish local list of pieces
 Have message –receipt of new piece
 Request a block of data
 Send a block of data
 Cancel a requested block of data

23
File Structure
 Typical size: 256
Kbyte
 Verified by SHA1
File
Piece 1
Block 1
Piece 3
Piece 2
Block 2
Block 3
 Typical size: 16 Kbytes
 Basic element in traffic
24
Download Order
 Sequential order

Typical of Client server architecture, e.g.
HTTP, FTP
 Random order

Used by BitTorrent clients upon joining a swarm
 Rarest first order
 Used by BitTorrent clients most of the time
25
Endgame mode
 When peer needs only a few pieces to
complete the file, they tend to trickle in
very slowly
Peer is usually downloading from very few peers
 Peer will get remaining pieces if those peers
have them or by optimistic unchoking

 Solution
 Send requests to all peers in swarm
 Send cancel messages once piece is retrieved
26
Bencoding
 Encoding of metadata in BitTorrent
 Strings
 <length in base-10>:<string value:>
 For example 8:announce
 Integers
 i<integer value in base-10>e
 For example i123e
 List: sequence of bencoded values
 l<values>e
 For example li123ei456ee
• 123
• 456
27
Bencoding (cont.)
 Dictionary: list of <type value> pairs
d<pairs of type, value>e
 For example:
d8:announce30:http://tracker.prq.to/announcee

 Show example of Torrent file
28
Encryption in BitTorrent
 There are various methods to identify
BitTorrent traffic
Specific fields in peer-tracker communication,
such as HTTP user-agent
 PWP commands in peer to peer traffic
 Others

 ISPs identify BitTorrent traffic to
throttle or disrupt it
 Encryption designed to hide BitTorrent
traffic
29
Encryption (cont.)
 Protocol Encryption (PE) or Message
Stream Encryption (MSE)
 Key establishment
Diffie-Hellman
 Hash of the torrent file used to further
identify users (they exchange a hash of the DH
shared key and a hash of the torrent file hash)

 RC-4 encryption of data
30
BitTorrent without trackers
 Trackers are a form of super-node
 In tracker-less mode, all nodes are equal
 Files and file pieces are associated with a unique key

Their SHA-1 hash
 Publisher


Publishes a torrent file with the SHA-1 hash of the file
(instead of a tracker URL)
Publishing in web site, by mail etc.
 Peers use Distributed hash tables to locate pieces
 Peers connect to swarms using pre-set IP addresses
 DHT implementations in BitTorrent


Kadmelia based (for mainline client)
Azureus/ Vuze
31