Transcript ppt

Malware Prevalence in the
Kazaa File-Sharing Network
Authors:
Seungwon Shin,
Jaeyeon Jung,
and Hari Balakrishnan
Internet Measurement Conference 2006
Presented by:
Arun Krishnamurthy
The Outline

Intro and problems of Kazaa


Krawler: The Kazaa Web Crawler


What does it do? How does it work?
Experimentation and Results


How Kazaa works? Problem isn’t just piracy?
What nasty stuff did Krawler find? How did they propagate?
My Comments

What was good? What was bad? How to improve?
Let’s talk Kazaa!
Intro to Kazaa

A file sharing software created in 2000 by
Sherman Networks.1

Main program contains spyware/adware.


Variations of Kazaa do not contain malware.
Uses supernodes to search for a file.

Unlike Napster that uses a centralized server for
searching.
1
Wikipedia
Centralized Server Searching
(Like Napster)
Peer 6 has “A Pirates
Life for me”
Peer 1
Peer 6
Main Server
Peer 2
Peer 3
I want “A Pirates Life for me”!
Pirate
Peer 5
Peer 4
Supernodes Searching
(Like Kazaa)
404’D!
Hook wants Peter Pan movie
Hook
Alligator
has Peter
Pan
movie!
Problems with Kazaa

The problem isn’t just piracy!

We also have to worry about malware!!!
Malware created by malicious peers to attack other
peers’ computers.
 Dummy files created by RIAA and MPAA to track
and sue illegal uploaders/downloaders!

Krawler: A Kazaa Web Crawler
What’s a Crawler?

A web crawler is a program or automated script
which browses the World Wide Web in a
methodical, automated manner1.
Give me data!
Data
Web Crawler
(Spider)
1
Wikipedia
World Wide
Web
Krawler: A Kazaa Crawler

Browses Kazaa in search of malicious programs.

Two components:

Dispatcher


Maintains list of Supernodes.
Fetcher
Communicates with dispatcher.
 Updates a set of supernodes to crawl.
 Sends query strings to individual supernodes.

Krawler: A Kazaa Crawler
(Basic Idea)

Begin with a set of IP addresses of 200 known supernodes and a
set of query strings associated with the seeking files.

Try to connect to each supernode.


If failed, then wait next round to get IP address.
If connected, exchange handshake message with supernode.

Retrieve a supernode refresh list consisting of 200 supernode IP
addresses. Save list in dispatcher.

Send out a set of queries to each supernode and wait for responses.
Download any matches and scan for viruses.
Experimentation and Results
Collecting Data

Three machines used:



2.1GHZ Dual Core CPU w/ 1GB RAM
2.1 GHZ CPU w/ 1.5GB RAM
1.42 GHZ CPU w/ 1 GB RAM

Allowed Crawler to investigate 60K files/hour.

Two Measurement Methods:


Query Strings
Virus Signatures
Collecting Data
(Query Strings)

File information is only limited to file names that matched query
string.

Many viruses create multiple copies with different legit file
names to increase chances of being downloaded.

Only .exe files are investigated.
Collecting Data
(Virus Signatures)

In 2002, security vendor sites have found more
than 200 viruses propagating from P2P.


Krawler has 71 content hashes of these viruses.
Kazaa content hash is 20 bytes in size.
First 16 bytes for MD5 signature.
 Last 4 bytes for length of file.

Malware Distribution

Krawler has found 45 viruses in Feb 06 and 52
viruses in May 06.

SdDrop infected the most number of clients!

ICQ and Trillian had the highest chance of
being infected (over 70%)!
Malware Distribution
(Top 10 Viruses Graph)
Malware Distribution
(Most Infected Files Graph)
Virus Propagation

Many viruses disguise themselves as legit
filenames.




Many viruses use peers to propagate.


Adobe Photoshop 10 full.exe
WinZip 8.1.exe
ICQ Lite (new).exe
They are placed on folders used for file sharing.
Some viruses don’t just use p2p for propagation.

Emails, web sites, messengers, etc.
Virus Propagation
(Breakdown Chart)
Characteristics of Infected Hosts

Krawler found 1,618 infected hosts in Feb 06.

Krawler found 2,576 infected hosts in May 06.


78 (about 5 percent) infected hosts were still infected
since Feb!
Many infected hosts were used as botnets, DoS
attacks, and spam relaying.
Characteristics of Infected Hosts
(Attack Methods Chart)
My Comments
Strengths

Identifies many types of viruses in the Kazaa
network.

Identifies the infected programs as well!

Easy to understand and possibly implement.

So easy, a caveman can understand it!
Weaknesses

Only searched the Kazaa network.


Only searched .exe files.


How about BitTorrent, LimeWire, Morpheus, etc?
Mp3 files can also be a problem (think RIAA).
Experiments could have lasted a bit longer.
Feb 06 to May 06 is a little short.
 How about conducting for 6 months or 1 year ?

Suggestions

Scan viruses from other file extensions.

Mp3, mov, dll, doc, etc.

Scan virues from other P2P applications.

Scan and filter out any dummy files from those
RIAA and MPAA <explicit deleted>!
Conclusion

Piracy isn’t the only problem in Kazaa and other
P2P networks.


Krawler does a very good job in finding
malicious programs in Kazaa.


We also have to worry about malware!
Also easy to understand!
Would love Krawler to search for other file
extensions and conduct longer experiments.
Anti-Piracy PSA
Piracy Hurts! 

Piracy not only hurts well-paid artists!
Hurts producers!
 Hurts directors!
 Hurts low paid workers!
 Also hurts consumers!!!



Higher prices to counter lost sales.
Piracy is not only wrong, it’s a CRIME!!!
PROPAGANDA WARNING!!!
Put an end to piracy…
…use open source
materials instead!
Find out more at Free Software Foundation and Creative Commons.