Fingerprint generation
Download
Report
Transcript Fingerprint generation
CyberProbe: Towards Internet-Scale
Active Detection of Malicious Servers
Antonio Nappa⇤‡, Zhaoyan Xu†, M. Zubair Rafique⇤, Juan Caballero⇤, Guofei Gu† ⇤IMDEA Software
Institute ‡Universidad Polite ́cnica de Madrid {antonio.nappa, zubair.rafique, juan.caballero}@imdea.org
†SUCCESS Lab, Texas A&M University
{z0x0427, guofei}@cse.tamu.edu
Presented by: Shasha Wen
Outline
Problem
Current ways and limitations
CyberProbe approach
Fingerprint generation
Scanner
Evaluation
Discussion and conclusion
2
Problem: Cybercrime
spam
clickfra
ud
ransom
ware
theft
C&C server → control the malware
Exploit server → distribute the malware
Web server → monitor the operation
Redirector → leading fake clicks
…...
Identify
servers
3
Ways to detect the server
Passive: monitoring
Monitor protected hosts
Run malware in contained environment
Observe servers involved → Limit coverage → increase? Internet-scale?
Slow, detect asynchronously → server maybe dead
Active: Honey client farms
Visit URLs, crawling
Focus on exploit servers
Achieving coverage is expensive
4
CyberProbe: approach
Send probes to remote hosts and examines their responses,
determining whether the remote hosts are malicious or not.
What probes to send
Benigh
traffic
Adversarial fingerprint
How to send the probes
Network
trace
scanning
Adversarial
fingerprint generator
Fingerprints
Port
Target range
Malicious servers
Scanning
5
Problem definition
Network fingerprinting
Fingerprint: the type, version, configuration of networking software
Identify software at different layers
A fingerprint → one malicious family
e.g. C&C software; exploit kit
A family → multiple fingerprints
Problem definition
Host h; target hosts H; target family: x
Fingerprint: FGx = <P, fP>
P(h) :sequences of probes, RP : response
fP(RP) : true if h ∈x
6
Fingerprint generation Overview
Framework, different from other fingerprint generation(FiG)
Minimize traffic
produce inconspicuous probes
Replay observed requests
Network signature → classification function fP
7
[1]
Fingerprint generation: RRP
extraction
RRP
Protocol feature
Protocol signature capture keywords in early part of a message
e.g. GET or POST in HTTP
Unknown → transport protocol
Filter
Endpoint is one of top 100,000 Alexa domains → benign
RRPs with identical requests → avoid replaying the same request
8
[1] RRP: request response pairs
Fingerprint generation:
Replay
Replay request to every malicious endpoint
Identify requests that lack replay protection
Requests replayed with a distinctive response
Use Virtual Private Network
Malware managers may notice
Replay in an incorrect order or invalid
Independency
Requests that generate response without prior communication
9
Fingerprint generation:
Replay
Filtering benign servers
RRPs with no response or return errors
Responses from a server to the replayed request and to
the random request are similar
Replay the remaining RRPs twice more
Output
Replayed RRPs, excluding the original ones
Unique endpoints → seed servers
10
Fingerprint generation:
Cluster by request similarity
For HTTP
Requests have the same method, same path, similar parameters
For other protocols
Clustering
Same transport protocol, size and content and sent to the same
port
Probe construction function
One for each cluster
One of the probes in the cluster with value field replaced by
TARGET and SET macros
11
Fingerprint generation:
Signature generation
fp =
Find the distinctive token
Coverage > 0.4
-9
fg < 10
responses contain the token
total responses
in benign traffic
responses contain the token
coverage =
total responses
in cluster
12
Scanning overview
Target ranges
Internet-wide: full, unreserved, allocated, BGP
Localized-reduced: BGP route contain seed's IP address
Localized-extended: extract the route description
Scan
Scan in random order
Whitelisting: exclude certain ranges; 512MB bit array
Multiple scanners iterate over the targets
13
Scanning:
Horizontal Scanner
Sender
Raw sockets
Initialization: buffer filled IP, TCP header
Rating limiting: inter-probe sleeping time
Receiver
scanner
target
Mark the
target alive
Catch SYNACK packets
Keep listening after the sender completes
Check the validity and log the target IP
No retransmission
14
Scanning: AppTCP &
Probe construction function
First: initialization build a default probe
P: pass the target IP and get the TCP or UDP payload
AppTCP scanner
Input: the living list given by horizontal scanner
Maximum size for a response
UDP scanner
UDP Scanner
Raw socket
Snort
15
Store traces and analyze offline
Evaluation: fingerprint generation
23 fingerprints for 13 families
3 exploit server, 10 malware
One UDP, rest use HTTP
16
Evaluation: Horizontal scanning
67%
Test scan infrastructure and provider locality
4.1% - 57.5%, most seeds locate on cloud hosting providers
Difference on live hosts ← BGP advertised routes
Reusing the results
localized
internet-wide
17
Evaluation: HTTP scanning
66(34 new)
128(72 new)
14
151
unique
18
Evaluation: UDP scanning
Fingerprint: ZeroAccess botnet
getL command: request supernodes list
Scan further
7884 → 15,943 supernodes
6257(39%) found, 61% unreachable
19% supernodes alive one day after the Internet-wide
scan
Speed of active probing makes IP variability a small issue
19
Evaluation: server operations
Bestav: winwebsec, uraysy,...
winwebsec → 2 fingerprints
Internet-wide scan reveal:
16 payment server;
11 C&C server;
In 4 providers.
Payment
server
C&C
server
Provider A
6
5
Provider B
9
4
Provider C
Provider D
Cybercriminals host multiple servers on
the same hosting provider
2
1
20
Discussion
Ethical consideration
Their probes are not malicious
Unsolicited nature of the probes
Explaining page → 106K IP addresses
Completeness
Some families can not generate fingerprints
Scanning capacity
Complex protocol semantics
Replaying request may fail
...
21
Conclusion
Novel active probing approach for detecting malicious servers
Fast, cheap, easy to deploy
Identify different server types
Implement CyberProbe
One adversarial fingerprint generation
Three scanners
Internet-wide scan and localized scan
Build fingerprints for 13 families
Find 151 malicious servers 7881 P2P bots through 24 scans
4 times better that existing technique
Reveal provider locality property
22
Q&A
23