All Your iFRAMEs Point to Us

Download Report

Transcript All Your iFRAMEs Point to Us

All Your iFRAMEs Point to Us
Cheng Wei
Acknowledgement
• This presentation is extended and modified
from
• The presentation by Bruno Virlet
All Your iFRAMEs Point to Us
• The presentation by YouZhi Bao
Motivation
•
•
•
•
Generally improve safety of web browsing
Report owners of malicious nets to authorities
Study distribution of malicious sites
Study relationship between user browsing
habits and exposure to malware.
• Study Malware Distribution Network
Introduction
• What is a drive-by download and why
should we even care?
• Malware delivery:
– Social engineering(attackers use various social
engineering techiques to entice visitors of a
website to download and run malware.)
– Browser vulnerabilities(automatically
download and run),luer users to connect to
malicious servers.
Injection techniques
• Adversaries use a number of techniques to inject content
under their control into benign websites.(adversaries exploit
web servers via vulnerable scripting applications,use invisible
HTML components(IFrames) to hide the injected content.)
• Advertisements(Adversary inject content into)
-Particularly dangerous as they target popular sites
-75% of malicious landing sites delivered malware via ads
• Another common is to use websites that allow users to
contribute their own content.Such as, use of forum, blogs or
advertisements to inject exploit URL (we focus on this)
What is the characteristic of malicious
sites?
Infrastructure and Methodology
Our primary goal is to identify malicious web sites and
help improve the safety of Internet.
Useful terms:
• Malicious URL: denote URLs that initiates drive-by
download when users visit them
• Landing site: group of URLs according to top level
domain names,we refer to the resulting set as the
landing sites.
• Distribution site: host of the malicious payload
• (loaded via an IFRAME or a script from a remote site)
Preprocessing phase
•
•
•
•
our goal is to inspect URLs from this repository and identify the ones that trigger
drive-by downloads.
Web repository maintained by Google
(exhausive inspection of each URL in repository is expensive due to large number
of URLs in the repository,so we use light-weight techiques to extract URLs that are
likey malicious then subject them to verification phase)
For each website extract:
–
–
–
•
Out of place iFrames
Obfuscated JavaScript
iFrames to known distribution sites
Pages that proceed to more expensive
verification process:
–
–
–
Those labeled as suspicious from the above procedure (1 million / day)
Random selection of several hundred thousands URLS
URL reported to
Pre-processing Phase
• Pre-processing Phase
– Extract several features and translate them into a
likelihood score using machine learning
framework
• Map-reduce
• 5-fold cross-validation
• These URLs are randomly sampled from popular URLs
as well as from the global index. We also process URLs
reported by users.
• 1 billion -> 1 million
Preprocessing phase
Verification Process
• this phrase aims to verify whether a candidate URL from preprocessing phase is malicious.
– Equipment: a large scale web-honeynet runs Microsoft
Windows images in virtual machine.
– Method: Execution based heuristics &results from Antivirus engine(to detect malicious URL)
– for each visited URL,we run VM for 2 minutes and monitor
system behavior for abnormal state changes
• Heuristics score: the number of create process; the
number of observed registry changes; the number of
file system changes
• Met threshold: suspicious
Constructing the Malware Distribution Network
• Malware distribution network=> set of
malware delivery trees from the landing site
(leafs & nodes) to the distribution site (root)
• Used the ‘Referer’ header from requests(To
construct the delivery tree,we extract edges that connecting
these nodes by inspecting the Referer header from Http
requests.)
– A set of malware delivery trees, which consists of
landing sites(leaf), hop points and distribution
site(root)
– REFER headers in HTTP request
Constructing the Malware Distribution Network
Prevalence of drive-by downloads
• 1.3% of the overall
incoming search queries
in Google returns at
least one malicious result
based on data collected over a period of 10 months
• From the top 1 million URLs appearing in
Google search engine results, about 6,000
belong to sites that are verified as malicious
(the most popular landing page had rank of
1.588)
4 Prevalence of Drive-by Downloads
• Jan 2011 - Oct 2011
• 6000 in top 1 million, uniformly distributed
Geographic locality of web based malware
• Above founding provide Evidence of poor
security practices from administrators
(running outdated and/or unpatched versions
of web server software)
• Correlation between distribution site and
landing site,we see that the malware
distribution networks are highly localized
within common geographical boundaries.
Malware Distribution Infrastructure
• 45% of the detected
malware distribution sites
used only a single landing
site at a time.
• 70% of the malware
distribution sites have IP
addresses within 58.* -61.* and 209.* -- 221.*
network ranges.
Impact of browsing habits
• DMOZ: knowledge base(measure prevalence of malicious
websites across different website functional categories for
about 50% of URLs)
• Random selection of 7.2 million URLs mapped to
corresponding DMOZ category
Detecting malicious Emails
Malicious content Injection: Drive-by Downloads
via Ads
• Majority of web advertisements are distributed in the form of
third party content to the advertising web site.
• A web page is only as secure as its weakest component!
• Insecure Ad content posses risk(even if the web page itself
does not contain any exploits,insecure Ad content poses a risk
to advertising web sites)
• Frequent fact:
– An advertiser sells advertising space => to another
advertising company => who sells the advertising space to
and other company and so it goes…
Somewhere along the chain something can go wrong
Related Work
This paper differs from all of these works in
that it offers a far more comprehensive
analysis of the different aspects of the
problem posed by web-based malware,
including an examination of its prevalence, the
structure of the distribution networks, and the
major driving forces.
Conclusion
Our study uses a large scale of data collectiion
infrastructure that continuously detects and
monitors the behavior of websites that perpetrate
drive-by downloads.
• our analysis reveals several forms of relations
between some distribution sites and networks.
• we show that merely avoiding the dark corners of
the Internet does not limit exposure to
malware(even the anti-virus engines are lacking in
their ability to protect against drive-by downloads)
Thank you
Questions ?