polytechnic-afrl-2009
Download
Report
Transcript polytechnic-afrl-2009
Infer: Detecting Infected Hosts Through Host
Communication Behavior
Dr. William Hery
Polytechnic Institute of NYU
Prof. Nasir Memon
Polytechnic Institute of NYU & Vivic Labs
Dr. Kulesh Shanmugasundaram
Vivic Labs
UNCLASSIFIED
Outline
• Primary focus: Polytechnic’s INFER Research Project
• Background on the important security problem INFER addresses
(detection of hosts running malicious code)
• The INFER approach and results
• Performance considerations
• Extensions of INFER
• Directions for future research, including application to MANETs and
Trust in Network Sciences.
UNCLASSIFIED
Firewall
Script Kiddie
Known Exploits
IDS/IPS
Known Virii
Trojans
State-of- theart for defense
in depth in
Enterprise
Networks
Antistuff
Mobile/Offsite
Infected Hosts
Assets
Port Scan
Passwords
IP/Trade
Secrets
Network
Intel
DDoS out
Spam out
Zero day attacks
Phishing/Pharming
Infected Sites
UNCLASSIFIE
As a Result ...
• Botnets in size of hundreds of thousands known to be active. Size of millions
speculated.
• Spyware that steals information from a computer and sends it to criminals,
other countries (keystroke loggers, rootkits) on millions of computers.
• Infected computers in fortune 500 companies, DoD networks etc.
• Remote monitoring of 1200 computers used by Tibetan groups traced to
computers in China (NY Times, 3/29/09; Shishir Nagaraja and Ross
Anderson)
• Vint Cerf: 25% of computers may be controlled by cyber criminals (Davos,
2007); 1% of Google searches lead to web sites which download infections
(CBS Sixty Minutes, 3/29/09)
UNCLASSIFIED
Problem Statement
• Malicious software gets into and runs on computers (infects the computers)
• PCs, workstations, servers, PDAs, phones, games…
• Malicious software attacks
• The host—deny use of resources; steal information to transmit, change
information.
• Other hosts and networks—denial of service, spam, infect other hosts.
• How can we
• Block the infection from getting in?
• Detect the infection?
• Disable and remove the infection?
UNCLASSIFIED
Major Types of Infections
• Botnet—network of infected computers remotely controlled be a
malicious entity through a Command and Control (C&C) channel,
often in a chain of command through “stepping stones” (which are
usually bots themselves). Bots in a botnet may lay dormant until
instructed to execute some kind of attack.
• Rootkit—software inserted between the operating system and the
hardware that is invisible to the OS and can potentially execute
anything surreptitiously.
• Trojan—software that performs malicious acts in addition to the
tasks a user thinks it should be doing.
UNCLASSIFIED
Major Types of Infections (continued)
• Virus/worm—software that attempts to spread itself to other
computers.
• Spyware—software that collects information from a computer and
transmits it to the attacker.
• Keystroke logger—spyware that logs keystrokes to send out;
used to collect passwords, etc.
• Adware—software that displays ads (e. g., pop-ups) a user may
not want to see.
Bots in a botnet may be all of the above!
UNCLASSIFIED
Other Types of Infections
• The infections listed above are typical infections frequently seen on the
public internet, in government networks and even in “closed” networks
• For some networks, or subsets of networks, it is valuable to look at different
types of infections due to the nature of the traffic, the architecture of the
network, or specific threats of concern. For example:
• SCADA networks should only have structured control and data acquisition
traffic
• MANETs carry readily identified routing traffic between all nodes
• Some DoD networks may be primarily concerned with exfiltration
UNCLASSIFIED
How do the Infections Get There?
• Malicious web sites
• Malicious attachments
• Removable media
• Malicious insiders
• Infected hosts within your local network
• Intrusions—automated and targeted
UNCLASSIFIED
How Do You Prevent Infections?
• Anti-virus, Anti-spyware—depend on signatures of known attacks
• Intrusion Detection Systems (IDS), Intrusion Prevention Systems
(IPS)—usually depend on signatures of known attacks, or
anomalies
• Firewalls that block access to untrusted sites, or recognize malware
(using AV software, etc.)
• Do not allow removable media, or limit capabilities (e. g. no
autorun for USB drives)
• User training
These don’t always work, and infections still get in!
UNCLASSIFIED
How Do You Detect Infections That are There
• Typical current methods:
• Anti-virus, anti-spyware, and other software that looks for known
malicious software
• Look for very specific behavior associated with known malware
(signature), or very specific threats
• Look for changes in behavior from a “baseline” (anomalies)
These are not effective against new kinds of attacks that have not been
analyzed and signatures distributed to detection software
UNCLASSIFIED
INFER Approach
UNCLASSIFIED
Primary INFER Goals
• Detect host infections that other tools (firewall, AV, ASP,
etc.) miss:
• Detect new infections and new variants of old ones
• Allow security analysts to focus on a few new, more
serious problems
• Be adaptable to enterprise policies and threat environment
• Policy may include both threats and responses (e. g.,
selective shutdown or quarantine in response to some
types of new attacks)
UNCLASSIFIED
In Our View, A Good Solution Should ...
• Be independent of signatures
• Simple variations on old attacks and new attacks get by
signature based systems
• Be based on invariants of infections
• Attackers are quick to adapt to defenses—what are the
observable behaviors common to a varied infections
• Have low false positives & false negatives
UNCLASSIFIED
A Good Solution Could...
• Focus on host actions
• We do not identify infected subjects by examining microbes in the air.
• We inspect subjects for symptoms.
• Be Network-based using a passive monitor
• If a host is compromised, you can’t trust it.
• Scalability and management is easier.
• Bad guys won’t know you are watching!
• Characterize Misuse
• Anomaly doesn’t help much
• Characterize and look for what is bad. Medical analogy - fever, cough, high blood
pressure etc.
UNCLASSIFIED
INFER: Network Based Infection Detection and Containment
3.Treatment
Busine
ss
Assets
»Solutions for cleanup, containment
Infection
Containment and eradication
(3)
Router/Switch
Infection
Sensor
(1)
Synopses
1.Sensors
» Collect and synopsize network
traffic
(2)
(1)
Infection Diagnosis
2.Infection Diagnosis
» Analyze synopses for “symptoms”
» Facilitate retroactive detection
UNCLASSIFIED
Elements of INFER (in Reverse Order!)
• Identification of infected (compromised) hosts
• Based on network observable, invariant symptoms
• Determination of symptoms
• Based on network communications patterns
• Finding communications patterns
• Based on data mining of synopsized network activity
• Data collection and synopsizing
UNCLASSIFIED
Status of INFER
• Research prototype
• Data collection through identification of likely infected hosts
• GUI to drill down through evidence of infection
• Treatment (e. g., quarantine) not in prototype (deployment specific)
• Pilot Deployments
• Polytechnic production network (~2 year) and lab
• One other university net and two local government nets
• Two others expected to start soon
• Very encouraging results (discussed later)
• Plans for productization and further testing/research
UNCLASSIFIED
Data Collection & Synopsizing
UNCLASSIFIED
Synopses
• Volume of network traffic too high to save for extended time
• OC-3 (~150Mb/s) is ~ 1 TB/day
• Properties of a good synopsis
• Capture enough data for analysis
• Capture enough data to quantify the confidence in the result
• Efficient enough to keep up with network traffic in real time
• Tunable to resource constraints & network properties
• More detailed synopses on local nets, less on backbones, but
with ways to link them
UNCLASSIFIED
Advantages of Using Synopses
• Makes it feasible to store a reasonable amount of history on disk
• Makes it feasible to transfer over network for central management
• Query processing is relatively efficient
• Easily adaptable to resource availability
• Allows for cascading different techniques in network hierarchy
UNCLASSIFIED
Examples of Synopsis
• Connection Records: who talked to who, when
• Data Type Classification: audio, image, video, compressed,
encrypted, text, etc.
• Bloom Filters, Hierarchical Bloom Filters: hash of payload
contents that allows matching and string search within payload
• Sampling
• Histograms
• Wavelets
UNCLASSIFIED
NeoFlow
• Similar to NetFlow, sFlow, etc.
• Contains more information
• Content types
• Packet inter arrival times (IAT)
• Packet size distribution
UNCLASSIFIED
Samples of Synopsis Methods Used
Data Type
Link
Protocol Layer
» IPs,ports, protocol
» TOS, TTL
Data for
building
symptoms
Packet
Synopses
Connection record
Neoflow
Header-dump
Link Content
Type of payload
» audio, video…
IM Keywords
Content itself
Mappings
MAC to IP
DNS to IP
Virtual Host to IP
AS-Name to IP
Aggregates
Frequency of
packets
BGB updates
» packet sizes
» inter-arrival times
Neoflow
Hierarchical Bloom
Filter
Packet-dump
MAC Records
DNS Records
HTTP Records
Proto-Histogram
SCBF*
BGP Records
UNCLASSIFIED
Data Collection and Synopsizing Engine
• Based on NYU-Poly’s ForNet (Forensic Network) system
• Developed with NSA and NSF funding
• Proven platform in use since 2003 for forensics and other research
• Version used for INFER achieves a 100:1 size reduction of actual
traffic. (1.5 Gb/s ~ 100 MB/day)
UNCLASSIFIED
How do we characterize an “infection”?
UNCLASSIFIED
Help Users Define “Infection”
• Symptoms
• Characterized by actions (or lack there of) of a host
• Roles
• Characterized by the type of interaction with other
hosts
• Reputation
• Computed as a function of the reputation of associated
hosts
UNCLASSIFIED
A Symptom: Reboot
• Skews in periodic events
• Groups of events (at reboot)
• Frequent reboot is a bad symptom
UNCLASSIFIED
Example Symptoms
• Host-based symptoms (network observable): slowdown,
reboot
• Link-based symptoms: command and control channels
• Protocol adherence-based symptoms: DNS-free
connections, evasive traffic
• Association-based symptoms: access to darkspace, change
in role, contact with untrusted hosts
UNCLASSIFIED
Reputation
/16
• Infer reputation based on neighborhood
• Infer reputation based on (probable) associations
• Infected web sites. Typo squatters.
UNCLASSIFIED
Roles
• Patterns of data flows can identify the role(s) a host is
performing
• At the highest level, hosts can be categorized as:
• Consumers—much more data coming in than going out
• Relays—similar amounts of data going in and going out
• Producers—much more data going out than coming in
• Improper roles, or changes in role may be important
symptoms
UNCLASSIFIED
Taxonomy of Roles…
UNCLASSIFIED
Example Role: Relay
Incoming
Outgoing
Time
Time Slot
• Coordinated ingress and egress flows
• A linear time, probabilistic solution
• Some bots are relays
UNCLASSIFIED
Common Infections and Their Symptoms
• Botnets
• Interactive sessions (C&C), Identical connections, Relays, Darkspace
access, Host role change
• Spyware/Adware
• Slowdown, Untimely reboots, Role change to publisher, Protocol nonconformance
• Trojans
• Interactive session (C&C), Slowdown, Untimely reboots, Role change to
publisher
UNCLASSIFIED
Common Infections and Their Symptoms
• Rootkits
• Slowdown, Interactive session (C&C), Contact with mule
• Virus/Worm
• Identical connections, Protocol non-conformance, Access to
Darkspace
• Infected Site
• Variance of reputation
UNCLASSIFIED
Putting it all together ...
UNCLASSIFIED
INFER in Operation
• INFER data collection and synopsizing is done by passive network monitors
(Synapps), typically on router span ports. No impact on traffic, and not
visible to attackers.
• INFER symptom identification (Sieve) is done on a periodic basis
(adjustable; currently every hour).
• A prioritized list of the most suspicious hosts for the last 24 hours
(adjustable), with explanation of symptoms, is produced.
• Drill down allows detailed analysis of the symptoms and host activities on
the network
UNCLASSIFIED
What if ...
Infection Detection
Infection Report for 10.10.2.10
Owner: Jon Doe
slowdown (t1)
symptoms
roles
reputation
untrusted
download (t2)
role of host
changed (t3)
Virulence: 0.87
Symptoms:
- Host slowed down at t1.
- downloaded exe from untrusted hosts
-- at time t2 from 192.168.1.10 (30KB)
-- at time t2’ from 192.168.3.12 (194KB)
- change in host role
-- role changed from web/mail client to
p2p-node at time t3
(t1 > t3 > t2)
recover evidence
Retroactive Query Results
Downloaded:
- 10.10.2.10 from 192.168.1.10 at time t2
- 10.10.2.34 from 192.168.52.26 at time t4
- 10.10.2.34 from 192.168.52.26 at time t5
Uploaded:
- 10.10.2.54 uploaded to 192.168.52.26 at time t3
Containment
• Restrict all network access
• Restrict outbound access
Retroactive Query
Direct link to packet data
Which hosts downloaded
or uploaded the payload?
OR
UNCLASSIFIED
Manual download from source
INFER by Vivic
UNCLASSIFIED
UNCLASSIFIED
UNCLASSIFIED
UNCLASSIFIED
Scalability
• INFER data collection and synopsizing (Synapp) is distributed and highly
parallelizable
• Data collection is at enterprise and local routers (e. g., span ports)
• INFER symptom identification (Sieve) is also distributed and highly
parallelizable.
• We expect linear scaling with network throughput for both the Synapp and
Sieve
• Infection identification and reporting can be centralized; Admin and/or
automated response workload dependent on number of infected hosts
UNCLASSIFIED
Measurements and Projections
• Research Prototype Synapp using a dual core 2Ghz AMD/Intel processor
running RedHat linux supports production networks at 300 Mbps average
load, and a lab network at 500 Mbps sustained load
• Projection: 2 dual core 2 Ghz processor Synapp for 1 Gbps sustained load
• Projection: Can support an OC-48 with a 6-8 processors Synapp or network
processors (preferred solution); 4 processor Sieve or network processor
• Expectation: network processor performance must keep pace with network
speed, enabling Synapps to keep pace
UNCLASSIFIED
Status
• Pilot in Polytechnic running for 2 years. 3,000 hosts.
• Pilot in a New York County government network. About 3,000 hosts.
• Detection of new attacks a few days before commercial signature based
product.
• Multiple infected hosts detected with very low false positives.
• New pilot on an NYC government network with 43,000 hosts. Single
Synapp/Sieve running on a COTS linux platform, quad core Intel CPU.
• 3 additional pilots coming on line soon.
• Custom port for government use to start soon.
UNCLASSIFIED
INFER Forensic Support
• Once an infection has been detected, the saved and searchable
synopses allow the analyst to
• Identify other hosts with similar symptoms likely to have the
same infection
• Identify the attack vector (by network behavior, not signature)
• Identify other hosts that have been attacked with the same
vector and may be infected
• Trace back the source of the malware within the network
(possibly to an insider)
UNCLASSIFIED
INFER Enterprise Policy Support
• The list and weights of symptoms used in building the prioritized
lists of likely infected hosts can easily be adapted to threats of
primary concern to the enterprise.
• Specific enterprise policies on encrypted traffic, P2P traffic, chat,
video downloads, etc. can be supported by INFER via reports or
interfaces to response software based on those policies
• New symptoms can be developed for specific policies not covered
by the existing symptoms.
UNCLASSIFIED
Planned Validation and Testing
• Further validation of detection mechanisms on networks with
known malware
• Testing and validation on "clean” production networks (university
networks are a “target rich” environment)
• Evaluation of false positive/false negative rates on both controlled
and clean production networks
UNCLASSIFIED
Further Research
• Research and identify further useful symptoms and
efficient algorithms to detect the symptoms.
• Develop a more efficient storage and query mechanism for
data collection and querying.
• Formalize the abstractions more rigorously and develop a
language around it.
UNCLASSIFIED
Potential Adaptations
• The previous material describes the basic version of INFER that
was developed for general purpose networks connected to the
Internet.
• The approach taken here can be adapted to specialized networks,
possibly with the development of new symptoms and/or new
architectures. Examples under consideration include:
• SCADA systems
• MANETs
• Trust in “Network Sciences” research
UNCLASSIFIED
SCADA Systems
• SCADA (Supervisory Control and Data Acquisition) systems are critical for
the control of power plants, the electric grid (particularly the proposed
“smart grid”), chemical plants, natural gas pipelines, etc. Many of the
elements of SCADA systems are connected over the Internet.
• Why the INFER approach is potentially useful for SCADA networks:
• Passive monitoring will not interfere with time critical functions, even on
existing “legacy” systems
• SCADA communications are structured, so symptoms will be easier to
define and detect (less “noise” in the system)
UNCLASSIFIED
MANETs
• MANETs (Mobile Ad-hoc NETworks) are/will be widely used for military
networks and sensor networks, and potentially for intelligent vehicle nets,
etc.
• Every node in a MANET is both a host and a router
• Individual links may come and go rapidly as nodes move
• Node to node path through other nodes may change frequently as nodes
move
• No central control
UNCLASSIFIED
INFER and MANETs
• Why the INFER approach is potentially useful
• MANETs often have specialized functions (including routing at every node)
and policies that INFER can capitalize on
• INFER is signature free, with considerably less computational and storage
overhead than signature based approaches
• Issues for INFER on a MANET
• No central points for monitoring—each node may only see a few other
nodes, and that set changes over time.
• Where to perform analysis (Sieve)? Locally and share with neighbors for a
consensus approach? Send all synopses to a central node for analysis?
• MANETs are typically very bandwidth, CPU, and storage constrained.
UNCLASSIFIED
Network Sciences Research
• Holistic approach to
distributed (network
supported) decision making
• Simultaneously consider
activities at the
• Social/Cognitive layer
• Information Layer
• Communications layer
UNCLASSIFIED
Trust in Distributed (Network Supported) Decision Making
• “Trust” component in the network sciences research program
• The goal is essentially a measure of trust in the information presented to
decision makers
• Not security per se, but a measure of how well security is working
• Trust simultaneously considers trust of elements and processes at the
Social/Cognitive layer, Information Layer, and the Communications layer,
and, most important, how those trust metrics can be composed/combined to
derive the trust metric for the information delivered
• Trust metrics will be dynamic, as will the relationships between the entities
UNCLASSIFIED
Example: Interaction of Layers for a Trust Decision
UNCLASSIFIED
INFER and Trust in (Network Supported) Distributed Decisions
• Trust is a complex issue that requires inputs from elements at all three
layers
• Trust of an individual node will be based on many things, including
• Inherent trustability of the hardware/software of the node
• Location and other elements of situation awareness
• Inputs from other layers
• Observable actions of the node
• An INFER-like approach can be an efficient and effective way to generate a
trust metric for the observable actions of the node
UNCLASSIFIED
Conclusions
• INFER is a valuable new approach to detecting infected hosts based on the
hosts network observable behavior which complements existing signature
based approaches
• INFER has demonstrated effectiveness in pilots on production networks, but
still needs further evaluation in other and controlled environmnets
• INFER is extensible to include enterprise policies and special needs
• INFER is adaptable to special purpose networks and new network
architectures
UNCLASSIFIED
[email protected]
[email protected]
[email protected]
UNCLASSIFIE