Transcript Slide 1

ECE 526 – Network
Processing Systems Design
Network Security
11/10-12/2008
What is network security
Confidentiality: only sender, intended receiver should
“understand” message contents
Authentication: sender, receiver want to confirm identity of
each other
Message integrity: sender, receiver want to ensure
message not altered (in transit, or afterwards) without
detection
Access and availability: services must be accessible and
available to valid users only
Ning Weng
ECE 526
2
Attack Types
eavesdrop: intercept messages
actively insert messages into connection
impersonation: can fake (spoof) source address
in packet (or any field in packet)
hijacking: “take over” ongoing connection by
removing sender or receiver, inserting himself
in place
denial of service: prevent service from being
used by others (e.g., by overloading
resources)
3
Security in Exiting Internet
“At least we understand cryptography now…”
Ning Weng
ECE 526
4
Cryptography for Confidentiality
Alice’s
K encryption
A
key
plaintext
encryption
algorithm
ciphertext
Bob’s
K decryption
B key
decryption plaintext
algorithm
symmetric key crypto: sender, receiver keys identical
Pros and cons:
public-key crypto: encryption key public, decryption key secret (private)
Pros and cons:
5
Cryptography for Message Integrity
large
message
m
H: hash
function
Bob’s
private
key
+
-
KB
encrypted
msg digest
H(m)
digital
signature
(encrypt)
encrypted
msg digest
KB(H(m))
large
message
m
H: hash
function
KB(H(m))
Bob’s
public
key
+
KB
H(m)
digital
signature
(decrypt)
H(m)
equal
?
Alice verifies signature and integrity of digitally signed message
Integrity using the hash function
Signature is the encryption key, encrypt h(m) instead of messages.
6
Cryptography for Authentication
number (R) used only once –in-a-lifetime
Potential problem: in the middle attack
“I am Alice”
R
Bob computes
+ -
-
K A (R)
“send me your public key”
+
KA
KA(KA (R)) = R
and knows only Alice
could have the private
key, that encrypted R
such that
+ K (K (R)) = R
A A
7
What NP systems can Do
to improve Access and availability
8
Firewalls
firewall
isolates organization’s internal net from larger
Internet, allowing some packets to pass, blocking
others.
public
Internet
administered
network
firewall
9
Firewalls: Why
prevent denial of service attacks:
─ SYN flooding: attacker establishes many bogus TCP connections,
no resources left for “real” connections
prevent illegal modification/access of internal data.
─ e.g., attacker replaces CIA’s homepage with something else
allow only authorized access to inside network (set of authenticated
users/hosts)
three types of firewalls:
─ stateless packet filters
─ stateful packet filters
─ application gateways
10
Credential-based Networks
to improve Access and availability
11
Setup Credentials
Ning Weng
ECE 526
12
Credentials Data Structure
• m: # of bit in the array
• n: # of hash functions
• r: # of hops
Two steps of bloom filter
1. Programming
2. Query
Ning Weng
ECE 526
13
False Positive Probability
1 nr nr
 r n / m r n
(1  (1  ) )  (1  e
)
m
• false negative is impossible ->legal packet will be
forwarded
• false positive is possible -> how big the chance
Ning Weng
ECE 526
14
Intrusion detection systems
• multiple IDSs: different types of checking at different
locations
application
gateway
firewall
Internet
internal
network
IDS
sensors
Web
server
FTP
server
DNS
server
demilitarized
zone
15
Intrusion detection systems
• packet filtering:
─ operates on TCP/IP headers only
─ no correlation check among sessions
• IDS: intrusion detection system
─ deep packet inspection: look at packet contents (e.g., check
character strings in packet against database of known virus,
attack strings)
─ examine correlation among multiple packets
• port scanning
• network mapping
• DoS attack
16
NIDS Techniques
•
•
•
•
Signature-based
Anomaly-based
Stateful detection
Application-level detection
17
Signature-base NIDS
Similar to the traditional anti-virus applications
Example:
Martin Overton, “Anti-Malware Tools: Intrusion Detection Systems”,
European Institute for Computer Anti-Virus Research (EICAR), 2005
Signature found at W32.Netsky.p binary sample
Rules for Snort:
18
Signature matching
• Used in intrusion prevention/detection, application classification, load
balancing
• Input: byte string from the payload of packet(s)
─ Hence the name “deep packet inspection”
• Output: the positions at which various signatures match.
• challenges
─ thousand of possible signature
─ high performance requirement
─ easy to update the new patterns
19
DFA construction
•Example: P = {he, she, his, hers}
•Initial State
•Transition Function
•State
• Accepting State
•h
•h
•2
•h
•h
•6
•s •S •h
•8
•7
•s
•9
Ning Weng
ECE 526
•S
•3
•i
•S
•r
•s
•S
•1
•e
•h
•0
•i
•h
•h
•r
•h
•4 •S
•e
•5
•S
•h
•S
20
DFA Searching
•h •0 •s
•Matching String
•h
•h
•Input stream:
•h•x•h•e•r•s
•S
•e •1
•S •h•i
•2
•6
•h
•S•h
•s
•r
•8
•7
•s •h
•S
•9
•S
•3
•h •h
•i
•4 •S
•h
•r
•e
•5
•S
• Scanning input stream only once
• Complexity: linear time
•.
Ning Weng
ECE 526
21
Network Attack Patterns
22
DFA mapped to Traditional Memory
• 256 entries for each
state
• Snort Dec. 2005 has
2733 patterns
• Needs 27000 states
• Memory size – 13 MB
23
SAM-FSM
• Traditional – 13 MB; Ours – 16KB
24
Overall System
25
Anomaly-based NIDS
• Signature-based NIDS can’t detect zero-day attacks
• Anomaly: Operations deviate from normal behavior.
• What could cause anomaly?
─
─
─
─
Malfunction of network devices
Network overload
Malicious attacks, like DoS/DDoS attacks
Other network intrusions
• Two main kinds of network anomalies.
1. Related to network failures and performance problems.
2. Security-related problems:
(1) Resource depletion
(2) Bandwidth depletion
26
Key Technical Challenges
 Large data size
─ Millions of network connections
are common for commercial network sites
 High dimensionality
─ Hundreds of dimensions are possible
 Temporal nature of the data
─ Data points close in time - highly correlated
 Skewed class distribution
“Mining needle in a haystack.
So much hay and so little time”
─ Interesting events are very rare  looking for the “needle in a haystack”
 High Performance Computing (HPC) is critical for on-line
analysis and scalability to very large data sets
27
Anomaly detection meets
troubles
• There are many schemes based on checking
abrupt traffic changes.
─ E.g. apply signal processing technique to detect out
traffic’s abrupt change
• However, this kind of anomaly does not always
mean illegitimate.
─ Abrupt change of traffic does not mean an attack has
exactly happened
• We call this case as:
Legitimately-abrupt-change (LAC)
28
Legitimately abrupt changes
• Example 1:
─ Famous information gateway websites, e.g. Yahoo.
• When bombastic news is announced, it would
appear.
• Example 2:
─ Special information announce center, e.g. the website of
national meteorological agency
• When a nature disaster is said to be coming, it
would occur.
– Typhoon, Earthquake, Tsunami
• Important outdoor holidays
29
Anomaly Detection
• Already used by industry
--Protocol Anomaly
--Statistical/Threshold based
• In Research
--Data mining
30
Protocol Anomaly Detection
Based on the well established RFCs
Focus on the packet header
Example:
--All SMTP commands have a fixed maximum size. If the size exceeds
the limit, it could be a buffer overflow or malicious code inserting
attack
--SYN flood attack: attacker sends SYN with fake source address
--Teardrop attack: fragmented IP packets with overlapped offset
31
Threshold based
Using training data to generate a statistical
model, then select proper thresholds for
network environment (traffic volume, TCP
packet count, IP fragments count, etc.)
-- usually used as an complementary tool
32
Stateful IDS
• No practical Solutions
• Very simple implementing
Example:
Snort uses patter matching in continuous Packets.
Traditional signature rules: “pattern1” “pattern1 || pattern2”
The rule now can be defined as: “pattern1.*pattern2”
33
Application-level IDS
Focus on specific services or programs
(Web Server, Database, etc.)
Example
--Monitoring all invocation for Microsoft RPCs
--Analyze HTTP request for malicious query strings
Products:
--mod_security: an optional IDS component for Apache
Web Server
34
Current NIDS Challenges
• High false positives
-- FP of 0.1% means a normal packet will be misclassified as an
alert for every 1000 normal packets, which is about one error alert
per minute on a 100M network
• Zero day attack (unknown attack)
--Most current products rely on signature-based detection, difficult to
detect new attacks.
• Poor at automatically preventing ability
--Human interaction is required when attack is detected
35
IDS Today Products
•
•
•
•
•
Snort
McAfee Intrushield
ISS RealSecure
Cisco IPS
Symantec IDS
36
Snort
•
•
•
•
•
Open Source, since 1998
Used by many major network security products
Signature-based (more than 3000)
Simple IP header protocol anomaly detection
Simple stateful pattern matching
37
McAfee
• Profile-based anomaly detection
--Manually create profile
--Create profile by self-learning through a training period
• Using profile plus threshold for defending against DOS
and DDOS
• Inspect encrypted traffic by collecting the server side
private keys
38
ISS RealSecure
• About 2000 signatures
• Application-based approach
--identifying any possible exploit to the published vulnerabilities of
MS RPC, IIS, Apache, Lotus, etc.
• Additional support for P2P,Instant Messengers
• Virtual Prevention System
--a virtual environment to examine the execution of a file in order to
find any possible malicious behaviors
• Support for IPv6
--Detect possible backdoors which enable the IPv6 of a system
(usually off)
39
Cisco IPS produtcs
•Protocol decoding
•Threshold based property
checking
•Signature matching
•Protocol Anomaly Detection
•Checking file behaviors by
intercepting all calls to the
system resources
40
Symantec
• Multi-steps (protocol, vulnerability, signature, DOS, traffic,
evasion check)
• Unique feature: evasion check
e.g. request “/index.html” can be replace with
“/%69nd%65x.html” to evade the signature matching
41
Summary of Current Products
Signature
General
Snort
McAfee
Intrushield
ISS
RealSecure
Cisco IDS
Symantec
IMUNE
x
x
x
x
x
Application based
Anomaly
Detection
x
Profile-based
x
Vulnerability-based
x
Statistical-based
Protocol-based
x
x
Self-learning
x
Application specific
x
Stateful
IPv6 Support
x
x
x
x
x
x
x
Behavior
Encrypted Traffic Detection
x
x
x
x
x
42
Academia on Anomaly Detection
• Columbia University
--Data mining based (since 1997)
• University of California at Santa Barbara
--Service Specific (HTTP)
--Stateful IDS
• Florida Institute of Technology
--Protocol Anomaly (Statistical based)
• University of Minnesota
--MIND (Minnesota Intrusion Detection System)
43
Columbia Univ. IDS
• 1997, Applied RIPPER rule learning algorithm
on UNIX system calls monitoring for malicious
events detection
• 1998, Applied the algorithm on off-line network
traffic data (clean training data)
• 2000, Applied EM and clustering algorithm for
dealing with noisy dataset
• 2001, Developed an complete experiment NIDS
based on those algorithms.
• 2004, New approach towards payload anomaly
detection
44
Implementing Procedure
• Wenke Lee, Sal Stolfo, and Kui Mok., “A Data Mining
Framework for Building Intrusion Detection Models”,
Proceedings of the 1999 IEEE Symposium on Security and
Privacy, Oakland, CA, May 1999
•Pre-Processing
•Process raw packet
data
•Feature construction
•Apply RIPPER algorithm
•Create statistic
features
•Rule
learning
45
Pre Processing
•SYN flood attack
46
Feature Construction
(service=http, flag=S0, dst_host=victim),
(service=http, flag=S0, dst_host=victim)
-> (service=http, flag=S0, dst_host=victim)
[0.93, 0.03, 2]
93% of the time, after two http connections with S0
flag are made to host victim, within 2 seconds from
the first of these two, the third similar connection is
made, and this pattern occurs in 3% of the data
47
RIPPLE Rules
smurf :- service=ecr_i, host_count >= 5,
host_srv_count>=5
( if the service is icmp echo request, and connections with the same
destination host are at least 5, and connections with the same service
are at least 5,then it is a smurf/DOS attack)
satan :- host_REJ_%>=83%, host_diff_srv_% >=
87%
( for connections with the same destination host, if the rejection rate is at least
83%, and the percentage of different services is at least 87%, then it is a
santa/PROBING attack)
48
Experiment Results
•Applied on DARPA’98 Intrusion Detection
Evaluation Data Set
49
Payload based Approach
K. Wang, S. J. Stolfo, “Anomalous Payload-based Network
Intrusion Detection”, RAID 2004
• Construct the statistical model for all bytes in the header
• Use Mahananobis distance to measure the difference
Problems:
• Clean training data is required
• False positive (unacceptable)
50
Service Specific IDS by UCSB
V.Giovanni et al at University of California at Santa Barbara
Since 2002
• Application level
• Focuses on HTTP request
• HTTP request analyzing
• Constructing models for important fields in the request
instead of all bytes of the payload (Columbia payload
approach)
51
Sample Request
Request
GET /scripts/access.pl?user=johndoe&cred=admin
Properties for Detection
Request Type: e.g. GET
Request Length: e.g. Length(“GET /scripts/access.pl?user=johndoe&cred=admin”)
Payload Distribution
52
Request Type
Assumption:
If a rare used request type was found, it is very possible it
will initiate malicious activity
Anomaly Score:
AStype=-log2(p[type])
P[type] stands for the probability of a certain type
53
Request Length
Assumption:
The request length should not vary much of a certain type.
Otherwise, it is probably caused by some attacks
(e.g. overflow)
Anomaly Score:
ASlen=1.5(1-)/(2.5*)
P[type] stands for the probability of a certain type
54
Characters Distribution
256 ASCII Characters
e.g. “passwd” -> “112 97 115 115 119 100”
Distributions: {0.33, 0.17, 0.17, 0.17, 0.17}
2=f(Oi, Ei) (i corresponds from segment 0 to 5)
Aspd= 2*(15/L) (L stands for the payload length)
Segment
0
1
2
3
4
5
ASCII
Value
0
1-3
4-6
7-11
12-15
16-255
55
Final Anomaly Score
AS=0.3*AStype + 0.3*ASlen+0.4*ASpd
56
Later Research at UCSB
Structure Inference with Markov Model
57
Other Properties Used
• Token Finder
if the query parameter is drawn from known candidates
• Attribute Presence or absence
malicious crafted request usually ignore the order of parameters
• Access Frequency
• Invocation order
• Request time interval
58
Experiment Results
• Tested at UCSB campus network and Google
• False positive 0.06%
Major cons:
Limited to HTTP service
59
Packet Header Anomaly Detection
Packet Header Anomaly Detection (PHAD)
developed by Florida Institute of Technology since 2001
Basic Assumption:
If an event x happened n times with r different results in the
training period, the probability of a novel data is r/n
60
Implementing
Step 1:
Assign the novel data probability to important fields
of the packet header (protocol type, flags, etc.)
Step 2:
Adding all the novel data probability together as a
threshold
61
MINDS
MINDS (Minnesota Intrusion Detection System)
Statistic outlier-based anomaly detection
Compared 5 outlier-based scheme:
• K-th nearest neighbor
• Nearest neighbor
• Mahalanobis-distance based
• Local Outlier Factor (LOF)
• Unsupervised SVMs
62
Comparison Result
•A. Lazarevic, et al, “A Comparative Study of Anomaly
Detection Schemes in Network Intrusion Detection”,
Proceedings of the 3rd SIAM Conference on Data Mining,
San Francisco, 2003
63
Some Emerging Approaches
• SVMs
(unsupervised and supervised)
• PCA
• PCA + SVMs
• Neural Network
64
Conclusion
• Network is lacking of security
• Crypto is well understood and used
• NIDS
─ Signature based approaches still play the major part
in practical IDS
─ Anomaly detection has only very limited success
─ New approaches are proposed everyday, but false
positive and detection rate are still the major problem
─ Various mechanisms should work together for
maximum success
65
Reference
• Jim Kurose: Computer Networks
• Tilman Wolf: Credential-based Networks
Ning Weng
ECE 526
66