Experiences with Specification-Based Intrusion Detection

Download Report

Transcript Experiences with Specification-Based Intrusion Detection

Lecture Notes In Computer Science (LNCS), Vol. 2212, 2001.
Proceedings of the 4th International Symposium on Recent Advances in Intrusion Detection (RAID)
Experiences with
Specification-Based Intrusion
Detection
Prem Uppuluri and R. Sekar
Department of Computer Science
SUNY at Stony Brook, NY
Present by: Mike Hsiao, 2007/11/30
Outline
• Introduction and Background
• What is specification-based?
• Behavioral Modeling Specification Language and
Example
• Specification development
• 5 steps for developing specification
• Experiential Results
• On DARPA/Lincoln and DARPA/AFRL 1999
• Discussions and Conclusions
• 20% false negative rate with zero false negative
2/29
Three Intrusion Detection
Techniques
• Misuse Detection
• It can accurately detect known attacks
• Its drawback is the inability to detect previously
unseen attacks
• Anomaly Detection
• It is capable of detecting novel attacks,
• but suffers from a high rate of false alarms
• Specification-based Detection
• a promising alternative that combine the strengths
of misuse and anomaly detection
3/29
Specification-based Detection
• Manually developed specifications are used
to characterize legitimate program behaviors.
• It does not generate false alarms when unusual
(but legitimate) program behaviors are
encountered.
• Its false positive rate can be comparable to misuse
detection.
• Since it detects attacks as deviations from
legitimate behaviors, it has the potential to detect
previously unknown attack.
4/29
Questions about realization
• How much effort is require to develop
program behavioral specification?
• compare with efforts required for training anomaly
detection systems
• How effective is the approach in detecting
novel attacks?
• Are there classes of attacks that can be detected
by specification-based techniques that cannot be
detected by anomaly detection or vice-versa?
• Can it achieve false alarm rates that are
comparable to misuse detection?
5/29
General Experiences of
Specification-based Detection
• Of the order of tens of hours and modest
specification development efforts, an effective
IDS can be developed.
• for many security-critical programs on Solaris
• These efforts need to be undertaken just
once for each operating system.
• further customization on the basis of individual
hosts or installations seems to be unnecessary
• anomaly detection systems typically need
training/tuning for each host/system installation.
6/29
Behavioral Monitoring
Specification Language (BMSL)
• BMSL enables concise specifications of event
based security-relevant properties
• These properties can capture either normal
behavior of programs and systems, or misuse
behaviors associated with known
exploitations.
• BMSL specifications are based on the packet
contents, log file entries, and for system calls
(and the values of system call arguments.)
• In this paper, only system calls are considered.
7/29
BMSL (cont’d)
• BMSL support a rich set of language
constructs that allow reasoning about not only
singular event, but also temporally related
event sequences.
• For a given event stream (system calls), an
interceptor component placed in the stream
provides efficient interception of raw events.
• The interceptors deliver raw event streams to a
runtime environment associated with each stream.
• A single detection engine monitors each defended
process.
8/29
pat -> action
• pat is a pattern on event sequences,
otherwise known as histories.
• action specifies the response to be launched
when the observed history satisfies pat.
• We typically initiate responses when abnormal
behaviors are witnessed
• thus, components of rules usually correspond to
negations of properties of normal event histories.
9/29
Two System Call Events
• The first event for a system call corresponds
to the system call invocation
• it has the same name as the system call with the
arguments that are exactly the same as the
system call
• The second event corresponds to the return
from the system call
• its name is obtained by suffixing _exit to the name
of the system call
• The arguments includes all of the arguments to
the system call, plus another argument that
captures the return value from the system call. 10/29
The pattern language: Regular
Expressions over Events
• REE extends the concepts from sequences of
characters to events with arguments.
• occurrence of single event: e(x1, …, xn)|cond is
satisfied if cond evaluates to true (x are arguments)
• occurrence of one of many events: E1||E2|| … ||En,
where Ei captures an occurrence of a single event
• event non-occurrence: !(E1||E2|| … || En)
• Temporal operators
• sequencing: pat1;pat2
• alternation: pat1 || pat2
• repetition: pat*
11/29
The pattern language (cont’d)
• pattern: e1(x);e2(x)
• e1(a);e2(b) -> not satisfied
• e1(a);e2(a) -> satisfied
• an event history H matching a pattern p
• H matches p if any suffix of H matches p.
• anchor
• i.e., constrain the match to occur from the
beginning of the history H
• a special event begin is introduced at the
beginning of the pattern.
12/29
Example Specifications
• restrict a process from making a certain set of
system calls
• execve||connect||chmod||chown||creat|
|sendto||mkdir -> term()
• if arguments are not used else where, we can omitted
them.
• term() is an external function provided in runtime
environment.
• restrict the files that a process may access or
reading or writing
• admFiles = {"/etc/utmp", "/etc/passwd"}
• open(f,mode)|(realpath(f)≠admFiles||m
13/29
ode≠O_RDONLY)->term()
Example Specifications (cont’d)
• assert that a program never opens and
closes a file without reading or write into it.
• openExit(fd)::=open_exit(..., fd)||
creat_exit(..., fd)
• rwOp(fd)::=read(fd)||readdir(fd)||
write(fd)
• openExit(fd);(!rwOp(fd))*;close(fd)
14/29
Specification development
1. Developing generic specifications
•
parameterize the system calls and arguments
2. Securing program groups
•
class programs that have similar security
implications (e.g., setuid, daemons)
3. Developing application-specific
specifications
4. Customizing specifications for an operating
system/site
5. Adding misuse specifications
•
the detection of some attacks requires
knowledge about attack behavior.
15/29
1. Developing generic
specifications
• group system calls of similar functionality
• allow us to consider a smaller number of system
call groups (few tens) while writing specifications,
rather than a few hundred system calls
• also helps in developing portable specifications
• The authors identified 23 groups, further
organized into 9 categories.
• File Access Operations
• WriteOperations: open, create
• ReadOperations
• FileAttributeChangeOps: chgrp, chmod
16/29
1. Developing generic
specifications (cont’d)
• Process Operations
• ProcessCreations: fork, execve
• ProcessInterference: kill
• Network Calls
• ConnectCalls
• AcceptCallls
• Setting resource attributes
• such as scheduling algorithm and parameters
• Privileged Calls
• mount, reboot
17/29
2. Securing program groups
• restrict setuid program (e.g., passwd) with
bounded size of input argument
• execve(path,argv,envp |
checkSize(path,argv,envp,max)
-> term()
• failed login attempts (e.g., Telnet, FTP)
• execve(prog) |
prog = "telnetd" && tooManyAttempts()
-> log("too many failed logins")
• (setuid || setreuid) ->
resetAttempt()
18/29
3. Developing applicationspecific specification
• The author identified the principal securityrelated system calls made by ftpd and
plausible sequence orders for their execution.
• (!setreuid)*;setreuid(r,e) ->
loggedUser := e
• (!setreuid())*;ftpInitBadCall() ->
term()
• setreuid();any()*;ftpAccessBadCall()
-> term()
• connect(s, sa)|((getIPAddress(sa) !=
clientIP) && (getPort(sa) ≠
ftpAccessedSvcs)) -> term()
19/29
4. Customizing specifications
for an OS/site
• system calls and files may different from
different OS/site.
• e.g., password file, private directories
• site specific security policies
• e.g., anonymous ftp
20/29
5. Adding Misuse rules to
specification
• Certain attacks can be detected only based
on knowledge about specific attack behavior.
• E.g., a small number of login failures using the
user name guest usually indicates an attack.
• But small number of failures are not that unusual
with other user names, as users frequently
mistype or forget their passwords.
• This knowledge pertains to attacker behavior,
rather than the behavior of the program itself.
21/29
Experimental Data
• Data
• 1999 DARPA/Lincoln Lab offline evaluation
• an audit log data is used to corresponding system call
(name, time, args, and return value, process is, userid,
guid, IPs, and ports)
• 1999 DARPA/AFRL online evaluation
• Specifiation for program
• in.ftpd, in.telnetd, eject, ffbconfig, fdformat, ps, cat,
cp, su, netstat, mail, crontab and login
22/29
Experimental Results
% instances detected
# of
instances
before misuse
after misuse
fdformat
3
100%
100%
ffbconfig
1
100%
100%
eject
1
100%
100%
secret
4
100%
100%
ps
4
100%
100%
ftpwrite
2
100%
100%
warez master
1
100%
100%
warez client
2
100%
100%
guess telnet
3
100%
100%
http tunnel
7
0%
100%
guest
2
0%
100%
total
30
82%
100%
Attack Name
23/29
Development Effort
• The author claim that general specifications
seem to be sufficient for detecting a majority
of the attack.
• Many attacks produce violations of multiple
rules.
• attacks may be detectable with even less effort in
specification development.
• For ftpd, the authors develop 14 rules which
were 18 lines of codes in BMSL.
• setuid to root programs required 6 man
hour to refine and customize the
24/29
specification.
Development Effort (comparing with
training anomaly detection systems)
• Production of training data (by a human) for
anomaly detection systems is resource
intensive.
• It is very important to ensure that the training
data encompasses almost all legitimate
behaviors, which is difficult, and typically
requires manual intervention on a program by
program basis.
• Trained data is usually specific to a particular
installation of an operating system.
25/29
Novel attacks and false
positive rate
• Over 80% of the attacks could be detected
without encoding any attack-specific
information into the specifications.
• Viewed differently, all of these attacks are
“unknown” to our system.
• In the online evaluation, the authors did not
encode any misuse rules
• all detections were based on normal behavior
specifications.
• In this context, the system detected all of the
three attacks launched during this evaluation,
26/29
once again with zero false alarms.
Novel attacks and false
positive rate (comparing with anomaly
detection systems)
• A specification is aimed at capturing a
superset of possible behaviors of a program.
• As such, there is no inherent reason to have false
positives, except due to specification errors.
• There may be some behaviors that are
outside of the legitimate behaviors of a
program, but within the superset captured by
a specification.
• Exploits that involve such behaviors will be missed
by a specification-based approach, while they can
be captured by a learning based approach.
27/29
Novel attacks and false
positive rate (anomaly detection can’t)
• Learning-based approach essentially learns
that execution of a security-relevant error is
normal.
• Viewed alternatively, exploitation of the error to
inflict intrusion does not lead to any change in the
behavior of the program containing the error.
• Existing anomaly detection approaches
based on system calls generally ignore
system call argument values.
• Anomaly detectors typically need to be
trained/tuned for each host/site.
28/29
Comments
• Unusual v.s. Legitimate
• Focus on the program behavior, not attack
behavior.
• Using system calls as a basis may not be a
good solution for network-based system.
• But system call is a good basis. (It is the basis of
all programs.)
• Classification is a smart approach.
• Using regular expression is a flexible design.
29/29