Slides - TERENA Networking Conference 2008

Download Report

Transcript Slides - TERENA Networking Conference 2008

Connect. Communicate. Collaborate
A Network Security Service for
GÉANT2 (and beyond….)
Maurizio Molina, DANTE
TNC 08, Brugges, 20 th May 2008
Outline
•
•
•
•
The vision
Proof of concept
Supporting tools
Service Outlook
Connect. Communicate. Collaborate
The vision:
enhance NRENs security
Connect. Communicate. Collaborate
• NRENs have their CERTs to deal with security
• and collaborate with each other
– Trusted Introducer
– GN2 JRA2
• and DANTE can filter traffic on GN2 if NRENs request it….
! BUT !
• Can we be more proactive to NREN CERTs exploiting
the visibility of the GN2 core?
The vision (cont.):
enhance NRENs security
•
•
•
•
Connect. Communicate. Collaborate
To spot security anomalies in the GN2 core you need data
Good old SNMP? Too coarse!
Router Logs? Ok, but need to know what you’re looking for
Run a darknet?
– It’s not where a core network makes a difference
– others already do it
• NetFlow? yes, but you need good tools!
• Routing data? Only as a complement of NetFlow
Proof of concept: what can
we see with NetFlow data?
Connect. Communicate. Collaborate
NfSen, enhanced with self written
Anomaly Detection extensions
Netflow collected on all
peering interfaces
1 / 1,000 Sampling
3k flows/s
Bits, Packets or Flows?
What to use?
Connect. Communicate. Collaborate
• Flows/s are more indicative of security incidents
• But with fixed thresholds, small interesting peaks will
disappear in daily cycles!
OK, we’re smart… let’s filter!
• It’s an “observer”
S
Input
X2
X1
+
+
S
+
-
K
1
K
2
Connect. Communicate. Collaborate
Forecast
error
k1=(1-p1)*(1-p2) ; k2= (1-p1)+(1-p2)
Choice: p1=p2=0.9
•The “error” is
used in control
loops
• Here we use it
to spot a
deviation from a
baseline
Does it help? Not if we stick
to volumes (e.g. flows/s) …
Connect. Communicate. Collaborate
TCP flows (filtered)
UDP flows (filtered)
Are there other more “security
sensitive” features?
Connect. Communicate. Collaborate
• Recent work on Anomaly Detection suggests focusing on
the concentration or dispersion of
– Flows per IP source address
– Flows per IP destination address
– Flows per IP source port
– Flows per IP destination port
• AKA “IP features entropies”
Explanation of IP feature
entropy
Connect. Communicate. Collaborate
fraction of total flows received per IP address
fraction of total flows received per IP address
0.25
0.25
0.2
0.2
0.15
0.15
0.1
0.1
0.05
0.05
0
0
1
6
11
16
21
26
IP (ranked)
1
6
11
16
21
26
IP (ranked)
Traffic more focused towards a few hosts
Normal
 ni 
 ni 
The Entropy H is: H ( x)     log 2  
S
i 1  S 
N
H varies between 0 (“one point takes all”)
and log2N (uniform distribution)
IP feature entropy
(simplified)
Connect. Communicate. Collaborate
fraction of total flows received per IP address
fraction of total flows received per IP address
0.25
0.25
0.2
0.2
f=0.81
f=0.6
0.15
0.15
0.1
0.1
0.05
0.05
0
0
1
6
11
16
IP (ranked)
Normal
21
26
1
6
11
16
21
26
IP (ranked)
Traffic more focused towards a few hosts
•Percentage of flows associated to top N src IPs, dst IPs, src ports, dst ports
• We tried N = 1, 10, 100, 500
• N=10 was the best choice (anomalies appear more evident)
IP features entropies (after
“observer” filtering)
Connect. Communicate. Collaborate
TCP features entropies
UDP features entropies
10 days of GN2 traffic
Drilling
on a TCP peak
Drillingdown
down
-Concentration of DST
IPs and DST ports
receiving flows
-Dispersion of SRC IPs
and SRC ports
-The “bounce” is due to the filter,
and needs a state machine to be
correctly interpreted!
Connect. Communicate. Collaborate
• IRC server in
Slovenia, receiving
a lot of 60 bytes
syn pkts on port
6667, mainly from a
/16 Subnetwork of
an University in the
Netherlands.
• Likely a “BotNet
war”?
Drilling
on a UDP peak
Drillingdown
down
Connect. Communicate. Collaborate
- Concentration of SRC and
DST IPs and SRC ports
- Dispersion of DST ports
-Observe again the “bounce”!
• Portscan of host in
CARNET, from 4
hosts, 29 bytes
packets
And on smaller aggregates?
“DWS”
DrillingNREN
downexample
-Concentration of SRC
and DST IPs and SRC
ports
-Dispersion of DST
ports
-Observe again the “bounce”!
Connect. Communicate. Collaborate
• A few hours routing
shift event (primary to
backup access)
triggers a lot of
“noise”
• One MUST be able to
correlate feature
entropies & traffic
shifts!
• Other than that,
peaks are still very
clear!
And on smaller aggregates?
“NON DWS” NREN example
Connect. Communicate. Collaborate
• Fewer peaks, but
still evident
Lessons learnt so far
Connect. Communicate. Collaborate
• IP features entropies evidence also low volume anomalies,
and can give an initial hint on the anomaly type, but:
– need a state machine to be interpreted
– fully automatic conclusions are difficult
– one must not be oblivious of big volume shifts and
macroscopic events!
• A lot of anomalies are “observable” on DWS connectivity
– Good reason for having a security service protecting
DWS customers!
– But we’ve seen attacks/scans between NRENs as well
Moving forward
Connect. Communicate. Collaborate
• With NfSen and self-written extensions we have enough
evidence that:
– anomalies are observable in the GÉANT2 core
– Novel automatic methodologies for their classifications
are applicable
• However, we are looking at commercial tools for moving to
a service
– To reduce effort to engineer / maintain / evolve code
– Scalability and tool support is an issue for a service
Tools requirements
Connect. Communicate. Collaborate
• Detection of both low and high volume anomalies
– (DoS, DDoS, host and Network scans, worms, phishing
sites, etc.)
• Automatic classification, collection of evidence
• Detection of anomaly entry points, suggestion of ACLs
• Give correct indications also in presence of sudden traffic
shifts due to routing changing/network outages
• Robustness to occasional loss of NetFlow records
• Work well also with sampled NetFlow
Tools’ comparison
Connect. Communicate. Collaborate
• Work just started, no conclusion yet
• We just report “lesson learnt” so far
– on paper analysis of some tools (four in some detail)
– Interaction with vendors
Tools approaches
Connect. Communicate. Collaborate
• IP features entropy + volume
– Pros: no additional info needed, works with low sampling
rate, can catch a wide range of anomalies
– Cons: needs drill down after “alert”
• Volume + “fingerprints”
– Pros: precise, an alert is already “a conclusion”
– Cons: won’t catch what you don’t look for
• Per host behavioural analysis
– Pros: precise
– Cons: scalability? robustness to low sampling?
Tools common features
Connect. Communicate. Collaborate
• Require NetFlow on ingress links only
• Capable of doing NetFlow v5 and v9
• Require SNMP access to routers to read configuration data
Tools distinguishing
features
Connect. Communicate. Collaborate
• BGP processing
– create POP to POP (or even prefix to prefix) matrixes
– correlate big volume shifts to routing changes
• Internal routing (e.g. IS/IS) processing
– traffic split of peers on internal (backbone) links
• NetFlow collection on multiple points (routing tracing)
– But this is not really a plus, rather an additional burden
for NOT using routing data!
Tools distinguishing
features (cont.)
Connect. Communicate. Collaborate
• different approaches to distinguish “normal” from “not
normal” behaviour
– Principal Component Analysis
– Host type classification & rather complex “scoring”
system
– moving averages
– fixed thresholds
Service Outlook
Connect. Communicate. Collaborate
• Primary recipients: NREN CERTs
• Info provided:
– security alerts about all types of discovered anomalies
– Collected evidence
– Suggested mitigation actions
– Periodic summary reports
• Other recipients: APMs, NREN PC, EU commission
– For strategic decisions
Acknowledgements
Connect. Communicate. Collaborate
• Prof. Francesco Donati and Dott.sa Gabriella Caporaletti
EICAS automazione S.P.A. for the useful hints on the
“observer” design and tuning
• Peter Haag from SWITCH for the development of NfSen
Connect. Communicate. Collaborate
Thank you! – Questions?
[email protected]