slides - Carleton Computer Security Lab

Download Report

Transcript slides - Carleton Computer Security Lab

Technological Networks:
From Empirical Laws to Theory
Stephanie Forrest
Dept. of Computer Science
University of New Mexico
http://cs.unm.edu/~forrest
[email protected]
UNM CS Dept. Profile
• 18 Faculty:
– 9 tenured (5 full, 4 assoc), 5 assistant,1 lecturer
– 3 openings (2 junior, 1 senior)
– Prince of Asturias Endowed Chair in Information Technology
– External faculty appointments from other departments and national labs.
• Close collaborations with SFI and national labs (SNL and LANL)
• Students:
– Degrees: BS: ~40, MS: ~35, PhD: ~5
– Undergraduate: ~200 majors; BS degrees
• >20% Female; >35% Minorities
– Graduate: ~120 MS ~80 Ph. D.
• >20% Female; 40-60% Foreign
• Funding: (2004-2005)
– Total: $3.5M
– NSF, DARPA, DOE, NIH, Sandia and Los Alamos
UNM CS Dept. Research
•
Strongly Interdisciplinary:
– Adaptive computation
– New paradigms of computing (molecular and quantum computation)
– Computational biology and bioinformatics (phylogenetic tree
reconstruction, radiology, RNAi)
•
•
Graphics and visualization
High-performance computing:
– Light-weight distributed operating systems
•
Automated reasoning and machine learning:
– Otter
– POMDP
•
Complex networks:
– Provably robust scalable algorithms for P2P networks
– Phase transitions in NP-complete problems
Themes of Talk
• The real world isn't exactly scale free.
• Understanding and predicting network structure is important for
engineering:
– Network properties can be exploited to enhance computer security
(computer epidemiology / border gateway protocol)
• We lack theory to explain/predict the structure of technological
networks:
– Preferential attachment isn’t good enough.
• Initial steps toward theory.
Technological Networks
• Distribute resources:
– Energy
– Materials
– Information
• Energy distribution:
– Power grids
– Gas pipelines
• Transportation:
– Highways
– Airline routes
• The Internet:
–
–
–
–
Physical connectivity
Autonomous systems (AS)
World-wide web
Social contacts, e.g., email
• Microarchitecture
Network Structure
•
Network topology affects network
properties:
– Shortest distance between two
nodes
– Bisection width
– Rate and extent of contagion
•
Analysis:
– Epidemiological models.
– The epidemic threshold.
•
Degree distribution of network:
– Scale-free (power law) networks:
– Pk = k-c
•
Controlling infections on scale-free
networks:
– Random vaccination is ineffective
(e.g., anti-virus software).
– Targeted vaccination of highconnectivity nodes.
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Example 1: Computer Epidemiology
Justin Balthrop, Mark Newman, Matt Williamson
•
Viruses and worms spread over networks of contacts between
computers:
– Email address books.
– URL links.
•
Different types of networks are exploited by different types of infections.
UNM CS Dept. Network of Address Books
Degree Distributions of Four Networks
Relevant to Computer Security
300
IP network
Adminstrator network
250
Email traffic
Address books
10000
200
1000
150
100
100
10
50
0
1
0
100
200
300
400
1
Degree k
10
100
1000
Degree k
Science 304:527-529 (2004)
– Not scale-free.
– Targeted vaccination unlikely to be effective:
•
•
> 10% of nodes required for address book data
> 87% of nodes required for email traffic data
– Computer infections can choose their own topology, so network topology is not
static.
– Viruses spread faster than repairs.
Throttling: Generic Control of Epidemics
Matt Williamson and Justin Balthrop
•
Control network topology in time rather than space.
–
–
•
Assumes that virus traffic is significantly different from normal traffic:
–
–
•
Effective when the form of infection not known in advance.
Reduces amount of traffic generated during an epidemic.
Limitations:
–
•
Nimbda infects up to 400 new machines per second compared to normal rate of
connections to new Web servers of about one connection per second or slower
(Williamson, 2002).
Throttling Nimbda would have increased the epidemic time from one day to over one
year.
Advantages:
–
–
•
Limit the rate at which a computer can make new connections
Limits spreading rates rather than stopping.
Effectiveness decreased if not all nodes are throttled (altruism), stealth attacks.
Implementations:
–
–
RIOT
HP’s Virus Throttle
Responsive I/O Throttle (RIOT)
Justin Balthrop and Matt Williamson
•
Vision: An adaptive desktop firewall:
–
–
–
–
•
Personal desktop firewalls:
–
–
•
Hamper worms, viruses, DOS,
misconfigurations, etc.
Graduated automated response
(throttling) on all connections.
Adaptive (learn normal).
Robust to false positives
Coarse-grained
Static and preprogrammed
•
How to detect anomalous connections?
–
Generic rate limit (throttle):
–
–
Delay network connections that
occur at an anomalous rate.
Detector activation triggers delay
and packets are dropped.
IP Address 1
32 bits
IP Address 2
32 bits
Set of detectors (lymphocytes) that
observe TCP connections:
•
•
–
–
TCP Port 1
16 bits
Data Path (below)
Meta-information (direction, TCP flags)
Each detector matches some portion of IP
space.
Each detector has its own normal activity
level.
TCP Port 2
16 bits
Learning and Throttling Connections
1. A new connection is initiated
(SYN Packet)
2. The connection information is
translated to a bitstring
3. The bitstring is shown to all
detectors
4. An immature detector exceeds
its activation threshold
5. The detector’s activation
threshold is raised
6. Detectors with stable activation
thresholds become mature
Autonomic Responses: A Repertoire
LISYS
(Steven Hofmeyr)
E-mail system administrator when there is
an anomalous network connection.
Personal
Desktop
Firewall
Block network connections based on
user-specified policy.
pH
Slow down system calls for anomalous
processes.
(Anil Somayaji)
Throttling
(Matt Williamson)
RIOT
(Justin Balthrop)
PGBGP
(Josh Karlin)
Limit the overall rate of new network
connections for a computer.
Limit rate of all anomalously behaving
TCP connections. Adaptive.
Limit the rate at which new BGP routes
are adopted.
Example 2: Inter-Domain Routing
Josh Karlin, Stephanie Forrest, and Jennifer Rexford
•
•
~20,000 Autonomous Systems
(ASs) connected via the Border
Gateway Protocol (BGP)
ASs route blocks of IP’s, know
as prefixes
AT&T
– 64.106.0.0/17 (Owned by UNM)
•
•
~170,000 prefixes owned by
ASs today.
Border gateway protocol (BGP):
– Tell neighbors about new routes
(Announcements).
– Tell neighbors about old routes
gone bad (Withdrawals).
– That’s it.
Time Warner
Telecom
Comcast
SWCP
UNM
BGP Networks are Interesting and Important
• Distributed:
– Nodes are AUTONOMOUS systems.
– No centralized routing information (routes stored and maintained locally).
– No authentication of new nodes or routes.
• Dynamic:
– Network connectivity changes routinely and continually.
– Network updates are spread through local contact (BGP).
• Confluence of technological and economic constraints:
– A “policy” network as well as a routing network.
• All inter-domain Internet traffic relies upon BGP.
• Vulnerable:
– Trivial to inject false routing information into network.
– Man-in-the-Middle attacks.
• Pretty Good BGP (PGBGP) and Internet Alert Registry (IAR).
– Throttle the adoption of new routes.
PGBGP Algorithm
• Main Idea: Delay Suspicious Routes
– Lower the preference of suspicious routes (24hr)
• Detection:
– Monitor BGP update messages
– Treat origin ASs for a prefix seen within the past few days as
normal
– Treat new origin ASs as suspicious for 24 hours, then accept as
normal (possible prefix hijack)
– Treat new sub-prefixes as suspicious for 24 hours, then accept as
normal (possible sub-prefix hijack)
• Response:
– Suspicious origin AS routes are temporarily given low local
preference
– Suspicious sub-prefixes are temporarily ignored (not forwarded to)
PGBGP Advantages
• Incremental deployability
– No change to BGP protocol, just to path selection
– Immediate benefits to adopting AS and customers
• Automated and immediate response
– Avoid using and propagating the bogus route
– Network has chance to stop the attack before it spreads
• Robust to false positives
– Lowering preference for suspicious routes
– No loss of reachability
– Accidental short-term delays do no harm
• Offline investigation of suspicious route
– Internet Alert Registry, active probing, …
• Adaptive, simple
Incremental Deployment
Prefix Hijack
Subprefix Hijack
ICNP (2006)
•
Limitations:
– Doesn’t address path spoofing, redistribution attacks.
– Negligence
BGP Network Structure
Petter Holme and Josh Karlin
•
Barabasi-Albert (BA) model (Barabasi and Albert, 1999):
–
–
•
Vertices and edges added iteratively
Probability of attaching to vertex i is proportional to k(i)
Inet model (Winick and Jamin, 2000)
–
–
–
No simple growth principle
Generate random graph with known degree distribution
Augment to mimic additional known correlations (e.g., connecting all high-degree
nodes)
Proc. Royal Acad. A (in press)
Tentative Conclusions
• Real AS graph is more heterogeneous than can be expected
from degree distribution alone:
–
–
–
–
Core providers in the low-d tail
Peak at d=3 (vertices directly connected to the core)
Second peak at d=4 (vertices directly connected to d=3 nodes)
More structure in periphery than predicted by earlier models.
• Preferential attachment is a poor model for network growth.
• What constraints determine network architecture and growth?
Allometric Scaling in Biology
Y  Y0 X 3 / 4
ln (metabolic rate *eE/kT)
30
Endotherms
Reptiles
Fish
Amphibians
Invertebrates
Unicells
Plants
10
y = 0.71x + C
-10
-30
0
30
ln (body mass)
•
Dominant design constraint:
– Distribute resources to every cell in the organism
•
•
•
Internal, space-filling, hierarchical networks (vascular system)
Invariant terminal units (capillaries)
Optimality (minimize transport time, maximize metabolism)
Examples: Scaling in Software
H. Inoue
HelloWorld: Unique Function Calls vs. Invocation Freq
LimeWire Behavior
Example: Scaling in Software
Dave Ackley and Terry Van Belle
•
How to measure evolvability?
– Likelihood (how likely is a location to
change).
– Impact (change in one location
affects other locations).
•
•
Work = [Likelihood x Impact]:
–
Software maintenance costs.
–
Expected time to evolve.
Study long-lived Java code
bases:
–
Code change sizes and frequencies
are power law ish.
Van Belle, 2004
A Theory of Network Scaling: Conclusions
• Why it’s important
– Networking infrastructure continues to expand by orders of
magnitude:
• NSF GENI project: Redesign the Internet from the ground up.
• Interplanetary networking.
– Architecture:
• Move from performance-oriented to power-aware designs.
• The end of silicon.
– Scaling problems in software, security problems everywhere
• Why it’s hard
– Terminal units aren’t necessarily invariant sized
– Dimensionality and geometry are not obvious
– May require new mathematics