Internet Intrusions: Global Characteristics and Prevalence
Download
Report
Transcript Internet Intrusions: Global Characteristics and Prevalence
Trends in Internet Measurement
Fall, 2004
Paul Barford
Assistant Professor
Computer Science
University of Wisconsin
Motivation
• The Internet is gigantic, complex, and constantly evolving
– Began as something quite simple
• Infrequent use of “scientific method” in network research
– Historical artifact
– Lack of inherent measurement capability
– Decentralization and privacy concerns
•
Recognition of importance of empirically-based research
– Critical trend over past five years (Internet Measurement Conf.)
• Good research hypothesis + good data + good analysis =
good research results
– Focus of this talk: “good data” - where we’ve been and where
we’re going
wail.cs.wisc.edu
2
In the beginning…
• Measurement was part of the original Arpanet in ’70
– Kleinrock’s Network Measurement Center at UCLA
– Resources in the network were reserved for measurement
• Formation of Network Measurement Group in ’72
– Rfc 323 – who is involved and what is important
• First network measurement publication in ’74
– “On Measured Behavior of the ARPA Network,” Kleinrock
and Taylor
• No significant difference between operations a
research
– Size kept things tractable
wail.cs.wisc.edu
3
From ARPAnet to Internet
• In the 80’s, measurement-based publications increased
– “The Experimental Literature of the Internet: An Annotated
Bibliography”,J. Mogul, ’88.
• Rfc 1262 – Guidelines for Internet Measurement
Activities, 1991
– V. Cerf, “Measurement of the Internet is critical for future
development, evolution and deployment planning.”
• What happened?
• “On the Self-Similar Nature of Ethernet Traffic”, Leland
et al., ‘94.
– Novel measurement combined with thorough analysis
– A transition point between operational and research
measurement (?)
wail.cs.wisc.edu
4
Gold in the streets in the 90’s
• Lots of juicy problems garnered much attention in 90’s
– Transport, ATM, QoS, Multicast, Lookup scalability, etc.
• The rise of simulation (aaagggghhhhh!!!!)
• Measurement activity didn’t die…
– Research focus on Internet behavior and structure
•
•
•
•
Self-similarity established as an invariant in series of studies
Paxson’s NPD study from ’93 to ’97
Routing (BGP) studies by Labovitz et al.
Structural properties (the Internet as a graph) by Govindan et al.
– Organizations focused on measurement
• National Laboratory for Applied Network Research (‘95)
• Cooperative Association for Internet Data Analysis (‘97)
wail.cs.wisc.edu
5
Measurement must be hard…
• Well, not really…lot’s of folks are measuring the
Internet
– See CAIDA or SLAC pages
– Business get paid to measure the Internet
• Lot’s of tools are available for Internet measurement
– See CAIDA and SLAC pages
– Dedicated hardware
– Public infrastructures
wail.cs.wisc.edu
6
So, what’s the problem?
• “Strategies for Sound Internet Measurement,” Paxson
‘04.
– Lack consistent methods for measurement-based experiments
– Problems faced in other sciences years ago
• Issues of scale in every direction
– What is representative?
– HUGE, HIGH-DIMENSION date sets make things break
• Disconnect between measurements for operations and
measurements for research
– Operational interests: SLA’s, billing, privacy, …
– Research interests: network-wide properties
wail.cs.wisc.edu
7
Current measurement trends
1. Open end host network measurement infrastructures
•
Available for a variety of uses
2. Large public data repositories
– Domain specific
– Suitable for longitudinal studies
3. Network telescope monitors
•
Malicious traffic
4. Laboratory-based testbeds
•
Bench environments
5. Standard anonymization methods
•
Address privacy concerns
wail.cs.wisc.edu
8
End host infrastructures
• Paxson’s NPD study; an end-host prototype
– Accounts on 35 systems distributed throughout the Internet
– Active, end-to-end measurement focus
• National Internet Measurement Infrastructure (NIMI)
and others evolved from NPD
– Perhaps a bit too ambitious at the time
• Today’s end host infrastructure “success story”:
Planetlab
wail.cs.wisc.edu
9
PlanetLab overview
• Collaboration between Intel, Princeton, Berkeley,
Washington, others starting in early ‘02
• Began as a distributed, virtualized system project
– Peer-to-peer overlay systems were getting hot
– Applications BOF at SIGCOMM ‘02 had only 6/80 people
• Systems were donated to an initial set of sites in ‘02
– Most major universities and Abilene POPs
• Available to members who host systems
• Developers have done a fine job creating a
management environment
– Isolates individual experiments from each other
wail.cs.wisc.edu
10
PlanetLab sites
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
449 nodes at 209 sites: source www.planet-lab.org
wail.cs.wisc.edu
11
End host infrastructures & SP
• End host infrastructures are primarily for active
measurement
– Generate probes and measure responses
• Problem domains
–
–
–
–
Network structure via tomography
Network distance estimation
End-to-end resource estimation
End-to-end packet dynamics
wail.cs.wisc.edu
12
Large public data repositories
• First data repository - Internet Traffic Archive (LBL)
– Hodgepodge of traces from various projects
•
Current projects are more focused
– Passive Measurement and Analysis Project
– Packet traces from high performance monitors
– Abilene Observatory
– Flow traces from the Internet2 backbone routers
– Route views/RIPE
– BGP routing updates from ~150 networks
– Datasets for network security
– DHS project focused on making attack and intrusion data
available for research
wail.cs.wisc.edu
13
Data repositories & SP
• Most of the data in aforementioned repositories was
gathered via passive means
– Counters/logs on devices
– Installed instrumentation
– Configuration to measure specific traffic (BGP)
• Problem domains
– Anomaly detection
– Traffic dynamics
– Routing dynamics
wail.cs.wisc.edu
14
Network telescopes
• Simple observation 1: number globally routed IP
addresses <> number of live hosts
– Network address translation
– Networks (ranges of IP addresses) are routed
• Simple observation 2: traffic to/from standard
services should only arrive at live hosts
– Misconfigurations and malicious traffic are the exceptions
• Network telescope = traffic monitor on routed but
otherwise unused IP addresses
– This traffic is otherwise usually dropped at border router
wail.cs.wisc.edu
15
So, what’s the point?
• Bad guys don’t know which IP addresses in a
network a live
– Random and systematic scanning commonly used
– Spoofed source addresses are used in DoS attacks
– Misconfigurations are fairly rare
• Ergo, network telescopes can provide important
perspective on malicious traffic
– Most importantly, a clean signal
• Implementation is fairly simple
– Honeypots of live systems or honeypot specific monitors
wail.cs.wisc.edu
16
What do we see?
• “Characteristics of Internet Background Radiation,”
Yegneswaran et al., ‘04.
– Active monitors (small, medium, large) at 3 sites
• Traffic is dominated by activity on common services
– Worms and probes targeting HTTP and NetBIOS
– The focus of our study
• Traffic is highly variable and diverse
– Perspectives from 3 monitors are quite different
• Traffic mutates rapidly
• Much deeper analysis is necessary
wail.cs.wisc.edu
17
Network telescopes & SP
• An emerging, rich source of data
• Network security is critically important
• Problem domains
–
–
–
–
Outbreak and attack detection
Collaborative monitoring
Dynamic quarantine
(Misconfiguration analysis)
wail.cs.wisc.edu
18
Laboratory-based testbeds
• Most scientific disciplines commonly use bench
environments to conduct research
– Control
– Instrumentation
– Repeatability
• Network research community has relied on analytic
modeling, simulation and empirical measurement
• Openly available bench environments for network
research are emerging
– EMULAB at Utah - collection of end hosts
– Wisconsin Advanced Internet Lab - collection of routers and
end hosts
wail.cs.wisc.edu
19
Laboratory testbeds & SP
• The effectiveness of bench research hinges on
research hypothesis and experimental design
– Aspects of scale (emergent behavior) are difficult to capture
• Problem domains
–
–
–
–
Inference tool analysis
Protocol (implementation) analysis
Anomaly detection
Network system evaluation
wail.cs.wisc.edu
20
Data anonymization
• Lots of people measure, most are scared s*!#less
about sharing data
– This is a legal issue
– No standards (sure payloads are off limits, but addresses?)
– Don’t ask, don’t tell?
• The community is developing tools for trace
anonymization
– “A High-Level Programming Environment for Packet Trace
Anonymization and Transformation,” Pang et al., ‘03.
– Prefix preserving address anonymization
– Payload hashing
• Probably no direct SP application
– But, implications vis-à-vis future data availability
wail.cs.wisc.edu
21
Summary
• Trends over past 30 years
– Divergence of research and operations
– Decline of importance of measurement in research
– Empirical study of the Internet as an artifact
• Current trends
–
–
–
–
Rise of measurement as a discipline
Open infrastructures and network testbeds
Large-scale domain specific data repositories
Novel measurement methods
• Future trends
– ??
– Embedded measurement systems
wail.cs.wisc.edu
22