Presentation

Download Report

Transcript Presentation

Developing Network Testbed Data Sets
Visualisation Network-of-Experts Working Group
Supporting NATO Research Task Group IST-059/RTG-025
November 6-8, 2007
Amy Vanderbilt
Marcus Lem
Cristin Hall
Joanne Treurniet
Rob Young
NATO IST-059 Network of Experts
Developing Network Testbed Data Sets
BACKGROUND
Q6 – how can “clean” data sets be produced for various types
of networks to provide testbeds with realistic traffic and other
elements?
 The DARPA Intrusion Detection Experiment data is a good
although outdated example (closed computer network traffic)
 How can we develop similar test bed data sets for other
network types?
 One way may be to accept real world networks (social and
otherwise) where a certain sub-network is modeled in detail
based on historical (and hopefully unbiased) data.
 Such testbed data sets may be the first step towards
answering many questions
NATO IST-059 Network of Experts
Developing Network Testbed Data Sets
WHY
 To provide initial validation of algorithms for various
purposes (Prediction, Detection, Etc)
 For ease of use – to allow the larger research
community to test and forward their research
towards needed solutions without having to hand
out classified, current real world data
 To determine the independent variables, minimal
models
 Research hypothesis testing – honing in
applications for transferring work from theory
on
 Stages of validation: Test on simulated data  Test
on historical data  Test on current data
NATO IST-059 Network of Experts
Developing Network Testbed Data Sets
WHAT SHOULD A DATA SET BE?
 Can we create a generic network data set that would be applicable
across applications?
 We could create a generic motif (subnet) component for each
network type and property set based on the framework
categorization
 Then we can build larger networks from these motifs depending
on the application
 We will need a mapping from the applications to the network
types and properties
 Each node and link must have parameters and/or constraints
 Battery life, bandwidth, distance capabilities, latency, restrictions
on traffic, etc
 Need to be able to test on the margins (extreme cases, catastrophic
scenarios).
 Need to allow the users of these data sets to change attributes on
links and nodes
 We need to have a probability distribution based historical data
 Need to tag submissions as coming from open or closed systems
NATO IST-059 Network of Experts
Developing Network Testbed Data Sets
VITA SEARCH
 VITA search on Network Data Test Social shows no
actual data repositories but a few sporadic lists
compiled by individuals
NATO IST-059 Network of Experts
Developing Network Testbed Data Sets
HISTORICAL DATA COLLECTION
 Current data sets collected are sporadic, have little relevancy
(too specific) and are often not accessible
 We need a way to collect historical real world data on which to
base the probability distribution and other aspects of the
nodes and links
 Social – massive multiplayer online games
 Computer – need to collect the traffic, connections, computer
nodes and services running on each node,
 Sensors – a simpler version of computer networks. We could
place sensors in an appropriate environment and collect the
network properties (ARL has open source data available)
 Might be able to extract properties from a real network and
generate the motif based on that
 Need to be able to handle embedding fields and networks
 IDEA: Set up a data bank where people can contribute
datasets coded in a preferred way and containing required
parameters, etc (they can get data if they contribute data)
AND develop a way to auto-generate data sets
NATO IST-059 Network of Experts
Developing Network Testbed Data Sets
STEPS TAKEN
 First we listed out attributes needed for
 Motifs – structure of a subnet
 Nodes
 Links
 Traffic
 Then we considered how to “scrub” such a data set
to allow open source use
NATO IST-059 Network of Experts
Developing Network Testbed Data Sets
ATTRIBUTES
COMPUTER NETWORKS
Category
Motifs
Attributes
Generic
Multi?
Scrub
Structure:
Same as generic
Centrality
None
Node degree distribution
None
Moments of degree
distribution
None
Node betweenness
distribution
None
Connectedness
None
Reachability
None
Shortest path length
None
Diameter
None
Size
None
Average clustering
coefficient
None
NATO IST-059 Network of Experts
Notes
Developing Network Testbed Data Sets
ATTRIBUTES
COMPUTER NETWORKS
Category
Motifs
Attributes
Generic
Multi?
Scrub
Notes
Type:
Same as generic
Regular lattice
Small world
Random
homogeneous
Scale-free
Etc…
Nodes
Hostname or IP
Unique ID – who

Anonymous
code label
Services
offered
Purpose - why

Anonymous
code label
Access control
Conditions when

Anonymous
code label
Hardware and
OS
Traits - what

Anonymous
code label
Asset value
criticality

Anonymous
code label
NATO IST-059 Network of Experts
Varies by
app.
Developing Network Testbed Data Sets
ATTRIBUTES
COMPUTER NETWORKS
Category
Links
Traffic
Attributes
Generic
Multi?
Scrub
Bandwidth
Capacity

Anonymous
code label
Direction
Direction

Anonymous
code label
Physical Path
GIS embedding

Anonymous
code label
Transmission
medium
Link Traits

Anonymous
code label
Range &
attenuation
Conditions

Anonymous
code label
Interaction
Traffic Type

Anonymous
code label
Transmission
rate & encryption
Traits

Anonymous
code label
NATO IST-059 Network of Experts
Notes
Developing Network Testbed Data Sets
NEXT STEPS
 Complete and finalize
attribute lists
 Write a paper
 Seek out who might
have data
 Look for funding!
1
Data Bank Development
1.1
Determine formats for submission
1.2
Develop archive and web-services architecture
1.3
Set Up Website
1.4
Advertise to the research community
2
Develop Automated Data Set Generation
2.1
Historical data collection planning
2.2
Historical data collection
2.3
Develop initial motif sets
2.4
Develop user interface
2.5
Testing
2.6
Deploy to Website
3
Data Bank Maintenance & Improvement (ph 2)
3.1
Develop additional motif sets
3.2
Maintain web services
3.3
Evangelize the data bank at conferences, etc
NATO IST-059 Network of Experts
Developing Network Testbed Data Sets
WORKSHOP TOPICS
 Bootstrapping for creation of more substantial data
sets
 Amelioration of uncertainty
 Prediction via hypothetical network models
 Self generating networks
 Dynamic uncertainty – eruption and propagation of
uncertainty in the evolution of networks
NATO IST-059 Network of Experts