Presentation
Download
Report
Transcript Presentation
Developing Network Testbed Data Sets
Visualisation Network-of-Experts Working Group
Supporting NATO Research Task Group IST-059/RTG-025
November 6-8, 2007
Amy Vanderbilt
Marcus Lem
Cristin Hall
Joanne Treurniet
Rob Young
NATO IST-059 Network of Experts
Developing Network Testbed Data Sets
BACKGROUND
Q6 – how can “clean” data sets be produced for various types
of networks to provide testbeds with realistic traffic and other
elements?
The DARPA Intrusion Detection Experiment data is a good
although outdated example (closed computer network traffic)
How can we develop similar test bed data sets for other
network types?
One way may be to accept real world networks (social and
otherwise) where a certain sub-network is modeled in detail
based on historical (and hopefully unbiased) data.
Such testbed data sets may be the first step towards
answering many questions
NATO IST-059 Network of Experts
Developing Network Testbed Data Sets
WHY
To provide initial validation of algorithms for various
purposes (Prediction, Detection, Etc)
For ease of use – to allow the larger research
community to test and forward their research
towards needed solutions without having to hand
out classified, current real world data
To determine the independent variables, minimal
models
Research hypothesis testing – honing in
applications for transferring work from theory
on
Stages of validation: Test on simulated data Test
on historical data Test on current data
NATO IST-059 Network of Experts
Developing Network Testbed Data Sets
WHAT SHOULD A DATA SET BE?
Can we create a generic network data set that would be applicable
across applications?
We could create a generic motif (subnet) component for each
network type and property set based on the framework
categorization
Then we can build larger networks from these motifs depending
on the application
We will need a mapping from the applications to the network
types and properties
Each node and link must have parameters and/or constraints
Battery life, bandwidth, distance capabilities, latency, restrictions
on traffic, etc
Need to be able to test on the margins (extreme cases, catastrophic
scenarios).
Need to allow the users of these data sets to change attributes on
links and nodes
We need to have a probability distribution based historical data
Need to tag submissions as coming from open or closed systems
NATO IST-059 Network of Experts
Developing Network Testbed Data Sets
VITA SEARCH
VITA search on Network Data Test Social shows no
actual data repositories but a few sporadic lists
compiled by individuals
NATO IST-059 Network of Experts
Developing Network Testbed Data Sets
HISTORICAL DATA COLLECTION
Current data sets collected are sporadic, have little relevancy
(too specific) and are often not accessible
We need a way to collect historical real world data on which to
base the probability distribution and other aspects of the
nodes and links
Social – massive multiplayer online games
Computer – need to collect the traffic, connections, computer
nodes and services running on each node,
Sensors – a simpler version of computer networks. We could
place sensors in an appropriate environment and collect the
network properties (ARL has open source data available)
Might be able to extract properties from a real network and
generate the motif based on that
Need to be able to handle embedding fields and networks
IDEA: Set up a data bank where people can contribute
datasets coded in a preferred way and containing required
parameters, etc (they can get data if they contribute data)
AND develop a way to auto-generate data sets
NATO IST-059 Network of Experts
Developing Network Testbed Data Sets
STEPS TAKEN
First we listed out attributes needed for
Motifs – structure of a subnet
Nodes
Links
Traffic
Then we considered how to “scrub” such a data set
to allow open source use
NATO IST-059 Network of Experts
Developing Network Testbed Data Sets
ATTRIBUTES
COMPUTER NETWORKS
Category
Motifs
Attributes
Generic
Multi?
Scrub
Structure:
Same as generic
Centrality
None
Node degree distribution
None
Moments of degree
distribution
None
Node betweenness
distribution
None
Connectedness
None
Reachability
None
Shortest path length
None
Diameter
None
Size
None
Average clustering
coefficient
None
NATO IST-059 Network of Experts
Notes
Developing Network Testbed Data Sets
ATTRIBUTES
COMPUTER NETWORKS
Category
Motifs
Attributes
Generic
Multi?
Scrub
Notes
Type:
Same as generic
Regular lattice
Small world
Random
homogeneous
Scale-free
Etc…
Nodes
Hostname or IP
Unique ID – who
Anonymous
code label
Services
offered
Purpose - why
Anonymous
code label
Access control
Conditions when
Anonymous
code label
Hardware and
OS
Traits - what
Anonymous
code label
Asset value
criticality
Anonymous
code label
NATO IST-059 Network of Experts
Varies by
app.
Developing Network Testbed Data Sets
ATTRIBUTES
COMPUTER NETWORKS
Category
Links
Traffic
Attributes
Generic
Multi?
Scrub
Bandwidth
Capacity
Anonymous
code label
Direction
Direction
Anonymous
code label
Physical Path
GIS embedding
Anonymous
code label
Transmission
medium
Link Traits
Anonymous
code label
Range &
attenuation
Conditions
Anonymous
code label
Interaction
Traffic Type
Anonymous
code label
Transmission
rate & encryption
Traits
Anonymous
code label
NATO IST-059 Network of Experts
Notes
Developing Network Testbed Data Sets
NEXT STEPS
Complete and finalize
attribute lists
Write a paper
Seek out who might
have data
Look for funding!
1
Data Bank Development
1.1
Determine formats for submission
1.2
Develop archive and web-services architecture
1.3
Set Up Website
1.4
Advertise to the research community
2
Develop Automated Data Set Generation
2.1
Historical data collection planning
2.2
Historical data collection
2.3
Develop initial motif sets
2.4
Develop user interface
2.5
Testing
2.6
Deploy to Website
3
Data Bank Maintenance & Improvement (ph 2)
3.1
Develop additional motif sets
3.2
Maintain web services
3.3
Evangelize the data bank at conferences, etc
NATO IST-059 Network of Experts
Developing Network Testbed Data Sets
WORKSHOP TOPICS
Bootstrapping for creation of more substantial data
sets
Amelioration of uncertainty
Prediction via hypothetical network models
Self generating networks
Dynamic uncertainty – eruption and propagation of
uncertainty in the evolution of networks
NATO IST-059 Network of Experts