20070211-measurement-zekauskas

Download Report

Transcript 20070211-measurement-zekauskas

Measurement on the Internet2 Network:
an evolving story
Matt Zekauskas
Joint Techs, Minneapolis
11-Feb-2007
Outline
• The Internet2 Observatory
• What we are measuring today
• The perfSONAR vision
• What is happening in the near term
• LHC OPN “e2emon”
2
The Observatory
• Collect data for operations
• Understanding the network, and how well it
is operating
• How we started
• Collect data for research
• Part of Internet2’s long-standing
commitment to network research
3
The Observatory
• Two components
• Data collected by NOC and Internet2 itself
• Ability for researchers to collocate
equipment when necessary
4
The New Internet2 Network
• Expanded Layer 1, 2 and 3 Facilities
• Includes SONET and Wave equipment
• Includes Ethernet Services
• Greater IP Services
• Requires expanded Observatory
5
In Brief
• Extends to all optical Add/Drop Sites
• Add capability:
• Run the control software
• Other out-of-band mgmt. tasks
• Refresh of Observatory
• Refresh PCs
• 10G capabilities on IPO
• 10G capability on Ciena Network
(planned, next year)
• Experimental NetFPGA Cards
(planned, next year)
• Standing up each node as it is installed
6
The New Internet2 Observatory
• Seek Input from the Community, both Engineers and
Network Researchers
• Current thinking is to support three types of services
• Measurement (as before)
• Collocation (as before)
• Experimental Servers to support specific projects - for
example, Phoebus (this is new)
• Support different types of nodes:
• Optical Nodes
• Router Nodes
7
Existing Observatory Capabilities
• One way latency, jitter, loss
• IPv4 and IPv6 (“owamp”)
• Regular TCP/UDP throughput tests – ~1 Gbps
• IPv4 and IPv6; On-demand available (“bwctl”)
• SNMP
• Octets, packets, errors; collected 1/min
• Flow data
• Addresses anonymized by 0-ing the low order 11 bits
• Routing updates
• Both IGP and BGP - Measurement device participates in both
• Router configuration
• Visible Backbone – Collect 1/hr from all routers
• Dynamic updates
• Syslog; also alarm generation (~nagios); polling via router proxy
8
Observatory Functions
Device Function
Details
nms-rthr1
Measurement
BWCTL on-demand 1 Gpbs router throughput,
Thrulay
nms-rthr2
Measurement
BWCTL on-demand 10 Gbps router
throughput, Thrulay
nms-rexp
Experimental
NDT/NPAD
nms-rpsv
Measurement
Netflow collector
nms-rlat
Measurement
OWAMP with locally attached GPS timing
nms-rpho
Experimental
Phoebus 2 x 10GE to Multiservice Switch
nms-octr
Management
Controls Multiservice Switch
nms-oexp Experimental
NetFPGA
nms-othr
On-demand Multiservice Switch 10 Gbps
throughput
9
Measurement
Router Nodes
10
Router Nodes
11
Optical Nodes
12
Optical Nodes
13
Observatory Hardware
• Dell 1950 and Dell 2950 servers
•
•
•
•
•
Dual Core 3.0 GHz Xeon processors
2 GB memory
Dual RAID 146 GB disk
Integrated 1 GE copper interfaces
10 GE interfaces
• Hewlett-Packard 10GE switches
• 9 servers at router sites, 3 planned at optical
only sites (initially 1 - control)
14
Observatory Databases – Datа Types
•Data is collected locally and stored in
distributed databases
•Databases
• Usage Data
• Netflow Data
• Routing Data
• Latency Data
• Throughput Data
• Router Data
• Syslog Data
15
Lots of Work to be Done
• Internet2 Observatory realization inside racks
set for initial deployment, including planning for
research projects (NetFPGA, Phoebus)
• Software and links easily changed
• Could add or change hardware depending on
costs
• Researcher tools, new datasets
• Consensus on passive data
16
New Challenges
• Operations and Characterization of new
services
• Finding problems with stitched together VLANs
• Collecting and exporting data from Dynamic Circuit
Service...
• Ciena performance counters
• Control plane setup information
• Circuit usage (not utilization, although that is also nice)
• Similar for underlying Infinera equipment
• And consider inter-domain issues
17
Observatory Requirements Strawman
• Small group: Dan Magorian, Joe Metzger and
Internet2
• See document off of
http://measurement.internet2.edu/
• Want to start working group under new
Network Technical Advisory Committee
• Interested? Talk to Matt or watch NTAC Wiki on
wiki.internet2.edu; measurement page will also
have some information…
18
Strawman: Potential New Focus Areas
• Technology Issues
• Is it working? How well? How debug
problems?
• Economy Issues – interdomain circuits
• How are they used? Are they used
effectively? Monitor violation of any rules
(e.g. for short-term circuits)
• Compare with “vanilla” IP services?
19
Strawman: Potential High-Level Goals
• Extend research datasets to new equipment
• Circuit “weathermap”; optical proxy
• Auditing Circuits
• Who requested (at suitable granularity)
• What for? (ex: bulk data, streaming media,
experiment control)
• Why? (add’l bw, required characteristics,
application isolation, security)
20
Inter-Domain Issues Important
• New services (various circuits)
• New control plane
• That must work across domains
• Will require some agreement among
various providers
• Want to allow for diversity…
21
Sharing Observatory Data
We want to make Internet2 Network
Observatory Data:
• Available:
• Access to existing active and passive
measurement data
• Ability to run new active measurement tests
• Interoperable:
• Common schema and semantics, shared across
other networks
• Single format
• XML-based discovery of what’s available
22
What is perfSONAR?
• Performance Middleware
• perfSONAR is an international consortium in which
Internet2 is a founder and leading participant
• perfSONAR is a set of protocol standards for
interoperability between measurement and
monitoring systems
• perfSONAR is a set of open source web services
that can be mixed-and-matched and extended to
create a performance monitoring framework
23
perfSONAR Design Goals
•
•
•
•
•
•
•
Standards-based
Modular
Decentralized
Locally controlled
Open Source
Extensible
Applicable to multiple generations of network
monitoring systems
• Grows “beyond our control”
• Customized for individual science disciplines
24
perfSONAR Integrates
• Network measurement tools
• Network measurement archives
• Discovery
• Authentication and authorization
• Data manipulation
• Resource protection
• Topology
25
perfSONAR Credits
•perfSONAR is a joint
effort:
•
•
•
•
ESnet
GÉANT2 JRA1
Internet2
RNP
•ESnet includes:
• ESnet/LBL staff
• Fermilab
•Internet2 includes:
•
•
•
•
26
University of Delaware
Georgia Tech
SLAC
Internet2 staff
•GÉANT2 JRA1 includes:
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Arnes
Belnet
Carnet
Cesnet
CYNet
DANTE
DFN
FCCN
GRNet
GARR
ISTF
PSNC
Nordunet (Uninett)
Renater
RedIRIS
Surfnet
SWITCH
perfSONAR Adoption
•R&E Networks
•
•
•
•
•
Internet2
ESnet
GÉANT2
European NRENs
RNP
•Application Communities
• LHC
• GLORIAD Distributed
Virtual NOC
• Roll-out to other
application communities
in 2007
27
•Distributed Development
• Individual projects (10
before first release) write
components that integrate
into the overall framework
• Individual communities (5
before first release) write
their own analysis and
visualization software
Proposed Data to be made available via
perfSONAR
• First Priorities
•
•
•
•
Link status (CIENA data)
SNMP data
OWAMP
BWCTL
• Second Priorities
• Additional CIENA data
• Ethernet stats
• SONET (Severely errored seconds, etc.)
• Light levels
• Similar Infinera data
• Later: Flow data
• Feedback? Alternate priorities?
28
What will (eventually) consume data?
• We intend to create a series of web pages that will
display the data
• Third-party Analysis/Visualization Tools
•
•
•
•
European and Brazilian UIs
SLAC-built analysis software
LHC OPN E2EMON
More …
• Real applications
• Network-aware applications
• Consume performance data
• React to network conditions
• Request dynamic provisioning
• Future Example: Phoebus
29
JRA4 E2EMon slides
From Mauro Campanella, GARR, 2006-Nov
Demo:
http://cnmdev.lrz-muenchen.de/e2e/html/G2_E2E_index.html
Problem space
Connect. Communicate. Collaborate
E2ELink A-B
Domain B
Point A
Domain A
Domain C
PointB
Goal: (near) real-time monitoring (link status) of constituent DomainLinks
(and links between domains) and whole end-to-end Link A-B.
The following applies to the GÉANT2+ service and the cross border fibres.
Hopi Meeting
3 Nov 2006
The Italian
Research and
Education
31
Divide & conquer
Connect. Communicate. Collaborate
(JRA4 E2Emon info model)
JRA4 view of world:
note WDM systems,
& static lambdas
Hopi Meeting
3 Nov 2006
The Italian
Research and
Education
32
Approach
Connect. Communicate. Collaborate
E2ELink A-B
Domain B
Point A
Domain A
Domain C
perfSONAR
MP or MA
PointB
DomainLink and
(partial) ID_Link info
perfSONAR
MP or MA
perfSONAR
Measurement Point (MP)
or Measurement Archive (MA)
“Weathermap”
view for users
E2Emon
correlator
E2ECU
operators
Hopi Meeting
3 Nov 2006
The Italian
Research and
Education
33
LHC-OPN e2e Monitoring
CNAF
MI
PD
BO
X
Connect. Communicate. Collaborate
KARLSRUHE
X
WDM
Manno
X
GARR
WDM
SWITCH
WDM
DFN
• e2e lightpath
from CNAF (Bologna, Italy) to Karlsruhe (Germany)
• The logical topology built for the e2e monitoring system
abstracts the internal topology of each domain and produces
a simpler topology.
Hopi Meeting
3 Nov 2006
The Italian
Research and
Education
34
LHC-OPN e2e Monitoring
CNAF
MI
PD
BO
X
Connect. Communicate. Collaborate
KARLSRUHE
X
WDM
Manno
X
GARR
WDM
SWITCH
WDM
DFN
Domain1
EP
ID Link
Domain2
Demarcation Point
Domain Link
Domain3, Domain4
DP
End Point
DP
ID Link
Hopi Meeting
3 Nov 2006
Domain5
Other Domain Links
The Italian
Research and
Education
ID Link
35
LHC-OPN e2e Monitoring
Connect. Communicate. Collaborate
Use
r
E2E Monitoring System
interdomain aggregation
web services
Domain 1 MA
Domain 2 MA
Domain n MA
domain aggregation and xml generation
Domain 1 MP
script
Domain 2 MP
Domain n MP
polls
acquisition
Network 1
Network 2
Hopi Meeting
3 Nov 2006
Network n
The Italian
Research and
Education
36
IL MONITORING
CNAF - CERN
GARR monitoring flow
Connect. Communicate. Collaborate
GINS (the GARR network monitoring system) checks the status of the logical
circuits in the GARR domain and provides the result to the GARR MP.
The central e2e measurement system queries each domain and provides the global
e2e status.
This shows the domain independency, the possibility to easily aggregate the
information and its scalability.
end point
GARR monitoring domain
GINS e2e
Monitor
CNAF
X
IP Link
GEANT2
GARR
XML Data
X
GARR MP
MPLS LSP
E2E MS
IP/L2 Link
Hopi Meeting
3 Nov 2006
The Italian
Research and
Education
37
GARR monitoring domain
MI
PD
CNAF
BO
KARLSRUHE
X
WDM
X
Connect. Communicate. Collaborate
Manno
X
GARR
WDM
SWITCH
WDM
DFN
GINS
User
IP
MPLS
lambda lambda lambda
GINS e2e Service
E2E Monitoring
System
check the status of segments
Hopi Meeting
3 Nov 2006
(status aggregation)
The Italian
Research and
Education
38
GARR User Interface
Hopi Meeting
3 Nov 2006
Connect. Communicate. Collaborate
The Italian
Research and
Education
39
CNAF - CERN
E2E MS user interface
Hopi Meeting
3 Nov 2006
VISUALIZZAZIONE
Connect. Communicate. Collaborate
The Italian
Research and
Education
40
CNAF - CERN
GARR GINS user interface
VISUALIZZAZIONE
Connect. Communicate. Collaborate
(Slides from Marco Marletta , Giovanni Cesaroni GARR)
Hopi Meeting
3 Nov 2006
The Italian
Research and
Education
41
Measurement System
Future work - wish list
Connect. Communicate. Collaborate
• Define & implement “degraded” link status
• Add scheduled maintenance indication
• Add more detail to data model
– Break down DomainLink into constituent parts?
(e.g. OCh trails)
– use more info from equipment
Hopi Meeting
3 Nov 2006
The Italian
Research and
Education
42
43