20080122-boote-swany

Download Report

Transcript 20080122-boote-swany

Performance Update
“10 pounds of stuff in a 5 pound bag”
Jeff Boote
Senior Network Software Engineer
Internet2
Martin Swany
Assistant Professor
University of Delaware
Overview
•
•
•
•
Performance Measurement Goals and Vision
Measurement Tools
perfSONAR
Transport Middleware
Goals
• Increase network awareness
• Set user expectations accurately
• Reduce diagnostic costs
• Performance problems noticed early
• Performance problems addressed efficiently
• Network engineers can see & act outside their turf
• Transform application design
• Incorporate network intuition into application behavior
Vision: Performance Information is …
• Available
• People can find it (Discovery)
• “Community of trust” allows access across administrative
domain boundaries
•
Ubiquitous
• Widely deployed (Paths of interest covered)
• Reliable (Consistently configured correctly)
• Valuable
• Actionable (Analysis suggests course of action)
• Automatable (Applications act on data)
NDT
• 3.4.1 is current version
• Latest enhancements were related to
administrator ability to analyze data using
JAnalyze (Google summer of code project)
• Test points available at all Internet2 IP network
router locations:
• ndt.POP.net.internet2.edu
• POP=losa,salt,hous,kans,chic,atla,newy,wash
OWAMP (One way latency data)
• 3.0c (RFC 4645 version) available now
• Maintenance mode
• Diagnostic test points available at all
Internet2 IP Network router locations:
• owamp.POP.net.internet2.edu
• POP = losa,salt,hous,kans,chic,atla,newy,wash
BWCTL (Throughput tests)
• 1.2a is current version
• 1.3 in testing (new testers: nuttcp, thrulay)
• Diagnostic test points available at all
Internet2 IP Network router locations:
• bwctl.POP.net.internet2.edu
• POP = losa,salt,hous,kans,chic,atla,newy,wash
NPToolKit
• Recent versions of Measurement tools
installed and pre-configured
• Knoppix Live-CD bootable system
• Current Version: 1.9
• http://e2epi.internet2.edu/networkperformance-toolkit.html
What is perfSONAR
• A collaboration
• Production network operators focused on designing and
building tools that they will deploy and use on their networks
to provide monitoring and diagnostic capabilites to
themselves and their user comunities.
• An architecture & a set of protocols
• Web Services Architecture
• Protocols based on the Open Grid Forum Network
Measurement Working Group Schemas
• Several interoperable software implementations
• Java & Perl
• A Deployed Measurement infrastructure
perfSONAR Collaboraters
•RNP
•ARNES
•BELNET
•CARNET
•CESNET
•CYNET
•DANTE
•DFN
•ESnet
•FCCN
•FERMI
•GARR
•GEANT
•GRNET
•HEAnet
•Internet2
•ISTF
•POZNAN
•UNINETT
•University of Delaware
•Renater
•RedIRIS
•SLAC
•SWITCH
•SURFnet
And anybody else I missed
perfSONAR Architecture
• Interoperable network measurement middleware:
•
•
•
•
Modular
Web services-based
Decentralized
Locally controlled
• Integrates:
•
•
•
•
•
•
•
Network measurement tools
Network measurement archives
Discovery
Authentication and authorization
Data manipulation
Resource protection
Topology
• Based on:
• Open Grid Forum Network Measurement Working Group
schema.
perfSONAR-PS Motivation
• Create separate implementation of perfSONAR
standard
• Use same protocol/standards
• Proof of interoperability (strengthens the standard)
• Targeted for NOC deployments
• Lightweight
• Easy to deploy/manage
• (We were unable to convince our primary users to deploy
Java services due to the complexity of dependencies)
perfSONAR-PS Beta Release (0.06)
(1/21/08)
• Focus on development of major perfSONAR components
•
•
•
•
•
•
LS - perfSONAR_PS::Services::LS::LS
SNMP MA - perfSONAR_PS::Services::MA::SNMP
Status MA - perfSONAR_PS::Services::MA::Status
CircuitStatus MA - perfSONAR_PS::Services::MA::CircuitStatus
Topology MA - perfSONAR_PS::Services::MA::Topology
PingER (SLAC) *
• Not yet released
• OWAMP/BWCTL archive (perfSONARBUOY)
•
Not released via CPAN
SNMP Measurement Archive
• Provide access to network performance data
• Utilization
• Errors
• Discards
• Numerous tools exist to collect passive measurements
(via SNMP):
• MRTG
• Cacti
• Cricket
• Expose archives from RRD files
SNMP Measurement Archive
• Current Deployment:
•
•
•
•
Internet2 Network
ESnet
Georgia Tech/SOX
Fermilab
Pinger Based MP/MA
• Joint effort between Fermi Lab and
SLAC
• Present views of historic Pinger data
• Expose interface to schedule live tests
• Built with perfSONAR-PS infrastructure
Link Status Measurement Archive
• Provide access to up/down status information
about layer2 links
• Data stored in a SQL database
• Database schema allows for storing time ranges during
which a link had a certain status
• Minimizes storage costs for rarely changing links
• Communication/Configuration via XML
• Target audience is network operators and users
interested in obtaining the status of the links over
which their data flows
Link Status Measurement Archive
• Collector
• Allows for the periodic collection of the status of
one or more links
• Can use SNMP, Scripts or simply Constants
• Can store results directly into a database or into
a remote Measurement Archive
Link Status Measurement Archive
• Visualization
• A perfSONAR-UI Plugin is available that can display a
network and the status of its links
• Current Deployment
• Internet2 Network
• HOPI (in2p3 circuit)
• Planned Deployment
• SLAC
Circuit Status Measurement Archive
• An e2emon-compatible service
• Integrates with the Link Status MA to provide the
information stored in MAs
• Can work with local MAs directly or with remote MAs
• Can use the Topology MA to obtain necessary information
about nodes
• Can use a Lookup Service to lookup the MA containing
information on each link
• Target audience is administrators who want to
publish circuit status information to e2emon clients
Circuit Status Measurement Archive
• Visualization
• Any tool that is compatible with e2emon will
work with this service
• Current Deployment
• Internet2 Network
• HOPI (in2p3 circuit)
• Planned Deployment
• SLAC
Topology Service
• Provides a queryable repository for
obtaining topology information about a
domain
• Can obtain the entire network
• Xquery interface allows the construction of
complex queries about the network
• Topology is specified according to the
schema in development in the OGF
Topology Service
• Current Deployments
• Internet2
• Planned Deployments
• Internet2 DCN
• SLAC (PingER Topology Information)
perfSONAR Lookup Service
• Directory service of perfSONAR deployments
• Accept service registrations
• Handles queries for service location and capabilities
and location of available data
• Manage the lifetimes of data and services to keep
information up to date
• Web Service interface to XML Database
• Sleepycat XML Database
• Service Info/Data kept in native formats
• Draw away the complex query tasks from
otherwise 'busy' services
Lookup Service
• Also XML based configuration/protocol
• Native storage/query mechanisms [Xpath/XQuery]
• Message format to exchange the data
• Targeted at single domain deployment
• Single instance to manage multiple services
• Client components and applications use the LS
to find services
• perfSONAR-UI
• perfAdmin
Lookup Service
• Current Deployment:
• Internet2 (Ann Arbor)
• University of Delaware
• Planned Deployment:
• IU for Internet2 network and regionals
• International Partners
Distributed Lookup Service
• Federation of individual LS instances into a
global system
• “Meta”-lookup phase allows a query to find
the specific LS that has relevant information
• Or perhaps the relevant LSes that have said info
• The specific query is sent directly to the LS in
question
• Recent active design and development
Distributed Lookup Service
• Service and measurement metadata is
“summarized” for propagation to distant
domains
• IP addresses in service and measurement
metadata are compressed into network/netmask
pairs in the same way that routes are advertised
(CIDR-style)
• These summarized metadata elements are
advertised to external “scopes”
• A “scope” is a set of LSes that are related by e.g.
being in the same administrative domain (although
multiple scopes within a single domain are
possible)
Weather Maps - Internet2
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Gmaps from SLAC
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
CNM from DFN
CNM from DFN
perfSONARUI from acad.bg
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
PerfsonarUI 1
PerfsonarUI 2
PerfsonarUI 3
Oscars Circuit plugin - Internet2
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
Oscars circuit plugin
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
E2Emon - Monitoring Circuits
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
E2Emon: Status of E2E link CERN-LHCOPN-FNAL-001
E2Emon generated view of the data for one OPN link [E2EMON]
Traceroute Visualizer
• Forward direction bandwidth utilization on application path from
LBNL to INFN-Frascati (Italy)
•
traffic shown as bars on those network device interfaces that have an associated
MP services (the first 4 graphs are normalized to 2000 Mb/s, the last to 500
Mb/s)
1 ir1000gw (131.243.2.1)
2 er1kgw
3 lbl2-ge-lbnl.es.net
link capacity is also provided
10 esnet.rt1.nyc.us.geant2.net (NO DATA)
11 so-7-0-0.rt1.ams.nl.geant2.net (NO DATA)
12 so-6-2-0.rt1.fra.de.geant2.net (NO DATA)
13 so-6-2-0.rt1.gen.ch.geant2.net (NO DATA)
14 so-2-0-0.rt1.mil.it.geant2.net (NO DATA)
15 garr-gw.rt1.mil.it.geant2.net (NO DATA)
16 rt1-mi1-rt-mi2.mi2.garr.net
4 slacmr1-sdn-lblmr1.es.net (GRAPH OMITTED)
5 snv2mr1-slacmr1.es.net (GRAPH OMITTED)
6 snv2sdn1-snv2mr1.es.net
17 rt-mi2-rt-rm2.rm2.garr.net (GRAPH OMITTED)
18 rt-rm2-rc-fra.fra.garr.net (GRAPH OMITTED)
19 rc-fra-ru-lnf.fra.garr.net (GRAPH OMITTED)
7 chislsdn1-oc192-snv2sdn1.es.net (GRAPH OMITTED)
8 chiccr1-chislsdn1.es.net
20
21 www6.lnf.infn.it (193.206.84.223) 189.908 ms 189.596 ms 189.684 ms
9 aofacr1-chicsdn1.es.net (GRAPH OMITTED)
Phoebus Motivation
• We’re addressing performance problems and easing
adoption of DC network circuits by deploying intelligent
network services like Phoebus in order to actively
enable users to better leverage their network
connectivity (and network investment) by consistently
achieving maximum performance
• The Phoebus service seeks to bridge the E2E
Performance Gap by providing end-users a seamless
way to access new types of high performance networks
like the Dynamic Circuit (DC) Network to maximize
their application performance.
The State of the Net
High Loss due
to shared
infrastructure
High Latency
due to distance
Phoebus in Action
•
Phoebus is based on the concept of creating a unique data-moving
“session” for each application
•
•
•
Each time an application is run, specific adaptation points in the backbone –
known as Phoebus Gateways - are utilized to determine the best , highest
performance path
For example, a file transfer application may traditionally use the IP network.
Once the application is set in motion, Phoebus determines the best network
path from end to end for this specific application which could include a
combination of IP, DC or other future service.
Since the intelligence is in the core of the network, Phoebus enables all
types of applications to leverage improved network performance with
little to no modification by the end-user
•
The Phoebus model is applicable to future applications as well and may prove
to be a factor in the evolution of data transport technology
The Phoebus Model
• Phoebus is a framework and protocol for high-performance
networks
• Phoebus works to transparently split the end-to-end network
path into distinct segments
• Adaptation points are typically chosen at the ingress and
egress points of the backbone
• This minimizes the negative effects of high latency and packet
loss on data transfer
• By localizing their effects
• By allocating dedicated resources to mitigate the issues
The Phoebus Model - Con’t
• Transparent adaptation for existing applications
• Perform well to nearest Phoebus Gateway and allow
the system to do the rest
• No modification necessary for most applications
• The Phoebus system has the ability to optimize the
performance with a variety of techniques and insights into
the state of the network
Phoebus-Enabled DC Network
End-to-End
Session
DC Network

Session Layer Protocol
• The Phoebus Session Protocol (PSP)
can be used to manage a multi-layer
connection
PSP
PSP
PSP
PSP
TCP
PSP
PSP
TCP
Layer 2 (e.g. DCN)
Enabling Applications
• Phoebus can be enabled on Linux
systems with software
• Applications don’t need to be
recompiled
• Windows support under investigation
• Alternatively, we can intercept
certain traffic with a special host
acting as a router
• No modifications to the users’
workstations
Phoebus - Future
• Deployment in nine router POPs over the
next few months
• Simple file transfer tool
• Transparently use Phoebus/Dynamic Circuits
• Utilize Measurement Infrastructure
• Help find best routes, provide information about
paths and achievable bandwidth
• Extension of Path Finding / Routing
• Authentication and Authorization
Protocols and Schema Documents
• Base network measurement schema
• OGF Network Measurement Working Group
• Topology Schema
• OGF Network Markup Language WG
• Includes Topology Network ID
• perfSONAR Protocol Documents
• perfSONAR Consortium
Schema/Protocol Developments
• The perfSONAR Topology schema is also used in the
DCN control plane
• We’ve spent quite a bit of effort harmonizing these
• The obvious win is that we have the measurement
system have immediate access to dynamic circuits
• The broader impact is that we’re approaching a
unified network interaction model (UNIM)
Schema - Network Element Identifiers
• A scheme for identifying network
elements
• Each network element gets a unique
identifier
• This identifier will be included with any
measurement associated with that
element.
Network Element Identifiers
• Use Cases:
• A topology service can be used to find the
identifier for a network element
• An LS could then be queried to find all
measurements associated with that element
• Dynamic service path-finding can be
integrated with ongoing measurements
Network Element Identifiers
• Identifiers use URN notation
• Prefixed with “urn:ogf:network:”
• Consists of name/value pairs separated by
colons
• Possible field names: domain, node, port,
link, path, network
• Set of rules defined for each field to keep
identifiers compact and finite
Network Element Identifiers
• Examples
•
•
•
•
•
•
•
urn:ogf:network:domain=Internet2.edu
urn:ogf:network:domain=internet2.edu:node=packrat
urn:ogf:network:domain=internet2.edu:node=rtr.seat:port=so-2%2F1%2F0.16
urn:ogf:network:domain=internet2.edu:node=rtr.seat:port=198.32.8.200
urn:ogf:network:domain=Internet2.edu:node=packrat:port=eth0:link=1
urn:ogf:network:domain=internet2.edu:link=WASH to ATLA OC192
urn:ogf:network:path=anna-11537-176
Distributed Systems Infrastructure
• perfSONAR, DCN Control Plane and
Phoebus have similar system requirements
• Lookup and Topology Services comprise a generic
Information Service that is useful to all these
Network Services
• Authentication and Policy services are crosscutting as well
• Rather than have silos of mission-specific
functionality, we envision pervasive system
components
Distributed Systems Infrastructure
• Synergies of information bases are
obvious
• Multi-layer path-finding including current
network state, available resources on a
variety of layers
• It is a compelling vision to imagine a
dynamic, reactive, visible service-rich
network
Summary
• A rich set of tools are being developed
• To federate network monitoring and diagnostics
• To enable dynamic network resource allocations
• To leverage new network capabilities from an ‘end-user’
application (phoebus)
• A longer view toward an evolution of “in the network”
services
Questions?
• Jeff Boote
• [email protected]
• Martin Swany
• [email protected]