20061205-performanceupdate-boyd

Download Report

Transcript 20061205-performanceupdate-boyd

Performance Update
Eric Boyd
Director of Performance Architecture
and Technologies
Internet2
Network support of Science
• Science is a global community
• Networks links scientists
• Collaborative research occurs across network
boundaries
• For the scientist, the value of the network is the
achieved network performance
• Scientists should not have to focus on the network;
good end-to-end performance should be a given
Large Hadron Collider
• International Physics facility located in CERN Switzerland
• Major US involvement
• 2 major US data repositories (PetaBytes/year)
• 17 US Institutions provide data analysis and storage
• 68 Universities and National Laboratories with scientists
looking at the data
• Dedicated transatlantic networks connect US to CERN
• Advanced network services required over existing campus,
connector/regional, and national networks
Achieving Good End-to-End Performance
• Internet2 consists of:
• Campuses
• Regional networks
• Internet2 backbone network
• Our members care about connecting with:
• Other members
• Government labs & networks
• International partners
• The Internet2 community cares about making all of this
work
Identifying the Problem
Applications
Developer
Hey, this is not
working right!
LAN
Administrator
Others are
getting in ok
Not our problem
Talk to the other guys
Applications
Developer
LAN
Administrator
Everything is
AOK
System
Administrator
Campus
Networking
Campus
Networking
The computer
Is working OK
System
Administrator
No other
complaints
Gigapop
How do you solve
a problem along a path?
Looks fine
Gigapop
Backbone
All the lights
are green
We don’t see
anything wrong
The network is lightly loaded
Status Quo
• Performance is excellent across backbone networks
• Performance is a problem end-to-end
• Problems are concentrated towards the edge and in
network transitions
• We need to:
• Diagnose: Understand limits of performance
• Address: Work with members and application
communities to address those performance issues
Vision: Performance Information is …
• Available
• People can find it (Discovery)
• “Community of trust” allows access across
administrative domain boundaries (AA)
• Ubiquitous
• Widely deployed (Paths of interest covered)
• Reliable (Consistently configured correctly)
• Valuable
• Actionable (Analysis suggests course of action)
• Automatable (Applications act on data)
e-VLBI Network
eVLBI Result
• Use of integrated network monitoring helped to enable
identification of bottleneck (hardware fault)
• Automated monitoring allowed view of network
throughput variation over time
• Highlights route changes, network outages
• Automated monitoring also helps to highlight any
throughput issues at end points:
• E.g. Network Interface Card failures, Untuned TCP
Stacks
• Integrated monitoring provides overall view of network
behavior at a glance
Goal: No more mystery …
• Increase network awareness
• Set user expectations accurately
• Reduce diagnostic costs
• Performance problems noticed early
• Performance problems addressed efficiently
• Network engineers can see & act outside their turf
• Transform application design
• Incorporate network intuition into application
behavior
Strategy: Build & Empower the
Community
Decouple the Problem Space:
• Analysis and Visualization
• Performance Data
Sharing
• Performance Data
Generation
Grow the Footprint:
• Clean APIs and protocols
between each layer
• Widespread deployment
of measurement
infrastructure
• Widespread deployment
of common performance
measurement tools
Analysis &
Visualization
Analysis &
Visualization
API
Measurement
Infrastructure
Measurement
Infrastructure
API
Performance
Tools
Performance
Tools
Tactics: Leverage position
• Internet2 is leveraged to help provide diagnostic
information for “backbone” portion of problem
• Create *some* diagnostic tools
• Make Abilene data as public as is reasonable
• Work on efforts to more widely make performance data
available (perfSONAR)
• Contribute to ‘base’ perfSONAR development
• Integrate ‘our’ diagnostic tools as ‘good’ example of
perfSONAR services
From the scientist’s perspective
On behalf of the scientist, network engineer or
application can easily/automatically:
• Discover additional monitoring resources
• Authenticate locally
• Authorized to use remote network resources to a
limited extent
• Acquire performance monitoring data from remote
sites via standard protocol
• Innovate where needed
• Customize the analysis and visualization
Internet2 End-to-End Performance Initiative
(E2Epi)
• Includes:
• Internet2 staff
• Internet2 members
• Federal partners
• International partners
• Building:
• Performance monitoring tools
• Performance middleware frameworks
• Performance improvement tools
Support for E2Epi
• Funded out of network revenues
• Partnerships
• Leveraging GÉANT2, ESnet, and RNP resources through
consortium leadership
• Grants
• NSF Apps - Targeted Assistance and Instrumentation for Internet2
Applications
• NSF SGER - Leveraging Internet2 Facilities for the Network
Research Community
• NSF SGER2 - Network Measurement for International Connections
• NSF BTG - Bridging the Gap: End-to-End Networking for Landmark
Applications
• NLM Pilot - User Experience with the High Performance Internet
Infrastructure: Critical Incidents of Success and Failure
• NLM NDT - Enhancing the Web 100-based Network Diagnostic Tool
Performance Tools
• Diagnosis
• Throughput (BWCTL)
• One-Way Delay (OWAMP)
• Top 10 Problems in First Mile (NDT)
• Solutions
• Alternate congestion control (VFER)
• Partition the session (Phoebus)
Network Performance Toolkit (NPToolkit)
• Knoppix (v5.0) based Live-CD
• Automatically starts 4 E2E performance tools
with usable default configurations
• BWCTL
• NDT
• NPAD
• OWAMP
• Easy customization scripts allows admin tailor
system to site needs
Network Diagnostic Tool (NDT)
• New Simple Firewall Test added
• Google Summer of Code project
• Detects blocked ephemeral ports on server and
client
• New IPv6 address support
• General code cleanup
• Virginia Tech contribution
• Client’s location can be plotted on map
OWAMP: One-Way Active Measurement
Protocol
• What is it?
• Measures one-way latency: 1-way ping
• Control connection used to broker test
request based upon policy restrictions and
available resources. (Bandwidth/disk limits)
• Specification
• http://www.rfc-editor.org/rfc/rfc4656.txt
OWAMP Flow Diagram
Server
owampd
[Resource Broker]
Client
Initial
connection
owping
client
[Control]
/
sts
e
u
q
Re sults
Re
owampd
[Control]
OWD Test
Endpoint
sts/
Reque s
t
Resul
OWD Test
Endpoint
What’s New? (1)
• Protocol status: RFC 4656
• IANA allocated port: 861
• Authentication/Authorization changes
• Uses HMAC-SHA1 for message validation
• Uses PBKDF2 for AES session key creation
• keys are now session specific and
dynamically generated from
passphrases.
What’s New? (2)
• Powstream is now a full supported application
with documentation
• As always - more bug fixes and ports
• Details in the distribution
Availability
• 3.0a release available
• Source tarball
• Supported release out in the next month after
more extensive testing on Abilene
measurement hosts
• Supported releases will also be provided as
RPM’s with many thanks to GA-TECH
Bulk Transport: Killer App
• Q: What do we need fat pipes for?
• A: Bulk Transport
• Flavors:
• Straightforward huge file transfer
• Interactive high throughput
• Instrument data transfer
• Poor Performance (~3 Mb/s performance where we
should have ~60-100 Mb/s)
• #1 Reason for poor performance: Transport Protocols
VFER – Bulk Transport Tool
• Command-line remote copy tool
• SCP-style interface
• Easy to use on today’s advanced networks
• Download, make, install
• Portable (no kernel mods)
• Out-of-the-box performance
• Tolerate minor non-congestive packet loss
• Both static file transfer and interactive applications
• Runs over UDP
• TCP-friendly
VFER – Current Status
• Alpha release v0.98
(http://vfer.internet2.edu)
• Working, not polished, delay-based
congestion control
• SSH-based security
Network Performance Measurement
Workshops
• Example Course Materials:
• http://e2epi.internet2.edu/npw/presentations.html
Goals:
• Grow installed base of BWCTL/Iperf, OWAMP, and NDT at
GigaPoP and regional campuses.
• http://e2epi.internet2.edu/pipes/pmp/pmp-dir.html
• Begin integration into IT support processes.
• Create an installed base for perfSONAR deployment.
• Teach Internet2 community to use performance tools.
Bridging the Gap
• Multi-discipline team addressing 2 major issues
• Reset user expectations
• 10 Mbytes per second is ‘acceptable’
• Problem resolution takes too long
• Better tools and self-guided documentation to improve
troubleshooting
• Documentation that can be used by both novice and
expert
Getting There: Build & Empower the
Community
Decouple the Problem Space:
• Analysis and Visualization
• Performance Data
Sharing
• Performance Data
Generation
Grow the Footprint:
• Clean APIs and protocols
between each layer
• Widespread deployment
of measurement
infrastructure
• Widespread deployment
of common performance
measurement tools
Analysis &
Visualization
Analysis &
Visualization
API
Measurement
Infrastructure
Measurement
Infrastructure
API
Performance
Tools
Performance
Tools
What is perfSONAR?
• Performance Middleware
• perfSONAR is an international consortium
in which Internet2 is a founder and leading
participant
• perfSONAR is a set of protocol standards
for interoperability between measurement
and monitoring systems
• perfSONAR is a set of open source web
services that can be mixed-and-matched
and extended to create a performance
monitoring framework
perfSONAR Design Goals
•
•
•
•
•
•
•
Standards-based
Modular
Decentralized
Locally controlled
Open Source
Extensible
Applicable to multiple generations of network
monitoring systems
• Grows “beyond our control”
• Customized for individual science disciplines
perfSONAR Integrates
• Network measurement tools
• Network measurement archives
• Discovery
• Authentication and authorization
• Data manipulation
• Resource protection
• Topology
perfSONAR Credits
•perfSONAR is a joint effort:
• ESnet
• GÉANT2 JRA1
• Internet2
• RNP
•ESnet includes:
• ESnet/LBL staff
• Fermilab
•Internet2 includes:
• University of Delaware
• Georgia Tech
• SLAC
• Internet2 staff
•GÉANT2 JRA1 includes:
• Arnes
• Belnet
• Carnet
• Cesnet
• CYNet
• DANTE
• DFN
• FCCN
• GRNet
• GARR
• ISTF
• PSNC
• Nordunet (Uninett)
• Renater
• RedIRIS
• Surfnet
• SWITCH
perfSONAR Adoption
•R&E Networks
• Internet2
• ESnet
• GÉANT2
• European NRENs
• RNP
•Application Communities
• LHC
• GLORIAD Distributed
Virtual NOC
• Roll-out to other
application
communities in 2007
•Distributed Development
• Individual projects (10
before first release)
write components
that integrate into the
overall framework
• Individual
communities (5
before first release)
write their own
analysis and
visualization software
More Information
• Eric Boyd
• [email protected]
• 734-352-7032
• http://e2epi.internet2.edu/
• http://www.perfsonar.net/