Perils of GigE

Download Report

Transcript Perils of GigE

Size Matters:
Network Performance on Jumbo Packets
Summer 2004 Joint Techs
Loki Jorgenson, Chief Scientist, Apparent Networks
9k MTU Project(s)
• test global path MTU on Abilene, CA*net4, CUDI and other R & E
networks, plus create a useful researcher mapping tool
• Internet2 ATEAM - Advanced Test Engineering and Measurement
• Bill Rutherford (Rutherford Research/GAIT)
• Kevin Walsh, Nathaniel Mendoza (San Diego
Supercomputing Center/SDSC)
• John Moore (Centaur Internet2 Technology Evaluation Center
ITEC/NCSU North Carolina State University)
• Loki Jorgenson (Apparent Networks/SFU)
• CA*net testing
• Bryan Caron (Network Manager Subatomic Physics,
University of Alberta)
• Damir Pobric (Network engineer - CANARIE)
Why Jumbo?
Performance Benefits
• High performance data transfers
• Grid networks
• Meteorology / physics / biotech
• Collaborative/interactive multi-media
Performance Requirements
• End-to-end path
• From NIC to NIC MTU requirement
• End station is typically the bottleneck
• Gig-E to the desktop
Steady State TCP
If TCP window size and network capacity are not
rate limiting factors then (roughly):
0.7 *
Max Segment Size (MTU)
e2e throughput <
Round Trip Time (latency) sqrt[loss]
M. Mathis, et.al.
• Double the MTU, double the throughput
• Half the latency, the throughput
(shortest path matters)
• Half the loss rate, 40% higher throughput
Frame Size vs. MTU vs. MSS – An Ethernet Example
IFG
PRE
MAC/
LLC
IP Header
TCP Header
Payload
Data
FCS
MSS(1460bytes)
Packet (1500 bytes = MTU)
Frame (1518 bytes)
OSI
Layer
7
6
5
4
3
2
1
Description
Application
Presentation
Session
Transport
Network
Data Link
Physical
Maximum Segment Size (MSS)
Maximum Transmission Unit (MTU) = Packet
Frame
MTU Performance Testing: ATEAM
Objectives:
• Measurements using
multiple methodologies
• SmartBits
• iPerf
• AppareNet/aNA
9k MTU Project
1
done
Create
Project Plan
Plan
Kevin Walsh/Bill Rutherford
Wednesday,July17,2002
2
done
Formulate 9k
MTU Interesting
Target List
Abilene Target
List
Paul Love/Kevin Walsh
Friday, July 19, 2002
3
done
Probe Abilene
Targets for
Basic MTU
Capabilities
Abilene
Probe data
Bill Rutherford/Nathaniel Mendoza/
Loki Jorgenson/Kevin Walsh/Paul
Schopis
Friday, July 26, 2002
4
done
Probe TRIUMF
to CERN for MTU
Capabilities
HEP Probe data
Bill Rutherford/Loki Jorgenson/
Steven McDonald
as of Sept 30/02
July 28- August 1, 2002
5
• Determine current
dependency on
transmission packet size
Formulate
Presentation
based on combined
done
Joint Techs
Presentation
Kevin Walsh/Fred Klassen
Probe Data
September2002
6
in progress -late
Formulate MTU
Test SW for
Spirent
SM6000B
Procedure/SW
All
October2002
7
in progress -late
Detailed
Analysis of MTU
Capabilities
Experiment
Results
All
preliminarydiscussion
waiting for 9k MTU config mods
at all sites
preliminary 9k MTU testing
completed on CA*net 4
as of Dec 30/02
• Determine extent of
jumbo MTU access
October2002
8
in progress -late
Funding
Application for
Further Work
Application
All
preliminarydiscussion
March 2004
pending funding
arrangement
9
Global
9k MTU
Route Map
M ap
preliminarydiscussion
All
About
aNA
• AppareNet Network for
Academics
• Currently 16 sequencers
across CA*net and Abilene
• NIS in Vancouver, Canada
• 10 Gig-E/Jumbo hosts
• 4 hosts in Canada
• BCnet
• Netara Alliance
• CA*net NOC
• ACORN-NS
• Upgrade ANA to v2.5 (mid-August 2004) from v1.6.2
• Web access
• On-demand deployment
About
AppareNet
• Uses light, non-intrusive active probing
• ICMP or UDP packets in various
configurations
• Point-and-shoot to most IP addresses
• Performs network path characterization
• Performs expert system diagnostics
• Single-ended  two-way measures
(e.g. half-duplex different from full-duplex)
Samples network to generate same view
as best effort application (pre-TCP)
Example Measurement Paths: SDSC to Ottawa
Example Measurement Paths: SDSC to Halifax
Example Measurement Paths: SDSC to Mexico
Why use Jumbo Frames?
GigE maximum achievable 2-way bandwidth vs. MTU
from Kansas City to various universities
2000
Standard 1500 MTU
1800
bandwidth (Mbps)
1600
1400
1200
1000
3072
MTU
800
600
4096
MTU
5120
MTU
6144
MTU
7168
MTU
8192
MTU
9000
MTU
2048
MTU
400
200
512
MTU
0
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
MTU size (bytes)
Increasing MTU gives better performance.
Advanced Test Engineering and Measurement (ATEAM) performance measurements taken across the Abilene and CA*net4 backbone
http://www.ncne.nlanr.net/training/techs/2003/0803/presentations/0803-moore1_files/v3_document.htm
More GigE Test Results
CA*net4 - MTU Performance
2000000
1800000
2-way Bandwidth (bps)
1600000
1400000
1200000
1000000
800000
600000
400000
200000
0
0
2000
4000
6000
Packet Size (bytes)
8000
10000
Preventing MTU conflicts – Network Negotiation
Network must be
able to handle MTU
Negotiation
Server
Server
9000 MTU
Mixed MTU
Network
1500 MTU
Client
GigE Black Hole Hop
Layer 2 switch
Server
4500
2250
1125
9000
DF
DF
9000 MTU
1500 MTU
resp
req
DF
DF
Server
Client
What is happening?:
• RFC 1191 and “TCP Slow Start” are interacting
• Packets are lost
• Retransmission happen, causing performance degradation
• Client responds to some packets, keeping connection open
• Overall performance appears slow to client
MTU handling via RFC 1191 PMTU discovery
Router
Server
9000
1500
DF
9000 MTU
ICMP
1500 MTU
req
DF
Client
Server
Advantages:
• Router is not loaded
• Maximum performance achieved
Disadvantages:
• reliance on ICMP
• easy to misconfigure
Applications:
• almost all modern applications
Intel Pro XT 1000 version 6.2.22.1
Intel Pro XT 1000 version 7.0.36.0
Intel Pro XT 1000 version 7.4.19.0
Avoiding GigE MTU problems
• Maintain logical Layer 3 diagrams
Addr:
Mask:
Routes:
Default GW:
10.0.1.1-254
255.255.255.0
10.0.1.1
R
Addr:
Mask:
Routes:
Default GW:
10.0.2.1-254
255.255.255.0
10.0.2.1
• Assign MTUs based on a per-subnet basis
• Be consistent with MTU values used
• Use 1500 bytes for legacy Ethernet (no registry
hacks)
• Use recommended 9000 bytes MTU for GigE when
jumbo frames are used (standard for CAnet and
Abilene networks)
• Don’t forget to add 18 bytes when adjusting frame size
(e.g. set NIC to 9018 bytes frame size to maintain a
9000 byte MTU)
• Don’t arbitrarily filter out ICMP messages
Resources
Path MTU tools:
• ANA pMTU service – uses ANA sequencers across I2 and CA*net
http://pathmtu.apparenet.com:8282/
[email protected]:guest42
• NCNE MTU Discovery Service – uses service located at NCNE
http://www.ncne.org/jumbogram/mtu_discovery.php
• pMTU Applet - Java-based client for end-user station
http://sourceforge.net/projects/mtu-calculator/
http://ana.apparenet.com:8282/pMTU/ Download
Jumbo MTU Performance whitepaper
• http://www.apparentNetworks.com/wp/
Path MTU Mapping Service
request
response
map service code
map client code
web interface
request
path MTU
route
path MTU route(s)
request
path MTU discovery
archive
analysis ?
route analysis
route parameter
analysis
pMTU Java Client
pMTU applet
available from
SourceForge
pMTU Database and Map Visualization
pMTU Database and
MTU map
pMTU Project Status
• Initial prototypes available
• Open source (SourceForge) projects
pMTU Applet
• Java/Swing client
• Webstart
• Native code driver
• Windows only
• Linux-ready
pMTU DB/Map
• mySQL/Servlet
• accepts connections
• limited map capability
• saves Applet results
Current Work – Application Performance
End-user Performance Benefits
•
•
•
•
•
End-to-end requirement
Very limited jumbo-access from campus
Influence of real-world application dynamics
Identify classes of need that require jumbo
Identify classes of implementation susceptible
to jumbo
Target Implementations
Distributed File Systems
• Simple, commonly useful utility
• Challenged by latency, loss
• Often data-intensive
• Persistent use (not one-time)
• Non-denominational
Visualization Server
• Visual/subjective benefit
• Very demanding
• Resolution dependent load
• Specific to particular fields/users
Resources
On-Line Path MTU tools:
• ANA pMTU service – uses ANA sequencers across I2 and CA*net
http://pathmtu.apparenet.com:8282/
[email protected]:guest42
• NCNE MTU Discovery Service – uses service located at NCNE
http://www.ncne.org/jumbogram/mtu_discovery.php
• pMTU Applet - Java-based client for end-user station
http://sourceforge.net/projects/mtu-calculator/
http://ana.apparenet.com:8282/Download
Jumbo MTU Performance whitepaper
• http://www.apparentNetworks.com/wp/
Fin
End of Presentation
aNA.apparenet.com
Fin
Backup Slides
GigE Black Hole Hop
Server
Layer 2 switch
Server
req
DF 9000
DF 4500
DF 2250
DF 1125
resp
DF 2250
DF
1125
Client
MTU handling via fragmentation
Server
Router
Client
Server
req
9000
1500
1500
1500
1500
9000
1500
1500
1500
MTU handling via RFC 1191 PMTU
discovery
Server
Router
Server
req
9000
ICMP
DF
DF
DF
1500
1500
1500
Client