Measurement Tools

Download Report

Transcript Measurement Tools

Finding Network Problems that
Influence Applications:
Measurement Tools
Matt Zekauskas, [email protected]
Rich Carlson, [email protected]
Spring Member Meeting
May 2, 2005
Outline
Problems, typical causes, diagnostic
strategies
Tools: First mile, host issues
Tools: Path issues
Tools: Others to be aware of
Tools within Abilene
Wrap-up
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
2
We Would Like Your Help
What problems are you experiencing?
Have you used a good tool?
Give us the benefit of your experience:
successful problem resolution?
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
3
What Are The Problems?
Packet loss
Jitter
Out-of-order packets (extreme jitter)
Duplicated packets
Excessive latency
• Interactive applications
• TCP’s control system
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
4
Link vs Path loss rates
Eliminating loss has been the goal
• TCP: 100 Mbit Ethernet coast-to-coast:
–Full size packets… need 10-6 Ploss [Mathis]
–Less than 1 loss every 83 seconds
• http://www.psc.edu/~mathis/papers/JTechs200105/
• GigE: 10-8, 1 loss every 497 seconds
Not all loss is avoidable
• Fair sharing can lead to congestive loss
• Non-congestive losses especially tricky
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
5
What Are The Problems?
TCP: lack of buffer space
• Forces protocol into stop-and-wait
• Number one TCP-related performance
problem.
• 70ms * 1Gbps = 70*10^6 bits, or 8.4MB
• 70ms * 100Mbps = 855KB
• Many stacks default to 64KB, or 7.4Mbps
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
6
What Are The Problems?
Video/Audio: lack of buffer space
• Makes broadcast streams very sensitive
to previous problems
Application behaviors
• Stop-and-wait behavior; Can’t stream
• Lack of robustness to network anomalies
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
7
The Usual Suspects
Host performance problems (TCP buffers)
Duplex mismatch (Ethernet)
Wiring/Fiber problem
Bad equipment
Bad routing
Congestion
• “Real” traffic
• Unnecessary traffic (broadcasts, multicast, denial
of service attacks)
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
8
JPL/Caltech – GSFC
The situation
• Using Abilene
• Tuned hosts
• Things work locally
Therefore it MUST be Abilene
• Tests show good flows router-router
• Intermediate tests point towards CA
Bad fiber connection!
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
9
Strategy
Most problems are local…
Test ahead of time!
Is there connectivity & reasonable
latency? (ping -> OWAMP)
Is routing reasonable (traceroute)
Is host reasonable (NDT; web100)
Is path reasonable (iperf -> BWCTL)
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
10
One Technique: Problem
Isolation via Divide and Conquer
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
11
Strategy (references)
See also
• http://e2epi.internet2.edu/
Look at stories, documents, tools
• http://e2epi.internet2.edu/ndt/
Pointer to the tool, and using it for debugging
the last mile
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
12
Strategy (references)
• http://www.psc.edu/networking/projects/tcptune/
How to tweak OS parameters (also scp pointer)
• http://www.ncne.org/research/tcp/
TCP debugging the detailed way
• http://dast.nlanr.net/Guides/WritingApps/
Tips for app writers
• http://dast.nlanr.net/Guides/GettingStarted
And some checking to do by hand & debugging.
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
13
Outline
Problems, typical causes, diagnostic
strategies
Tools: First mile, host issues
Tools: Path issues
Tools: Others to be aware of
Tools within Abilene
Wrap-up
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
14
Internet2 Detective
A simple “is there any hope” tool
• Windows “tray” application
• Red/green lights, am I on Internet2
• Multicast available
• IPv6 available
http://detective.internet2.edu/
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
15
NLANR Performance Advisor
Geared for the naive user
Run at both ends, and see if a standard
problem is detected.
Can also work with intermedate servers
http://dast.nlanr.net/Projects/Advisor
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
16
NDT
Network Debugging Tool
Java applet
Connects to server in middle, runs tests, and
evaluates hueristics looking for host and first
mile problems.
Has detailed output.
You’ll see lots of detail later today.
A commercial tool that tests for TCP buffer
problems: http://www.dslreports.com/tweaks/
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
17
Host/OS Tuning: Web100
Goal: TCP stack, tuning not bottleneck
Large measurement component
• TCP performance not what you expect?
Ask TCP why!
–Receiver bottleneck (out of receiver window)
–Sender bottleneck (no data to send)
–Path bottleneck (out of congestion window)
–Path anomalies (duplicate, out of order, loss)
www.web100.org
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
18
Reference Servers (Beacons)
H.323 conferencing
• Goal: portable machines that tell you if system
likely to work (and if not, why?)
• Moderate-rate UDP of interest
• E.g., H.323 Beacon
http://www.osc.edu/oarnet/itecohio.net/beacon/
• ViDeNet Scout, http://scout.video.unc.edu/
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
19
Outline
Problems, typical causes, diagnostic
strategies
Tools: First mile, host issues
Tools: Path issues
Tools: Others to be aware of
Tools within Abilene
Wrap-up
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
20
OWAMP
One-Way Active Measurement Protocol
Requires NTP-Synchronized clocks
Look for one-way latency, loss
Authentication and Scheduling
Again, lots more later today
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
21
BWCTL
A tool for throughput testing that
includes scheduling and authentication.
Currently uses iperf for actual tests.
Can assign users (or IP addresses) to
classes, give classes different
throughput limits or time limits.
Periodic and on-demand testing.
Lots more later today.
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
22
A commercial tool
Apparent Networks has a tool that can
diagnose path properties from a single
vantage point. Finds things like duplex
mismatches, MTU black holes, rate
limiting links. Depends on ICMP (or a
reflector).
http://www.apparentnetworks.com/
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
23
Outline
Problems, typical causes, diagnostic
strategies
Tools: First mile, host issues
Tools: Path issues
Tools: Others to be aware of
Tools within Abilene
Wrap-up
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
24
Some Commercial Tools
Caveat: only a partial list, give me more!
Spirent (nee Netcom/Adtech):
• working on a box for ‘end-to-end’ measurements
• SmartBits: test at low & high rates, QoS; test
components or end-to-end path
NetIQ: Chariot/Pegasus
Agilent (like SmartBits, and FireHunter)
Ixia (like SmartBits/Spirent)
Brix Networks (like AMP/Owamp, for ‘QoS’)
Apparent Networks: path debugger
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
25
Some Noncommercial Tools
Iperf: dast.nlanr.net/Projects/iperf
• See also http://www-itg.lbl.gov/nettest/
• http://www-didc.lbl.gov/NCS/
Flowscan:
• http://www.caida.org/tools/utilities/flowscan/
• http://net.doit.wisc.edu/~plonka/FlowScan/
SLAC’s traceroute perl script:
• http://www.slac.stanford.edu/comp/net/wan-mon/traceroute-srv.html
One large list:
• http://www.slac.stanford.edu/xorg/nmtf/nmtf-tools.html
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
26
Outline
Problems, typical causes, diagnostic
strategies
Tools: First mile, host issues
Tools: Path issues
Tools: Others to be aware of
Tools within Abilene
Wrap-up
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
27
Abilene:
Measurements from the Center
Active (latency, throughput)
• Measurement within Abilene
• Measurements to the edge
Passive
• SNMP stats (esp. core Abilene links)
• Variables via router proxy
• Router configuration
• Route state
• Characterization of traffic
– Netflow; OCxMON
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
28
Goal
Abilene goal to be an exemplar
• Measurements open
• Tests possible to router nodes
• Throughput tests routinely through
backbone
• …as well as existing utilization, etc.
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
29
Abilene: Machines
GigE connected high-performance tester
• bwctl, “nms1”, 9000 byte MTU
Latency tester
• owamp, “nms4”, 100bT
Stats collection
• SNMP, flow-stats, “nms3”, 100bT
Ad-hoc tests
• NDT server, “nms2”, gigE, 1500 byte MTU
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
30
Throughput
Take tests 1/hr, 20 seconds each
• IPv4 TCP
• IPv6 TCP (no discernable difference)
• IPv4 UDP (on our platforms flakey at 1G)
• IPv6 UDP (ditto)
Others test to our nodes
Others test amongst themselves
Net result: 25% of traffic (NOT capacity) is
measurement
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
31
Latency
CDMA used to synchronize NTP
• www.endruntechnologies.com
Test among all router node pairs
10/sec
IPv4 and IPv6
Minimal sized packets
Poisson schedule
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
32
Passive - Utilization
The Abilene NOC takes
• Packets in,out
• Bytes in,out
• Drops/Errors
• ..for all interfaces, publishes internal links
& peering points (at 5 min intervals)
• ..via SNMP polling – every 60 sec
http://hydra.uits.iu.edu/~abilene/traffic/
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
33
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
34
Abilene Pointers
http://www.abilene.iu.edu/
• Monitoring
• Tools
http://www.itec.oar.net/abilene-netflow
http://netflow.internet2.edu/weekly/
(summaries)
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
35
Outline
Problems, typical causes, diagnostic
strategies
Tools: First mile, host issues
Tools: Path issues
Tools: Others to be aware of
Tools within Abilene
Wrap-up
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
36
End-to-End Measurement
Infrastructure vision
Ongoing monitoring to test major
elements, and (some, important) end-toend paths.
• Elements: gigaPoP links, peering, …
• Utilization
• Delay
• Loss
• Occasional throughput
• Multicast connectivity
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
37
End-to-End Measurement
Infrastructure Vision II
There are many more paths end to end
than can be monitored.
Diagnostic tools available on-demand
(with authorization)
• Show routes
• Perform flow tests (perhaps app tests)
• Parse/debug flows (a-la tcpdump or
OCXmon with heuristic tools)
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
38
What Campuses Can Do
Export SNMP data
• I have an “Internet2 list”, can add you
• Monitor loss as well as throughput
Performance test point at campus edge
• Hopefully, the result of today’s workshop
• Possibly also traceroute “looking glass”
• Commercial (e.g., NetIQ) complements
• We have a master list
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
39
Acknowledgements
The original presentation by Matt Zekauskas
using ideas inspired by material from NLANR
DAST, Matt Mathis, and others.
Copyright Internet2 2005, All Rights
Reserved.
Your mileage may vary. Caveat Emptor. It’s a
desert topping and a floor wax. They all do
that. It’s a feature. It’s wafer-thin. Sleep is for
the weak. Coffee won’t hurt you, look what it’s
done for meeee…
Finding Network Problems: Measurement Tools
5/2/0522-Apr-2005
40
www.internet2.edu