Network Troubleshooting Tools
Download
Report
Transcript Network Troubleshooting Tools
Network
Troubleshooting Tools
Kent Reuber
ITS Networking
[email protected]
April 6, 2007
Outline
What problems do you need to
solve?
Tool descriptions
Q&A time
Tool descriptions are in the
“Software” section of the LNA
Guide:
http://lnaguide/software.html
What are the problems?
Are hosts online? (ping)
How do you get to hosts? (traceroute)
What are hosts running? (nmap)
Where/when have hosts been seen?
(ipm)
“The network is slow” (Netspeed, iperf)
DHCP and DNS (SUNet reports)
Wireless problems (various)
Packet sniffing (wireshark), and batch
NetDB changes (NetDB CLI)
Ping and traceroute
Ping: Are you there?
Ping sends ICMP echo requests to a host and asks
for a reply. Reply time is also returned.
Some hosts may choose not to reply by security
policy. It may not mean that they’re down.
Stanford de-prioritizes pings at some of our
borders, so a long ping time or dropped pings
does not indicate a poor connection.
Stanford maintains a special host:
“ping-me.stanford.edu”
Exempt from ping filter.
Have outside users ping “ping-me” if they claim that
connections to Stanford are unavailable or slow.
Ping for Advanced Users
Can increase packet size to see
duplex errors. (Unix: ping -s)
Default small (<60 byte) ping packets
don’t generate enough traffic to show
duplex problems.
Try using pings of 1000+ bytes.
Use nmap or similar utility for “ping
sweeps” of entire networks:
“nmap -sP <network range>” (Ex.
“nmap -sP 171.64.18.0/23”)
Nmap: http://insecure.org/
Traceroute: How do I get
there?
How traceroute works:
Source sends a series of packets with increasing time-tolives. (TTL is the allowed number of router hops.)
Unix/Mac: UDP, Windows “tracert”: ICMP.
Routers will decrement TTL and respond with an ICMP
“unreachable” message if TTL is 0.
Like ping, a timestamp is returned.
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Traceroute notes
Routers need not reply to traceroutes.
Lack of a reply does not mean that the
router is down.
Return traffic doesn’t necessarily use the
same path.
This can cause problems with firewalls and
packet shapers that assume they see the
whole conversation.
When troubleshooting connection problems,
you may want to have the destination send
traceroutes to you as well.
nmap
Nmap: Scanning nets
In addition to ping scans, you can
scan for open ports on hosts.
This can be useful for seeing who is
running a service (intentionally or
otherwise!)
My recipe for scanning for open
TCP ports:
”nmap -P0 -sT net -p ports -oG - | grep
open”
Getting nmap
Download from
http://insecure.org
Unix and MacOS X usually
require compiling from source.
Windows binary available.
ipm
IPM: IP <-> MAC addresses
Stanford-specific utility
How it works:
Devices broadcast ARP packets when
they need to communicate locally.
Routers see these ARP and cache it.
Information is periodically harvested
and kept in a database.
Using IPM, you can track when an
IP/MAC was first and last seen and
where.
IPM: What’s it good for?
You can find MAC addresses
which aren’t in Netdb.
Find out where a particular
device has been seen.
See if multiple devices are
using a single IP address.
More on IPM
Where is it:
AFS: /usr/pubsw/sbin/ipm
Note: this directory is not in your
default PATH.
Using IPM:
Wildcards: “_” (single
character), “%” (multiple
characters)
Run “ipm -h” to see list of options.
MAC vendor codes
MAC addresses are 48-bit (6 bytes)
xx:xx:xx:xx:xx:xx, where each “x” is a
hexadecimal number 0-9,a-f.
First 3 bytes are the Organizationally
Unique Identifier (OUI), which tell you who
made the network card.
Can look this up. My favorite site:
http://www.coffer.com/mac_find/
Can tell you when NetDB records are
outdated. For example, a NetDB record
for a Macintosh with MAC address
00:0b:db (Dell) is clearly wrong.
Netspeed and Iperf
Netspeed & Iperf: Speed
testing
Often hear “the network is slow”.
Is it the client, the network or a server?
Where’s the bottleneck?
Useful tools:
Netspeed (Web based speed to
campus backbone).
Iperf (command line tool for point-topoint).
Netspeed
Web based speed testing to
Stanford backbone:
http://netspeed.stanford.edu/
or http://iperf.stanford.edu/
Useful for finding duplex errors
(misconfigured hubs or
switches) in the path.
Iperf
Command line testing tool.
Can also run speed tests against
netspeed.stanford.edu and
iperf.stanford.edu
Can be run in server mode for
testing speed between arbitrary
points (e.g., within your network)
http://dast.nlanr.net/Projects/Iperf/
How fast can you go?
DSL: 1 Mbps (asymmetric)
802.11b wireless: 1-5 Mbps
802.11g wireless: 1-12 Mbps
Fast Ethernet: 80+ Mbps
Gigabit: ??
Note: consider these tests as upper
bounds. For gigabit especially, you
may not be able to transfer real
data this fast.
DHCP
Troubleshooting DHCP
Many things can go wrong.
Problems are rarely caused by
DHCP server unavailability.
Things to check:
What IP is the host getting?
Netdb record for the host.
DHCP server logs, roaming pool
utilization reports.
Understanding DHCP
Stanford has two DHCP servers: dusk and dawn.
Info from Netdb is uploaded approximately every 15
minutes. Give Netdb the time to upload data.
At Stanford, MAC address information is required for
successful DHCP.
Initial DHCP is a four step process using broadcasts;
renews are different.
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Leases
DHCP addresses are valid for a
limited period (wired and wireless).
Normal DHCP: 2 days
Roaming DHCP: 42 minutes
Hosts will re-confirm their leases
halfway through the lease period.
Clients use unicast directly to the
DHCP server (clients have an address
and they know who their server is).
Renew message type is used.
DHCP roaming
If the Netdb record has a “home” IP address
appropriate for the network where the device is
located, DHCP servers will send it.
Can have “home” IP addresses and still be able to
roam to other networks.
Can have multiple “home” addresses bound to
each MAC address.
If no appropriate address is entered, DHCP will
look for available roaming addresses on the local
network.
Number of roaming address is specified by the LNA.
Defined in the Netdb network record.
Usually there are only a handful of roaming
addresses. Can easily run out of them.
What address did you get?
The address received may tell you what
the problem is.
Self assigned (169.254.*.*):
NetDB record not set up properly.
No roaming address available.
Routing or DHCP server problem (less likely).
10.x.x.x:
Used by Network self-registration system.
(SNSR)
Could also be used by a rogue.
192.168.*.*:
Probably a rogue DHCP server.
Finding rogues
Try pinging the gateway that’s
being distributed.
Use “arp” command to get the
MAC address of the gateway. Or
use a sniffer if you have one.
Look at switch MAC tables and find
the offending hosts. Shut off the
port or go have a “chat”.
New Net-to-Switch configs block
rogue DHCP servers!
Available DHCP reports
DCHP logs for a given host.
Type in MAC address and see the conversation.
Takes practice to read.
Roaming address utilization
How many roaming addresses were used in a day.
DHCP reports from dusk and dawn
Hourly logs show number of DHCP messages for
hosts.
“No free leases” may indicate that you’re out of
roaming addresses.
All reports are linked from LNA Guide software
section: http://lnaguide/software.html
DNS
DNS at Stanford
Host information is entered in NetDB
Uploads to DHCP servers about every
15 minutes.
Uploads to DNS servers about every
hour.
Starts at 5 minutes after the hour.
Takes about 20 minutes. Should be done
by 30 minutes past the hour.
Specific info on timing is kept in the NetDB
help files.
DNS inspection tools
Standard: “host”, “nslookup”, “dig”.
Stanford whois can show you most NetDB
information:
“whois -h whois.stanford.edu <query>”
Use “%” and “_” as wildcards as per ipm.
Great for people who need “read-only”
access, since you don’t need a NetDB
account.
For host names, you need to end query in a “.”
or specify “.stanford.edu” so that whois knows
you want information on a host.
Wireless
Wireless problems
Wireless is slow or unavailable.
Reports can be vague. “Wireless is
slow on the 2nd floor.”
Isolating the problem can speed
resolution.
Exactly where is the problem
occurring?
What access point is the user
connecting to?
Do others have problem in the area?
Wireless tools
Access point association:
Mac: Internet Connect utility
PC: ??
Access point discovery for seeing
available AP’s and channels:
NetStumbler, iStumbler
Iperf and Netspeed are useful for
checking speed problems.
Often, a AP reboot will solve the problem.
AP jack (tso) information is in Netdb.
Can unplug and replug if necessary.
Packet sniffer
EtherPeek and Wireshark
Stanford has site license for
Etherpeek, but it’s still expensive.
Wireshark (formerly Ethereal) is free.
(Motto: “Sniff free or die!”)
X windows application for Unix/Mac.
Binary for Windows.
http://www.wireshark.org/
Some books are available!
Advice on Sniffing
Need for a sniffer is rare, but
invaluable when you need it.
Learn to use it before you need it!
You will need to set up special
“span” ports on your switches to
see all traffic.
No need if you’re interested in
broadcasts and multicasts.
Most useful for seeing traffic entering
and leaving your net.
NetDB Command Line
NetDB CLI overview
Designed for power users.
Provides a subset of NetDB
functionality (mostly nodes) for
batch changes. New features
are periodically added.
Use with caution. Try one or
two hosts before doing big
batches.
How to run NetDB CLI
Located in AFS space:
/usr/pubsw/sbin/netdb (note: this
directory is probably not in your PATH)
Use -h option to get command syntax
Stuff you can do (to a single
machine or list of machines):
Change administrators, locations.
Change IP addresses.
Delete nodes.
Q&A??