Troubleshooting Methods

Transcript Troubleshooting Methods

Network Troubleshooting
Accessing the WAN – Chapter 8
Modified by Tony Chen
04/28/2009
ITE I Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
1
Notes:

If you see any mistake on my PowerPoint slides or if
you have any questions about the materials, please
feel free to email me at [email protected].
Thanks!
Tony Chen
College of DuPage
Cisco Networking Academy
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
2
Objectives

In this chapter, you will learn to:
– Establish and document a network baseline.
– Describe the various troubleshooting methodologies and
troubleshooting tools.
– Describe the common issues that occur during WAN
implementation.
– Identify and troubleshoot common enterprise network
implementation issues using a layered model approach.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
3
Documenting Your Network
 To efficiently diagnose and correct network problems,
a network engineer needs to know network baseline .
–This information is captured in documentation.
 Network documentation include 3 components:
1. Network configuration table
2. End-system configuration table
3. Network topology diagram
1. Network Configuration Table
–Contains up-to-date records of hardware and software
•Type of device, model designation
•IOS image name
•Device network hostname
•Location of the device (building, floor, room, rack, panel)
•If it is a modular device, include all module types and in
which module slot they are located
•Data link layer addresses
•Network layer addresses
•Any additional important information about physical aspects
of the device
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
4
Documenting Your Network
2. End-system Configuration Table
–Contains baseline records used in end-system devices
such as servers, and desktop workstations.
•Device name (purpose)
•Operating system and version
•IP address
•Subnet mask
•Default gateway, DNS server, and WINS server addresses
•Any high-bandwidth network applications that the endsystem runs
3. Network Topology Diagram
–Graphical representation of a network, which illustrates
how each device in a network is connected and its
logical architecture.
–Routing protocols can also be shown.
•Symbols for all devices and how they are connected
•Interface types and numbers
•IP addresses
•Subnet masks
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
5
Network Documentation Process
 When you document your network, you may have to
gather information directly from routers and switches.
 Commands that are useful to the network
documentation process include:
–The ping command is used to test connectivity with
neighboring devices. Pinging to other PCs in the network
also initiates the MAC address auto-discovery process.
–The telnet command is used to log in remotely to a
device for accessing configuration information.
–The show ip interface brief is used to display the up
or down status and IP address of all interfaces.
–The show ip route command is used to display the
routing table in a router to learn the directly connected
neighbors, more remote devices (through learned
routes), and the routing protocols.
–The show cdp neighbor detail command is used to
obtain detailed information about directly connected
Cisco neighbor devices.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
6
Why is Establishing a Baseline Important?
 Establishing a network performance baseline requires
collecting key performance data from the ports and
devices that are essential to network operation.
–How does the network perform during a normal or
average day?
• Measuring the initial performance allows a network
administrator to determine the difference between
abnormal behavior and proper network performance.
–Where are the underutilized and over-utilized areas?
• It may also reveal areas in the network that are
underutilized and quite often can lead to network redesign
efforts based on quality and capacity observations.
–Where are the most errors occurring?
• In addition, analysis after an initial baseline tends to
reveal hidden problems.
–What thresholds should be set for the devices that
need to be monitored?
–Can the network deliver the identified policies?
• The baseline also provides insight into whether the
current network design can deliver the required policies.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
7
Steps for Establishing a Network Baseline
3 steps for planning the first baseline:
 Step 1. Determine what types of data to collect
–When conducting the initial baseline, start by selecting
a few variables that represent the defined policies. If too
many data points are selected, the amount of data can
be overwhelming.
• Generally, some good measures are interface utilization
and CPU utilization.
 Step 2. Identify devices and ports of interest
–. Devices and ports of interest include:
• Network device ports that connect to other network
devices
• Servers
• Key users
• Anything else considered critical to operations.
–By narrowing the ports polled, the results are concise,
and network management load is minimized.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
8
Steps for Establishing a Network Baseline
 Step 3. Determine the baseline duration
–This period should be at least seven days to
capture any daily or weekly trends.
–A baseline needs to last no more than six weeks.
–Generally, a two-to-four-week baseline is
adequate.
• The figure shows examples of several
screenshots of CPU utilization trends captured
over a daily, weekly, monthly, and yearly period.
• The work week trends are too short to accurately
reveal the recurring nature of the utilization surge
that occurs every weekend when a database
backup operation consumes network bandwidth.
• The yearly trend shown in the example is too long
a duration to provide meaningful baseline
performance details.
–Baseline analysis of the network should be
conducted on a regular basis.
• Analysis must be conducted regularly to
understand how the network is affected by growth
and other changes.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
9
Measuring Network Performance Data
 Sophisticated network management software is
often used to baseline large networks.
– For example, Fluke Network SuperAgent module
enables administrators to automatically create
reports using Intelligent Baselines feature.
• This feature compares current performance levels
with historical observations and can automatically
identify performance problems and applications
that do not provide expected levels of service.
 In simpler networks, the baseline tasks may
require a combination of manual data collection
and simple network protocol inspectors.
– Hand collection using show commands on
individual network devices is extremely time
consuming and should be limited to missioncritical network devices.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
10
General Approach to Troubleshooting
 Using efficient troubleshooting techniques shortens overall
troubleshooting time.
 Two extreme approaches to troubleshooting almost always result
in disappointment, delay, or failure.
– At one extreme is the theorist, or rocket scientist, approach.
• The rocket scientist analyzes and reanalyzes the situation until the exact
cause at the root of the problem has been identified.
• While this process is fairly reliable, few companies can afford to have
their networks down for the hours or days.
– At the other extreme is the impractical, or caveman, approach.
• The caveman's first instinct is to start swapping cards, cables, and
software until miraculously the network begins operating again.
• This approach may achieve a change in symptoms faster, it is not
reliable.
 the better approach is somewhere in the middle using elements
of both.
– It is important to analyze the network as a whole rather than in a
piecemeal fashion.
– A systematic approach minimizes confusion and cuts down on time
otherwise wasted with trial and error.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
11
Using Layered Models for Troubleshooting
OSI Versus TCP/IP Layered Models
 OSI Reference Model
–The upper layers (5-7) deal with application issues and
are implemented only in software.
–The lower layers (1-4) handle data-transport issues.
•Layers 3 and 4 are generally implemented only in software.
•The physical layer (Layer 1) and data link layer (Layer 2)
are implemented in hardware and software.
 TCP/IP Model
–The application layer in the TCP/IP suite actually
combines the functions of the three OSI model layers:
session, presentation, and application.
–The transport layers of TCP/IP is responsible for
exchanging segments between devices.
–The Internet layer is responsible for placing messages in
a fixed format that allows devices to handle them.
–The network access layer communicates directly with
the network media and provides an interface between
the architecture of the network and the Internet layer.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
12
General Troubleshooting Procedures
 The stages of the general troubleshooting process are:
–Stage 1 Gather symptoms - Troubleshooting begins with
the process of gathering and documenting symptoms from
the network, end systems, and users.
•Symptoms may appear in many different forms, including
alerts from the network management system, console
messages, and user complaints.
–Stage 2 Isolate the problem - The problem is not isolated
until a single problem, or a set of problems, is identified.
–Stage 3 Correct the problem - Having isolated and
identified the cause of the problem, the network
administrator works to correct the problem by
implementing, testing, and documenting a solution.
If the network administrator determines that the
corrective action has created another problem,
–the attempted solution is documented, the changes are
removed, and the network administrator returns to
gathering symptoms and isolating the problem.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
13
Troubleshooting Methods
 There are three main methods for troubleshooting:
 Bottom-Up Troubleshooting Method
–In bottom-up troubleshooting you start with the physical
components of the network and move up through the layers.
•Bottom-up troubleshooting is a good approach to use when the
problem is suspected to be a physical one.
 Top-Down Troubleshooting Method
–In top-down troubleshooting your start with the end-user
applications and move down the layers of the OSI model.
•Use this approach for simpler problems or when you think the
problem is with a piece of software.
 Divide-and-Conquer Troubleshooting Method
–In divide-and-conquer troubleshooting you start by collecting
user experience of the problem, document the symptoms
and then, using that information, make an informed guess as
to which OSI layer to start your investigation.
•For example, if users can't access the web server and you can
ping the server, then you know that the problem is above Layer 3.
•If you can't ping the server, then you know the problem is likely at
a lower OSI layer.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
14
Guidelines for Selecting a Troubleshooting Method
 To quickly resolve network problems,
take the time to select the most effective
troubleshooting method.
–Use the process shown in the figure to
help you select the most efficient
troubleshooting method.
 For example: Two IP routers are not
exchanging routing information. The last
time this type of problem occurred it was
a protocol issue. So you choose the
divide-and-conquer troubleshooting
method.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
15
Gathering Symptoms
 Step 1. Analyze existing symptoms
–Analyze symptoms gathered from the trouble ticket or
users to form a definition of the problem.
 Step 2. Determine ownership
–If problem is within your system, move onto next stage.
–If the problem is outside the boundary of your control, for
example, lost Internet connectivity you need to contact
an administrator for the external system.
 Step 3. Narrow the scope
–Determine if the problem is at the core, distribution, or
access layer of the network.
 Step 4. Gather symptoms from suspect devices
–Use knowledge and experience to determine if the
problem is a hardware or software problem.
 Step 5. Document symptoms
–Sometimes the problem can be solved using the
documented symptoms. If not, begin the isolating phase
of the general troubleshooting process.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
16
Gathering Symptoms
 Use the Cisco IOS commands to gather
symptoms about the network.
–Although the debug command is an
important tool for gathering symptoms it
generates a large amount of console
message traffic and the performance of a
network device can be noticeably affected.
–Make sure you warn network users that a
troubleshooting effort is underway and that
network performance may be affected.
–Remember to disable debugging when you
are done.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
17
Gathering Symptoms: Questioning End Users
 When you question end users about a network problem they
may be experiencing, use effective questioning techniques.
 This way you will get the information you need to effectively
document the symptoms of a problem.
 The table in the figure provides some guidelines and enduser example questions.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
18
Software Troubleshooting Tools
 NMS Tools
–Network management system (NMS) tools
include device-level monitoring, configuration,
and fault management tools.
–Network monitoring software graphically
displays a physical view of network devices,
allowing network managers to monitor remote
devices without physically checking them.
–Examples are CiscoView, HP Openview, Solar
Winds, and What's Up Gold.
 Knowledge Bases
–On-line network device vendor knowledge
bases have become indispensable sources of
information.
–When vendor-based knowledge bases are
combined with Internet search engines like
Google, a network administrator has access to
a vast pool of experience-based information.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
19
Software Troubleshooting Tools
 Baselining Tools
–For example they can help you draw network
diagrams, help you to keep network software and
hardware documentation up-to-date and help you to
cost-effectively measure baseline network bandwidth
use.
–Many tools for automating the network
documentation and baselining process are available.
–The figure shows a screen chapter of the SolarWinds
LAN surveyor and CyberGauge software.
 Protocol Analyzers
–A protocol analyzer decodes the various protocol
layers in a recorded frame and presents this
information in a relatively easy to use format.
–The figure shows a screen capture of the Wireshark
protocol analyzer.
–Most protocol analyzers can filter traffic that meets
certain criteria so that, for example, all traffic to and
from a particular device can be captured.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
20
Hardware Troubleshooting Tools
 Network Analysis Module
–A network analysis module (NAM) can be
installed in Cisco Catalyst 6500 series switches
and Cisco 7600 series routers to provide a
graphical representation of traffic.
 Digital Multimeters
–Digital multimeters (DMMs) are test instruments
that are used to directly measure electrical values
of voltage, current, and resistance.
 Cable Testers
–Cabling testers can be used to detect broken
wires, crossed-over wiring, shorted connections,
and improperly paired connections.
–These devices can be inexpensive continuity
testers, moderately priced data cabling testers, or
expensive time-domain reflectometers (TDRs).
•TDRs are used to test the distance to a break in a
cable.
•TDRs used to test fiber optic cables are known as
optical time-domain reflectometers (OTDRs).
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
21
Hardware Troubleshooting Tools
 Cable Analyzers
–Cable analyzers are multifunctional handheld devices that
are used to test and certify copper and fiber cables for
different services and standards.
–The more sophisticated tools include advanced
troubleshooting diagnostics that measure distance to
performance defect (NEXT, RL), identify corrective
actions, and graphically display crosstalk and impedance
behavior.
 Portable Network Analyzers
–Portable devices that are used for troubleshooting
switched networks and VLANs.
–By plugging the network analyzer in anywhere on the
network, a network engineer can see the switch port to
which the device is connected and the average and peak
utilization.
–The analyzer can also be used to discover VLAN
configuration, identify top network talkers, analyze
network traffic, and view interface details.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
22
Troubleshooting Tools: Research Activity
 The following are links to various troubleshooting tools.
 Software Tools
– Network Management Systems:
•http://www.ipswitch.com/products/whatsup/index.asp?t=demo
•http://www.solarwinds.com/products/network_tools.aspx
– Baselining Tools:
•http://www.networkuptime.com/tools/enterprise
– Knowledge Bases:
•http://www.cisco.com
– Protocol Analyzers:
•http://www.flukenetworks.com/fnet/en-us/products/OptiView+Protocol+Expert/
 Hardware Tools
– Cisco Network Analyzer Module (NAM):
•http://www.cisco.com/en/US/docs/net_mgmt/network_analysis_module_software/3.5/user/guide/user.html
– Cable Testers:
•http://www.flukenetworks.com/fnet/en-us/products/CableIQ+Qualification+Tester/Demo.htm
– Cable Analyzers:
•http://www.flukenetworks.com/fnet/en-us/products/DTX+CableAnalyzer+Series/Demo.htm
– Network Analyzers:
•http://www.flukenetworks.com/fnet/en-us/products/OptiView+Series+III+Integrated+Network+Analyzer/Demos.htm
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
23
WAN Communications
 WAN technologies function at the lower three
layers of the OSI reference model.
 A communications provider normally owns the
data links that make up a WAN.
–The links are made available to subscribers for a
fee and are used to interconnect LANs or connect
to remote networks.
–WAN data transfer speed (bandwidth) is
considerably slower than the common LAN
bandwidth.
–The charges for link provision are the major cost
element, therefore the WAN implementation must
aim to provide maximum bandwidth at acceptable
cost.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
24
Steps in WAN Design
 WAN connectivity is important to business and expensive,
these are the steps for designing or modifying a WAN:
 Step 1. Locate LANs - Establish the source and destination
endpoints that will connect through the WAN.
 Step 2. Analyze traffic - Know what data traffic must be
carried, its origin, and its destination.
 Step 3. Plan the topology - A high requirement for availability
requires extra links that provide alternative data paths for
redundancy and load balancing.
 Step 4. Estimate the required bandwidth - Traffic on the links
may have varying requirements for latency and jitter.
 Step 5. Choose the WAN technology - Suitable link
technologies must be selected.
 Step 6. Evaluate costs - When all the requirements are
established, installation and operational costs for the WAN
can be determined and compared with the business need
driving the WAN implementation.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
25
WAN Traffic Considerations
 The table in the figure shows
the wide variety of traffic
types and their varying
requirements of bandwidth,
latency, and jitter that WAN
links are required to carry.
–To determine traffic flow
conditions and timing of a WAN
link, you need to analyze the
traffic characteristics specific to
each LAN that is connected to
the WAN.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
26
WAN Topology Considerations
 Designing a WAN topology consists of the following:
– Selecting an interconnection pattern or layout for the links
between the various locations
– Selecting the technologies for those links to meet the
enterprise requirements at an acceptable cost
• More links increase the cost of the network services, but
having multiple paths between destinations increases
reliability.
• Adding more network devices to the data path increase
latency and decreases reliability.
 Many WANs use a star topology.
– As the enterprise grows and new branches are added, the
branches are connected back to the head office,
producing a traditional star topology.
 Star endpoints are sometimes cross-connected, creating
a mesh or partial mesh topology.
– This provides for many possible combinations for
interconnections.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
27
WAN Topology Considerations - Hierarchical
 When many locations must be joined, a hierarchical
solution is recommended.
 For example, imagine an enterprise that is
operational in every country of the European Union
and has a branch in every town with a population
over 10,000. Each branch has a LAN, and it has been
decided to interconnect the branches.
–A mesh network is clearly not feasible because there
would be hundreds of thousands of links.
–A three-layer hierarchy is often useful when the network
traffic mirrors the enterprise branch structure and is
divided into regions, areas, and branches
•Group the LANs in each area and interconnected them to
form a region,
– The area could be based on the number of locations to be
connected with an upper limit of between 30 and 50.
– The area would have a star topology, with the hubs of the stars
linked to form the region.
•interconnect the regions to form the core of the WAN.
– Regions could be geographic, connecting between three and 10
areas, and the hub of each region could be linked point-to-point.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
28
WAN Connection Technologies
 A typical private WAN uses a combination
of technologies that are usually chosen
based on traffic type and volume.
–ISDN, DSL, Frame Relay, or leased lines are
used to connect individual branches into an
area.
–Frame Relay, ATM, or leased lines are used
to connect external areas back to the
backbone.
–ATM or leased lines form the WAN backbone.
–Technologies that require the establishment
of a connection before data can be
transmitted, such as basic telephone, ISDN,
or X.25, are not suitable for WANs that
require rapid response time or low latency.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
29
WAN Connection Technologies
 Frame Relay and ATM are examples of shared
networks.
–Because several customers are sharing the link, the cost
to each is generally less than the cost of a direct link of
the same capacity.
–Although ATM is a shared network, it has been designed
to produce minimal latency and jitter through high-speed
internal links sending easily manageable units of data,
called cells.
•ATM cells have a fixed length of 53 bytes, 48 bytes for data
and 5 bytes for the header. ATM is widely used for carrying
delay-sensitive traffic.
–Frame Relay may also be used for delay-sensitive
traffic, often using QoS mechanisms to give priority to
the more sensitive data.
 Leased lines are typically more expensive than
access links but are available at virtually any
bandwidth and provide very low latency and jitter
[They are not shared].
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
30
WAN Topology Considerations
 Although the Internet may pose a
security problem it does provides
an alternative for inter-branch
traffic.
–Part of the traffic that must be
considered during design is
going to or coming from the
Internet.
–Common implementations are
to have each network in the
company connect to a different
ISP, or to have all company
networks connect to a single
ISP from a core layer
connection.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
31
WAN Bandwidth Considerations
 Many companies rely on the highspeed transfer of data between
remote locations.
–Consequently, higher bandwidth is
crucial because it allows more
data to be transmitted in a given
time.
–When bandwidth is inadequate,
competition between various types
of traffic causes response times to
increase, which reduces employee
productivity and slows down
critical web-based business
processes.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
32
Common WAN Implementation Issues
 The figure summarizes the common WAN
implement issues and the questions you
need to answer before you can effectively
implement a WAN.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
33
Case Study: WAN Troubleshooting From an ISP’s Perspective
 A significant proportion of the support calls received
by an ISP refer to slowness of the Network. To
troubleshoot, you have to isolate the components:
–Individual PC host
•A large number of user applications open on the PC.
•Tools like Task Manager can help determine CPU utilization
–LAN
•If the customer LAN is frequently reaching 100 percent
utilization. This is a the customer internal problem .
•This is why a network baseline is so important.
–Link from the edge of the user network to the edge
of the ISP
•This problem is ISP's responsibility.
–Backbone of the ISP
•the ISP can determine which link is causing the problem.
–Server being accessed
•In some cases the slowness, being attributed to the
network, may be caused by server congestion. This problem
is the hardest to diagnose and it should be the last option
pursued after all other options have been eliminated.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
34
Interpreting Network Diagrams to Identify Problems
 It is impossible to troubleshoot any type of network
connectivity issue without a network diagram.
 Physical Network Diagram
–A physical network diagram shows the physical layout of
the devices connected to the network.
•Device type
•Model and manufacturer
•Operating system version
•Cable type and identifier
•Cable specification
•Connector type
•Cabling endpoints
 Logical Network Diagram
–A logical network diagram shows how data is transferred
on the network.
ITE 1 Chapter 6
•Device identifiers
•Site-to-site VPNs
•IP address and subnet
•Routing protocols
•Interface identifiers
•Static routes
•Connection type
•Data-link protocols
•DLCI for virtual circuits
•WAN technologies used
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
35
Symptoms of Physical Layer Problems
 The physical layer transmits bits from one computer to
another and regulates the transmission of a stream of
bits over the physical medium.
 Common symptoms of problems at the physical layer:
–Performance lower than baseline - If performance is
unsatisfactory all the time, the problem is probably related to
inadequate capacity.
–Loss of connectivity - If a cable or device fails, the most
obvious symptom is a loss of connectivity.
–High collision counts - Collisions are normally a more
significant problem on shared media. Collision-based
problems may be a bad cable to a single station on a hub.
–Network bottlenecks or congestion - If an interface fails,
routing protocols may redirect traffic to other routes that are
not designed to carry the extra capacity.
–High CPU utilization rates - High CPU utilization rates are a
symptom that a device, such as a router, switch, or server,
is operating at or exceeding its design limits.
–Console error messages - Error messages reported on the
device console indicate a physical layer problem.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
36
Causes of Physical Layer Problems
 Power-related
– Power-related issues are the most fundamental reason for
network failure. Check the operation of the fans.
 Hardware faults
– Faulty NICs can be the cause of transmission errors due to late
collisions, short frames, and jabber.
 Cabling faults
– Many problems can be corrected by simply reseating cables that
have become partially disconnected.
– Look for damaged cables, improper cable types, and poorly
crimped RJ-45s.
 Attenuation
– Attenuation can be caused if a cable length exceeds the design
limit for the media (for example, an Ethernet cable is limited to
100 meters (328 feet).
 Interface configuration errors
– Many things can be misconfigured to cause it to go down.
•Serial links reconfigured as asynchronous instead of synchronous
•Incorrect clock rate
•Incorrect clock source
•Interface not turned on
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
37
Causes of Physical Layer Problems
 Noise
– Local electromagnetic interference (EMI) is known as noise.
There are four types of noise that are significant to networks:
•Impulse noise that is caused by voltage fluctuations or
current spikes induced on the cabling.
•Random (white) noise that is generated by such as FM
radio stations, police radio, and building security.
•Alien crosstalk, which is noise induced by other cables in
the same pathway.
•Near end crosstalk (NEXT), which is noise originating from
crosstalk from other adjacent cables or noise from nearby
electric cables, devices with large electric motors, or
anything that includes a transmitter more powerful than a
cell phone.
 Exceeding design limits
– A component may be operating suboptimally at the physical
layer because it is being utilized at a higher average rate than it
is configured to operate.
 CPU overload
– One of the causes of CPU overload in a router is high traffic. If
some interfaces are regularly overloaded with traffic, consider
redesigning the traffic flow in the network or upgrading the
hardware.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
38
To isolate problems at the physical layers
 Check for bad cables or connections
–Verify that the cable is properly connected and is in good
condition. Your cable tester might reveal an open wire.
 Check that the correct cabling standard is adhered to
throughout the network
–Verify that the proper cable is being used. For example, in
the figure, the Fluke meter detected that a cable was good
for Fast Ethernet, it is not qualified to support 1000BASE-T.
 Check that devices are cabled correctly
–Check that cables are connected to their correct ports.
• This is where having a neat and organized wiring closet saves
you a great deal of time.
 Verify proper interface configurations
–Check that all switch ports are set in the correct VLAN and,
speed, and duplex settings are correctly configured.
 Check operational statistics and data error rates
–Use Cisco show commands to check for statistics such as
collisions and input and output errors.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
39
Symptoms of Data Link Layer Problems
 Common symptoms at the data link layer include:
 No functionality or connectivity at network layer or above
–Some Layer 2 problems can stop the frames across a link.
 Network is operating below baseline performance levels
–There are two types of suboptimal Layer 2 operation:
• Frames take an illogical path to their destination but do arrive.
An example of a problem which could cause frames to take a
suboptimal path is a poorly designed Layer 2 spanning-tree.
• Some frames are dropped. An extended or continuous ping
also reveals if frames are being dropped.
 Excessive broadcasts
–Excessive broadcasts result from one of the following:
• Poorly programmed or configured applications
• Large Layer 2 broadcast domains
• Underlying network problems, such as STP loops.
 Console messages
–In some instances, a router recognizes a Layer 2 problem
has occurred and sends alert messages to the console.
–The most common console is line protocol down message.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
40
Causes of Data Link Layer Problems
 Encapsulation errors
–This condition occurs when the encapsulation at
one end of a WAN link is configured differently from
the encapsulation used at the other end.
 Address mapping errors
–When using static maps in Frame Relay, an
incorrect map is a common mistake.
• Simple configuration errors can result in a mismatch
of Layer 2 and Layer 3 addressing information.
–In a dynamic environment, the mapping of Layer 2
and Layer 3 information can fail for the following
reasons:
• Devices may have been specifically configured not
to respond to ARP or Inverse-ARP requests.
• The Layer 2 or Layer 3 information that is cached
may have physically changed.
• Invalid ARP replies are received because of a
misconfiguration or a security attack.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
41
Causes of Data Link Layer Problems
 Framing errors
–Frames usually work in groups of 8 bit bytes. A
framing error occurs when a frame does not end on
an 8-bit byte boundary. When this happens, the
receiver may have problems determining where one
frame ends and another frame starts.
• Framing errors can be caused by a noisy serial line,
an improperly designed cable (too long), or an
incorrectly configured CSU line clock.
 STP failures or loops
–Most STP problems revolve around these issues:
• Forwarding loops that occur when no port in a
redundant topology is blocked and traffic is
forwarded in circles indefinitely.
• Excessive flooding because of a high rate of STP
topology changes.
• Slow STP convergence, which can be caused by a
mismatch between the real and documented
topology, a configuration error, such as an
inconsistent configuration of STP timers.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
42
Troubleshooting Layer 2 - PPP
 Most of problems with PPP involve link negotiation.
 The steps for troubleshooting PPP are as follows:
 Step 1. Check that the appropriate encapsulation
is in use at both ends,
–Using show interfaces serial command.
–In the figure, the output reveals that R2 has been
incorrectly configured to use HDLC encapsulation.
 Step 2. Confirm that the Link Control Protocol
(LCP) negotiations have succeeded by checking
the output for the LCP Open message.
–In the figure, the encapsulation on R2 has been
changed to PPP. The output shows the LCP Open
message and the LCP negotiations have succeeded.
 Step 3. Verify authentication on both sides of the
link using the debug ppp authentication command.
–In the figure, the output of debug ppp
authentication command shows that R1 is unable to
authenticate R2 using CHAP, because the username
and password have not been configured on R1.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
43
Troubleshooting Layer 2 - Frame Relay
 Step 1. Verify the physical connection between
the CSU/DSU and the router.
 Step 2. Verify that the router and Frame Relay
provider are properly exchanging LMI by using
the show frame-relay lmi command.
–In the figure, the output of R2 shows no errors.
This indicates that R2 and the Frame Relay
switch are properly exchanging LMI information.
 Step 3. Verify that the PVC status is active by
using the show frame-relay pvc command.
–In the figure, the output of R2 verifies that the
PVC status is active.
 Step 4. Verify that the Frame Relay
encapsulation matches on both routers with the
show interfaces serial command.
–In the figure, the output of routers R2 and R3
shows that there is an encapsulation mismatch.
–R3 has been incorrectly configured to use HDLC
encapsulation instead of Frame Relay.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
44
Troubleshooting Layer 2 - STP Loops
 If you suspect an STP loop is causing a Layer 2 problem,
verify if the STP is running on each of the switches.
 Step 1. Identify that an STP loop is occurring.
–When a forwarding loop has developed in the network, these
are the usual symptoms:
•Loss of connectivity to, from, and through the affected network
•High CPU utilization on routers connected to affected VLANs
•High link utilization (often 100 percent)
•High switch backplane utilization
•Syslog messages that indicate packet looping in the network (for
example, HSRP duplicate IP address messages)
•Syslog messages that indicate constant address relearning or
MAC address flapping messages
•Increasing number of output drops on many interfaces
 Step 2. Discover the topology (scope) of the loop.
–The highest priority is to stop the loop and restore network.
–To stop the loop, you must know which ports are involved.
Look at the ports with the highest link utilization. The show
interface command displays the utilization for each interface.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
45
Troubleshooting Layer 2 - STP Loops
 Step 3. Break the loop.
–Shut down or disconnect the involved ports one at a time.
–After you disable or disconnect each port, check whether the
switch backplane utilization is back to a normal level.
–Document your findings.
 Step 4. Find and fix the cause of the loop.
–Investigate the topology diagram to find a redundant path.
–For every switch on the redundant path, check these issues:
•Does the switch know the correct STP root?
•Is the root port identified correctly?
•Are Bridge Protocol Data Units (BPDUs) received regularly on the
root port and on ports that are supposed to be blocking?
•Are BPDUs sent regularly on non-root, designated ports?
 Step 5. Restore the redundancy.
–After the device or link that is causing the loop has been found
and the problem has been resolved, restore the redundant
links that were disconnected.
–http://cisco.com/en/US/tech/tk389/tk621/technologies_tech_no
te09186a0080136673.shtml#troubleshoot.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
46
Symptoms of Network Layer Problems
 Network layer problems include any problem that
involves a Layer 3 protocol, both routed protocols and
routing protocols.
–This topic focuses primarily on IP routing protocols.
 Problems at the network layer:
–Network failure
•The network is nearly or completely nonfunctional, affecting
all users and applications using the network.
•These failures are usually noticed quickly by users and
network administrators, and are obviously critical to the
productivity of a company.
–Network optimization problems
•usually involve a subset of users, applications, destinations,
or a particular type of traffic.
•Optimization issues in general can be more difficult to detect
and even harder to isolate and diagnose because they usually
involve multiple layers or even the host computer itself.
•Determining that the problem is a network layer problem can
take time.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
47
Troubleshooting Layer 3 Problems
 In most networks, static routes are used in combination with
dynamic routing protocols.
–Improper configuration of static routes can lead to less than optimal
routing and, in some cases the network to become unreachable.
 Here are some possible problems involving routing protocols:
 General network issues
–Often a change in the topology, such as a down link, may have
affects on other areas that might not be obvious at the time.
 Connectivity issues
–Check for any equipment problems, cabling, and ISP problems.
 Neighbor issues
–Check if there are any problems with the routers forming neighbor.
 Topology database
–Check the topology table, for any missing or unexpected entries.
 Routing table
–Check the routing table for anything missing or unexpected routes.
–Use debug commands to view routing updates and maintenance.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
48
Transport Layer Troubleshooting: Access List Issues
1. Selection of traffic flow
–ACL must be applied to the correct interface, and correct
traffic direction must be selected to function properly.
–If the router is running both ACLs and NAT, the order in
which each of these technologies is applied is important:
•Inbound traffic is processed by the inbound ACL before being
processed by outside-to-inside NAT.
•Outbound traffic is processed by the outbound ACL after being
processed by inside-to-outside NAT.
2. Order of access control elements
–The elements ACL should be from specific to general.
3. Implicit deny all
–Forgetting about this implicit access control element may be
the cause of an ACL misconfiguration.
4. Addresses and wildcard masks
–Complex wildcard masks provide significant improvements
in efficiency, but are more subject to configuration errors.
•The address 10.0.32.0 and wildcard mask 0.0.32.15 to select
the first 15 host addresses in either the 10.0.0.0 or 10.0.32.0
network.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
49
Transport Layer Troubleshooting: Access List Issues
5. Selection of transport layer protocol
–When configuring ACLs, it is important that only the correct
transport layer protocols [TCP, UDP] be specified.
6. Source and destination ports
–Address and port information for traffic generated by a replying
host is the mirror address and port from the source host.
7. Use of the established keyword
–If the keyword is applied to an outbound ACL, unexpected
results may occur.
8. Uncommon protocols
–Uncommon protocols that are gaining popularity are VPN and
encryption protocols.
 Troubleshooting Access Control Lists
–A useful command for viewing ACL operation is the log
keyword on ACL entries.
•This keyword instructs the router to place an entry in the system
log whenever that entry condition is matched.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
50
Transport Layer Troubleshooting: NAT Issues
 The biggest problem with all NAT technologies is
interoperability with other network technologies:
–BOOTP and DHCP - Because NAT requires both a valid
destination and source IP address.
•Configuring the IP helper feature can help solve this problem.
–DNS and WINS – Because NAT is changing the relationship
between inside and outside addresses.
•Configuring the IP helper feature can help solve this problem.
–SNMP - NAT is not able to alter the addressing information
stored in the data payload of the packet.
•Configuring the IP helper feature can help solve this problem.
–Tunneling and encryption protocols - Encryption and
tunneling protocols often require that traffic be sourced from a
specific UDP or TCP port.
•If encryption or tunneling protocols must be run through a NAT
router, network administrator can create a static NAT entry for the
required port for on the inside of the NAT router.
 Improperly timers can also result in unexpected operation.
•If NAT timers are too short, entries in the NAT table may expire before replies are received,
so packets are discarded.
•If timers are too long, entries may stay in the NAT table longer than necessary, consuming
the available connection pool.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
51
Application Layer Overview
 The most widely known application layer protocols:
–Telnet - Enables users to establish terminal session
connections with remote hosts.
–HTTP - Supports the exchanging of text, graphic,
sound, video, and other multimedia files on the web.
–FTP - Performs interactive file transfers between hosts.
–TFTP - Performs basic interactive file transfers typically
between hosts and networking devices.
–SMTP - Supports basic message delivery services.
–POP - Connects to mail servers and downloads e-mail.
–Simple Network Management Protocol (SNMP) Collects management information from network devices.
–DNS - Maps IP addresses to the names assigned to
network devices.
–Network File System (NFS) - Enables computers to
mount drives on remote hosts and operate them as if
they were local drives.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
52
Symptoms of Application Layer Problems
 A problem at the application layer can result in
unreachable or unusable resources when the
physical, data link, network, and transport layers are
functional.
–It is possible to have full network connectivity, but the
application simply cannot provide data.
 Another type of problem at the application layer
occurs when the physical, data link, network, and
transport layers are functional, but the data transfer
and requests for network services from a single
network service or application do not meet the normal
expectations of a user.
 A problem at the application layer may cause users to
complain that the network or the particular application
that they are working with is sluggish or slower than
usual when transferring data or requesting network
services.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
53
Troubleshooting Application Layer Problems
 The steps for troubleshooting application layer problems are:
 Step 1. Ping the default gateway.
–If successful, Layer 1 and Layer 2 services are functioning properly.
 Step 2. Verify end-to-end connectivity.
–If Layers 1-3 functioning properly, the issue exist at a higher layer.
 Step 3. Verify access list and NAT operation.
–If the ACLs and NAT are functioning as expected, the problem must
lie in a higher layer.
 Step 4. Troubleshoot upper layer protocol connectivity.
–Upper layer protocol, such as FTP, HTTP, or Telnet ride on top of the
basic IP transport but are subject to protocol-specific problems relating
to packet filters and firewalls.
–Troubleshooting an upper layer protocol connectivity problem requires
understanding the process of the protocol.
–This information is usually found in the latest RFC for the protocol or
on the developer web page.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
54
Correcting Application Layer Problems
 The steps for correcting application layer problems are:
 Step 1: Make a backup.
–Ensure that a valid configuration has been saved.
 Step 2: Make an initial configuration change.
–Make only one change at a time.
 Step 3: Evaluate each change and its results.
–If the results of any problem-solving steps are unsuccessful,
immediately undo the changes.
 Step 4: Determine if the change solves the problem.
–Verify that the change actually resolves the problem without
introducing any new problems.
–If the problem is not solved, undo all the changes.
 Step 5: Stop when the problem is solved.
 Step 6: If necessary, get assistance from outside resources.
–This may be a co-worker, a consultant, or TAC.
 Step 7: Document.
–Once the problem is resolved, document the solution.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
55
Chapter Summary
 In this chapter, you have learned to:
–Establish and document a network baseline.
–Describe the various troubleshooting
methodologies and
Tony Chen COD
troubleshooting tools.
Cisco Networking Academy
–Describe the common issues that occur during WAN
implementation.
–Identify and troubleshoot common enterprise network
implementation issues using a layered model approach.
ITE 1 Chapter 6
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Public
56

Troubleshooting Methods

Transcript Troubleshooting Methods

Directory