Transcript Chapter 13

Chapter 13
Network Troubleshooting
Introduction
• Look at:
– Avoiding Potential Problems (13.1)
– Principles of Troubleshooting (13.2)
– Accessing Key Information Resources
(13.3)
– Handling Common Sources of Trouble
(13.4)
Avoiding Potential Problems
• There are two approaches to trouble
shooting:
– preventing potential problems through
proper planning
– quickly fixing what fails
• The former is often referred to as trouble
avoidance or pre-emptive troubleshooting
• The later is referred to as troubleshooting or
damage control
Avoiding Potential Problems
• To be effective, both should be used in
combination
• Be proactive in managing the environment
• Know how to effectively troubleshoot any
issue
• Documentation is important
• It's a tedious job but it is imperative that you
have proper documentation
Avoiding Potential Problems
• Ensure that users have ongoing access
by making sure that if something
happens you have a backup plan:
– Identify the data that should be backed up
– Determine the backup type and schedule
– Designate someone to be responsible
– Be sure the tapes are properly labeled
– Keep a log
Avoiding Potential Problems
• Security policy should detail hardware
and software along with some of these
areas:
– Clear paths of responsibility and user
expectations
– Awareness of privacy issues that may arise
– A separation of duties, so that total control
is not left in the hands of a single individual
Avoiding Potential Problems
• Security policy should detail hardware
and software along with some of these
areas:
– Password length, duration, history, and
complexity requirements
– A clear policy for the destruction of data
– Procedures for creating and authorizing
accounts
– Incident response and disaster recovery
planning policies
Avoiding Potential Problems
• The goal of security is expressed in
terms of:
– Confidentiality
– Integrity
– Availability
• These goals can be achieved through
creating hardware and software
standards
Avoiding Potential Problems
• Workstation consistency is often overlooked:
– Installing unauthorized software
– Downloading infected music and movie
files
– Opening an e-mail message that contain a
virus
– Using weak passwords
– Not logging off the network when leaving
the building
Avoiding Potential Problems
• Standards for laptops, personal digital
assistants (PDAs), Palm Pilots, and Pocket
PCs may be more difficult to define
• If these devices are company issued or
company supported, they must be
standardized as well
• These devices are susceptible to theft
because they are small and valuable.
• Many times contain important information
about the company
Avoiding Potential Problems
• You must also define and document
standards for new server installations along
with guidelines for current server
configurations
• The configuration process should start with
installing only the services necessary for the
server to function
• Limit physical access to the server
• Use Redundant Array of Inexpensive Disk
(RAID), uninterruptible power supply (UPS)
equipment, and clustering
Avoiding Potential Problems
• Policies should include provisions for change
authorization, documentation, and notification
• Include procedures to be used when
hardware, software, or storage media is
replaced or discarded
• Planning and testing can eliminate corruption
or data deletion problems
• Sufficient time must be spent to ensure that
the transition goes as smoothly as possible
Avoiding Potential Problems
• The following should be considered when
creating a change in management policy:
– Establish a schedule for changes
– Make sure users are notified of the
changes
– Conduct proper testing
• Changes should be scheduled during off
hours
Avoiding Potential Problems
• Documentation is critical
• Before actually deploying the change,
testing should be conducted
• Testing should be well documented
• A rollback strategy should be part of
every change plan
Avoiding Potential Problems
• Thorough documentation is a necessary part
of an administrator's job
• Document everything you do and be as
detailed as you can
• Documenting is particularly important
because of the impact it can have on
business if legal action is involved
• All documents should be kept in both hardand soft-copy form
Avoiding Potential Problems
• Your network documentation should
include these components:
– Policies and procedures
– Network history
– Network map
– Cable diagrams and layouts
– Contact list
– Equipment list
Avoiding Potential Problems
• Your network documentation should
include these components:
– Computer and network device
configuration
– Software and its configuration
– Network address list
– Software licensing information
Avoiding Potential Problems
• Pre-emptive troubleshooting is also
called trouble avoidance
• It will save time and may help save data
when problems arise
• Pre-emptive troubleshooting can also
prevent additional expense and
downtime while trying to figure out what
happened after a failure
Avoiding Potential Problems
• The ISO defines five pre-emptive
troubleshooting network management
categories:
– Accounting management
– Configuration management
– Fault management
– Performance management
– Security management
Avoiding Potential Problems
• The measure of normal network activity
is known as a baseline
• This gives you a point of reference
when the network goes awry
• Baselining should be done for both
network and application processes
• The allows you to determine whether
you have a hardware or software issue
Avoiding Potential Problems
• There are tools can be used to gather
network information
• Event Viewer allows you to audit certain
events
• Task manager can be used to end processes
or applications that get hung up without
having to reboot the machine
• Auditing is the process of tracking users and
their actions on the network
Avoiding Potential Problems
• Keep in mind that auditing uses system
resources and space
• The Performance console is used for
tracking and viewing the utilization of
operating system resources
• A network monitor can be used to
capture network traffic and generate
statistics for creating reports
Principles of Troubleshooting
• Troubleshooting requires skill
• These skills are acquired through
experimentation and experience
• You cannot learn the resolution to every
problem that exists
• You can, however, learn a methodology to
find and diagnose nearly every problem in a
systematic and logical manner
Principles of Troubleshooting
• The following are the most common
network problems:
– User error
– Physical connections
– System needs a reboot
• If these steps don't help, then it's time to
move on and try other troubleshooting
options
Principles of Troubleshooting
• Research on problem solving and
reasoning is fundamental to
understanding troubleshooting skills
• You can choose from several different
methodologies of troubleshooting
• These give us guidelines for logical
solving problems using a step-by-step
process
Principles of Troubleshooting
• The first step is to determine the scope of the
problem by identifying the symptoms
• The next step is to collect specific information
about the problem at hand
• Once you have the pertinent information, then
the scope is determined
• Begin to isolate the problem by testing each
of the causes, starting with the most obvious
first
Principles of Troubleshooting
•
•
•
•
•
Attempt to re-create the problem
Make only one change at a time
Test each change
Don't be afraid to ask for help
Read the documentation that came with
the hardware or software
• Don't forget about the obvious
Principles of Troubleshooting
• Creating a Hardware Toolkit:
– Crossover cable
– Hardware loopback adapter
– Tone generator
– Cable tester or cable checker
– Voltmeters
– Time domain reflectometer (TDR)
– Oscilloscope
Principles of Troubleshooting
• Creating a Software Toolkit:
– Ping
– Netstat
– Nbtstat
– Traceroute
– Network monitors
– Protocol analyzer
Accessing Key Information
Resources
• One of the best places for troubleshooting a
problem is the manufacturer's Web site
• Subscription services such as TechNet can
be used to obtain a wealth of information
• Vendor-provided CDs should be one of the
first places you go to look for information
• Look at the readme.txt file even before the
product is installed
Accessing Key Information
Resources
• Resource kits are another excellent
source of information about your
operating system
• Call the vendor and open up a technical
support incident to solve the problem
• If it is a known issue the vendor may
have documented fixes available
Accessing Key Information
Resources
• Have the following information ready to
assist the support department:
– The operating system you are running
– Service packs that are installed
– Version numbers of hardware and software
– Serial numbers
– Detailed account of the problem and
troubleshooting steps you have taken
Accessing Key Information
Resources
• Other excellent sources of information are
periodicals and white papers
• Many new magazines and periodicals are
introduced each year, some of them deal with
specific computing environments
• Besides white papers and periodicals, don’t
forget to keep a couple of good reference
books handy, especially when you first start
out
Handling Common Sources of
Trouble
• Not all problems will be easy to fix
• The two most common causes for data
not moving reliably are:
– A physical connection breaks such as the
cable being unplugged or broken
– A network device is not working properly
Handling Common Sources of
Trouble
• The majority of networking problems occur at
the Physical layer of the OSI model and
include problems with:
– cables
– connectors
– NICs
• Check cabling and connections first during
your network troubleshooting process
Handling Common Sources of
Trouble
• Power problems will crop up in various
ways
• One of the most obvious is when power
strips are daisy chained together
• The devices will not get enough power
• The other end of the spectrum is that
this will occasionally trip the circuit
breakers or start a fire
Handling Common Sources of
Trouble
• Power not properly conditioned, can
have devastating effects on equipment :
– Noise
– Spikes
– Surges and overvoltages
– Sags and brownouts
– Blackouts
Handling Common Sources of
Trouble
• Always connect your sensitive
electronic equipment to:
– power conditioners
– surge protectors
– for the best protection an uninterruptible
power supply (UPS)
• The UPS powers the computer so that
you can take action without data loss
Handling Common Sources of
Trouble
• There are basically three different types of
devices that are classified as UPSs:
– Standby power supply (SPS)
– Hybrid or ferroresonant UPS systems
– Continuous UPS
• Never plug a printer into a UPS
• Power problems cannot be eliminated but the
damage can be minimized or prevented
Handling Common Sources of
Trouble
• A software upgrade can cause issues on the
system even though you tested the upgrade
• You should be prepared to rollback or reverse
the process.
• This process is also referred to as
backleveling
• Most often the best source of help when a
problem occurs is the manufacturer's
documentation
Handling Common Sources of
Trouble
• You will also have to provide for a
backup plan in the event a hardware
upgrade doesn't go as planned
• It is important not the discard the old
device in the event the upgrade causes
issues
• This applies to the drivers that may be
necessary as well
Handling Common Sources of
Trouble
• The complexity of network topology and
communication equipment has become more
and more sophisticated
• Performance management as well as
response time management is more difficult
• Sometimes you will find that for an unknown
reason the network performance begins to
suffer
Handling Common Sources of
Trouble
• Here are some avenues for you to
consider when there are issues with
performance:
– Change is the biggest factor that can
cause poor network performance
– Another big factor that affects network
performance is playing games or
downloading music and movie files
Handling Common Sources of
Trouble
• Here are some avenues for you to
consider when there are issues with
performance:
– Sometimes applications have memory
leaks or a new version may be bloated or
have an improperly programmed query
function
– Adding new electrical equipment may have
a negative effect on the network
Handling Common Sources of
Trouble
• Here are some avenues for you to
consider when there are issues with
performance:
– Adding new hardware such as additional
servers or workstations may cause
performance to decrease
– Other changes in workload or workplace
behavior, including adding more users,
could affect performance