NETWORK MONITORING

Download Report

Transcript NETWORK MONITORING

NETWORK MONITORING
Table of Contents
Introduction
Monitored Types of Information
Network Monitoring Configurations
Network Monitoring Methods
Performance Monitoring
Performance Indicators
Performance Monitoring Functions
Fault Monitoring
Problems of Fault Monitoring
Fault Monitoring Functions
Accounting Monitoring
2
Introduction
Network monitoring is concerned with observing and
analyzing the status and behavior of the end systems,
intermediate systems, and subnetworks that make up the
network to be managed.
3
Introduction
Issues in network monitoring
what to monitor?
• define what is to be monitored
how to monitor?
• how to obtain information from managed resources
what to do with the monitored information?
• how the monitored information is used in various
management functional areas
4
Monitored Types of Information
Static information
 hardly changes
 current configuration information

e.g., the number and identification of ports on a router
Dynamic information
 changes frequently
 information related to events in the network
 e.g., change of state, transmission/reception of packets
Statistical information
 derived from dynamic information
 e.g., average number of packets transmitted per unit time
5
Organization of a Management Information Base (MIB)
MANAGEMENT INFORMATION BASE (MIB)
Statistical
data base
Call_Blocked
Packet_Loss
Time_Delay
Throughput
Abstraction of state
and event variables
Dynamic
data base
State_Variable
Event_Variable
Sensor activation and
data collection
Sensor data base
Buffer
Switch_server
Source
Status_Sensor
Server
Station_Info
Switch_Buffer
Switch_Source
Configuration data base
Static data base
Derived_Status_Sensor
Event_Sensor
6
Monitoring System Components
monitoring application
 includes the functions of monitoring that are visible to the
user
 e.g., performance, fault, accounting
manager function
 performs the basic monitoring function of retrieving
information
7
Monitoring System Components
agent function
 gathers and records management information for one or more network
elements and delivers the information to the monitor
managed objects
 management information that represents resources and their activities
monitoring agent
 generates summaries and statistical analysis of management
information
8
Functional Architecture for Network Monitoring
Monitoring
application
Manager
function
Monitoring
application
Manager
function
Agent
function
Managed
objects
(a) manager-agent model
Monitoring
agent
Agent
function
Managed
objects
...
Agent
function
Managed
objects
(b) A model for summarization
9
Network Monitoring Configurations
Monitoring
application
Manager
function
Agent
function
Managed
objects
(a) Managed resources in
manager system
Monitoring
application
Manager
function
Subnetwork
or internet
Agent
function
Managed
objects
Monitoring
application
Manager
function
Monitoring
application
Manager
function
Subnetwork
or internet
(b) Resources in agent system
Agent
function
Subnetwork
or internet
LAN
Agent
function
LAN
observed traffic
(c) External monitor
(d) proxy monitor agent
10
Network Monitoring Methods
Polling
 a request-response interaction between a manager and
agent
 a manager sends request to an agent which processes the
request and responds with information from its MIB
 a manager may use polling to
 learn about the configuration it is managing
 obtain periodically an update of conditions
 investigate an area in detail after being altered to a
problem
11
Network Monitoring Methods
Event Reporting
 information flow is initiated from the agent to manager
 an agent may generate report periodically to give the
manager its current status or whenever a significant event
(e.g., change of a state) or an unusual event (e.g., fault)
occurs
 good for detecting problems as soon as they occur
12
Performance Monitoring
Measuring the performance of the network (or performance
monitoring) is absolutely required in Network Management
 to detect & fix problems that cause performance degradation
 to better plan network upgrades
13
Performance Monitoring
Problems in selecting and using appropriate indicators (or
metrics)
 too many indicators in use
 the meaning of most indicators are not yet clearly understood
 some indicators are supported by some manufacturers only
 frequently, the indicators are accurately measured but
incorrectly interpreted by human or management application
 the calculation of indicators takes too much time
14
Network Performance Indicators
Service-oriented
Availability: the percentage of time that a network system, a
component, or an application is available for a user
Response Time: how long it takes for a response to appear at a
user’s terminal after a user action calls for it
Accuracy: the percentage of time that no errors in the transmission
and delivery of information
15
Network Performance Indicators
Efficiency-oriented
Throughput: the rate at which application-oriented events
(e.g., file transfers) occur
Utilization: the percentage of the theoretical capacity of a
resource (e.g., transmission line, switch, CPU) that is being used
16
Elements of Response Time
TO
Network interface
(e.g., router)
Workstation
SI
Server
Network
SO
WO
WI
TI
CPU
RT = TI + WI + SI + CPU + WO + SO + TO
RT = response time
TI = inbound terminal delay
WI = inbound queuing time
SI = inbound service time
CPU = CPU process delay
WO = outbound queuing time
SO = outbound service time
TO = outbound terminal delay
17
Performance Monitoring Functions
Performance Measurement
 the actual gathering of statistics about network traffic &
timing
 typically performed by agents within network devices
 e.g., amount of data in and out of a node, number of
connections, traffic per connection
18
Performance Monitoring Functions
Performance Analysis
 analyzing the gathered data and presenting it
 e.g., total, average, min, max, histogram
Synthetic Traffic Generation
 generating artificial traffic load
 permits the network to be observed under a controlled
load
19
Typical Performance-Related Questions
Performance measurements can be used to
answer a number of questions
 Why is the response so slow? (a very loaded question!)
 Why is the retransmission rate so high?
 Is traffic evenly distributed among network users or are
there source-destination pairs with unusually heavy
traffic?
 What is the percentage of each type of packet?
20
Typical Performance-Related Questions
 What is the channel utilization and throughput?
 What is the effect of traffic load on utilization, throughput
& time delays?
 When does traffic load start to degrade system
performance?
 What is the maximum capacity of the channel under
normal operating conditions? How many active users are
necessary to reach this maximum?
21
Fault Monitoring
To detect faults as quickly as possible after they occur and to identify
the cause of the fault so that correctional action may be taken
Problems of Fault Monitoring
Fault Detection Problems
• Unobservable faults: e.g., deadlock, device not monitorable
• Partially observable faults: insufficient to pinpoint the
problem
• Uncertainty in observation: not clear what the problem is
22
Fault Monitoring
Fault Isolation Problems
• Multiple potential causes
• Too many related observations
• Interference between diagnosis and local recovery procedures
• Absence of automated testing tools
23
What happens when the T1 link fails?
802.3
802.5
Client
Router
Router
MUX
PBX
T1
Server
MUX
PBX
802.3
Heterogeneous Network Environment
24
Propagation of Failures to Higher Layers
Application failure
Transport failure
Client
Server
Data link failure
Mux
Router
Transmission
Mux
break
Router
25
Fault Monitoring Functions
Logging
 record important events and errors
 logs should be accessible by managers (e.g., via polling)
Event Reporting
 sending events, errors to managers
 sending alarms to manager to warn possible problems
26
Fault Monitoring Functions
Diagnostic Functions
 connectivity test (e.g., traceroute)
 response-time test
 liveness test (e.g., ping)
 protocol integrity test
 loopback test
27
Accounting Monitoring
Keeping track of users’ usage of network resources
 communication facilities
 computer hardware
 software and systems
 services
Usage may need to be broken down by account, by
project, or by individual user for appropriate
accounting purposes
28
Summary



Network monitoring is the most basic aspect of Network
Management
The purpose of network monitoring is to gather information about
the status and behavior of network elements
Information to be gathered include
static, dynamic and statistical information


Monitoring methods - polling & event reporting
Monitoring functions
 performance monitoring
 fault monitoring
 accounting monitoring
29
ON THE JOB WITH A
NETWORK MANAGER
Network Manager
 The type of activities that are performed by people
who run network for a living
 The term of network manager is rarely used for the
people involved in managing networks
 Network operator, network administrator, network
planner are much more common.
 Each of those terms refers to a more special function
that is only one aspect of Network Management
31
Network Manager
 Network management involves not just technology, but
also a human dimension:
 How people use management tools and management
technology to achieve a given purpose?
• How people who perform management functions?
• Who are ultimately responsible for the fact that networks and
networking services are running smoothly can best be supported.
 The organizational dimension must be considered
• How the tasks and workflows are organized,
• How people involved in managing a network work together,
• What procedures they have in place and must follow to collectively get
the job done
32
A Day in the Life of a Network Manager
 Pat: A Network Operator for a Global Service Provider
 Chris: Network Administrator for a Medium-Size Business
 Sandy: Administrator and Planner in an Internet Data
Center
Pat: A Network Operator for a Global
Service Provider
 Pat works as network operator at the Network Operations Center (NOC)
of a Global Service Provider (GSP)
 She and her group are responsible for monitoring both the global
backbone network and the access network
 This’ a big responsibility, several terabytes of data more over GSP’s
backbone daily connecting several million end customers as well as a
significant percentage of global Fortune 500 companies.
 Any disruption to this service could have huge economic implications,
leading to revenue losses of millions of dollars, exposing GSP to penalties
and liability claims, and putting jobs in jeopardy.
Pat: A Network Operator for a Global
Service Provider
They show statistics on network
utilization,
information
about
current delays and service levels
experienced by the network’s users,
and the number of problems that
have been reported in
different geographic areas.
This gives everybody in the room a
good overall sense of what is
currently going on
Pat: A Network Operator for a Global
Service Provider
Chris: Network Administrator for a MediumSize Business
 Chris is responsible for the computer and networking infrastructure of a
retail chain, RC Stores, with a headquarters and 40 branch locations.
 RC Stores’ network contains close to 100 routers: typically, an access
router and a wireless router in the branch locations, and additional
networking infrastructure in the headquarters and at the warehouse.
 The company has turned to a managed service provider (MSP) to
interconnect the various locations of its network.
 The MSP has set up a Virtual Private Network (VPN) with tunnels
between the access routers at each site that connects all the branch
locations and the headquarters.
 This means that the entire company’s network can be managed as one
network
Chris: Network Administrator for a MediumSize Business
 Although the MSP worries about the interconnectivity among the branch
offices, Chris and his colleagues are their points of contact. Also, the
contract with the MSP does not cover how the network is being used
within the company. This is the responsibility of Chris and his colleagues.
 Chris has a workstation at his desk that runs a management platform.
 This is a general-purpose management application used to monitor the
network.
 At the core of the application is a graphical view of the network that
displays the network topology. Each router is represented as an icon on
the screen that is green, yellow, orange, or red, depending on its alarm
state.
 This color coding allows Chris to see at first glance whether everything is
up and running.
Chris: Network Administrator for a MediumSize Business
RC Stores’ Network
Chris: Network Administrator for a MediumSize Business
A Typical Management
Application Screen (Cisco
Packet Telephony Center)
Chris: Network Administrator for a MediumSize Business
Sample Screen of a Management
Application with Performance
Graphs (Cisco Works IP
Performance Monitor)
Sandy: Administrator and Planner in an
Internet Data Center
 Sandy works in the Internet Data Center for a global Fortune 500
company, F500, Inc.
 The data center is at the center of the company’s intranet, extranet, and
Internet presence:
 It hosts the company’s external website, which provides company and
product information and connects customers to the online ordering
system.
 More important, it is host to all the company’s crucial business data:
its product documents and specifications, its customer data, and its
supplier data.
 In addition, the data center hosts the company’s internal website
through which most of this data can be accessed, given the proper
access privileges.
Sandy: Administrator and Planner in an
Internet Data Center
 Sandy has been tasked with developing a plan for how to accommodate
a new partner supplier.
 This will involve setting up the server and storage infrastructure for
storing and sharing data that is critical for the business relationship.
Also, an extranet over which the shared data can be accessed must be
carved out.
 The extranet constitutes essentially its own Virtual Private Network that
will be set up specifically for that purpose.
Sandy: Administrator and Planner in an
Internet Data Center
Sample Screen of a Management
Application That Allows the
Configuration of Ports (Cisco
WAN Manager 15.1)