Network Monitoring
Download
Report
Transcript Network Monitoring
Network Monitoring
Prof. Choong Seon HONG
Kyung Hee
University
1
Network Monitoring
Access to monitored information
How to define monitoring information, and how to get that information
from a resource to a manager
Design of monitoring mechanisms
How best to obtain information from resources
Application of monitored information
How the monitored information is used in various management
functional areas
Kyung Hee
University
2
Network Monitoring Information
Information type
Static : infrequent changing information. Ex) Port ID, Number of Ports
Not frequently changed
Dynamic : state information. Ex) state of protocol machine or the transmission
of packet
Statistical : derived from dynamic information. Ex) average number of packets
transmitted
Kyung Hee
University
3
Network Monitoring Information (cont’d)
Organization of a management information base by Mazumdar and Lazar(1991)
Call_Blocked Packed_Loss
Statistical DB
Time_Delay Throughput
Abstraction of state
and event variables
State_Variable
Dynamic DB
Event_Variable
Sensor activation and
data collection
Sensor DB
Kyung Hee
University
Static DB
Switch_Server
Buffer Source
Station _info Server
Switch_Buffer
Switch_Source
Configuration DB
Status_Sensor
Derived_Status_sensor
Event_Sensor
4
Network monitoring configurations
Monitoring application : visible to user such as performance
monitoring, fault monitoring and accounting monitoring
Manager function : having basic monitoring function
Agent function : gathering and recording management information
from one or more elements, and communicate with monitor
Managed objects : management information that represents
resources and their activities
Monitoring agent : generating summaries and statistical analyses
of management information
Kyung Hee
University
5
Network monitoring configurations (cont’d)
Functional architecture for network monitoring
Monitoring
application
Manager
function
Monitoring
application
Monitoring
agent
Manager
function
Agent
function
Agent
function
Agent
function
Managed
object
Managed
object
Managed
object
Manager-agent model
Kyung Hee
University
A model for summarization
6
Network monitoring configurations (cont’d)
Monitoring
application
Monitoring
application
Manager
function
Manager
function
Subnetwork
or Internet
Agent
function
Managed
objects
Agent
function
Managed resources in
manager system
Managed
object
Monitoring
application
Monitoring
application
Manager
function
Manager
function
Subnetwork
or Internet
Subnetwork
or Internet
External monitor
Agent
function
Observed
traffic
Kyung Hee
University
Agent
function
Resources in agent system
Proxy monitor agent
LAN
LAN
7
Polling and event reporting
Information is collected and stored by agents and it is used by
multiple managers
Two techniques to make the information
polling : request-response interaction between a manager and agent
querying any agent and requesting the values of various information
elements
agent responding with information from its MIB
event reporting : initiative with the agent and the manager with the role
of a listener
giving current status of agent to manager
preconfigurable reporting period or settable by manger
generating a report when a significant event (ex, a change of state or an
unusual event (ex., fault)
more efficient than polling for monitoring object whose states or values
change relatively infrequently
Kyung Hee
University
8
Polling and event reporting (cont’d)
Telecommunications management system : very high reliance on
event reporting
SNMP approach puts very little reliance on event reporting
SNMP and OSI systems management
Kyung Hee
University
9
Performance monitoring
Performance indicators
absolute prerequisite for the management of telecom network :
measuring the performance of the network, or performance monitoring
difficulties to choose appropriate indicators because of following:
there are too many indicators in use
the meanings of most indicators are not yet clearly understood
some indicators are introduced and supported some manufacturers only
most indicators are not suitable for comparison with each other
frequently, the indicators are accurately measured but incorrectly
interpreted
in many cases, the calculation of indicators takes too much time, and the
final results can hardly be used for controlling the environments
service-oriented measures (availability, response time, accuracy) and
efficiency-oriented measures (throughput, utilization)
Kyung Hee
University
10
Performance monitoring (cont’d)
Availability : percentage of time that a network system component,
or application is available for a user
A = MTBF / MTBF + MTTR, where MTBF : mean time between failures,
MTTR : mean time to repair
Availability of serial and parallel connections
A
A
A2
0.98 x 0.98 = 0.96
Serial
A
2A - A2
A
Parallel
Kyung Hee
University
1- 0.98 = 0.02 : one unavailability
0.02 x 002 =0.0004 :
both unavailability
1-0.004 = 0.9996:
availability of combined
unit
11
Performance monitoring (cont’d)
Response time
Several studies show that a computer and a user interacts at a pace
that neither has to wait on the other
productivity increases significantly
the cost of work drops
quality tends to improve
Up to two seconds : it was acceptable for most interactive applications
User response time: the time span between the moment a user
receives a complete reply to one command and enters the next
command - referred to as think time
System response time: the time span between the moment the user
enters a command and the moment a complete response is displayed
Kyung Hee
University
12
Performance monitoring (cont’d)
Elements related to response time
Inbound terminal delay : the delay in getting an inquiry from terminal to
the communications line
Inbound queuing time : the time required for processing by the
controller or PAD device
Inbound service time : the time to transmit the communications link
and nodes (controller to host’s front-end processor)
Processor delay : the time for processing in the front-end processor,
the host processor, the disk driver and so on
Outbound queuing time : the time a reply spends at a port in the frontend processor waiting to be dispatched to the network.
Outbound service time : the time to transmit the communications
facility from the host’s front-end processor to the controller.
Outbound terminal delay : primarily due to line speed
Kyung Hee
University
13
Performance monitoring (cont’d)
Elements of response time
TI = inbound terminal delay
WI = inbound queueing time
SI = inbound service time
CPU =CPU processing delay
WO = outbound queueing delay
SO = outbound service time
TO = outbound terminal delay
TO
Network interface
(e.g., bridge)
PC
SI
Server
Network
WI
WO
TI
Kyung Hee
University
SO
CPU
14
Performance monitoring (cont’d)
Accuracy
Accurate transmission of data between user and host or between two hosts
using error-correction mechanisms in protocol such as the data link and
transport protocols
generally not user concern
rejection rate: the percentage of time the network cannot transfer information
because of a lack of resources and performance
–
> 2% indicates significant problems
Throughput
an application-oriented measure
the number of transactions of a given type for a certain period of time
the number of customer sessions of a given application during a certain period of
time
the number of calls for a circuit-switched environment
Kyung Hee
University
15
Performance monitoring (cont’d)
Goodput
the probability or the rate of successfully received packets with no
packet loss that causes packet loss at the receiver
Utilization
a more fine-grained measure than throughput
determining the percentage of time that a resource is in use over a
given period of time
to search for potential bottlenecks and areas of congestion
usually increasing exponentially as the utilization of a resource
increases (see figure 2.10)
Kyung Hee
University
16
Performance monitoring (cont’d)
Collecting utilization data
On a bridge or router
packet forwarding rate
percentage of dropped frames (on each interface)
number of packets in a queue
processor load
On a file server
processor load
disk access rate
NIC utilization
Kyung Hee
University
17
Performance monitoring (cont’d)
Network Management System
A simple tool
provide real-time information about network devices and links
preferably in graphical form such as a line or bar graph
A more complex tool
setting thresholds can trigger a subsequent action
Utilization
(%)
60
50
40
Threshold for alarm: 60%
Rearm alarm at 40%
Kyung Hee
University
Time(sec)
18
Performance monitoring (cont’d)
Thresholds have a priority (low, medium, high)
Graphing historical data
–
line graph:examining trends in data such as utilization
–
bar graph: comparing values
–
pie graph: demonstration the percentage of values
100
Memory used (Kbytes)
Packets passed (K)
35% IP
Utilization
(%)
21%
Appletalk
5% OSI
4%
unknown
Time (seconds)
Line graph
Kyung Hee
University
1/98
2/98
Bar graph
3/98
29%
DECnet
6% SNA
Pie graph
19
Performance monitoring (cont’d)
An Advanced tool
Examining the historical data
–
receive the state of the network and performance problems
–
retrieve information from the database
–
analyze the state of the network
Threshold value
60
Predicted utilization
Utilization
(%)
Computed actual utilization
30
Kyung Hee
University
60
90
120
150
180
Time (days)
20
Performance Management
Simulating the network
–
analyze future performance and determine what configuration
can produce the greatest performance
–
build the network model
• how the simulation should calculate each component
• how it should react to the simulation
–
Queuing analysis
Waiting line
(queue)
Dispatching
discipline
Departure
Server
Waiting time
in the queue
Service time
Utilization
Waiting time in the queuing system
Kyung Hee
University
21
Performance Management
predicting response time, rejection rate and availability
sufficient input
simulate traffic
Limit of experience
12 sec
Actual response time
Response time
8 sec
4 sec
Projected response time
0.2
Kyung Hee
University
0.4
0.6
0.8
system load (utilization)
22
Performance Management
Reporting performance information
text reports are the most common way
utilization and error rates
network devices and links
data in either a graphical format or on a bitmapped display
Kyung Hee
University
23
Fault Management
Problems of Fault Monitoring
Fault observation
unobservable faults (e.g. the existence of deadlock)
partially observable faults (e.g. failure of some low-level protocol in an
attached device)
uncertainty in observation (e.g. Lack of response)
Fault diagnosis
multiple potential causes
too many related observation
interference between diagnosis and local recovery
absence of automated testing tools
Kyung Hee
University
24
Fault Management
Propagation of failure to higher layers
Application failure
Transport failure
Data link failure
Client
Client
Transmission
break
Router
Router
Kyung Hee
University
Mux
Mux
25
Fault Management
Examples of test that a fault monitoring should have
connectivity test
data integrity test
protocol integrity test
data saturation test
connection saturation test
response time test
loopback test
function test
diagnosis test
Kyung Hee
University
26
Accounting Management
Accounting monitoring
Keep track of user’s usage of network resources
Resources
communication facilities: LANs, WANs, leased lines, dial-up lines, and PBX
systems
computer hardware: workstations and servers
software and systems: applications and utility software in servers, a data
center, and end user sites
services: includes all commercial communications and information
services available to network users
Accounting data
user identification
receiver
number of packets
resources used
time stamps & priority level
Kyung Hee
University
27
Accounting Management
Determining network resource usage
total number of transaction
number of logins
total number of packets
total number of bytes (reflecting bandwidth)
billing on output
bytes received
–
Email
–
Acknowledgment
Security level
Kyung Hee
University
28
Accounting Management
Accomplishing Accounting Management
Gathering data about the utilization of network resources
Using metrics to help set usage quotas
Billing users for their network use
Metrics and Quotas
SNMP
RFC 1272 “Internet Accounting Background”
define services to be metered and usage reporting
define the types of information necessary at various layers
Metrics work with quotas
Billing
One-time installation fee
Monthly fee
Fee based on amount of network resources consumed
Kyung Hee
University
29
Accounting Management
Network management system
A simple tool
monitor for metrics that exceed quotas
report that data
A more complex tool
perform network billing
determine where to poll for billing information
An advanced tool
forecast the need for network resources
establish reasonable metrics and quotas
predict their billing costs
Reporting
real-time display:the current value of a metric
text reports: historical accounting and billing information
Kyung Hee
University
30