Transcript greg

Keeping Network
Monitoring Current
using Automated Nagios
Configurations (WIP)
Greg Wickham
APAN
July 2005
Is the network being
monitored correctly?
Greg Wickham
APAN
July 2005
Contents
• Background
• Monitoring Overview / Requirements
• Solution Architecture
• Monitoring Verification
• Conclusion
Contents
• Background
• Monitoring Overview / Requirements
• Solution Architecture
• Monitoring Verification
• Conclusion
GrangeNet Architecture
GrangeNet Monitoring
Device Types
Routers
Servers
Switches
Quantity
6
6
4
16
GrangeNet Monitoring
Device Types
Routers
Servers
Switches
Quantity Probes
6
6
4
310
6
7
16
323
GrangeNet Monitoring
Device Types
Routers
Servers
Switches
Quantity Probes
6
6
4
310
6
7
16
323
Nagios Lines (services.cfg)
3172
GrangeNet Monitoring (ACT Edge)
Probe Types
Fan
Hardware
Ping
Power
Temperature
Interfaces
MSDP Peerings
BGP Peerings
OSPF
Total Probes:
Quantity
Notes
3
17
1
2
1
16
8
15
2
65
(39)
GrangeNet Monitoring (ACT Edge)
• Is that everything that can be monitored?
GrangeNet Monitoring (ACT Edge)
• Is that everything that can be monitored?
No!
GrangeNet Monitoring (ACT Edge)
• Is that everything that can be monitored?
No!
• What else?
– BGP address family peerings
• Multicast / Unicast / IPv6
– Software versions
– Hardware versions
– Latency (of links)
– Usage (of links)
–…
Contents
• Background
• Monitoring Overview / Requirements
• Solution Architecture
• Monitoring Verification
• Conclusion
Monitoring Solution
• Solution Goals:
–Verifying network is correctly monitored
–Minimise replication of data
–Simplistic integration with existing systems
–Easy to maintain
–Extensible
–Flexible
–Efficient
Monitoring Overview
• Facts:
–Networks change
–Updating is tedious
–Monitoring Difficult to Auditing
• Answers Required:
–Is the network performing optimally?
–Has a change occurred?
–What is the status of the network?
–Is the monitoring accurate?
Contents
• Background
• Monitoring Overview / Requirements
• Solution Architecture
• Monitoring Verification
• Conclusion
Solution Architecture
Monitoring
Configuration
•Configuration data stored as XML
•Describes:
•Devices to monitor
•How to monitor
•Nagios templates
•Device Templates
Solution Architecture
Monitoring
Configuration
Monitoring
Daemon
•Daemon reads configuration data
•Verifies devices are monitored correctly
•Generates Nagios Configurations
•Performs device probes
•Runs periodically
Solution Architecture
Monitoring
Configuration
Monitoring
Daemon
Nagios
Configuration
Nagios configuration
automatically generated
by Monitoring Daemon
Solution Architecture
Monitoring
Configuration
Nagios
Configuration
Monitoring
Daemon
Nagios uses configuration supplied by
monitoring daemon;
Nagios configured to use ‘passive’ checks
Nagios
Daemon
Solution Architecture
Monitoring
Configuration
Monitoring
Daemon
Nagios
Configuration
Nagios
Daemon
Monitoring daemon queries all devices
using SNMP;
Check device telemetry against
known configurations
Network Devices
Solution Architecture
Monitoring
Configuration
Monitoring
Daemon
Nagios
Configuration
Nagios
Daemon
Monitoring daemon sends
Probe status direct to Nagios
(Nagios running passive checks)
Network Devices
Solution Architecture
Monitoring
Configuration
Nagios
Configuration
Nagios
Daemon
Monitoring
Daemon
eMail
SMS
Network Devices
Web
Nagios reports on network
health as usual but does no
active checking of its own
Solution Architecture
Monitoring
Configuration
Nagios
Configuration
Nagios
Daemon
Monitoring
Daemon
eMail
Report
Network Devices
SMS
Web
Report generated of device
monitoring comparison
Solution Architecture
Monitoring
Configuration
Nagios
Configuration
Nagios
Daemon
Monitoring
Daemon
RRDtool
Collected data fed to
optional sub-systems
eMail
SMS
Web
Report
Network Devices
Solution Architecture
• Result
–Only one process communicates to all devices
Very Efficient
Query time for 34 devices is < 10 seconds
–As only one daemon communicates to the devices
the load on each network device is minimised
(collected data is distributed as necessary)
–As Nagios does less work the monitoring server is
less loaded (Nagios is heavy)
Contents
• Background
• Monitoring Overview / Requirements
• Solution Architecture
• Monitoring Verification
• Conclusion
Monitoring Verification
• Templates are used to define pre-requisite
monitoring probes
• Devices are attached to templates
Monitoring Verification
Device Description
<device>
<alias>edge1.vic</alias>
<address>202.0.98.68</address>
…
<module type="nagios">
<template>ibgp-mesh</template>
<template>ebgp-peerings</template>
<template>ospf</template>
<template>system</template>
…
<probe type=“ibgp-mesh" description="AS18062 - edge1.nsw“
arg=“202.0.98.13” />
<probe type=“ebgp-peering" description="AS64670“
arg=“202.0.98.190” />
…
</module>
</device>
Monitoring Verification
Template Description
<template name=“ibgp-mesh">
<template>system-health</template>
<probe name=“ibgp-mesh" inheirit="bgp-standard">
<attribute type="field">bgpPeerState</attribute>
<attribute type="notify">gn-noc</attribute>
<attribute type="level">level1-service</attribute>
<match>
<field name="bgpPeerRemoteAs" value="18062" />
<field name="bgpPeerState" value="up" />
</match>
</probe>
</template>
Monitoring Verification
• From the:
–Device template; and
–Monitoring Template
an accurate report can be generated of the
status of monitoring.
• All probe details are stored in XML so can be
easily verified
Contents
• Background
• Monitoring Overview / Requirements
• Solution Architecture
• Monitoring Verification
• Conclusion
Conclusion
• Due to efficiencies in the monitoring daemon:
–Nagios doesn’t load the server
–Other applications can share the SNMP data
–Doesn’t load the network devices
–Device probing is very quick
• Reduces complexity of Nagios configuration
• Generate reports identifying inaccuracies in
existing monitoring
• Unified configuration data
This is a Work in Progress
Status (Work in Progress)
• Current functionality:
– Separate applications:
• Collecting data from devices; feed into Nagios
• Generating Nagios configurations
• To Do
– Integrate applications
– Complete Implementation Nagios templates
– Documentation!
• Software
– Perl
– net-snmp
– Nagios