IP SLA Responder - My Web Application
Download
Report
Transcript IP SLA Responder - My Web Application
CIS 187 CCNP SWITCH
Ch. 5 IP SLAs
Rick Graziani
Cabrillo College
[email protected]
Spring 2011
Understanding
High Availability
Components of High Availability
Redundancy
Technology (including hardware and software features)
People
Processes
Tools
Redundancy
Geographic diversity and path diversity are often included.
Dual devices and links are common.
Dual WAN providers are common.
Dual data centers are sometimes used, especially for large companies
and large e-commerce sites.
Dual collocation facilities, dual phone central office facilities, and dual
power substations can be implemented.
Technology
Cisco Nonstop Forwarding
(NSF)
Stateful Switchover (SSO)
Graceful Restart
Cisco IOS IP Service Level
Agreements (SLA)
Object Tracking
Firewall Stateful Failover
People
Prepare, Plan, Design, Implement, Operate, and Optimize
(PPDIOO) is a guide.
Work habits and attention to detail important.
Skills are acquired via ongoing technical training.
Good communication and documentation critical.
Use lab testing to simulate failover scenarios.
Take time to design.
Identify roles.
Identify responsibilities.
Align teams with services.
Ensure time to do job.
Processes
Organizations should build repeatable processes.
Organizations should use labs appropriately.
Organizations need meaningful change controls.
Management of operational changes is important.
Tools
Network diagrams.
Documentation of network
design evolution.
Key addresses, VLANs, and
servers documented.
Documentation tying services
to applications and physical
servers.
Resiliency for High Availability
Network-Level Resiliency
High Availability and Failover
Times
Network-Level Resiliency
Built with device and link redundancy.
Employs fast convergence.
Relies on monitoring with NTP, SNMP, Syslog, and IP SLA.
High Availability and Failover Times
Tuned routing protocols failover in less than 1 second.
RSTP converges in about 1 second.
EtherChannel can failover in approximately 1 second.
HSRP timers are 3 seconds for hello and 10 seconds for hold time.
Stateful service modules typically failover within 3-5 seconds.
TCP/IP stacks have up to a 9-second tolerance.
Optimal Redundancy
Provide alternate paths.
Avoid too much redundancy.
Avoid single point of failure.
Use Cisco NSF with SSO, if
applicable.
Use Cisco NSF with routing
protocols.
Provide Alternate Paths
Use redundant distribution-tocore links in case a core switch
fails.
Link distribution switches to
support summarization of
routing information from the
distribution to the core.
Avoid Too Much Redundancy
Where should the root switch be placed? With this design, it is not
easy to determine where the root switch is located.
What links should be in a blocking state? It is hard to determine how
many ports will be in a blocking state.
What are the implications of STP and RSTP convergence? The
network convergence is definitely not deterministic.
When something goes wrong, how do you find the source of the
problem? The design is much harder to troubleshoot.
Avoid Single Point of Failure
Key element of high availability.
Easy to implement at core and distribution.
Access layer switch is single point of failure. Reduce outages to 1 to 3
seconds in the access layer with:
• SSO in L2 environment
• Cisco NSF with SSO in L3 environment.
Cisco NSF with SSO (Stateful Switchover)
Supervisor redundancy mechanism in Cisco IOS enabling supervisor
switchover at L2-L3-L4.
SSO enables standby RP to take control after fault on active RP.
Cisco NSF is L3 function that works with SSO to minimize time
network unavailable following switchover, continuing to forward IP
packets following RP switchover.
Routing Protocols and NSF (Cisco Nonstop
Forwarding)
NSF enables continued
forwarding of packets along
known routes while routing
protocol information is being
restored during switchover.
Switchover must complete
before NSF dead and hold
timers expire or routing peers
will reset adjacencies and
reroute traffic.
Implementing
Network
Monitoring
Logging Services
Events on networking devices can be logged.
Various events
Various levels of severity
Events are logged to:
Console (default)
Console display
Buffer
Server
Examples
Interfaces up or down
Configuration changes
Routing protocol adjacencies
19
Logging
Services
Logging severity levels on Cisco Systems devices are as follows:
(0) Emergencies
(1) Alerts
(2) Critical
(3) Errors
(4) Warnings
(5) Notifications
(6) Informational
(7) Debugging
By default, all messages from level 0 to 7 are logged to the console
20
Logging Services
Console
You can also adjust the logging severity level of the console.
By default, all messages from level 0 to 7 are logged to the
console;
You can configure the severity level as an optional parameter:
logging console level
Limits the logging of messages displayed on the console terminal to
the specified level and (numerically) lower levels.
21
You can enter the level number or level name.
Logging Services
Buffer
logging buffered [buffer-size|level]
May or may not be the default
By default, messages of all severity levels are logged to buffer.
show logging Displays the content of the buffer
The buffer is circular, meaning that when the buffer has reached its
maximum capacity, the oldest messages will be discarded to allow the
logging of new messages.
22
Configuring Syslog
To configure logging to the buffer of the local switch, use the command
logging buffered.
Switch(config)# logging buffered ?
<0-7>
Logging severity level
<4096-2147483647>
Logging buffer size
alerts
Immediate action needed
(severity=1)
critical
Critical conditions
(severity=2)
debugging
Debugging messages
(severity=7)
discriminator
Establish MD-Buffer association
emergencies
System is unusable
(severity=0)
errors
Error conditions
(severity=3)
informational
Informational messages
(severity=6)
notifications
Normal but significant conditions (severity=5)
warnings
Warning conditions
(severity=4)
xml
Enable logging in XML to XML logging buffer
Logging Services
Server
logging ip-address command
Some IOS version it is logging host
By default, only messages of severity level 6 or lower will be logged to the
syslog server.
This can be changed by entering the logging trap level command.
24
Configuring Syslog
To configure a syslog server, use the logging ip_addr global
configuration command.
To which severity levels of messages are sent to the syslog server,
use the global configuration command logging trap level.
Switch(config)# logging trap ?
<0-7>
Logging severity level
alerts
Immediate action needed
critical
Critical conditions
debugging
Debugging messages
emergencies
System is unusable
errors
Error conditions
informational
Informational messages
notifications
Normal but significant conditions
warnings
Warning conditions
(severity=1)
(severity=2)
(severity=7)
(severity=0)
(severity=3)
(severity=6)
(severity=5)
(severity=4)
Sample Syslog Messages
08:01:13: %LINEPROTO-5-UPDOWN: Line protocol on Interface
FastEthernet0/5, changed state to up
08:01:23: %DUAL-5-NBRCHANGE: EIGRP-IPv4:(1) 1: Neighbor 10.1.1.1
(Vlan1) is up: new adjacency
08:02:31: %LINK-3-UPDOWN: Interface FastEthernet0/8, changed state
to up
08:18:20: %LINEPROTO-5-UPDOWN: Line protocol on Interface
FastEthernet0/5, changed state to down
08:18:22: %LINEPROTO-5-UPDOWN: Line protocol on Interface
FastEthernet0/5, changed state to up
08:18:24: %LINEPROTO-5-UPDOWN: Line protocol on Interface
FastEthernet0/2, changed state to down
08:18:24: %ILPOWER-5-IEEE_DISCONNECT: Interface Fa0/2: PD removed
08:18:26: %LINK-3-UPDOWN: Interface FastEthernet0/2, changed state
to down
08:19:49: %ILPOWER-7-DETECT: Interface Fa0/2: Power Device detected:
Cisco PD
08:19:53: %LINK-3-UPDOWN: Interface FastEthernet0/2, changed state
to up
08:19:53: %LINEPROTO-5-UPDOWN: Line protocol on Interface
FastEthernet0/2, changed state to up
Syslog Severity Levels
Smaller numerical levels are the more critical syslog alarms.
Syslog Severity
Severity Level
Emergency
Level 0, highest level
Alert
Level 1
Critical
Level 2
Error
Level 3
Warning
Level 4
Notice
Level 5
Informational
Level 6
Debugging
Level 7
Syslog Facilities
Service identifiers.
Identify and categorize system state data for error and event message
reporting.
Cisco IOS has more than 500 facilities.
Most common syslog facilities:
IP
OSPF
SYS operating system
IP Security (IPsec)
Route Switch Processor (RSP)
Interface (IF)
Syslog Message Format
System messages begin with a percent sign (%)
Facility: A code consisting of two or more uppercase letters that indicates the
hardware device, protocol, or a module of the system software.
Severity: A single-digit code from 0 to 7 that reflects the severity of the
condition. The lower the number, the more serious the situation.
Mnemonic: A code that uniquely identifies the error message.
Message-text: A text string describing the condition. This portion of the
message sometimes contains detailed information about the event, including
terminal port numbers, network addresses, or addresses that correspond to
locations in the system memory address space.
Verifying Syslog Configuration
Use the show logging command to display the content of the local
log files.
Use the pipe argument (|) in combination with keywords such as
include or begin to filter the output.
Switch# show logging | include LINK-3
2d20h: %LINK-3-UPDOWN: Interface FastEthernet0/1, changed state to up
2d20h: %LINK-3-UPDOWN: Interface FastEthernet0/2, changed state to up
2d20h: %LINK-3-UPDOWN: Interface FastEthernet0/1, changed state to up
Switch# show logging | begin %DUAL
2d22h: %DUAL-5-NBRCHANGE: EIGRP-IPv4:(10) 10: Neighbor 10.1.253.13
(FastEthernet0/11) is down: interface down
2d22h: %LINK-3-UPDOWN: Interface FastEthernet0/11, changed state to down
2d22h: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet0/11,
changed state to down
Cisco IP SLA
IP SLA, feature of Cisco IOS software allows you to configure a router to
send synthetic traffic to:
A host computer
Router that has been configured to respond (Responder)
31
IP SLA is very useful for:
performance measurement
monitoring
network baselining.
You can tie the results of the IP SLA operations to other features of your
router and trigger action based on the results of the probe.
32
To implement IP SLA network performance measurement, you need to
perform the following tasks:
Enable the IP SLA responder, if required.
Configure the required IP SLA operation type.
Configure any options available for the specified operation type.
Configure threshold conditions, if required.
Schedule the operation to run, and then let the operation run for a
period of time to gather statistics.
Display and interpret the results of the operation using the Cisco IOS
CLI or a network management system (NMS), with Simple Network
Management Protocol (SNMP).
33
Depending on the type of probe you setup, you may or may not need to
configure an IP SLA Responder.
For example, if you are setting up a simple echo probe to a IP host, you do
not need a responder.
An IP SLA Responder allows for more detailed information to be retrieved.
34
IP SLA Responder is a component embedded in the destination Cisco
routing device.
Allows the system to anticipate and respond to IP SLA request packets
Provides a large advantage with accurate measurements without the need
for dedicated probes and additional statistics not available from standard
ICMP-based measurements.
See information regarding “IP SLAs with Responder Time Stamps”
IP SLA Source (Cisco device) uses an IP SLA Control Protocol to
communicate with the IP SLA Responder.
Tells the responder which port it should listen to and respond.
Responder will enable the specified UDP or TCP port for a specific
duration.
35
Example: Network
Availability
Router(config)# ip route 0.0.0.0 0.0.0.0 fa0/0
Router(config)# ip route 0.0.0.0 0.0.0.0 fa0/1 5
fa0/0
fa0/1
172.16.1.1
Customer A is multihoming to two ISPs.
Customer A is not using BGP with the ISPs; but using static default routes.
Two default static routes with different administrative distances are
configured
Link to ISP-1 is the primary link
Link to ISP-2 is the backup link
The static default route with the lower administrative distance will be
preferred and injected into the routing table.
However, if there is a problem within the ISP-1 domain but its interface to
Customer A is still up, all traffic from Customer A will still go to that ISP
The traffic may then get lost within the ISP.
36
fa0/0
fa0/1
172.16.1.1
The solution to this issue is the Cisco IOS IP SLAs functionality
Configure the SLAs to:
Continuously check the reachability of a specific destination such as:
Provider edge [PE] router interface
ISP's DNS server
Any other specific destination: 10.1.1.1 and 172.16.1.1
Conditionally announce the default route only if the connectivity is
verified.
37
R1(config)# ip sla monitor 11
R1(config-rtr)# type echo protocol ipIcmpEcho 10.1.1.1 source-interface fa0/0
R1(config-rtr)# frequency 10
Probe
R1(config)# ip sla monitor schedule schedule 11 life forever start-time now
R1(config)# track 1 rtr 11 reachability
R1(config)# ip route 0.0.0.0 0.0.0.0 fa0/0 2 track 1
Tracking
Object
Status of
Tracking Object
172.16.1.1
Defining the Probe
ip sla: defines probe 11
type echo: specifies that the ICMP echoes are sent:
To destination 10.1.1.1 to check connectivity
With the source interface of FastEthernet0/0
frequency 10: schedules the connectivity test to repeat every 10 seconds.
ip sla monitor schedule 11 life forever start-time now: defines the start
time of now and it will continue forever
38
R1(config)# ip sla monitor 11
R1(config-rtr)# type echo protocol ipIcmpEcho 10.1.1.1 source-interface fa0/0
R1(config-rtr)# frequency 10
Probe
R1(config)# ip sla monitor schedule schedule 11 life forever start-time now
R1(config)# track 1 rtr 11 reachability
R1(config)# ip route 0.0.0.0 0.0.0.0 fa0/0 2 track 1
Tracking
Object
Status of
Tracking Object
172.16.1.1
Defining the Tracking Object
track 1 rtr 11 reachability: Specifies that:
Object 1 is tracked (next step)
Linked to probe 11 (defined in the first step) so that the reachability of
the 10.1.1.1 is tracked.
39
R1(config)# ip sla monitor 11
R1(config-rtr)# type echo protocol ipIcmpEcho 10.1.1.1 source-interface fa0/0
R1(config-rtr)# frequency 10
Probe
R1(config)# ip sla monitor schedule schedule 11 life forever start-time now
Tracking
Object
R1(config)# track 1 rtr 11 reachability
R1(config)# ip route 0.0.0.0 0.0.0.0 fa0/0 2 track 1
Status of
Tracking Object
AD=2
172.16.1.1
Defining an action based on the status of the tracking object
ip route 0.0.0.0 0.0.0.0 fa0/0 2 track 1: Conditionally announces the default
route, out fa0/0, with an administrative distance 2 if the result of tracking
object 1 is true – if the probe is successful.
To summarize: If 10.1.1.1 is reachable, a static default route out Fa0/0 with an
administrative distance of 2, is installed in the routing table.
40
R1(config)# ip sla monitor 22
R1(config-rtr)# type echo protocol ipIcmpEcho 172.16.1.1 source-interface fa0/1
R1(config-rtr)# frequency 10
Probe
R1(config)# ip sla monitor schedule 22 life forever start-time now
R1(config)# track 2 rtr 22 reachability
R1(config)# ip route 0.0.0.0 0.0.0.0 fa0/1 3 track 2
Tracking
Object
Status of
Tracking Object
172.16.1.1
Defining the Probe
ip sla: defines probe 22
type echo: specifies that the ICMP echoes are sent:
To destination 172.16.1.1 to check connectivity,
With the source interface of FastEthernet0/1
frequency 10: schedules the connectivity test to repeat every 10 seconds.
ip sla monitor schedule 22 life forever start-time now: defines the start
time of now and it will continue forever
41
R1(config)# ip sla monitor 22
R1(config-rtr)# type echo protocol ipIcmpEcho 172.16.1.1 source-interface fa0/1
R1(config-rtr)# frequency 10
Probe
R1(config)# ip sla monitor schedule 22 life forever start-time now
R1(config)# track 2 rtr 22 reachability
R1(config)# ip route 0.0.0.0 0.0.0.0 fa0/1 3 track 2
Tracking
Object
Status of
Tracking Object
172.16.1.1
Defining the Tracking Object
track 1 rtr 22 reachability: Specifies that:
Object 2 is tracked (next step)
Linked to probe 22 (defined in the first step) so that the reachability of
the 172.16.1.1 is tracked.
42
R1(config)# ip sla monitor 22
R1(config-rtr)# type echo protocol ipIcmpEcho 172.16.1.1 source-interface fa0/1
R1(config-rtr)# frequency 10
Probe
R1(config)# ip sla monitor schedule 22 life forever start-time now
Tracking
Object
R1(config)# track 2 rtr 22 reachability
R1(config)# ip route 0.0.0.0 0.0.0.0 fa0/1 3 track 2
Status of
Tracking Object
AD=2
AD=3
172.16.1.1
Defining an action based on the status of the tracking object
ip route 0.0.0.0 0.0.0.0 fa 0/1 3 track 2: Conditionally announces the
default route, exit fa0/1, with an administrative distance 3 if the result of
tracking object 1 is true – if the probe is successful.
To summarize: If 172.16.1.1 is reachable, a static default route exit fa0/1 with
an administrative distance of 3 is “offered” to the routing table.
Because this default route has a higher AD of 3, if the path via R2 is
available, this path will be the backup path.
43
R1(config)# ip sla monitor 11
R1(config-rtr)# type echo protocol ipIcmpEcho 10.1.1.1 source-interface fa0/0
R1(config-rtr)# frequency 10
Probe
R1(config)# ip sla monitor schedule 11 life forever start-time now
Tracking
Object
R1(config)# track 1 rtr 11 reachability
R1(config)# ip route 0.0.0.0 0.0.0.0 fa0/0 2 track 1
Status of
Tracking Object
R1(config)# ip sla monitor 22
R1(config-rtr)# type echo protocol ipIcmpEcho 172.16.1.1 source-interface fa0/1
R1(config-rtr)# frequency 10
Probe
R1(config)# ip sla monitor schedule 22 life forever start-time now
Tracking
Object
R1(config)# track 2 rtr 22 reachability
R1(config)# ip route 0.0.0.0 0.0.0.0 fa0/1 3 track 2
If 10.1.1.1 is reachable, a static default
route via R2 with an administrative
distance of 2, is installed in the routing
table
If 172.16.1.1 is reachable, a static default
route via R3 with an administrative
distance of 3 is “available” to the routing
table as a backup path.
Status of
Tracking Object
AD=2
AD=3
172.16.1.1
44
Example: Type DNS
RouterB(config)# ip sla monitor 11
RouterB(config-rtr)# type dns target-addr www.cisco.com name-server 172.20.2.132
RouterB(config-rtr)# frequency 60
RouterB(config-rtr)# exit
RouterB(config)# ip sla monitor schedule 11 life forever start-time now
To measure the difference between the time taken to send a DNS request
and the time a reply is received by a Cisco device, use the IP SLAs DNS
operation.
Configuration of an IP SLAs operation type of DNS to find the IP address of
the hostname cisco.com.
The DNS operation number 11 is scheduled to start immediately and run
indefinitely.
To view and interpret the results of an IP SLAs operation use the show ip
sla monitor statistics command.
45
Common IP SLA Issues
Sender
Sender
Receiver
Probes will cause a burden if overscheduled
If multiple senders overwhelm one receiver, or if the device is already a
bottleneck and its CPU utilization is high.
Senders generally suffer more from the over-scheduling and frequency of
probes.
Probe scheduling can be problematic if the clock on the device is out of
sync
Reason synchronizing through Network Time Protocol (NTP) is highly
recommended
46
Cisco Internetwork Performance Monitor (IPM)
Several Cisco network management applications use IP SLAs One example
is the Cisco Internetwork Performance Monitor (IPM) in CiscoWorks2000
RWAN bundle.
47
Intro to Cisco IP SLA Operations - SolarWinds Video
http://www.youtube.com/watch?v=x-fQr24kFKg
48
Network Performance Monitoring: Using IP SLA Monitor with
Orion NPM
http://www.youtube.com/watch?v=YKXoexOVsaE&feature=relat
ed
49
Implementing
Redundant
Supervisor
Engines in Catalyst
Switches
Redundancy Features on Catalyst 4500/6500
RPR (Route Processor
Redundancy) and RPR+ (only
on Catalyst 6500)
SSO (Stateful SwitchOver)
NSF (Non-Stop Forwarding)
with SSO
SE1
SE2
Route Processor Redundancy (RPR)
Redundancy
Catalyst 6500 Failover
Time
Catalyst 4500 Failover
Time
RPR
2-4 minutes
Less than 60 seconds
RPR+
30-60 seconds
---
With RPR, any of the following events triggers a switchover from the active to the
standby Supervisor Engine:
Route Processor (RP) or Switch Processor (SP) crash on the active Supervisor
Engine.
A manual switchover from the CLI.
Removal of the active Supervisor Engine.
Clock synchronization failure between Supervisor Engines.
In a switchover, the redundant Supervisor Engine becomes fully operational and the
following events occur on the remaining modules during an RPR failover:
All switching modules are power-cycled.
Remaining subsystems on the MSFC (including Layer 2 and Layer 3 protocols)
are initialized on the prior standby, now active, Supervisor Engine.
ACLs based on the new active Supervisor Engine are reprogrammed into the
Supervisor Engine hardware.
Route Processor Redundancy Plus (RPR+)
Redundancy
Catalyst 6500 Failover
Time
Catalyst 4500 Failover
Time
RPR
2-4 minutes
Less than 60 seconds
RPR+
30-60 seconds
---
RPR+ enhances Supervisor redundancy compared to RPR by
providing the following additional benefits:
Reduced switchover time: Depending on the configuration, the
switchover time is in the range of 30 seconds to 60 seconds.
No reloading of installed modules: Because both the startup
configuration and the running configuration stay continually
synchronized from the active to the redundant Supervisor Engine
during a switchover, no reloading of line modules occurs.
Synchronization of Online Insertion and Removal (OIR)
events between the active and standby: This occurs such that
modules in the online state remain online and modules in the
down state remain in the down state after a switchover.
Configuring and Verifying RPR+ Redundancy
Step 1. Use the redundancy command to start
configuring redundancy modes:
Step 2. Use the mode rpr-plus command under
redundancy configuration submode to configure RPR+:
Switch# configure terminal
Enter configuration commands, one per line. End with CNTL/Z.
Switch(config)# redundancy
Switch(config-red)# mode rpr-plus
Switch(config-red)# end
Switch# show redundancy states
my state = 13 –ACTIVE
peer state = 1 -DISABLED
Mode = Simplex
Unit = Primary
Unit ID = 1
Redundancy Mode (Operational) = Route Processor Redundancy Plus
Redundancy Mode (Configured) = Route Processor Redundancy Plus
Split Mode = Disabled
Manual Swact = Disabled Reason: Simplex mode
Communications = Down Reason: Simplex mode
<output omitted>
Stateful Switchover (SSO)
Provides minimal Layer 2 traffic disruption during Supervisor
switchover.
Redundant Supervisor starts up in fully initialized state and
synchronizes with startup configuration and running configuration of
active Supervisor.
Standby Supervisor in SSO mode keeps in sync with active
Supervisor for all changes in hardware and software states for
features supported via SSO.
Protocols and Features Supported by SSO
802.3x (Flow Control)
802.3ad (LACP) and PAgP
802.1X (Authentication) and Port security
802.3af (Inline power)
VTP
Dynamic ARP Inspection/DHCP snooping/IP source guard
IGMP snooping (versions 1 and 2)
DTP (802.1Q and ISL)
MST/PVST+/Rapid-PVST
PortFast/UplinkFast/BackboneFast /BPDU Guard and filtering
Voice VLAN
Unicast MAC filtering
ACL (VLAN ACLs, Port ACLs, Router ACLs)
QOS (DBL)
Multicast storm control/broadcast storm control
Configuring and Verifying SSO
Step 1. Enter the redundancy command to start configuring redundancy
modes.ancy
Step 2. Use the mode sso command under redundancy configuration
submode to configure RPR+:
Switch# configure terminal
Enter configuration commands, one per line. End with CNTL/Z.
Switch(config)# redundancy
Switch(config-red)# mode sso
Changing to sso mode will reset the standby. Do you want to continue?
[confirm]
Switch(config-red)# end
Switch# show redundancy states
my state = 13 –ACTIVE
peer state = 8 -STANDBY HOT
Mode = Duplex
Unit = Primary
Unit ID = 2
Redundancy Mode (Operational) = Stateful Switchover
Redundancy Mode (Configured) = Stateful Switchover
Split Mode = Disabled
Manual Swact = Enabled
Communications = Up
<output omitted>
NSF with SSO
Catalyst 4500 and 6500.
Minimizes time that L3 network is unavailable following Supervisor
switchover by continuing to forward IP packets using CEF entries
built from the old active Supervisor.
Zero or near zero packet loss.
Supports BGP, EIGRP, OSPF, and IS-IS.
Routing protocol neighbor relationships are maintained during
Supervisor failover.
Prevents route flapping.
CIS 187 CCNP SWITCH
Ch. 5 IP SLAs
Rick Graziani
Cabrillo College
[email protected]
Spring 2011