ppt - Andrew.cmu.edu

Download Report

Transcript ppt - Andrew.cmu.edu

15-441 Computer Networking
Network Management
Hui Zhang, Fall 2012
1
Introduction



We have spent a lot of time on network protocols
This lecture is about network
What come to your mind when you think of
networks?
-
Devices (switches, routers, repeaters)
Links (WiFi, Sonet, Ethernet, T1 etc)
Interface cards
Topology
Hui Zhang, Fall 2012
2
What Does a “Device” Look Like?
Fan Tray
Port Cards
Fabric cards
SCPs
Switching
Shelf Area
Power Modules
Fan and Filter Trays
Hui Zhang, Fall 2012
3
Switching Shelf Components
SCP
Switching Fabric
Port Cards
Hui Zhang, Fall 2012
4
Network Architecture 4
Switch Control Processor (SCP)
RS-232 serial port
NMI / RESET buttons
Power LEDs
Ethernet port
NEXT / SELECT
buttons
Display LED
System LEDs
Hui Zhang, Fall 2012
5
Network Architecture 5
Logical Diagram of the Switch
Physical Slots
1
2
3
4
5
6
7
8
9
10
11
12
13
14
1
2
4C
Fabric #4
Fabric #1
4
Fabric #3
3
Fabric #2
1A
1B
4D
SCP X SCP Y
1A/B
2A/B
1C/D
Hui Zhang, Fall 2012
3A/B
2C/D
4A/B
3C/D
4C/D
6
Documentation
Maybe you’ve asked, “How do you keep track
of it all?”...
Document,
document,
document…
Hui Zhang, Fall 2012
7
Documentation
Basics, such as documenting your switches...
-
What is each port connected to?
-
Can be simple text file with one line for every port in a
switch:
•
•
•
•
•
•
health-switch1, port 1, Room 29 – Director’s office
health-switch1, port 2, Room 43 – Receptionist
health-switch1, port 3, Room 100 – Classroom
health-switch1, port 4, Room 105 – Professors Office
…..
health-switch1, port 25, uplink to health-backbone
-
This information might be available to your network staff,
help desk staff, via a wiki, software interface, etc.
-
Remember to label your ports!
Hui Zhang, Fall 2012
8
Documentation: Labeling
Nice…
Hui Zhang, Fall 2012
9
Example Backbone Network
Architecture
Edge
Switch
Edge
Switch
Edge
Router
Backbone
Router
Backbone
Router
Edge
Router
Edge
Switch
Edge
Switch
ATM
Edge
Switch
Edge
Router
Edge
Switch
Hui Zhang, Fall 2012
Backbone
Router
Edge
Switch
Backbone
Router
Edge
Router
Edge
Switch
10
Why Multiple Types of Devices?



Core routers are much more expensive than edge routers
A router port is much more expensive than a switch port
How to achieve the same network goal by minimizing the
number of expensive devices?
- Edge switches aggregate traffic to share edge router access port
- Core switches reduce # of core router ports and still achieve a
fully logically connected mesh
- Edge routers hold less # number of routes than core routers
Hui Zhang, Fall 2012
11
Hui Zhang, Fall 2012
12
Network Architecture 12
Management Network
• A completely separate network from “production” network
that provides a means of monitoring and controlling
“production” network without using it.
• A “backdoor” to all network devices
• Serial Connections (T1’s)
• Ethernet (Telnet directly to device)
• Console (Telnet through MC router)
Hui Zhang, Fall 2012
13
Management Network
Dialup Modem
2001
MT1 (7204)
HUB3
HUB2
MC1 (3640)
MC1 (3640)
Hub Phone #
UUNET
Fairfax
(FFX)
S3 S2
HUB1
Hui Zhang, Fall 2012
S1
WILPAK
WCOM
FRAME
RELAY
14
Network Management Example

A typical problem
- people are complaining that Netflix performance was
bad last night

Where do you begin?
-Where is the problem?
-What is the problem?
-What is the solution?

You may have different perspectives
depending on who you are
- Netflix engineer
- Comcast engineer
- A user a home
Hui Zhang, Fall 2012
15
Where to Start?
 With proper management tools and procedures in
place, you may already have the answer
 Consider some possibilities
 1. What configuration changes were made overnight?
 2. Have you received a device fault notification indicating the
issue?
 3. Have you detected a security breach?
 4. Has your performance baseline predicted this behavior on
an increasingly congested network link?
Hui Zhang, Fall 2012
16
What Do You Need?



An accurate database of your network’s topology,
configuration, and performance
A solid understanding of the protocols and models
used in communication between your management
server and the managed devices
Methods and tools that allow you to interpret and
act upon gathered information
Hui Zhang, Fall 2012
17
FCAPS: Five Areas of Network
Management





Fault management
Configuration management
Accounting management
Performance management
Security management
Hui Zhang, Fall 2012
18
Fault Management

When a fault occurs
- Determine “exactly” where the fault is
- Isolate the rest of the network from the failure
- Reconfigure or modify the network to minimize the
impact of operation
- Repair or replace the failed components
Hui Zhang, Fall 2012
19
Configuration Management

Configuration management is concerned with
- Initializing a network
- Gracefully shutting down part or all of the network
- Maintaining, adding, and updating the relationships among components
and the status of components themselves during network operation
Hui Zhang, Fall 2012
20
Accounting Management

Network managers track the use of network resources by end
user or end-user class
- An end user or group of end users may be abusing its access
privileges and burdening the network at the expense of other users
- End users may be making inefficient use of the network, and network
manager can assist in changing procedures to improve performance
- The network manager is easier to plan for network growth if end user
activity is known in sufficient detail
Hui Zhang, Fall 2012
21
Performance Management





What is the level of capacity utilization?
Is there excessive traffic?
Has throughput been reduced to unacceptable
levels?
Are there bottlenecks?
Is response time increasing?
Hui Zhang, Fall 2012
22
Security Management

Managing information protection, and access control
facilities
- Generating, distributing and storing encryption keys
- Passwords, authorization or access control information
must be maintained and distributed

Monitoring and controlling access to computer
networks and to all or part of the network
management information
- SM involves with the collection, storage, and examination
of audit records and security logs
- the enabling and disabling of these logging facilities
Hui Zhang, Fall 2012
23
Differences of Network
Management from Network Control


Human operator as the user of the network management
Stable storage is the fundamental building blocks for
network management
- Configuration files
- Log files or databases
• What to measure and then log?
• What granularity?
• How much overhead?
Hui Zhang, Fall 2012
24
Simple Network Management Protocol
(SNMP)

A set of standards for network management
- a protocol
- a data base schema or structure specification
- a set of data objects
• throughput, pkt counts, errors, CPU load, temperature,
..

for multi-vender, interoperable network management
- used across a broad spectrum of device types: end
systems, bridges, switches, routers and
telecommunications equipment
- TCP/IP based

Hundreds of tools built on top of SNMP protocol
Hui Zhang, Fall 2012
25
Network Management Systems
(NMS)

NMS is a collection of tools for network monitoring
and control
- Designed to view the entire network as a unified
architecture
• addresses and labels assigned to each point
• specific attributes of each element and link known to the
system
- Single operator interface with a powerful but user-friendly
set of commands
- a minimal amount of separate equipment
(hardware/software) is necessary
• NMS software resides in the host computers and
communications processors (bridges, routers)
Hui Zhang, Fall 2012
26
Architectural model of NMS
Unified
user
Interface
Presentation of network management
Information to users
Network
Management
application
Application
element
. . .
Application
element
Network
Management
application
. . .
Application
element
Network management data transport service
MIB
access
module
Management
information
base
Hui Zhang, Fall 2012
Communications
protocol
stack
Managed networks
27
Network Monitoring

Course grain monitoring
- Counters as aggregate statistics
• # of packets on a link
• # of bytes on a link
• # errors on a link
- Keep packet-level statistics
- Used in SNMP

Fine grain monitoring
- Exam (and potentially log) each packet and its timing
- Challenge to control the overhead
• Hard to store, transfer, and process every packet
over the entire duration of network operation
- Various techniques have been invented
Hui Zhang, Fall 2012
28
Flow Monitoring

Flow monitoring (e.g., Cisco Netflow)
- Statistics about groups of related packets (e.g., same
IP/TCP headers and close in time)
- Recording header information, counts, and time

More detail than SNMP, less overhead than
every packet capture
Hui Zhang, Fall 2012
29
Cisco Netflow

Basic output: “Flow record”
- Most common version is v5
- Latest version is v10 (RFC 3917)

Current version (10) is being standardized in the IETF
(template-based)
- More flexible record format
- Much easier to add new flow record types
Core
Network
Collection and
Aggregation
Hui Zhang, Fall 2012
Approximately 1500 bytes
20-50 flow records
Sent more frequently if traffic increases
Collector
(PC)
30
Silde Courtesy of Nick Feamster
Flow Record Contents
Basic information about the flow…
 Source and Destination, IP address and port
 Packet and byte counts
 Start and end times
 ToS, TCP flags
…plus, information related to
routing
• Next-hop IP address
• Source and destination AS
• Source and destination prefix
Hui Zhang, Fall 2012
31
Silde Courtesy of Nick Feamster
Sampled Netflow

Packet sampling before flow creation
- 1-out-of-m sampling of individual packets (e.g., m=100)
- Create of flow records over the sampled packets

Reducing overhead
- Avoid per-packet overhead on (m-1)/m packets

Accuracy?
- Missing many of the small flows
Hui Zhang, Fall 2012
32
Sampled Netflow
Sample packets at random,
aggregate into flows
Flow = Packets with same pattern FlowId Counter
Source and Destination Address and Ports
1 1 6 1 3 1 1
Hui Zhang, Fall 2012
1163
1
Flow reports
2
1
1 1 Heavyhitters,
6 1 3 1
Estimate: FSD, Entropy,
Changes, SuperSpreaders ….
1
33
Packet header
Hash-Based Flow Sampling
Version IHL TOS Length
Identification Flags Offset
TTL Protocol Checksum
Source IP address
Destination IP address
……
SourcePort DestinationPort
Hash
Flowid  [0,Max]
Hash range
[3,10]
3
1
6
1
Flow memory
(flow, counter #pkts)
Compute hash, log if in range
1 1 6 1 3 1 1
163
1 1 6 1 3 1 1
Pick flows at random; not biased by flow size
Good for “communication” patterns
Hui Zhang, Fall 2012
34
Sample and Hold
Algorithm
If flow is already logged  update
Sample packet with probability p
If new flow  create counter
1 2
1
4
3
Flow memory
(flow, #pkts)
6 1
1 1 6 1 3 1 1
3
6
1
1 1 6 1 3 1 1
Accurate counts of large flows
Good for “volume” queries
Hui Zhang, Fall 2012
35
What do network operators care
about?
Network
Operations
Center
Applications
3
2
2
1
1
Respect resource constraints
2
1
High flow coverage
Provide network-wide goals
Low data mgmt overhead
1
Flow reports
Hui Zhang, Fall 2012
2
flow = same src-dst, ports, proto
flow report = flow + pkt/byte counters
36
Not A Solved Problem

Routers cannot record every packet/flow
- Constraints: CPU, Memory, Bandwidth

Resource constraints don’t go away!
- Network demands scale even as routers become more
powerful
Hui Zhang, Fall 2012
37
37
Summary of Key Concepts

Two keywords in network management
- Network: not just the protocols
- Management: human being has goals to achieve

First step in network management
- Modeling and documenting all details of the network

Key difference from network control
- Files and databases are fundamental building blocks

Five key areas of network management
- FCAPS


SNMP and Netflow are just starting points
Many challenges remain
- Opex dominates Capex
- More scientific/systematic approach needed
Hui Zhang, Fall 2012
38
38