Transcript PPT Version
TRILL Routing Scalability
Considerations
Alex Zinin
[email protected]
IETF-62
TRILL BOF
General scalability framework
About growth functions for
Scaling parameters
Data overhead (Adj’s, LSDB, MAC entries)
BW overhead (Hellos, Updates, Refr’s/sec)
CPU overhead (comp complexity, frequency)
N—total number of stations N
L—number of VLANs
F—relocation frequency
Types of devices
Edge switch (attached to a fraction of N, and L)
Core switch (most of L)
IETF-62
TRILL BOF
Scenarios for analysis
Single stationary bcast domain
Bcast domain with mobile stations
Multiple stationary VLANs
No practical station mobility
N = O(1K) by natural bcast limits
L = O(1K) total, O(100) visible to switch
N = O(10K) total
Multiple VLANs with mobile stations
IETF-62
TRILL BOF
Protocol params of interest
What
Why
Amount of data (topology, leaf entries)
Number of LSPs
LSP refresh rate
LSP update rate
Flooding complexity
Route calculation complexity & frequency
Required memory [increase] as network grows
Required mem & CPU to keep up with protocol dynamics
Link BW overhead to control the network
How:
Absolute: big-O notation
Relative: compare to e.g. bridging & IP routing
IETF-62
TRILL BOF
Why is this important
If data-inefficient:
Increased memory requirements
Frequent memory upgrades as network grows
Much more info to flood
If comput’ly inefficient:
Substantial comp power increase == marginal network size
increase
High CPU utilization
Inability to keep up with protocol dynamics
IETF-62
TRILL BOF
Link-state Protocol Dynamics
Network events are visible everywhere
Main assumption for stationary networks:
For each node:
Rprc >> Rinp
What if (Rprc < Rinp) ???
Rinp—input update rate (network event frequency)
Rprc—update process rate
Long-term convergence condition:
Network change is temporary
Topology stabilizes within finite T
Micro bursts are buffered by queues
Short-term (normal for stat. nets): update drops, rexmit, convergence
Long-term/permanent: net never converges, CPU upgrade needed
Rprc = f (proto design, CPU, implementation)
Rinp = f (proto design, network)
IETF-62
TRILL BOF
Data-plane parameters
Data overhead
Number of MAC entries in CAM-table
Why worry?
CAM-table is expensive
1-8K entries for small switches
32K-128K for core switches
Shared among VLANs
Entries expire when stations go silent
IETF-62
TRILL BOF
Single Bcast domain (CP)
Total of O(1K) MAC addresses
IS-IS update packing:
4 addr’s per TLV (TLV is 255B max)
20 addr’s per LSP fragment (1470B default)
~5K add’s per node (256 frags total)
LSP refresh rate:
Each address: 12bit VLAN tag + 48bit MAC = 60 bits
1K MACs = 50 LSPs
1h renewal = 1 update every 72 secs
MAC update rate:
Depends on MAC learning & dead detection procedure
IETF-62
TRILL BOF
MAC learning
Traffic + expiration (5-15m):
Announces station activity
1K stations, 30m fluctuations = 1 update every 1.8 seconds
average
Likely bursts due to “start-of-day” phenomenon
Reachability-based
Start announcing MAC when first heard from station
Assume it’s there until have seen evidence otherwise even if
silent (presumption of reachability)
Removes activity-sensitive fluctuations
IETF-62
TRILL BOF
Single bcast domain (DP)
Number of entries
Bridges: f (traffic)
Limited by local config, location within network
Rbridge: all attached stations
No big change for core switches (see most MACs)
May be a problem for smaller ones
IETF-62
TRILL BOF
Single bcast: summary
With reachibility-based MAC announcements…
CP is well within the limits of current link-state routing
protocols
CP data overhead is O(N)
Can comfortably handle O(10k) routes
Dynamics are very similar
There’s an existence proof that this works
Worse than IP routing: O(log N)
However, net size is upper-bound by bcast limits
Small switches will need to store & compute more
Data-plane may require bigger MAC tables in smaller
switches
IETF-62
TRILL BOF
Note: comfort limit
Always possible to overload neighbor w updates
Update flow control is employed
Experience-based heuristics: pace updates at 30/sec
Dynamic is possible, yet…
Not a hard rule, ballpark
Limits burst Rinp for neighbor
Prevents drops during flooding storms
Given the (Rprc >> Rinp) condition, want average to be
an order of magnitude lower, e.g. O(1) upd/sec Max
IETF-62
TRILL BOF
Note: protocol upper-bound
LSP generation is paced: normally not more frequent
than each 5 secs
Each LSP frag has it’s own timer
With equal distribution
Max node origination rate == 51 upd/sec
Does not address long-term stability
IETF-62
TRILL BOF
Single bcast + mobility
Same number of stations
Different dynamics
Take IETF wireless network, worst case
~700 stations
New location within 10 minutes
Average 1 MAC every 0.86 sec or 1.16 MAC/sec
Note: every small switch in VLAN will see updates
How does it work now???
Same data efficiency for CP and DP
Bridges (APs + switches) relearn MACs, expire old
Summary: dynamics barely fit within comfort range
IETF-62
TRILL BOF
Multiple VLANs
Real networks have VLANs
Assuming current proposal is used
Two possibilities:
Standard IS-IS flooding
Single IS-IS instance for whole network
Separate IS-IS instance per VLAN
Similar scaling challenges as with VR-based L3
VPNs
IETF-62
TRILL BOF
VLANs: single IS-IS
Assuming reachability-based MAC announc’t
Adjacencies and convergence scale well
However…
Easily hit 5K MAC/node limit (solvable)
Every switch sees every MAC in every VLAN
Even if it doesn’t need it
Clear scaling issue
IETF-62
TRILL BOF
VLANs: multiple instances
MAC announcements scale well
Good resource separation
However…
N
N
N
N
adjacencies for a VLAN trunk
times more processing for a single topological event
times more data structures (neighbors, timers, etc.)
=100…1000 for a core switch
Clear scaling issue for core switches
IETF-62
TRILL BOF
VLANs: data plane
Core switches
Not big difference
Exposed to most MACs in VLANs anyway
Smaller switches
Have to install all MACs even if a single port on a
switch belongs to a VLAN
May require bigger MAC tables than available today
IETF-62
TRILL BOF
VLANs: summary
Control plane:
Currently available solutions have scaling issues
Data plane:
Smaller switches may have to pay
IETF-62
TRILL BOF
VLANs + Mobility
Assuming some VLANs will have mobile stations
Data plane: same as stationary VLANs
All scaling considerations for VLANs apply
Mobility dynamics get multiplied
Single IS-IS: updates hit same adjacency
Multiple IS-IS: updates hit same CPU
Activity not bounded naturally anymore
Update rate easily goes outside comfort range
Clear scaling issues
IETF-62
TRILL BOF
Resolving scaling concerns
5K MAC/node limit in IS-IS could be solved with
RFC3786
Don’t use per-VLAN (multi-instance) routing
Use reachability-based MAC announcement
Scaling MAC distribution requires VLAN-aware flooding:
Each node and link is associated with a set of VLANs
Only information needed by the remote nbr is flooded to it
Not present in current IS-IS framework
Forget about mobility ;-)
IETF-62
TRILL BOF