PWE3 WG Document Status

Download Report

Transcript PWE3 WG Document Status

Enhanced ECMP and Large
Flow Aware Transport
draft-yong-pwe3-enhance-ecmp-lfat
{lucyyong,yangpeilin}@huawei.com
IETF77, Anaheim CA
1
Why Enhanced ECMP
• Internet Traffic show the traffic pattern as
– Very small % top large flows take up large portion of network capacity
– Huge amount of small flows consume the rest of network capacity
• Hash based ECMP can not evenly disperse traffic flows over ECMP
paths under such traffic pattern
– Hashing preserves the order of packets that belongs to individual flows
within the aggregated flows
– Hashing dispersion is simple and stateless, if flow IDs are random
enough, it evenly disperses the number of flows on paths
• But it does not mean the even traffic volume on the paths
• Simulation shows uneven load for internet traffic
• Unbalanced load over ECMP paths results
– Congestion on some path while other are partially used
– Enforce carrier to deploy more capacity in order to maintain the
performance -> lower network utilization and increase service cost
• Hash based distribution are popularly used, because simple and
scale
IETF77, Anaheim CA
2
Enhanced ECMP Proposal
• Apply different treatments on small flows and large flows
– Use hash to disperse all small flows over ECMP paths
– Use a table to map a small set of large flows to ECMP paths
• Simple load balance algorithm can effectively compensate
unbalanced paths caused by hashing
• Mapping table automatically refresh to remove non-live flows
• A very small set of large flows will not give BIG burden to device and
not cause scalability concern
+-------------+
| 4 ECMP Paths
| Small-Flow |
|
+--->| Forwarding |--->|=========
+------------+ |
| Process
|
|
Packets| Packet
| |
+-------------+
|=========
------>| Separation |---+
|
| Process
|---+
|=========
+------------+ |
+-------------+
|
|
| Large-Flow |
|=========
+--->| Forwarding |--->|
| Process
|
|
+-------------+
|
IETF77, Anaheim CA
3
Simulation Result
• We analyzed Internet Traffic captured by Caida
(http://www.caida.org/data/monitor)
• Program a traffic generator that generate
–
–
–
–
2% large flows that take up 30% of traffic volume
98% small flows that take up 70% of traffic volume
Flow rate for small or large flows are randomly generated
Apply generated traffic to 4 ECMP paths by using existing
ECMP and enhanced ECMP approach, respectively
– Run it again over 10 ECMP paths
IETF77, Anaheim CA
4
Result for Four ECMP Paths
Existing ECMP Simulation
ECMP can get ~10%
volume difference
between paths
Weighted Average Traffic
per Path
110
105
Path1
Path2
100
Path2
Path4
95
90
6 simulations
Enhanced ECMP Simulation
110
Weighted Average Traffic
per Path
Enhanced ECMP
obtains <1% volume
difference between
paths
105
Path1
100
Path2
Path3
Path4
95
90
IETF77, Anaheim CA
6 simulations
5
Result for Ten ECMP Paths
115
Existing ECMP Simulation
ECMP can get ~15%
volume difference
between paths
The more ECMP
paths, the worse
Hash perform
Weighted Average Traffic
per Path
Path1
Path2
110
Path3
Path4
105
Path5
Path6
100
Path7
Path8
95
Path9
Path10
90
6 simulations
115
Enahnced ECMP Simulation
Path1
Weighted Average Traffic
per Path
Enhanced ECMP
obtains <1% volume
difference between
paths
Path2
110
Path3
Path4
105
Path5
Path6
100
Path7
Path8
95
Path9
Path10
90
IETF77, Anaheim
CA
6 simulations
6
Ingress PE Process
• Insert flow label for each received packet
• Perform Large Flow Recognition
– Large flow criteria can be configured by operator
• Insert large flow indication in the packets for the large flow
– Set the default as a small flow
• Egress PE trims off the flow label before forwarding to AC
– Same process as described in FAT-PW
IETF77, Anaheim CA
7
Enhanced ECMP Process at P
• Inspect large flow indication on each received packets
• Use hash to disperse small flow packets
• Forward large flow packets based on the mapping table
– For an existing flow
• Forward the packet to the path indicated in table
– For a new flow
• Select the least loaded path for the new flow
• Add an entry in table for the new flow
– Remove the flow entry when the flow transport completed
+-------------+
| 4 ECMP Paths
| Small-Flow |
|
+--->| Forwarding |--->|=========
+------------+ |
| Process
|
|
Packets| Packet
| |
+-------------+
|=========
------>| Separation |---+
|
| Process
|---+
|=========
+------------+ |
+-------------+
|
|
| Large-Flow |
|=========
+--->| Forwarding |--->|
| Process
|
|
+-------------+
|
IETF77, Anaheim CA
8
Congestion Control
• When congestion happens, P node selects some
large flows to be dropped, rerouted, or cached
– Dramatically reduce the number of impacted services
– Reduce network convergence time (only few flow impacted)
-> Large flow indication brings an advantage in
congestion control
IETF77, Anaheim CA
9
Full Backward Compatibility
• When ingress PE does not support Large Flow
Recognition, it SHALL set F bit to 0
– All packets will be treated as small flow at the P nodes that
implement enhanced ECMP
• When ingress PE supports Large Flow Recognition,
P node does not support
– P node will not check the large flow indication and treat all
packets as small flow packets
• Enhanced ECMP can co-exist with existing ECMP
– Some LSRs with enhanced ECMP and some with ECMP
– What a great help in network migration!
IETF77, Anaheim CA
10
Applicability
•
•
•
•
•
A single large flow in PW
IP packets
LSP traffic with entropy label or application label
Transport over LAG
MS-PW
IETF77, Anaheim CA
11
Acknowledgement
Authors like to thank Stewart Bryant, Frederic Jounay,
Simon Delord, Raymond Key for their review and
comments
Next Step
We like to hear people comments and advices in
moving to next step
IETF77, Anaheim CA
12