2 - OpenStack

Download Report

Transcript 2 - OpenStack

SDN Performance & Architecture
Evaluation
Vijay Seshadri
Cloud Platform Engineering (CPE), Symantec
Jason Venner
Mirantis
Agenda
1
CPE Overview
2
SDN Objectives and Test Plan
3
Test Setup and Framework
4
Test Results
5
Key Findings/Insights
2
CPE Overview
• CPE Charter
– Build a consolidated cloud infrastructure that offers platform
services to host Symantec cloud applications
• Symantec cloud infrastructure already hosting diverse
(security, data management) workloads
– Analytics – Reputation based security, managed security
services
– Storage – Consumer and Enterprise backup/archival
– Network – Hosted email & web security
• We are building an OpenStack based platform that
provides additional storage and analytics services
– Secure multi-tenancy a core objective for all services
CPE Platform Architecture
CLIs
Cloud Applications
Scripts
Web Portal
REST/JSON API
Monitoring
Logging
Compute
(Nova)
SDN
(Neutron)
Object Store
Batch
Analytics
Msg Queue
Image
(Glance)
Load
Balancing
K/V Store
Stream
Processing
Mem Cache
DNS
SQL
Metering
User Mgmt
Authn
Email Relay
Roles
SSL
Deployment
Tenancy
Compute
Networking
Storage
Big Data
Messaging
Quotas
Supporting
Services
Core Services
Identity &
Access
(Keystone)
1
CPE Overview
2
SDN Objectives and Test Plan
3
Test Setup and Framework
4
Test Results
5
Key Findings/Insights
2
SDN Objectives
• Provide secure multi-tenancy using strong network
isolation
– Policy driven network access control within (and across)
projects/domains
• Provide an ability for product engineering teams to define a
network topology via REST APIs
–Associate network objects dynamically with VMs,
Projects (Tenants)
– Create and manage network access control policies within and
across projects
– Enables easier integration of applications on partner infrastructure
• Interconnect OpenStack with bare metal storage/analytics
services
– East-West traffic needs to transit the underlay network
• Support software driven network functions
– LBaaS, DNSaaS etc
Test Plan
• Secure Multi Tenancy
– Test network isolation under various configurations
• Same Subnet
• Same network, different subnets
• Different networks, different tenants
• Different networks, identical IP addresses
– Test enforcement of network policies
• “Deny All” works and is the default
• “Allow Specific” works
• Removal of an “Allow Specific” works
• Data Plane Performance
– OpenStack Internal
• N x N Inter VM communication
• Client-Server TCP mesh using Iperf3
Test Plan – Cont’d
– Egress/Ingress
• Simulate data ingress/egress to non-OpenStack services (E.g Bare metal
Hadoop cluster)
• Control Plane scalability
– Where does the solution break? Verify correct operation at the
limit
– Test rate of creation and resource limits for neutron objects
• Number of Network Ports – VNIC’s
• Number of Networks and Subnets
• Number of Routers
• Number of active flows
• Number of connections through the external Gateway (Ingress/Egress)
1
CPE Overview
2
SDN Objectives and Test Plan
3
Test Setup and Framework
4
Test Results
5
Key Findings/Insights
2
Overview of SDN solutions
• OpenStack is about networking
• OVS/VLAN
– 4k limit on VLANS
– Many VLAN’s spanning many TORs are operationally
challenging, especially in rapidly changing environments
– No Failure Isolation Domain
• Overlay
– Simple Physical Network, each TOR L2 island
– L3 between the TORS
– Packet encapsulation for VM Traffic
– Controllers orchestrate tunnel mesh for VM traffic
Physical Test setup
140 2 socket servers
7 Racks 80Gb to Spine
2x10Gb LACP/Server
2 Juniper MX Routers
2x40g
2x40g
2x40g
2x40g
2x40g
2x40g
2x40g
2 Juniper MX
Each with 4x10G
In place of the last 4 servers
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
Test Framework Overview
• We build a pool of VM’s each placed into a specific Rack
• We have a control network we use to launch test cases and
collect results
OpenStack Public API Service
Te
Test VM
Test VM
Test VM
Control VM
NSX Test Setup: OpenStack Internal
Bottom Right
NSX Controllers
NSX Gateway servers
2x40g
2x40g
2x40g
2x40g
2x40g
2x40g
2x40g
3
OpenSack
Controllers
4
NSX
Controllers
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
4
NSX
Gateways
2x10Gb LACP Data
1g IPMI
1g MGMT
NSX Test Setup: Ingress/Egress
2x40g
2x40g
2x40g
2x40g
3
OpenSack
Controller
2x40g
2x40g
2x40g
20
20
Outsiide
hosts
NSX
Traffic
Source
Sink
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
4
NSX
Controllers
2x10Gb LACP Data
1g IPMI
1g MGMT
Gateways
2x10Gb LACP Data
1g IPMI
1g MGMT
Contrail Test Setup: OpenStack Internal
2x40g
2x40g
2x40g
2x40g
2x40g
2x40g
2x40g
1 OpenSack
Controller
4 servers
2x10 Used
3 servers
Conrollers
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2 Juniper MX
Each with 4x10G
In place of the last 4 servers
Contrail Test Setup: Ingress/Egress
2x40g
2x40g
2x40g
2x40g
1 OpenSack
Controller
2x40g
2x40g
2x40g
20
OutSide
Traffic
Source
Sink
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
4 servers
2x10
Ports Used
2x10Gb LACP Data
1g IPMI
1g MGMT
2x10Gb LACP Data
1g IPMI
1g MGMT
2 Juniper MX
Each with 4x10G
In place of the last 4 servers
1
CPE Overview
2
SDN Objectives and Test Plan
3
Test Setup and Framework
4
Test Results
5
Key Findings/Insights
2
Multi-tenancy Test Results
• Connectivity within a
subnet
– Both solutions proved isolation
– Contrail 1.04 was not “deny all”
by default
• Same network, different
subnets
– Both solutions proved isolation
Multi-tenancy Test Results
• Different networks in
different projects
– Both solutions proved isolation
– Contrail used floating IPs
• Different network
overlapping IP addresses
– Both solutions proved isolation
Data Plane Test Setup
Control Channel
Control Channel
Iperf
Data Plane
Test Controller
Iperf Source
Iperf Sink
Iperf Data
Through the TORs
Across the Spine
Test Controller
VM
Iperf Source
1 VM
60 VMs
160 VMs
300 VMs
Iperf Sink
1 VM
60 VMs
160 VMs
300 VMs
Iperf Data
Through the
TORs
Across the
Spine
Iperf Source
1Iperf
VM Source
60 VMs
1Iperf
VM Source
160 VMs
60 VMs
1Iperf
VM Source
300 160
VMsVMs
60 VMs
1 VM
300 160
VMsVMs
60 VMs
300 160
VMsVMs
300 VMs
Iperf Sink
1 VM
60 VMs
160 VMs
300 VMs
Test Controller
VM
Data Plane Test Results: OpenStack Internal
• Both overlay solutions added 4 % payload overhead
– This is in addition to 3% underlay frame overhead
• Both solutions ran close to ‘virtual’ wire speed.
– We hit 75Gb/sec per TOR out of 80Gb/sec
– Peak across the TORS 225Gb/sec out of 240Gb/sec
• Traversing SDN routers had little impact on peak performance
• Neutron OVS/VLAN required jumbo frames to hit wire speed
Data Plane Test Results: Ingress/Egress
•
•
•
•
•
•
On a Per VM Basis we ran at virtual wire speed to/from our external hosts
OVS bugs in OVS 2.1 limited the performance of individual connections under high
concurrency (>10,000)
Saturation of our TOR to Spine connection was the principle performance limit
Gateway traffic required 2 passes through our Gateway TOR – one busy TOR
Saturated the Gateway TOR, data payload at 37 Gb/sec
Tested up to 100,000 concurrent connections
SDN Gateway Services
• OVS performance fell off after 10,000
Gateway
• vRouter was able to scale till 100,000
Tor
IPERF on Virtual Machines
2x40g
2x10Gb
LACP Data
External
1g IPMIHosts
on
1gIPERF
MGMT
physical
machines
Control Plane Test Results
Test
NSX
Quantity
NSX Notes
Contrail
Quantity
Contrail Notes
Number of
Networks
>16,000
Limiting factor is the amount
of RAM in the NSX
controllers
>2,500
OpenStack configuration limited the
test
Number of
Subnets per
Network
>1024
OpenStack configuration
limited the test
>1024
OpenStack configuration limited the
test
Number of
Network
Ports
>8192
OpenStack configuration
limited the test
>5000
Hardware Limits in the cluster limited
the test
Number of
Networks
per Router
> 512
Published limit is 10
>300
OpenStack configuration limited the
test
1
CPE Overview
2
SDN Objectives and Test Plan
3
Test Setup and Framework
4
Test Results
5
Key Findings/Insights
2
Key Findings: OpenStack configuration
• Building a 100+ node OpenStack cluster requires extensive config
tuning
– Host OS, MySQL, DB Indexes and Rabbit tuning
– Horizontal scaling of API servers
– API service thread counts, connection pools and queues
– Python REST servers are not the ideal for handling 10,000’s of requests per
second
– Keystone configuration
• Memory cache for tokens
• Short lived tokens with regular purge
• Havana Neutron implementation does not meet our scalability goals
– 8 minutes+ to turn up a network port, on a network with a few hundred VM’s
– Long delays with the query API’s when there are a few thousand VM’s
Key findings: VM Network Performance
• Tune your VM kernels
– With the default Ubuntu 12.04 or CentOS 6.5 kernel settings for network
buffer space and TCP window sizes we see 1.8Gb/sec/VM with 1500byte
MTU.
• With standard 10G tuning for rmem/wmem and interface
txqueuelen
MTU 1500
MTU 9000
OVS/VLAN
3.8Gb/sec/VM
9Gb/sec/VM
NSX
9.1Gb/sec/VM
Contrail
9.1Gb/sec/VM
Conclusion
• SDN is a core capability for us to offer a secure multi-tenant
cloud platform
• Overlay solutions provide a strong network isolation and
access control
• Our use case requires extensive traffic into and out of the SDN
zone
– NSX requires host servers and additional network configuration
– Contrail uses MPLS enabled routers, more integrated with underlay
infrastructure
• Both overlay solutions met our short term performance and
scalability goals
• We will continue to evaluate the SDN space for solutions that
meet our long term goals
Thank you!
SYMANTEC
Copyright © 2014 Symantec Corporation. All rights reserved.