Transcript hotnets10
CloudPolice: Taking Access Control Out of the
Network
Lucian Popa
UC Berkeley/ICSI
MinlanYu
Steven Y. Ko
Princeton Univ.
Princeton Univ.
Sylvia Ratnasamy
Ion Stoica
Intel Labs Berkeley
UC Berkeley
Context
Infrastructure as a Service virtualized clouds
VM
VM
Hypervisor
Traffic internal to cloud
VM
Context
Cloud computing requires network access control
Context
Cloud computing requires network access control
Access control policy of tenant X = what network traffic is
tenant X willing to accept
Y can talk
to me
Tenant X
Tenant Y
Why Access Control in Clouds? (1)
For isolation
Policy: deny incoming traffic from any other tenant
Exbay
Amazonia
Why Access Control in Clouds? (2)
For inter-tenant & tenant-provider communication
Policy: allow/deny traffic from specific tenants
Increasingly common in cloud environments
Low latency and high bandwidth
Ease of service composition
Exbay
Amazonia
Why Access Control in Clouds? (2)
For inter-tenant & tenant-provider communication
Policy: allow/deny traffic from specific tenants
Exbay
Amazonia
Real-time bidding advertising
Why Access Control in Clouds? (2)
For inter-tenant & tenant-provider communication
Policy: allow/deny traffic from specific tenants
Send information about client
Exbay
Amazonia
Real-time bidding advertising
Ad
Ad
Ad
Networks
Network
Network21
Why Access Control in Clouds? (2)
For inter-tenant & tenant-provider communication
Policy: allow/deny traffic from specific tenants
Receive ad bids
Exbay
Amazonia
Real-time bidding advertising
Ad
Ad
Ad
Networks
Network
Network21
Why Access Control in Clouds? (2)
For inter-tenant & tenant-provider communication
Policy: allow/deny traffic from specific tenants
Return ad of highest bidder
Exbay
Amazonia
Real-time bidding advertising
Ad
Ad
Ad
Networks
Network
Network21
Why Access Control in Clouds? (2)
For inter-tenant & tenant-provider communication
Policy: allow/deny traffic from specific tenants
Policy of Exbay: allow traffic from
AdNetworks, deny all other traffic
Exbay
Amazonia
Real-time bidding advertising
Ad
Ad
Ad
Networks
Network
Network21
Why Access Control in Clouds? (2)
For inter-tenant & tenant-provider communication
Policy: allow/deny traffic from specific tenants
Other service examples: database (SimpleDB), desktop,
communication (SQS), map-reduce++, Facebook, host
managing, locking, etc.
Exbay
Amazonia
Ad
Ad
Ad
Networks
Network
Network21
Why Access Control in Clouds? (3)
For inter-tenant & tenant-provider communication
Policy: weighted bandwidth allocation between tenants
Exbay
Amazonia
Ad
Ad
Ad
Networks
Network
Network21
Why Access Control in Clouds? (3)
For inter-tenant & tenant-provider communication
Policy: weighted bandwidth allocation between tenants
Share bandwidth fairly among tenants regardless of #VM sources
Nextbay
Exbay
Amazonia
Ad
Ad
Ad
Networks
Network
Network21
Why Access Control in Clouds? (3)
For inter-tenant & tenant-provider communication
Policy: weighted bandwidth allocation between tenants
Other example policies:
• Rate-limited access
• Allow only locally initiated connections
• Nighttime access only
Nextbay
Exbay
Amazonia
Ad
Ad
Ad
Networks
Network
Network21
Why Access Control in Clouds? (4)
DoS protection
One tenant can attack another tenant
Reduce bandwidth and slow down machines
Attackers more powerful: higher bandwidths
Barrier is lower: pay for attacking hosts (compromise credit cards instead
of hosts)
Nextbay
Exbay
AmazoniaX
Ad
Ad
Ad
Networks
Network
Network21
Hence, the problem
Want access control in clouds that
Is resilient to DoS
Supports rich inter-tenant policies
Hence, the problem
Want access control in clouds that
Is resilient to DoS
Supports rich inter-tenant policies
Scales
100k servers
10k tenants
Hence, the problem
Want access control in clouds that
Is resilient to DoS
Supports rich inter-tenant policies
Scales
Tolerates high dynamicity
100k VMs started per day, more than one per second
Hence, the problem
Want access control in clouds that
Is resilient to DoS
Supports rich inter-tenant policies
Scales
Tolerates high dynamicity
Traditional access control mechanisms not well suited to
meeting these requirements
Existing Access Control
Cloud APIs are narrow
On/off
No locally initiated connections, no rate-limiting, no weighted
allocation
Mechanisms inherited from enterprises
VLANs
Firewalls
Existing Access Control
Cloud APIs are narrow
On/off
No locally initiated connections, no rate-limiting, no weighted
allocation
Mechanisms inherited from enterprises
VLANs
Firewalls
But clouds != enterprises
Clouds != Enterprises
Enterprises are not multi-tenant
Few DoS concerns between departments
Typically simpler policies
Enterprises don’t have the same dynamicity and scale
10k tenants vs. 10s departments; 1 VM/s vs. mostly static
Clouds have different network designs
High bisection bandwidths, multiple paths, different L2/L3 mix
Many new topologies: FatTree, VL2, BCube, DCell, etc.
VLANS not well suited for clouds
Inflexible policies
Difficult to scale (cloud size & dynamicity)
Limited number, spanning tree
Limited network designs
No L3 networks, no multiple paths, inter-VLAN through router
Firewalls not well suited for Clouds
Offering DoS protection is difficult
Must be applied at source hard to update
Inflexible policies
Scale through prefix aggregation
Difficult to manage
10k tenants with multiple prefixes, different scaling requirements
No L3 networks
Recap
Traditional access control is not well suited for clouds
Couple access control with network operation
With switching –VLANs
With address assignment – Firewalls
Recap
Traditional access control is not well suited for clouds
Couple access control with network operation
With switching –VLANs
With address assignment – Firewalls
CloudPolice takes access control out of the network
Outline
Part 1 – Context and Motivation
Access control for clouds: why and what?
Limitations of traditional mechanisms
Part 2 – CloudPolice
Approach
Operation
Goal
Network Access Control for Clouds that is:
1. Independent of network topology and addressing
2. Scalable (millions hosts, high churn)
3. Flexible (on/off access, rated access, fair access)
4. Robust to (internal) DDoS attacks
CloudPolice
Sufficient and advantageous to implement access control only
within hypervisors
VM
VM
Hypervisor
VM
CloudPolice
Sufficient and advantageous to implement access control only
within hypervisors
Trusted
Network independent
Full software programmability flexible
Close toVMs block unwanted traffic before network and help DoS
Easy deployability
VM
VM
Hypervisor
VM
CloudPolice
Sufficient and advantageous to implement access control only
within hypervisors
CloudPolice Policy Model
Group = set of tenant VMs with same access control policy
VM
VM
Hypervisor
VM
CloudPolice
Sufficient and advantageous to implement access control only
within hypervisors
CloudPolice Policy Model
VM
Policy = set of Rules
Rule = IF ConditionTHEN Action
VM
Hypervisor
VM
CloudPolice
Sufficient and advantageous to implement access control only
within hypervisors
Condition = logical expression
with predicates based on:
• Group of sender
• Packet header
• Current time
• History of traffic
CloudPolice Policy Model
VM
VM
Hypervisor
VM
CloudPolice
Sufficient and advantageous to implement access control only
within hypervisors
Action:
• Allow
• Block
• Rate-limit (token bucket)
CloudPolice Policy Model
VM
VM
Hypervisor
VM
CloudPolice
Sufficient and advantageous to implement access control only
within hypervisors
Action:
• Allow
• Block
• Rate-limit (token bucket)
CloudPolice Policy Model
VM
VM
VM
Applied per
Hypervisor
flow
source VM
source group
CloudPolice
Hypervisor-based
VM
VM
Hypervisor
Src.
VM
VM
VM
Hypervisor
Dst.
VM
CloudPolice
Hypervisor-based
Avoid DoS and wasted resources apply policy at source
VM
VM
Hypervisor
Src.
VM
VM
VM
Hypervisor
Dst.
VM
CloudPolice
Hypervisor-based
How to apply destination’s policy at the source hypervisor?
VM
VM
Hypervisor
Src.
VM
VM
VM
Hypervisor
Dst.
VM
CloudPolice
Hypervisor-based
Centralized policy repository?
VM
VM
Hypervisor
Src.
VM
VM
VM
Hypervisor
Dst.
VM
CloudPolice
Hypervisor-based
Centralized policy repository?
VM
VM
Src.
VM
VM
Allow?
Hypervisor
VM
Hypervisor
Dst.
VM
CloudPolice
Hypervisor-based
Centralized policy repository?
Centralized service requires high availability and throughput
100k servers and 10 new flows/VM/s 1M decisions/s on average!
Caching can be ineffective (random patterns, malicious pollution)
Centralized service can be a DoS target
VM
VM
Src.
VM
VM
Allow?
Hypervisor
VM
Hypervisor
Dst.
VM
CloudPolice
Hypervisor-based
Decentralized
VM
VM
Hypervisor
Src.
VM
VM
VM
Hypervisor
Dst.
VM
CloudPolice
Hypervisor-based
Decentralized
Distribute all policies to all hypervisors?
VM
VM
Hypervisor
Src.
VM
VM
VM
Hypervisor
Dst.
VM
CloudPolice
Hypervisor-based
Decentralized
Distribute all policies to all hypervisors?
VM
VM
Hypervisor
Src.
VM
VM
Allow?
VM
Hypervisor
Dst.
VM
CloudPolice
Hypervisor-based
Decentralized
Distribute all policies to all hypervisors?
Too heavyweight if network independent
Full group membership required; Group updates propagated everywhere
100k new VMs/day, 100k servers 100k updates/s on average
VM
VM
Hypervisor
Src.
VM
VM
VM
Hypervisor
Dst.
VM
CloudPolice
Hypervisor-based
Decentralized
Apply at destination and enforce at source
VM
VM
Hypervisor
Src.
VM
Apply destination’s
policy
VM
VM
Hypervisor
Dst.
VM
CloudPolice
Hypervisor-based
Decentralized
Apply at destination and enforce at source
VM
VM
Hypervisor
Src.
VM
Enforce policy’s
action
VM
VM
Hypervisor
Dst.
VM
Inspired by Internet Research
Internet solutions to DDoS
Push-back filters [AIP, Pushback, AITF, StopIt]
Network Capabilities [SIFF, TVA]
Handle large and dynamic networks, millions of users
Inspired by Internet Research
Internet solutions to DDoS
Push-back filters [AIP, Pushback, AITF, StopIt]
Network Capabilities [SIFF, TVA]
Handle large and dynamic networks, millions of users
More easily deployed: Clouds != Internet
Clouds are controlled environments
Both communication endpoints can be controlled
Single administrative domain
New tools: trusted software layer – Hypervisor
Outline
Part 1 – Context and Motivation
Access control for clouds: why and what?
Limitations of traditional mechanisms
Part 2 – CloudPolice
Approach
Operation
CloudPolice
CloudPolice
X
Y
Z
X’s group policy: IF group = A allow
IF group = B block
IF group = C & port = 80
rate-limit to 100Mbps
Y’s group policy: IF …
Z’s group policy: IF …
Hypervisor
Policies for X,Y and Z
Policy could also be
specified / updated by VM
Installed by provider
service that starts VMs
Each hypervisor needs to know
for hostedVMs: group and policy
CloudPolice
X
Y
Z
Hypervisor
Filter for
incoming/outgoing flows
CloudPolice
Start flow to C
X
Y
A
Z
Hypervisor
B
C
Hypervisor
Z group
CloudPolice inserts control packet
containing group of Z and first
packet header
CloudPolice
X
Y
A
Z
Hypervisor
B
C
Hypervisor
Z group
If blocked or rate limited, send control
If allowed, packets are forwarded to
packet to source hypervisor to block
destination VM
or rate-limit source (flow/VM)
Block/rate-limit
Soft-state
CloudPolice verifies policy of
and timeouts handle policy invalidations and
packetVMlosses
destination
Scalability
CloudPolice takes the best of both worlds
Centralized vs. every server stores all policies
Load spread across all servers
Maintaining and enforcing policies
Update propagation is contained
Group membership updates not propagated
Policy updates propagated only to group
Security Analysis Sketch
Attackers
VMs – corrupted or paid by malicious tenants
Attacks considered
Violate access control policies to reach destination
DoS with unauthorized traffic
DoS with authorized traffic
Assumptions
Hypervisors not compromised
Security Analysis Sketch
Violate access control policies to reach destination
Policy distributed securely to hypervisor
Control packets cannot be spoofed, only sent by hypervisors
X
Y
Z
Fake group
Fake group
Hypervisor
Security Analysis Sketch
Violate access control policies to reach destination
DoS with unauthorized traffic
Control packets block unauthorized traffic at source
Security Analysis Sketch
Violate access control policies to reach destination
DoS with unauthorized traffic
Control packets block unauthorized traffic at source
Attackers attempt to cause drops of control packets
Block/rate-limit
Security Analysis Sketch
Violate access control policies to reach destination
DoS with unauthorized traffic
Control packets block unauthorized traffic at source
Attackers attempt to cause drops of control packets
Retry or prioritize control packets
Security Analysis Sketch
Violate access control policies to reach destination
DoS with unauthorized traffic
DoS with authorized traffic
Also need performance isolation for full protection
Congestion
Security Analysis Sketch
Violate access control policies to reach destination
DoS with unauthorized traffic
DoS with authorized traffic
Also need performance isolation for full protection
CloudPolice can implement some performance isolation
Rate-limit to fair share
of destination link
rate-limit
Share access link evenly
between destination VMs
Future Work
Implement CloudPolice prototype
Extend CloudPolice
Policies with application-level semantics (dynamic policies)
Policies based on group-wide state
Beyond access control?
More flexible actions, e.g., send to middlebox
Performance isolation framework
Summary
Access control in cloud computing requires new mechanisms
and extended policies
CloudPolice
Takes advantage of trusted hypervisors
Inspired by past work on Internet DDoS protection
Properties
Network independent
Scalable
Flexible
Robust to (internal) DDoS attacks
Backup Slides
Related Work
OpenFlow & (Onix | Difane) & OpenVSwitch
Decisions not based on logical identifier (group/tenant)
Onix only isolation framework
OpenFlow actions designed for switches (e.g., currently can’t
rate-limit)
Require scaling central controller
Vs. software update for CloudPolice
Contributions
Identify that new access control mechanism is needed in clouds
Pinpoint the challenges and requirements
Identify that access control should be done in hypervisors
Propose CloudPolice, mechanism that satisfies requirements
Compromise Single Hypervisor
Can prevent compromised hypervisors from violating security
policies
Security credentials associated with group identifier
1.
Cannot be sent if unknown (known only for hosted VMs)
E.g., group ID has key in name
Prevent spoofed control packets in the network
2.
Like IP anti-spoofing in switches/routers
Today’s Cloud Mechanisms?
Solutions not public
Could be similar to our solution
Could provide fewer properties
API is narrow
On/off between groups
No locally initiated connections, no rate-limiting, no weighted
allocation
Feasibility
Working on implementing CloudPolice prototype
Fast path – act on per flow state
Open VSwitch and software routers [RouteBricks,
PacketShader] suggest this is feasible
Slow path – execute policy and install flow state
1/N of requirements for centralized repository
Few hosted VMs dominated by policy complexity
Software router applications suggest if-then-else structures can be parsed
fast [RBF]
Other Related Work
VL2’s approach if it would be applied to hypervisors
Centralized repository
Can violate policies if IP of destination known
Firewalls not Suited for Clouds
Not well suited against DoS
Must be applied at source hard to update
Inflexible policies for clouds
Scaling & network designs
With no prefix aggregation
Difficult to scale (100k+ entries)
Needs updating on all VM starts (more than once/s)
With prefix aggregation
Complex to manage
10k tenants with multiple prefixes, different scaling requirements
No L3 networks