Transcript hotnets10

CloudPolice: Taking Access Control Out of the
Network
Lucian Popa
UC Berkeley/ICSI
MinlanYu
Steven Y. Ko
Princeton Univ.
Princeton Univ.
Sylvia Ratnasamy
Ion Stoica
Intel Labs Berkeley
UC Berkeley
Context
 Infrastructure as a Service virtualized clouds
VM
VM
Hypervisor
 Traffic internal to cloud
VM
Context
 Cloud computing requires network access control
Context
 Cloud computing requires network access control
 Access control policy of tenant X = what network traffic is
tenant X willing to accept
Y can talk
to me
Tenant X
Tenant Y
Why Access Control in Clouds? (1)
 For isolation
 Policy: deny incoming traffic from any other tenant
Exbay
Amazonia
Why Access Control in Clouds? (2)
 For inter-tenant & tenant-provider communication
 Policy: allow/deny traffic from specific tenants
 Increasingly common in cloud environments
 Low latency and high bandwidth
 Ease of service composition
Exbay
Amazonia
Why Access Control in Clouds? (2)
 For inter-tenant & tenant-provider communication
 Policy: allow/deny traffic from specific tenants
Exbay
Amazonia
Real-time bidding advertising
Why Access Control in Clouds? (2)
 For inter-tenant & tenant-provider communication
 Policy: allow/deny traffic from specific tenants
Send information about client
Exbay
Amazonia
Real-time bidding advertising
Ad
Ad
Ad
Networks
Network
Network21
Why Access Control in Clouds? (2)
 For inter-tenant & tenant-provider communication
 Policy: allow/deny traffic from specific tenants
Receive ad bids
Exbay
Amazonia
Real-time bidding advertising
Ad
Ad
Ad
Networks
Network
Network21
Why Access Control in Clouds? (2)
 For inter-tenant & tenant-provider communication
 Policy: allow/deny traffic from specific tenants
Return ad of highest bidder
Exbay
Amazonia
Real-time bidding advertising
Ad
Ad
Ad
Networks
Network
Network21
Why Access Control in Clouds? (2)
 For inter-tenant & tenant-provider communication
 Policy: allow/deny traffic from specific tenants
Policy of Exbay: allow traffic from
AdNetworks, deny all other traffic
Exbay
Amazonia
Real-time bidding advertising
Ad
Ad
Ad
Networks
Network
Network21
Why Access Control in Clouds? (2)
 For inter-tenant & tenant-provider communication
 Policy: allow/deny traffic from specific tenants
 Other service examples: database (SimpleDB), desktop,
communication (SQS), map-reduce++, Facebook, host
managing, locking, etc.
Exbay
Amazonia
Ad
Ad
Ad
Networks
Network
Network21
Why Access Control in Clouds? (3)
 For inter-tenant & tenant-provider communication
 Policy: weighted bandwidth allocation between tenants
Exbay
Amazonia
Ad
Ad
Ad
Networks
Network
Network21
Why Access Control in Clouds? (3)
 For inter-tenant & tenant-provider communication
 Policy: weighted bandwidth allocation between tenants
Share bandwidth fairly among tenants regardless of #VM sources
Nextbay
Exbay
Amazonia
Ad
Ad
Ad
Networks
Network
Network21
Why Access Control in Clouds? (3)
 For inter-tenant & tenant-provider communication
 Policy: weighted bandwidth allocation between tenants
Other example policies:
• Rate-limited access
• Allow only locally initiated connections
• Nighttime access only
Nextbay
Exbay
Amazonia
Ad
Ad
Ad
Networks
Network
Network21
Why Access Control in Clouds? (4)
 DoS protection
 One tenant can attack another tenant
 Reduce bandwidth and slow down machines
 Attackers more powerful: higher bandwidths
 Barrier is lower: pay for attacking hosts (compromise credit cards instead
of hosts)
Nextbay
Exbay
AmazoniaX
Ad
Ad
Ad
Networks
Network
Network21
Hence, the problem
 Want access control in clouds that
 Is resilient to DoS
 Supports rich inter-tenant policies
Hence, the problem
 Want access control in clouds that
 Is resilient to DoS
 Supports rich inter-tenant policies
 Scales
 100k servers
 10k tenants
Hence, the problem
 Want access control in clouds that
 Is resilient to DoS
 Supports rich inter-tenant policies
 Scales
 Tolerates high dynamicity
 100k VMs started per day, more than one per second
Hence, the problem
 Want access control in clouds that
 Is resilient to DoS
 Supports rich inter-tenant policies
 Scales
 Tolerates high dynamicity
 Traditional access control mechanisms not well suited to
meeting these requirements
Existing Access Control
 Cloud APIs are narrow
 On/off
 No locally initiated connections, no rate-limiting, no weighted
allocation
 Mechanisms inherited from enterprises
 VLANs
 Firewalls
Existing Access Control
 Cloud APIs are narrow
 On/off
 No locally initiated connections, no rate-limiting, no weighted
allocation
 Mechanisms inherited from enterprises
 VLANs
 Firewalls
 But clouds != enterprises
Clouds != Enterprises
 Enterprises are not multi-tenant
 Few DoS concerns between departments
 Typically simpler policies
 Enterprises don’t have the same dynamicity and scale
 10k tenants vs. 10s departments; 1 VM/s vs. mostly static
 Clouds have different network designs
 High bisection bandwidths, multiple paths, different L2/L3 mix
 Many new topologies: FatTree, VL2, BCube, DCell, etc.
VLANS not well suited for clouds
 Inflexible policies
 Difficult to scale (cloud size & dynamicity)
 Limited number, spanning tree
 Limited network designs
 No L3 networks, no multiple paths, inter-VLAN through router
Firewalls not well suited for Clouds
 Offering DoS protection is difficult
 Must be applied at source  hard to update
 Inflexible policies
 Scale through prefix aggregation
 Difficult to manage
 10k tenants with multiple prefixes, different scaling requirements
 No L3 networks
Recap
 Traditional access control is not well suited for clouds
 Couple access control with network operation
 With switching –VLANs
 With address assignment – Firewalls
Recap
 Traditional access control is not well suited for clouds
 Couple access control with network operation
 With switching –VLANs
 With address assignment – Firewalls
 CloudPolice takes access control out of the network
Outline
 Part 1 – Context and Motivation
 Access control for clouds: why and what?
 Limitations of traditional mechanisms
 Part 2 – CloudPolice
 Approach
 Operation
Goal
Network Access Control for Clouds that is:
1. Independent of network topology and addressing
2. Scalable (millions hosts, high churn)
3. Flexible (on/off access, rated access, fair access)
4. Robust to (internal) DDoS attacks
CloudPolice
 Sufficient and advantageous to implement access control only
within hypervisors
VM
VM
Hypervisor
VM
CloudPolice
 Sufficient and advantageous to implement access control only
within hypervisors
 Trusted
 Network independent
 Full software programmability  flexible
 Close toVMs  block unwanted traffic before network and help DoS
 Easy deployability
VM
VM
Hypervisor
VM
CloudPolice
 Sufficient and advantageous to implement access control only
within hypervisors
CloudPolice Policy Model
Group = set of tenant VMs with same access control policy
VM
VM
Hypervisor
VM
CloudPolice
 Sufficient and advantageous to implement access control only
within hypervisors
CloudPolice Policy Model
VM
Policy = set of Rules
Rule = IF ConditionTHEN Action
VM
Hypervisor
VM
CloudPolice
 Sufficient and advantageous to implement access control only
within hypervisors
Condition = logical expression
with predicates based on:
• Group of sender
• Packet header
• Current time
• History of traffic
CloudPolice Policy Model
VM
VM
Hypervisor
VM
CloudPolice
 Sufficient and advantageous to implement access control only
within hypervisors
Action:
• Allow
• Block
• Rate-limit (token bucket)
CloudPolice Policy Model
VM
VM
Hypervisor
VM
CloudPolice
 Sufficient and advantageous to implement access control only
within hypervisors
Action:
• Allow
• Block
• Rate-limit (token bucket)
CloudPolice Policy Model
VM
VM
VM
Applied per
Hypervisor
flow
source VM
source group
CloudPolice
 Hypervisor-based
VM
VM
Hypervisor
Src.
VM
VM
VM
Hypervisor
Dst.
VM
CloudPolice
 Hypervisor-based
 Avoid DoS and wasted resources  apply policy at source
VM
VM
Hypervisor
Src.
VM
VM
VM
Hypervisor
Dst.
VM
CloudPolice
 Hypervisor-based
 How to apply destination’s policy at the source hypervisor?
VM
VM
Hypervisor
Src.
VM
VM
VM
Hypervisor
Dst.
VM
CloudPolice
 Hypervisor-based
 Centralized policy repository?
VM
VM
Hypervisor
Src.
VM
VM
VM
Hypervisor
Dst.
VM
CloudPolice
 Hypervisor-based
 Centralized policy repository?
VM
VM
Src.
VM
VM
Allow?
Hypervisor
VM
Hypervisor
Dst.
VM
CloudPolice
 Hypervisor-based
 Centralized policy repository?
 Centralized service requires high availability and throughput
 100k servers and 10 new flows/VM/s  1M decisions/s on average!
 Caching can be ineffective (random patterns, malicious pollution)
 Centralized service can be a DoS target
VM
VM
Src.
VM
VM
Allow?
Hypervisor
VM
Hypervisor
Dst.
VM
CloudPolice
 Hypervisor-based
 Decentralized
VM
VM
Hypervisor
Src.
VM
VM
VM
Hypervisor
Dst.
VM
CloudPolice
 Hypervisor-based
 Decentralized
 Distribute all policies to all hypervisors?
VM
VM
Hypervisor
Src.
VM
VM
VM
Hypervisor
Dst.
VM
CloudPolice
 Hypervisor-based
 Decentralized
 Distribute all policies to all hypervisors?
VM
VM
Hypervisor
Src.
VM
VM
Allow?
VM
Hypervisor
Dst.
VM
CloudPolice
 Hypervisor-based
 Decentralized
 Distribute all policies to all hypervisors?
 Too heavyweight if network independent
 Full group membership required; Group updates propagated everywhere
 100k new VMs/day, 100k servers  100k updates/s on average
VM
VM
Hypervisor
Src.
VM
VM
VM
Hypervisor
Dst.
VM
CloudPolice
 Hypervisor-based
 Decentralized
 Apply at destination and enforce at source
VM
VM
Hypervisor
Src.
VM
Apply destination’s
policy
VM
VM
Hypervisor
Dst.
VM
CloudPolice
 Hypervisor-based
 Decentralized
 Apply at destination and enforce at source
VM
VM
Hypervisor
Src.
VM
Enforce policy’s
action
VM
VM
Hypervisor
Dst.
VM
Inspired by Internet Research
 Internet solutions to DDoS
 Push-back filters [AIP, Pushback, AITF, StopIt]
 Network Capabilities [SIFF, TVA]
 Handle large and dynamic networks, millions of users
Inspired by Internet Research
 Internet solutions to DDoS
 Push-back filters [AIP, Pushback, AITF, StopIt]
 Network Capabilities [SIFF, TVA]
 Handle large and dynamic networks, millions of users
 More easily deployed: Clouds != Internet
 Clouds are controlled environments
 Both communication endpoints can be controlled
 Single administrative domain
 New tools: trusted software layer – Hypervisor
Outline
 Part 1 – Context and Motivation
 Access control for clouds: why and what?
 Limitations of traditional mechanisms
 Part 2 – CloudPolice
 Approach
 Operation
CloudPolice
CloudPolice
X
Y
Z
X’s group policy: IF group = A  allow
IF group = B  block
IF group = C & port = 80
 rate-limit to 100Mbps
Y’s group policy: IF …
Z’s group policy: IF …
Hypervisor
Policies for X,Y and Z
Policy could also be
specified / updated by VM
Installed by provider
service that starts VMs
Each hypervisor needs to know
for hostedVMs: group and policy
CloudPolice
X
Y
Z
Hypervisor
Filter for
incoming/outgoing flows
CloudPolice
Start flow to C
X
Y
A
Z
Hypervisor
B
C
Hypervisor
Z group
CloudPolice inserts control packet
containing group of Z and first
packet header
CloudPolice
X
Y
A
Z
Hypervisor
B
C
Hypervisor
Z group
If blocked or rate limited, send control
If allowed, packets are forwarded to
packet to source hypervisor to block
destination VM
or rate-limit source (flow/VM)
Block/rate-limit
Soft-state
CloudPolice verifies policy of
and timeouts handle policy invalidations and
packetVMlosses
destination
Scalability
 CloudPolice takes the best of both worlds
 Centralized vs. every server stores all policies
 Load spread across all servers
 Maintaining and enforcing policies
 Update propagation is contained
 Group membership updates not propagated
 Policy updates propagated only to group
Security Analysis Sketch
 Attackers
 VMs – corrupted or paid by malicious tenants
 Attacks considered
 Violate access control policies to reach destination
 DoS with unauthorized traffic
 DoS with authorized traffic
 Assumptions
 Hypervisors not compromised
Security Analysis Sketch
 Violate access control policies to reach destination
 Policy distributed securely to hypervisor
 Control packets cannot be spoofed, only sent by hypervisors
X
Y
Z
Fake group
Fake group
Hypervisor
Security Analysis Sketch
 Violate access control policies to reach destination
 DoS with unauthorized traffic
 Control packets block unauthorized traffic at source
Security Analysis Sketch
 Violate access control policies to reach destination
 DoS with unauthorized traffic
 Control packets block unauthorized traffic at source
 Attackers attempt to cause drops of control packets
Block/rate-limit
Security Analysis Sketch
 Violate access control policies to reach destination
 DoS with unauthorized traffic
 Control packets block unauthorized traffic at source
 Attackers attempt to cause drops of control packets
 Retry or prioritize control packets
Security Analysis Sketch
 Violate access control policies to reach destination
 DoS with unauthorized traffic
 DoS with authorized traffic
 Also need performance isolation for full protection
Congestion
Security Analysis Sketch
 Violate access control policies to reach destination
 DoS with unauthorized traffic
 DoS with authorized traffic
 Also need performance isolation for full protection
 CloudPolice can implement some performance isolation
Rate-limit to fair share
of destination link
rate-limit
Share access link evenly
between destination VMs
Future Work
 Implement CloudPolice prototype
 Extend CloudPolice
 Policies with application-level semantics (dynamic policies)
 Policies based on group-wide state
 Beyond access control?
 More flexible actions, e.g., send to middlebox
 Performance isolation framework
Summary
 Access control in cloud computing requires new mechanisms
and extended policies
 CloudPolice
 Takes advantage of trusted hypervisors
 Inspired by past work on Internet DDoS protection
 Properties
 Network independent
 Scalable
 Flexible
 Robust to (internal) DDoS attacks
Backup Slides
Related Work
 OpenFlow & (Onix | Difane) & OpenVSwitch
 Decisions not based on logical identifier (group/tenant)
 Onix only isolation framework
 OpenFlow actions designed for switches (e.g., currently can’t
rate-limit)
 Require scaling central controller
 Vs. software update for CloudPolice
Contributions
 Identify that new access control mechanism is needed in clouds
 Pinpoint the challenges and requirements
 Identify that access control should be done in hypervisors
 Propose CloudPolice, mechanism that satisfies requirements
Compromise Single Hypervisor
Can prevent compromised hypervisors from violating security
policies
Security credentials associated with group identifier
1.
Cannot be sent if unknown (known only for hosted VMs)


E.g., group ID has key in name
Prevent spoofed control packets in the network
2.

Like IP anti-spoofing in switches/routers
Today’s Cloud Mechanisms?
 Solutions not public
 Could be similar to our solution
 Could provide fewer properties
 API is narrow
 On/off between groups
 No locally initiated connections, no rate-limiting, no weighted
allocation
Feasibility
 Working on implementing CloudPolice prototype
 Fast path – act on per flow state
 Open VSwitch and software routers [RouteBricks,
PacketShader] suggest this is feasible
 Slow path – execute policy and install flow state
 1/N of requirements for centralized repository
 Few hosted VMs  dominated by policy complexity
 Software router applications suggest if-then-else structures can be parsed
fast [RBF]
Other Related Work
 VL2’s approach if it would be applied to hypervisors
 Centralized repository
 Can violate policies if IP of destination known
Firewalls not Suited for Clouds
 Not well suited against DoS
 Must be applied at source  hard to update
 Inflexible policies for clouds
 Scaling & network designs
 With no prefix aggregation
 Difficult to scale (100k+ entries)
 Needs updating on all VM starts (more than once/s)
 With prefix aggregation
 Complex to manage
 10k tenants with multiple prefixes, different scaling requirements
 No L3 networks