The CoDeeN Content Distribution Network

Download Report

Transcript The CoDeeN Content Distribution Network

The CoDeeN Content
Distribution Network
Vivek S. Pai, Limin Wang, KyoungSoo Park,
Ruoming Pang, Larry Peterson
Princeton University
August 12, 2003
Content Distribution Networks
Replicates Web content broadly
Redirects clients to “best” copy
Load, locality, proximity
Offloads work from origin servers
Multiplexes load spikes
Reduces overprovisioning
Ex: Akamai, Mirror Image, Speedera
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
2
What Does It Do?
An Academic Content Distribution
Network
Redirects/caches HTTP requests
Based on our OSDI 2002 paper on CDN
performance
An Open Proxy Network
Probably the largest in existence
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
3
Who Is The Target Audience?
Now
Users wanting better performance
People seeking “anonymity”
Next
Content providers seeking load sharing
Later
General support for absorbing flash crowds
Avoid the “Slashdot Effect”
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
4
How Does It Work?
Server surrogates (proxies) on most
North American sites
Originally everywhere, but we cut back
Clients specify proxy to use
Cache hits served locally
Cache misses forwarded to CoDeeN nodes
• Maybe forwarded to origin servers
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
5
Request Forwarding
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
6
When Will It Be Ready?
January – development started
Reliability & stability major concerns
March – stable enough for daily use
April – security problems begin
Shut down for one month
June – Restarted “beta”
Expecting “production” soon
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
7
Decisions – Good & Bad
Use commercial proxy with API [USITS 2003]
Good – mostly layer 7 concerns
Bad – limits deployment size (donated licenses)
Deployment on PlanetLab
Good – otherwise impossible
“Bad” – vulnerable to other experiments
Allow open access
Good – generates real traffic
Bad – some traffic just plain mean
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
8
Restrict ports & HTTP methods
Lots of Malicious Traffic
Multi-scale req & bw accounting
Spammers
SMTP tunnels, POSTSignature
forms, database
IRC channels
& Robot test
Bandwidth hogs
Google crawls, steganographers, X-Pacific
Determine location & privilege
Hackers & Spreaders
Yahoo dictionary attacks, IIS vuln tests
Content thieves
E-journals/databases, local content
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
9
Protecting Privilege
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
10
Attempted SMTP Tunnels/Day
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
11
By The Numbers…
Restarted in late May
In continuous operation
Stats from first 8 weeks
Over 59,000 unique IPs as clients
Over 24 million requests serviced
Valid rates up to 15K reqs/hour
Roughly 1 million reqs/day aggregate
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
12
More Production Info
About 2000 lines of code
About ¼ is actual decision logic
Uptimes limited by upgrades
Generally 1-2 times/week
Downtimes of 20 seconds/node
Currently on ~40 nodes
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
13
Daily Requests (Serviced)
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
14
Welcome
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
15
sorted by # avoiding
Avoiding
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
16
sorted by # load average
Load
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
17
sorted by # total req rate
Total
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
18
sorted by # users
Users
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
19
The Troubles We’ve Caused
Routinely trigger open proxy alerts
Educating sysadmins, others
Resource checks generate noise
Got onto planetlab-support
Really good honeypots
6000 SMTP flows/minute at CMU
Spammers do ~1M HTTP ops/day
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
20
What We’ve Learned
Parallel ssh is a must
General commands/queries
Basis for parallel scp
Used to detect out-of-date files
Monitoring is a must
Too hard to see anomalies in 40+ nodes
Almost looks like a demo
Be careful accepting outside requests
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
21
What We Still Need
Better layer 4 tools
Hard to tell why things die
Building complete heartbeats isn’t fun
Better isolation on most resources
CPU/OS: Java, VServers, ???
Others: FD exhaustion, disk space
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
22
What We Wouldn’t Mind…
Customizable DNS mapping
Map project.planet-lab.org to some node
Projects could provide feedback
• Node availability, utility, etc
Most IP geolocation seems locked up
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
23
More Info
http://codeen.cs.princeton.edu
Aug 12, 2003
CoDeeN Overview - IRIS/PlanetLab
24