Transcript PPT

GENERAL
SCALABILITY CONSIDERATIONS
http://www.flickr.com/photos/jamescridland/613445810
Overview of scalability
As the number of users grows, maintain:
– Low latency
– High throughput
– High reliability
Latency
• Latency = total time between when an
operation is initiated and when the operation
completes
Responsiveness
(measured in seconds)
Latency
(measured in seconds)
Throughput
• Throughput = number of operations
completed per unit time
Web page
Throughput: 32/minute
10/min
2/min
Web server
Web page
10/min
2/min
2/min
4/min
2/min
Web page
Web page
Web page
Web page
Web page
Reliability
• Reliability = percentage of operations
successfully completed
Web page
Reliability: 29/32 = 90%
1/10 failure
0/2 failure
Web server
Web page
0/10 failure
0/2 failure
0/2 failure
2/4 failure
0/2 failure
Web page
Web page
Web page
Web page
Web page
Scalability
• Scalability means that even when the number
of users grows into the thousands or millions,
your website still maintains
– Low latency
– High throughput
– High reliability
Very rough reasonable goals
Reasonable #
Latency
"simultaneous"
users
Throughput
Reliability
One single-core Hundreds or
server
maybe
thousands
Low hundreds
of milliseconds
A few hundred
operations per
second
99%
One multi-core
server
Thousands or
tens of
thousands
Around 100
milliseconds
Thousands of
operations per
second
99%
A cluster of a
few multi-core
computers
Tens or
hundreds of
thousands
Under 100
milliseconds
Tens of
thousands of
operations per
second
99.99%
A small
Millions
datacenter with
a few dozen
multi-core
computers
A few dozen
milliseconds
Hundreds of
thousands of
(assuming a great operations per
network
second
connection)
99.999%
Techniques to improve scalability
•
•
•
•
•
•
•
Minimal size messages
Minimal number of messages
Minimal amount of computation
Local computation
Replication
Aggressive caching
Aggressive indexing
Minimal size of messages
• When client-server communicate…
– Only send data needed at that moment
– Use a concise data format (i.e., probably JSON)
• For example, suppose that an app needed to
retrieve a list of courses in response to a query
in order to show a list of links
– http://www.myserver.com/info.php?prof=cscaffid
Option #1
565 bytes
<?xml version="1.0"?>
<courses>
<course><dept>CS</dept><num>361</num><prof>cscaffid</prof>
<title>Intro to SE</title><description>Blah blah blah blah blah blah
blah blah blah</description></course>
<course><dept>CS</dept><num>494</num><prof>cscaffid</prof>
<title>Web development</title><description>Blah blah blah blah
blah blah blah blah blah</description></course>
<course><dept>CS</dept><num>496</num><prof>cscaffid</prof>
<title>Cloud+Mobile development</title><description>Blah blah
blah blah blah blah blah blah blah</description></course>
</courses>
Option #2
108 bytes
[{n:"CS361",t:"Intro to SE"},
{n:"CS494",t:"Web development"},
{n:"CS496",t:"Cloud+Mobile development"}]
1.
2.
3.
4.
Combine fields if appropriate (e.g., dept and number)
Omit fields if not needed (e.g., description)
Shorten field names if appropriate (e.g., n and t)
Use JSON if feasible
Minified JS and CSS
• Online services for squeezing the whitespace
and other wasted characters out of your JS
– Search for JS "minifier" or "minimizer"
– E.g., http://closure-compiler.appspot.com/home
• Ditto for CSS
– E.g., https://cssminifier.com/
Minimal number of messages
• Eliminate unnecessary messages
– E.g., eliminate unnecessary images from UI
• Combine messages if feasible
– E.g., if you need to query CS and ECE courses,
design server to handle both queries at once
• Defer messages if feasible
– E.g., give the user the option to defer logging in
until it’s absolutely necessary
Minimal amount of computation
• Avoid "feature bloat"
– Only implement the features you need
– This also will enhance usability!
• Avoid blithely copy-pasting code
– E.g., It's simplest to do certain things at the top of
every web page in your site (send JS, open db)
even when each page doesn't actually need this
Minimal amount of computation
• Use the right data structures
– E.g., If you need to use an associative array, then
use an associative array
• Use the right APIs
– E.g., There is an AJAX API for retrieving JSON as an
object – don't try to write such an API yourself
• Your version will be buggy and slow!
Minimal amount of computation
• Retrieve only the data you need
– E.g., if you need one row, use a WHERE clause in
SQL (rather than retrieving all rows & looping)
• Looping just creates unnecessary computation!
• Use SQL aggregate functions when practical
Duh
Local computation
• If a computation uses a very large amount of
data, then move the computation to the data,
instead of the data to the computation.
• Example: Find city with maximal rainfall in US
• Option #1:
– Server sends rainfall for 4500 cities to browser
– Browser loops through cities to choose maximum
• Option #2:
– Server loops through cities to choose maximum
– Server sends just the maximum to the browser
Replication
• Make copies of your computation and data
Web page
Web server
Web server
Web server Throughput:
32/minute
10/min
10/min
2/min
2/min
Web server
Web page
10/min
2/min
2/min
2/min
4/min
4/min
10/min
2/min
2/min
2/min
Web page
Web page
Web page
Web page
Web page
Replication
• You also can replicate your database
Web server
Database
Web server
Web server
Database
Database
Databases can be configured to
automatically
"mirror" contents
Shopping for a hosting service
• When leasing space from a "hosting service"
– You pay them $X per month
– They let you use Y machines
• If you want replication, look for…
– Load balancing: automatic routing of traffic evenly
across the machines you lease
– Mirroring: automatic copying of data updates from
one server to another ("master/slave")
– Failover: automatic routing (and restart) around
machines that crash
Learning about replication
• If you really want to get your hands dirty with
the details of replication…
– CS496: Mobile + Cloud Software Development
– CS440: (Advanced) Database Management
Aggressive caching & indexing
• Caching: If a computation or transmission is
expensive, then do it once, save the result,
and reuse the result later
• Indexing: If you have lots of data, create a data
structure that makes it easier to find the data
• These will each be covered by a whole lecture