Hardware Layouts for LAMP Installations
Download
Report
Transcript Hardware Layouts for LAMP Installations
October 18-21, 2005
San Francisco, CA
Hardware Layouts for LAMP Installations
John Allspaw, Flickr Plumbr
Flickr (Yahoo)
[email protected]
October 18, 2005
Hyatt Regency San Francisco Airport Burlingame, CA
Hardware Layouts for LAMP Installations
Hardware requirements for LAMP installs have to do
with:
o A decent amount about the actual hardware
(“in-box” stuff)
o A bit more about the hardware architecture
o Which should complement the application
architecture
Oct. 18, 2005
#2
Hardware Layouts for LAMP Installations
What we’ll talk about here:
o Database (MySQL) layouts and considerations
o Some miscellaneous/esoteric stuff (lessons
learned)
o Caching content and considerations
Oct. 18, 2005
#3
Hardware Layouts for LAMP Installations
• Growing Up, “One Box” solution
Basic web application (discussion board, etc.)
Low traffic
Apache/PHP/MySQL on one machine
Bottlenecks will start showing up:
• Most likely database before apache/php
• Disk I/O (Innodb) or locking wait states (MyISAM)
• Context switching between memory work (apache) and
CPU work (MySQL)
Oct. 18, 2005
#4
Hardware Layouts for LAMP Installations
ONE BOX
Oct. 18, 2005
#5
Hardware Layouts for LAMP Installations
• Growing Up, “Two Box” solution
Higher traffic application (more demand)
Apache/PHP on box A, MySQL on box B
Same network = bad (*or is it ?), separate network =
good
Bottlenecks with start to be:
• Disk I/O on MySQL machine (Innodb)
• Locking on MyISAM tables
• Network I/O
Oct. 18, 2005
#6
Hardware Layouts for LAMP Installations
TWO BOX
Oct. 18, 2005
#7
Hardware Layouts for LAMP Installations
• Growing Up, “Many Boxes with Replication”
solution
• Yet even higher traffic
• Writes are separated from reads (master gets
IN/UP/DEL, slaves get SELECTs)
• Diminishes network bottlenecks, disk I/O, and other
“in-box” issues
• SELECTs, IN/UP/DEL can be specified within the
application,
• OR….
• Load-balancing can be used
Oct. 18, 2005
#8
Hardware Layouts for LAMP Installations
MANY BOX
Oct. 18, 2005
#9
Hardware Layouts for LAMP Installations
Slave Lag
• When slaves can’t keep up with replication
• They’re too busy:
• Reading (production traffic)
• Writing (replication)
• Manifests as:
• Comments/photos/any user-entered data doesn’t
show up on the site right away
• So users will repeat the action, thinking that it didn’t
“take” the first time, makes situation worse
Oct. 18, 2005
# 10
Hardware Layouts for LAMP Installations
Insert funny photo here about slave lag*
*slave lag isn’t funny
Oct. 18, 2005
# 11
Hardware Layouts for LAMP Installations
Hardware Load Balancing MySQL
Oct. 18, 2005
# 12
Hardware Layouts for LAMP Installations
How It’s Usually Done
• Standard MySQL master/slave replication
• All writes (inserts/updates/deletes) from application
go to Master
• All reads (selects) from application go to a loadbalanced VIP (virtual IP) spreading out load across
all slaves
Oct. 18, 2005
# 13
Hardware Layouts for LAMP Installations
Oct. 18, 2005
# 14
Hardware Layouts for LAMP Installations
What Is Good About Load Balancing
• you can add/remove slaves without affecting
application, since queries are atomic (sorta/kinda)
• additional monitoring point and some automatic
failure handling
• you can treat all of your slave pool as one resource,
and makes capacity planning a lot easier if you
know the ceiling of each slave
Oct. 18, 2005
# 15
Hardware Layouts for LAMP Installations
• How do you know the ceiling (maximum QPS capacity) of
each slave ?
• First make a guess based on benchmarking (or look
up some bench results from Tom’s Hardware or
anandtech.com, etc.
• Then get more machines than that :)
• Scary: in production during a lull in traffic, remove
machines from the pool until you detect lag
• The QPS you saw right before slave lag set in:
THAT is your ceiling
Oct. 18, 2005
# 16
Hardware Layouts for LAMP Installations
Oct. 18, 2005
# 17
Hardware Layouts for LAMP Installations
What Can Be Bad/Tough About Load Balancing:
• not all load-balancers are created equal, not all loadbalancing companies expect this product use, so support
may still be thin
• not that many people are doing it in high-volume
situations yet, so support from community isn’t large either
• Gotchas:
• port exhaustion,
• health checks,
• and balance algorithms
Oct. 18, 2005
# 18
Hardware Layouts for LAMP Installations
Port Exhaustion
PROBLEM:
• LB is basically a traffic cop, nothing more
• Side effect of having a lot of connections: only
~64,511 ports per each IP (VIP) to use
• 64,511 ports/120 sec per port….
• ~535 max concurrent connections per IP*
* Not really, but close to it: tcp_tw_recycle and tcp_tw_reuse
Oct. 18, 2005
# 19
Hardware Layouts for LAMP Installations
Oct. 18, 2005
# 20
Hardware Layouts for LAMP Installations
Port Exhaustion (cont’d)
SOLUTION:
• Use a pool of IPs on the database slave/farm side
(Netscaler calls these “subnet IPs”, Alteon calls them
“PiPs”)
• Monitor port/connection usage, know when it’s time
to add more
Oct. 18, 2005
# 21
Hardware Layouts for LAMP Installations
Health checks
• LB won’t know anything about how well each
MySQL slave is doing, and will pass traffic as long as
port 3306 is answering
• Load balancers don’t talk SQL, only things like plain
old TCP, HTTP/S, maybe FTP
Oct. 18, 2005
# 22
Hardware Layouts for LAMP Installations
Health checks (cont’d)
• Two options:
1.Dirty, but workable:
Have each server monitor itself, and shut
off/firewall its own port 3306, even if MySQL is
still running
Oct. 18, 2005
# 23
Hardware Layouts for LAMP Installations
Health checks (cont’d)
2. Cleaner, but a bit more work:
• Have each server monitor itself, and run a check
via xinetd (for example, a nagios monitor)
So the LB can tickle that port, and expect back
an “OK” string. If not, it’ll automatically take that
server out of the pool
Good for detecting and counteracting isolated
incidents of ‘slave lag’ and automatically
handling it
Oct. 18, 2005
# 24
Hardware Layouts for LAMP Installations
Health Checks
Oct. 18, 2005
# 25
Hardware Layouts for LAMP Installations
Balancing Algorithms
• Load balancers know HTTP, FTP, basic TCP, but not SQL
• Two things to care about:
• Should the server still be in the pool ? (health checks)
• How should load get balanced ?
“least connections” or “least bandwidth” or “least
anything” = BAD
Because not all SQL queries are created equal
Use “round-robin” or “random”
What happens if you don’t: Evil Favoritism™
Oct. 18, 2005
# 26
Hardware Layouts for LAMP Installations
Evil Favoritism
Oct. 18, 2005
# 27
Hardware Layouts for LAMP Installations
Oct. 18, 2005
# 28
Hardware Layouts for LAMP Installations
• Meanwhile….for “in-the-box considerations”
Interleaving memory *does* make a difference
Always RAID10 (or RAID0 if you’re crazy*) but NEVER RAID5
(for Innodb, anyway)
RAID10 has much more read capacity, and a write penalty,
but not as much as RAID5
Always have battery backup for HW RAID write caching
Or, don’t use write caching at all
Oct. 18, 2005
# 29
Hardware Layouts for LAMP Installations
• “IN-THE-BOX” considerations (cont’d)
Always have proper monitoring (nagios, etc.) for
failed/rebuilding drives
SATA or SCSI ? SCSI !
It’s worth it!
10k or 15k RPM SCSI ? 15k! It’s worth it!
(~20% performance increase when you’re disk bound)
For 64bit Linux (AMD64 or EM64T):
• Crank up the RAM for Innodb’s buffer pool
• Swapping = very very bad either:
• Turn it off (slightly scary)
• Leave it on and set /proc/sys/vm/swapiness = 0
Oct. 18, 2005
# 30
Hardware Layouts for LAMP Installations
• 10k versus 15k drives ?
• Does it really matter that much ?
• Some in-the-wild proof….
Oct. 18, 2005
# 31
Hardware Layouts for LAMP Installations
Slave Lag in production
10K drives
15K drives
Oct. 18, 2005
# 32
Hardware Layouts for LAMP Installations
• Using MySQL with a SAN (Storage Area Network)
• Do layout storage same as if they would be local
• Do make sure that the HBA (fiber card) driver is well
supported by Linux
• Don’t share volumes across databases
• Don’t forget to correctly tune Queue Depth Size, which
should be increasing, from server HBA -> switch ->
storage
Oct. 18, 2005
# 33
Hardware Layouts for LAMP Installations
Caching your static content
Oct. 18, 2005
# 34
Hardware Layouts for LAMP Installations
• Caching Static Content
SQUID = good
Relieve your front-end PHP machines from looking up
data that will never (or rarely) change
Generate static pages, and cache them in squid,
along with your images
Oct. 18, 2005
# 35
Hardware Layouts for LAMP Installations
• Caching Static Content (cont’d)
Use SQUID to accelerate plain-old origin webservers,
also known as “reverse-proxy” HTTP acceleration
Described here and elsewhere:
http://www.squid-cache.org/Doc/FAQ/FAQ-20.html
Oct. 18, 2005
# 36
Hardware Layouts for LAMP Installations
Basic SQUID layout
• squid accepts requests on 80
• passes on cache misses to apache on 81
• apache uses as its docroot an NFS mounted dir
• should be on local subnet, or dedicated net
Oct. 18, 2005
# 37
Hardware Layouts for LAMP Installations
• Good HW layout for high-volume SQUIDing
Do use SCSI, and many spindles for disk cache dirs
Don’t use RAID
Do use network attached storage, or place the origin
servers on separate machines
Do use ext3 with noatime for disk cache dirs
Do monitor squid stats
Oct. 18, 2005
# 38
Hardware Layouts for LAMP Installations
Flickr: How We Roll
Oct. 18, 2005
# 39
Hardware Layouts for LAMP Installations
• Yummy SQUID stats:
• >2800 images/sec, ~75-80% are cache hits
• ~10 million photos cached at any time
• 1.5 million cached in memory
Oct. 18, 2005
# 40
Hardware Layouts for LAMP Installations
The End
Oct. 18, 2005
# 41