Hardware Layouts for LAMP Installations

Download Report

Transcript Hardware Layouts for LAMP Installations

October 18-21, 2005
San Francisco, CA
Hardware Layouts for LAMP Installations
John Allspaw, Flickr Plumbr
Flickr (Yahoo)
[email protected]
October 18, 2005
Hyatt Regency San Francisco Airport Burlingame, CA
Hardware Layouts for LAMP Installations
Hardware requirements for LAMP installs have to do
with:
o A decent amount about the actual hardware
(“in-box” stuff)
o A bit more about the hardware architecture
o Which should complement the application
architecture
Oct. 18, 2005
#2
Hardware Layouts for LAMP Installations
What we’ll talk about here:
o Database (MySQL) layouts and considerations
o Some miscellaneous/esoteric stuff (lessons
learned)
o Caching content and considerations
Oct. 18, 2005
#3
Hardware Layouts for LAMP Installations
• Growing Up, “One Box” solution




Basic web application (discussion board, etc.)
Low traffic
Apache/PHP/MySQL on one machine
Bottlenecks will start showing up:
• Most likely database before apache/php
• Disk I/O (Innodb) or locking wait states (MyISAM)
• Context switching between memory work (apache) and
CPU work (MySQL)
Oct. 18, 2005
#4
Hardware Layouts for LAMP Installations
ONE BOX
Oct. 18, 2005
#5
Hardware Layouts for LAMP Installations
• Growing Up, “Two Box” solution
 Higher traffic application (more demand)
 Apache/PHP on box A, MySQL on box B
 Same network = bad (*or is it ?), separate network =
good
 Bottlenecks with start to be:
• Disk I/O on MySQL machine (Innodb)
• Locking on MyISAM tables
• Network I/O
Oct. 18, 2005
#6
Hardware Layouts for LAMP Installations
TWO BOX
Oct. 18, 2005
#7
Hardware Layouts for LAMP Installations
• Growing Up, “Many Boxes with Replication”
solution
• Yet even higher traffic
• Writes are separated from reads (master gets
IN/UP/DEL, slaves get SELECTs)
• Diminishes network bottlenecks, disk I/O, and other
“in-box” issues
• SELECTs, IN/UP/DEL can be specified within the
application,
• OR….
• Load-balancing can be used
Oct. 18, 2005
#8
Hardware Layouts for LAMP Installations
MANY BOX
Oct. 18, 2005
#9
Hardware Layouts for LAMP Installations
Slave Lag
• When slaves can’t keep up with replication
• They’re too busy:
• Reading (production traffic)
• Writing (replication)
• Manifests as:
• Comments/photos/any user-entered data doesn’t
show up on the site right away
• So users will repeat the action, thinking that it didn’t
“take” the first time, makes situation worse
Oct. 18, 2005
# 10
Hardware Layouts for LAMP Installations
Insert funny photo here about slave lag*
*slave lag isn’t funny
Oct. 18, 2005
# 11
Hardware Layouts for LAMP Installations
Hardware Load Balancing MySQL
Oct. 18, 2005
# 12
Hardware Layouts for LAMP Installations
How It’s Usually Done
• Standard MySQL master/slave replication
• All writes (inserts/updates/deletes) from application
go to Master
• All reads (selects) from application go to a loadbalanced VIP (virtual IP) spreading out load across
all slaves
Oct. 18, 2005
# 13
Hardware Layouts for LAMP Installations
Oct. 18, 2005
# 14
Hardware Layouts for LAMP Installations
What Is Good About Load Balancing
• you can add/remove slaves without affecting
application, since queries are atomic (sorta/kinda)
• additional monitoring point and some automatic
failure handling
• you can treat all of your slave pool as one resource,
and makes capacity planning a lot easier if you
know the ceiling of each slave
Oct. 18, 2005
# 15
Hardware Layouts for LAMP Installations
• How do you know the ceiling (maximum QPS capacity) of
each slave ?
• First make a guess based on benchmarking (or look
up some bench results from Tom’s Hardware or
anandtech.com, etc.
• Then get more machines than that :)
• Scary: in production during a lull in traffic, remove
machines from the pool until you detect lag
• The QPS you saw right before slave lag set in:
THAT is your ceiling
Oct. 18, 2005
# 16
Hardware Layouts for LAMP Installations
Oct. 18, 2005
# 17
Hardware Layouts for LAMP Installations
What Can Be Bad/Tough About Load Balancing:
• not all load-balancers are created equal, not all loadbalancing companies expect this product use, so support
may still be thin
• not that many people are doing it in high-volume
situations yet, so support from community isn’t large either
• Gotchas:
• port exhaustion,
• health checks,
• and balance algorithms
Oct. 18, 2005
# 18
Hardware Layouts for LAMP Installations
Port Exhaustion
PROBLEM:
• LB is basically a traffic cop, nothing more
• Side effect of having a lot of connections: only
~64,511 ports per each IP (VIP) to use
• 64,511 ports/120 sec per port….
• ~535 max concurrent connections per IP*
* Not really, but close to it: tcp_tw_recycle and tcp_tw_reuse
Oct. 18, 2005
# 19
Hardware Layouts for LAMP Installations
Oct. 18, 2005
# 20
Hardware Layouts for LAMP Installations
Port Exhaustion (cont’d)
SOLUTION:
• Use a pool of IPs on the database slave/farm side
(Netscaler calls these “subnet IPs”, Alteon calls them
“PiPs”)
• Monitor port/connection usage, know when it’s time
to add more
Oct. 18, 2005
# 21
Hardware Layouts for LAMP Installations
Health checks
• LB won’t know anything about how well each
MySQL slave is doing, and will pass traffic as long as
port 3306 is answering
• Load balancers don’t talk SQL, only things like plain
old TCP, HTTP/S, maybe FTP
Oct. 18, 2005
# 22
Hardware Layouts for LAMP Installations
Health checks (cont’d)
• Two options:
1.Dirty, but workable:
 Have each server monitor itself, and shut
off/firewall its own port 3306, even if MySQL is
still running
Oct. 18, 2005
# 23
Hardware Layouts for LAMP Installations
Health checks (cont’d)
2. Cleaner, but a bit more work:
• Have each server monitor itself, and run a check
via xinetd (for example, a nagios monitor)
 So the LB can tickle that port, and expect back
an “OK” string. If not, it’ll automatically take that
server out of the pool
 Good for detecting and counteracting isolated
incidents of ‘slave lag’ and automatically
handling it
Oct. 18, 2005
# 24
Hardware Layouts for LAMP Installations
Health Checks
Oct. 18, 2005
# 25
Hardware Layouts for LAMP Installations
Balancing Algorithms
• Load balancers know HTTP, FTP, basic TCP, but not SQL
• Two things to care about:
• Should the server still be in the pool ? (health checks)
• How should load get balanced ?
 “least connections” or “least bandwidth” or “least
anything” = BAD
 Because not all SQL queries are created equal
 Use “round-robin” or “random”
 What happens if you don’t: Evil Favoritism™
Oct. 18, 2005
# 26
Hardware Layouts for LAMP Installations
Evil Favoritism
Oct. 18, 2005
# 27
Hardware Layouts for LAMP Installations
Oct. 18, 2005
# 28
Hardware Layouts for LAMP Installations
• Meanwhile….for “in-the-box considerations”
 Interleaving memory *does* make a difference
 Always RAID10 (or RAID0 if you’re crazy*) but NEVER RAID5
(for Innodb, anyway)
 RAID10 has much more read capacity, and a write penalty,
but not as much as RAID5
 Always have battery backup for HW RAID write caching
 Or, don’t use write caching at all
Oct. 18, 2005
# 29
Hardware Layouts for LAMP Installations
• “IN-THE-BOX” considerations (cont’d)
 Always have proper monitoring (nagios, etc.) for
failed/rebuilding drives
 SATA or SCSI ? SCSI !
It’s worth it!
 10k or 15k RPM SCSI ? 15k! It’s worth it!
(~20% performance increase when you’re disk bound)
 For 64bit Linux (AMD64 or EM64T):
• Crank up the RAM for Innodb’s buffer pool
• Swapping = very very bad either:
• Turn it off (slightly scary)
• Leave it on and set /proc/sys/vm/swapiness = 0
Oct. 18, 2005
# 30
Hardware Layouts for LAMP Installations
• 10k versus 15k drives ?
• Does it really matter that much ?
• Some in-the-wild proof….
Oct. 18, 2005
# 31
Hardware Layouts for LAMP Installations
Slave Lag in production
10K drives
15K drives
Oct. 18, 2005
# 32
Hardware Layouts for LAMP Installations
• Using MySQL with a SAN (Storage Area Network)
• Do layout storage same as if they would be local
• Do make sure that the HBA (fiber card) driver is well
supported by Linux
• Don’t share volumes across databases
• Don’t forget to correctly tune Queue Depth Size, which
should be increasing, from server HBA -> switch ->
storage
Oct. 18, 2005
# 33
Hardware Layouts for LAMP Installations
Caching your static content
Oct. 18, 2005
# 34
Hardware Layouts for LAMP Installations
• Caching Static Content
 SQUID = good
 Relieve your front-end PHP machines from looking up
data that will never (or rarely) change
 Generate static pages, and cache them in squid,
along with your images
Oct. 18, 2005
# 35
Hardware Layouts for LAMP Installations
• Caching Static Content (cont’d)
 Use SQUID to accelerate plain-old origin webservers,
also known as “reverse-proxy” HTTP acceleration
 Described here and elsewhere:
 http://www.squid-cache.org/Doc/FAQ/FAQ-20.html
Oct. 18, 2005
# 36
Hardware Layouts for LAMP Installations
Basic SQUID layout
• squid accepts requests on 80
• passes on cache misses to apache on 81
• apache uses as its docroot an NFS mounted dir
• should be on local subnet, or dedicated net
Oct. 18, 2005
# 37
Hardware Layouts for LAMP Installations
• Good HW layout for high-volume SQUIDing
 Do use SCSI, and many spindles for disk cache dirs
 Don’t use RAID
 Do use network attached storage, or place the origin
servers on separate machines
 Do use ext3 with noatime for disk cache dirs
 Do monitor squid stats
Oct. 18, 2005
# 38
Hardware Layouts for LAMP Installations
Flickr: How We Roll
Oct. 18, 2005
# 39
Hardware Layouts for LAMP Installations
• Yummy SQUID stats:
• >2800 images/sec, ~75-80% are cache hits
• ~10 million photos cached at any time
• 1.5 million cached in memory
Oct. 18, 2005
# 40
Hardware Layouts for LAMP Installations
The End
Oct. 18, 2005
# 41