Application Architecture for the rest of us

Download Report

Transcript Application Architecture for the rest of us

APPLICATION ARCHITECTURE
FOR THE REST OF US
Presented by
M N Islam Shihan
Introduction


Target Audience
What is Architecture?
 Architecture
is the foundation of your application
 Applications are not like Sky Scrappers
 Enterprise Vs Personal Architecture

Why look ahead in Architecture?
 Adaptability
with Growth
 Maintainability
 Requirements never ends
Enterprise Architecture (cont…)








Security
Responsiveness
Extendibility
Availability
Load Management
Distributed Computation
Caching
Scalability
Security
Security (cont…)
Think about Security first of all
 Network Security: Implement Firewall &
Reverse Proxy for your network
 SQL Injection: Never forget to escape
field values in your queries
 XSS (Cross Site Scripting): Never trust user provided (or
grabbed from third party data sources) data and display
without sanitizing/escaping
 CSRF (Cross Site Request Forgery): Never let your forms to be
submitted from third party sites
Security (cont…)




DDOS (Distributed Daniel of Services): Enable real time
monitoring of access to detect and prevent DDOS attacks
Session fixation: Implement session key regeneration for
every request
Always hash your security tokens/cookies with new random
salts per request/session basis (or in an interval)
Stay tuned and up-to-date with security news and releases
of all of your used tools and technologies
Responsiveness
Responsiveness (cont…)







Web applications should be as responsive as Desktop
Applications
Plan well and apply good use of JavaScript to achieve
Responsiveness
Detect browsers and provide separate response/interface
depending on detected browser type
Implement unobtrusive use of JavaScript
Implement optimal use of Ajax
Use Comet Programming instead of Polling
Implement deferred/asynchronous processing of large
computations using Job Queue
Extendibility


Implement and use robust data access interface, so
that they can be exposed easily via web services
(like REST, SOAP, JSONP)
Use architectural patterns & best practices
 SOA
(Service Oriented Architecture)
 MVC (Model View Controller)


Modular architecture with plug-ability
Allow hooks and overrides through Events
Availability
Availability (cont…)







Implement well planned Disaster Recovery policy
Use version control for your sources
Use RAID for your storage devices
Keep hot standby fallback for each of your primary
data/content servers
Perform periodical backup of your source repository, files &
data
Implement periodical archiving of your old data
Provide mechanism to the users to switch between current and
archived data when possible
Load Management
Load Management (cont…)







Monitor and Benchmark your servers periodically and find
pick usage time
Optimize to support at least 150% of pick time load
Use web servers with high I/O performance
Introduce load balancer to distribute loads among multiple
application Servers
Start with software (aka. reverse proxy) then grow to use
hardware load balancer only if necessary
Use CDNs to serve your static contents
Use public CDNs to serve the open source JavaScript or CSS
files when possible
Caching

To Cache Or Not to Cache?







Analyze the nature of content and response generated by your
application very well
What to cache?
Analyze and set proper expiry time
Invalidate cache whenever content changes
Partial caching will also bring you speed
When caching is bad?
Understand various types of web caches



Browser cache
Proxy cache
Gateway cache
Caching (cont…)

Implement server side caching




Runtime in-memory cache
 Per request: Global variables
 Shared: Memcached
Persistent Cache
 Per Server: File based, APC
 Shared: Db based, Redis
Optimizers and accelerators: eAccelerator, XCache
Reverse proxy/gateway cache

Varnish cache
Distributed Computing
Scalability



What the heck is this?
Scalability is the soul of enterprise architecture
Scalability pyramid
Scalability (cont…)
Vertical Scalability (scaling up)
Scalability (cont…)
Horizontal Scalability (scaling out)
Scalability (cont…)
Scalability

Scaling up (vertical) vs. Scaling out (horizontal)
Scalability

Database Scalability
 Vertical:
 In
Add resource to server as needed
most cases produce single point of failure
 Horizontal:
Distribute/replicate data among multiple
servers
 Cloud Services: Store your data to third party data
centers and pay with respect to your usage
Scalability (cont…)
Scaling Database
Scaling options
 Master/Slave


Cluster Computing




Large tables are split among partitions
Federated Tables


Single storage with multiple server node
Table Partitioning


Master for Write, Slaves for Read
Tables are shared among multiple servers
Distributed Key Value Stores
Distributed Object DB
Database Sharding
Scalability (cont…)
Database Sharding






Smaller databases are
easier to manage
Smaller databases are
faster
Database sharding can
reduce costs
Need one or multiple well
define shard functions
"Don't do it, if you don't
need to!" (37signals.com)
"Shard early and often!"
(startuplessonslearned.blo
gspot.com)
Scalability (cont…)
Database Sharding
When appropriate?


High-transaction database applications
Mixed workload database usage




Frequent reads, including complex queries
and joins
Write-intensive transactions (CRUD
statements, including INSERT, UPDATE,
DELETE)
Contention for common tables and/or rows
General Business Reporting


Typical "repeating segment" report
generation
Some data analysis (mixed with other
workloads)
What to analyze?





Identify all transaction-intensive tables in
your schema.
Determine the transaction volume your
database is currently handling (or is
expected to handle).
Identify all common SQL statements
(SELECT, INSERT, UPDATE, DELETE), and
the volumes associated with each.
Develop an understanding of your "table
hierarchy" contained in your schema; in
other words the main parent-child
relationships.
Determine the "key distribution" for
transactions on high-volume tables, to
determine if they are evenly spread or
are concentrated in narrow ranges.
Scalability (cont…)
Database Sharding

Challenges
 Reliability
 Automated
backups
 Database Shard redundancy
 Cost-effective hardware redundancy
 Automated failover
 Disaster Recovery
 Distributed
queries
 Aggregation
of statistics
 Queries that support comprehensive reports
Scalability (cont…)
Database Sharding

Challenges (cont…)
 Avoidance
of cross-shard joins
 Auto-increment key management
 Support for multiple Shard Schemes
 Session-based
sharding
 Transaction-based sharding
 Statement-based sharding
 Determine
 Shard
the optimum method for sharding the data
by a primary key on a table
 Shard by the modulus of a key value
 Maintain a master shard index table
Scalability (cont…)
Database Sharding
Example Bookstore schema showing how data is sharded
Tools




Application framework
Load balancer with multiple application servers
Continuous integration
Automated Testing



Monitoring






TDD (Test Driven Development)
BDD (Behavior Driven Development)
Services
Servers
Error Logging
Access Logging
Content Data Networks (CDN)
FOSS
Think Ahead
Think Ahead (cont…)








Understand business model
Analyze requirement in greatest detail
Plan for extendibility
Be agile, do incremental architecture
Create/use frameworks
SQL or NoSQL?
Sharding or clustering or both?
Cloud services?
Guidelines



Enrich your knowledge: Read, read & read. Read
anything available : jokes to religions.
Follow patterns & best practices
Mix technologies
Don’t let your tools/technologies limit your vision
 Invent/customize technology if required


Use FOSS
Don’t expect ready solutions
 Find the closest match
 Customize as needed

Guidelines (cont…)
Database Optimization

Use established & proven solutions








Understand and utilize indexing & full-text search
Use optimized DB structure & algorithms



MySQL
PostgreSQL
MongoDB
Redis
Memchached
CouchDB
Modified Preorder Tree Traversal (MPTT)
Map Reduce
ORM or not?
Guidelines (cont…)
Database Optimization

Optimize your queries
 One
big query is faster than repetitive smaller queries
 Never be lazy to write optimized queries
 One
 Use
Ring to Rule `em All
Runtime In Memory Cache
 Filtering in-memory cached dataset is much faster than
executing a query in DB
Guidelines (cont…)
One Ring to Rule `em All
Perform Selection, then Projection, then Join
a_id
A
1,000 records
B
1000,000 records
C
1000,000,000 records
A simple example
Write a standard SQL query to find all records with fields A.a1, B.b1 and C.c1 from
tables A (id, a1,a2, a3, …,aP), B (id, a_id, b1, b2, b3, …, bQ), and C(id, b_id,
c1, c2, c3, …,cR) given that A.aX, B.bY and C.cZ will match ‘X’, ‘Y’ and ‘Z’
values respectively.
Assume all tables A, B, C has primary keys defined by id column and a_id and b_id
are the foreign keys in B from A and in C from B respectively.
Guidelines
One Ring to Rule `em All (cont…)
Solution 1
SELECT A.a1, B.b1, C.c1
FROM A, B, C
WHERE A.id = B.a_id AND B.id = C.b_id
AND A.aX = ‘X’ AND B.bY = ‘Y’ AND C.cZ = ‘Z’
Why it Sucks?
•Remembered the size of A, B and C tables?
•Cross product of tables are always memory extensive, why?
•A x B x C will have 1,000 x 1,000,000 x 1,000,000,000 records with (P +1) +
(Q +2) + (R +2) fields
•Can you imagine the size of in-memory result set of joined tables?
•It will be HUGE
Guidelines
One Ring to Rule `em All (cont…)
Solution 2
SELECT A.a1, B.b1, C.c1
FROM A
INNER JOIN B ON A.id = B.a_id
INNER JOIN C ON B.id = C.b_id
WHERE A.aX = ‘X’ AND B.bY = ‘Y’ AND C.cZ = ‘Z’
Why it still Sucks?
•A  B  C will produce (1,000 x 1,000,000) records to perform A  B and then
produce another (1,000 x 1,000,000,000) records to compute (A  B)  C and then it
will filters the records defined by WHERE clause.
•The number of fields, that is P+1 in A, Q+2 in B and R+2 in C will also contribute in
memory consumption.
•It is optimized but still be HUGE with respect to memory consumption and computation
Guidelines
One Ring to Rule `em All (cont…)
Optimal Solution
SELECT A.a1, B.b1, C.c1
FROM (SELECT id, a1 FROM A WHERE aX = ‘X’) as A
INNER JOIN ( SELECT id, b1, a_id FROM B WHERE bY = ‘Y’) as B ON A.id = B.a_id
INNER JOIN ( SELECT id, c1, b_id FROM C WHERE cZ = ‘Z’) as C ON B.id = C.b_id
Why this solution out performs?
•Let’s keep the explanation as an exercise 
Reference : Tools

Security




Caching







Nmap: http://nmap.org/
Nikto: http://cirt.net/Nikto2
List of Tools: http://sectools.org/
APC: http://php.net/manual/en/book.apc.php
XCache: http://xcache.lighttpd.net/
eAccelerator: http://sourceforge.net/projects/eaccelerator/
Varnish Cache: https://www.varnish-cache.org/
MemCached: http://memcached.org/
Redis: http://redis.io/
Load Balancer


HAProxy: http://haproxy.1wt.eu/
Pound: http://www.apsis.ch/pound/
Reference : Tools (cont…)

NoSQL




Distributed Computing



Nagios: http://www.nagios.org/
Testing





RabitMQ: http://www.rabbitmq.com/
ActiveMQ: http://activemq.apache.org/
Monitoring


GearMan: http://gearman.org/
Message Queue/Job Server


MongoDB: http://www.mongodb.org/
CouchDB: http://couchdb.apache.org/
A complete list: http://nosql-database.org/
Selenium: http://seleniumhq.org/
Cucumber: http://cukes.info/
Watir: http://watir.com/
PhpUnit: http://www.phpunit.de/manual/3.7/en/
MPTT

Shameless Promotion: https://github.com/mnishihan/phpMptt
Reference : Articles

Caching



Load Balancing




http://www.diranieh.com/DistributedDesign_1/Scalability.htm
http://www.infoq.com/presentations/Facebook-Software-Stack
http://99designs.com/tech-blog/blog/2012/01/30/infrastructure-at-99designs/

http://bit.ly/16cKu

Database Sharding




http://www.codefutures.com/database-sharding/
http://bit.ly/Y3b3J
http://www.startuplessonslearned.com/2009/01/sharding-for-startups.html
CDN


http://en.wikipedia.org/wiki/Load_balancing_%28computing%29
http://1wt.eu/articles/2006_lb/index.html
Scalability & Architecture


http://www.mnot.net/cache_docs/
http://bit.ly/9cTJfA
http://bit.ly/sMRyxC
MPTT

http://www.sitepoint.com/hierarchical-data-database/
Thank You
Join phpXperts [http://bit.ly/phpxperts]
Follow me on twitter [http://twitter.com/mnishihan]
Subscribe in facebook [http://fb.me/mnishihan]
Questions???
I will be glad to answer 