CofaxScalability

Download Report

Transcript CofaxScalability

Cofax Scalability
Document Version 1.0
Scaling Cofax in General





2
The scalability of Cofax is directly related to the system
software, hardware and network environment in which it is
installed.
The analysis did not find any scalability bottlenecks in the
Cofax design or code itself. It confirmed that Cofax was
designed to scale.
This process did result in a significant reliability and
performance improvement to our server hosting facility and
database setup.
Cofax was developed to allow the rapid deployment of new
hardware resources according to demand.
Cofax is not dependent on any one system installation
architecture. It scales by reconfiguring the environment to
meet current needs.
Scaling Cofax at PNI






3
Description of Installation v2.0.atPNI
Strengths of Installation v2.0.atPNI
Weaknesses of Installation v2.0.atPNI
Description of Installation v3.0.atPNI
Strengths of Installation v3.0.atPNI
Addressing concerns about Installation v3.0.atPNI
Installation v2.0.atPNI
Cofax Server
File Storage
HTTP Server
Cofax Server
File Storage
HTTP Server
Cofax Server
File Storage
HTTP Server
Cofax Server
File Storage
HTTP Server
Cofax Server
File Server
Cofax Server
Database Server
4
Strengths of Installation
v2.0.atPNI






5
HTTP Servers are distributed.
Cofax Application Servers are distributed.
Load is balanced between multiple servers.
Load can be distributed as necessary.
Once the hardware and OS is in place, new
additional servers can be manually configured and
running in minutes.
Very suitable for an ISP that can automatically
add/remove servers on the fly.
Weaknesses of Installation
v2.0.atPNI

Database server is a single point of failure.
–
–

There is a limit to how much hardware we can add
to the single database server.
–

6
On the serving side, and
On the updating side
The incremental performance gain from adding more
computing resources (CPUs, memory, disk space) to the
single server starts to diminish at a point.
A single machine, no matter how powerful does fail
for the common types of problems (locked data in a
table, runaway processes, memory leaks, etc.)
Weaknesses of Installation
v2.0.atPNI

There is a practical limit of how much optimization
we can do on a single machine.
–

These optimizations change with time.
–
–
7
There is also a limit to how much optimization the people
would want to do.
When the server setup changes
When the usage patterns change.
Installation v3.0.atPNI
File Storage
File Storage
Cofax Server
File Storage
HTTP Server
Cofax Server
File Storage
HTTP Server
Cofax Server
HTTP Server
HTTP Server
Cofax Server
Cofax Server
Cofax Server
8
File Server
Serving Database
Server
Serving Database
Server
Serving Database
Server
Serving Database
Server
Editing Database
Server
Editing Database
Server
Strengths of Installation
v3.0.atPNI





9
The model is already tested.
Only reasonable optimizations are required.
Serving database is replicated across multiple
physical servers
There is no single point of failure on the serving
side.
Data transformation is isolated from data retrieval
Implementing the Distributed
Database




10
Requires no design changes to the Cofax
framework.
Requires no changes to the Java code or software
application.
Requires configuration changes only.
Requires the addition of new hardware resources,
database servers, tomcat servers, web servers.
Upgrading from Database model to
Distributed Database model



11
Separation of “editing” and “serving” databases.
Front-end database can be replicated across
multiple physical servers.
Additional databases can be brought online as
needed.
A proven model that is able to
serve very high traffic

Additional database servers can be added to
handle growing web site traffic.
–

Can house large amounts of content
–
–
12
E.g. 100 million or more dynamic page views a day.
Disk storage continues to become cheaper.
10 Years’ worth of content from 100 Daily Newspapers.
Replication Issues Addressed


The database replication model is based on the
knowledge:
The number of “reads” from the data store outweigh
the “writes”.
–

The number of “new records being added or
deleted” outweigh the current records being
updated.
–
13
E.g. A data store that has 10 million records read from it
in an hour is likely to have no more than 10 thousand
records written to it.
E.g. A data store that has 10 thousand new records
added to it in an hour is likely to have between 1 hundred
to 2 thousand existing records updated in that time.
Latency Issues Addressed



14
Updates from the Editing databases to the Serving
databases are transactional. As tables on the
editing database occur those transactions are
replicated on the serving machines.
Transactional model means almost no latency
between editing and serving machines.
Data is de-normalized and optimized for fast
serving on the Editing databases. These fastaccess tables are sent to the Serving databases.
Conclusion



15
Because of its flexible framework Cofax can scale
to meet any demand.
Scaling requires only the addition of hardware
resources and minor configuration changes
The current installation changes took only a few
days to implement and bring online.