Presentation_Valery_Pechatnikov

Transcript Presentation_Valery_Pechatnikov

The NoSQL Approach
To High Scalability
Valery Pechatnikov
Web Enhanced Information Management - Spring
2011
Web Scalability
The the amount of data accessible
via the web has grown
exponentially and continues to
expand at an incredible pace
A typical user of the web today has
a large digital footprint - expecting
to be able to store and quickly
retrieve personal journals, photos,
and videos within the cloud
•
Shared links: 1,000,000 every 20 minutes
•
Tagged photos: 1,323,000
•
Event invites sent out: 1,484,000
•
Wall Posts: 1,587,000
•
Status updates: 1,851,000
•
Friend requests accepted: 1,972,000
•
Photos uploaded: 2,716,000
•
Comments: 10,208,000
•
Message: 4,632,000
Examples: Facebook, Twitter
Facebook Messaging (as of 11/2010): Over 350
million users sending over 15 billion person-to-person
messages per month
Back in 2009, Facebook had over 30,000 servers.
Today, Facebook has decided to publish the
specifications of it’s latest custom data center as it
breaks ground for massive data storage (from a
hardware perspective)
Twitter is currently seeing 155 million tweets per day
(April 2011), and the numbers are growing weekly.
What is High Scalability?
Many Web 2.0 applications find themselves in need of scaling beyond their existing capabilities very
quickly.
Scaling approaches:
Vertical - use larger capacity, more powerful, faster hardware
Horizontal - use more hardware - distribute the workload across multiple servers
Vertical scaling quickly reaches a limit and can be expensive
Horizontal scalability is cheaper, but requires distributing data across multiple servers, which can be a
complex problem.
Scalability and Relational Database
Management Systems (RDBMS)
Traditional horizontal scaling - data must be
partitioned across multiple servers.
Possible partitioning schemes:
Master/Slave (Write/Read)
Horizontal Table partitioning
Sharding (share nothing)
RDBMS scaling problems
However, horizontal scaling with RDBMS requires giving up some of the
features that make relation datastores so popular in the first place.
Often for performance reasons, application developers give up
data normalization - on order to avoid joins, store data in a redundant
way
transactionality - distributed transactions are expensive
data constraints across partitioned tables
The
NoSQL
Approach
What is NoSQL?
The NoSQL space is not defined by any one specific database implementation, but by the notion of choice. Although these choices aren’t
meant to focus exclusively on web-scale performance, in the majority of cases the need for a new choice arose due to a previous
implementation’s inability to scale readily while maintaining the necessary features (and thus was really inspired by Web 2.0)
Generally, NoSQL alternatives focus on the ability to scale horizontally from the start, and thus there is a
relaxation of at least some of the typical ACID properties (Atomicity, Consistency, Isolation, Durability) and
often the complexity of relationships between entities is limited.
NoSQL Alternatives:
Key-Value Databases - simplest model, representing data as a massive hash table of key-value pairs. Examples: Voldemort (LinkedIn),
Dynamo (Amazon)
BigTable Databases - the core entity in this implementation is a large table with column “families” and a time dimension allowing data
versioning. Examples: BigTable (Google), HBase(Powerset --> Apache Hadoop).
Document databases - Data entities are represented as documents instead of rows, which are then grouped into collections (as opposed to
tables). A document in a collection doesn’t need to have a defined schema (like a row would for a table). Example: MongoDB (10gen)
Graph databases - Represent arbitrary data as nodes connected by edges, providing ability to model properties and relationships. The
natural query pattern for such a database is graph traversal (as opposed to joins). Example: neo4j
Evaluation Criteria

Ease of Scaling
 Does the implementation take care of distributing your data automatically?
 How much overheard is there in terms of performance once the data has been distributed?
 Does scaling out pose a considerable maintenance burden or does the data store solution provide configurable automated maintenance with a distributed
solution in mind?

ACID:

Data Consistency

Atomicity

Isolation

Durability

Query language ease of use and maturity

Data Modeling Flexibility

Migration/Adoption feasibility
Case Study: HBase
“…Hbase tables are like those in an RDBMS, only cells are versioned, rows are sorted, and columns can be
added on the fly by the client as long as the column family they belong to pre-exists.”
In the example diagram below, column family groups are defined by the prefix prior to the colon (i.e. “anchor”
and “contents” are sample family groups). The row has a key in the form of com.cnn.www (the convention
allows rows to be sorted in such a way that data for similar domains is stored close together). The third
dimension for each cell is a timestamp, such that the data is versioned.
How well does HBase scale?
Excellently. In the HBase implementation, rows are automatically grouped into “regions” and when
necessary tables are split into multiple regions, which are distributed across multiple servers seamlessly.
The location of a particular region is stored within a table on a special region server, such that any region
can be reached with at most 3 hops. The management of region splitting, region merging, and region
rebalancing all happens automatically in the HBase implementation and doesn’t need any extra coding from
the developer (just configuration of the relevant thresholds)
Case Study: HBase
In the image above, the Google version of the BigTable implementation demonstrates the structure of tablet location discovery (“region” location
in HBase).
A Chubby file (Zookeeper in HBase) is a file on the distributed filesystem lock service that stores the location of the first tablet (the root tablet).
The metadata tablets (regions) can be found from the root tablet location. (The root and metadata tablets are all part of one table, where the
topmost set of rows is the root).
Due to the implementation, this tree will never be allowed to become more than 3 levels deep
Due to the popularity of Hadoop and it’s seamless integration with HBase, many companies have taken the leap and have already adoped HBase.
With Facebook as a contributor to the project there is now even more reason to believe HBase will continue to evolve and to thrive.
Case Study: MongoDB
Data entities are represented as documents instead of rows,
which are then grouped into collections (as opposed to
tables). A document in a collection doesn’t need to have a
defined schema (like a row would for a table)
How well does MongoDB scale?
Excellently. MongoDB scales horizontally via an
auto-sharding.
Case Study: Voldemort
The data model for the Voldemort database is a simple key-value map. The keys and values being
stored can be almost arbitrarily complex objects.
How well does Voldemort scale?
Excellently. This is the core strength of the model and allows reads and writes to scale
horizontally transparently to the user (requiring only configuration) by partitioning the data
across the cluster (i.e. distributed database).
Consistent hashing is used to ensure that when a server is added or removed from the cluster,
rebalancing is efficient (i.e. only necessary amount of data is rebalanced, rather than all data).
The diagram shows a sample mapping of servers A, B, C, D to the universe of possible hashes
of keys. This assumes 3 servers will hold only one single key for redundancy/availability
Complex and ever-changing relationships are not modeled well via a simple key-value
datastore. The application reading and writing to the database would be required to check all
relationship constraints and interpret arbitrary key-value compositions. There are no triggers
and no constructs that allow an application to infer meaning into the data. purposes.
However, complex and ever-changing relationships are not modeled well via a simple keyvalue datastore. The application reading and writing to the database would be required to
check all relationship constraints and interpret arbitrary key-value compositions. There are no
triggers and no constructs that allow an application to infer meaning into the data.
Case Study: neo4j
Unlike the previously discussed data models, the graph database model thrives on relationships and associations.
Neo4j’s data model allows a user to model nodes and relationships (edges), with attributes (key-value pairs) on
both nodes and edges.
How well does Neo4j scale?
OK. Neo4j claims to be massively scalable, but at the same time concedes that it is the least scalable of the 4
types of NoSQL databases being discussed. The scalability of Neo4j is on the order of billions of nodes and
relationships on a single machine. As with relational databases, it’s possible to use sharding to scale to
multiple machines as well. Sharding a graph database is hard due to the nature of the graph data model.
Data Modeling Complexity vs Scalability (size)
Conclusion
Relational databases will not be made obsolete anytime soon by the NoSQL
alternatives.
However, for certain special use cases the features offered by NoSQL offerings
make the trade-offs worth it, and in some cases there is a clear win by going the
non-traditional route. As more types of data stores become available, with varied
APIs and even lower barriers to entry, the relational database will cease to be the
default choice for every application.
NoSQL has already proven to be a practical alternative for the needs of Web
giants such as Facebook and Twitter and should continue to expand in its scope
such that it will be more than just a passing fad.
REFERENCES
•
Hoff, Todd. "Facebook's New Real-time Messaging System: HBase to Store 135+ Billion Messages A Month." High Scalability. 16 Nov. 2010. Web. Feb.-Mar. 2011.
<http://highscalability.com/blog/2010/11/16/facebooks-new-real-time-messaging-system-hbase-to-store-135.html>.
•
Kreps, Jay. "Project Voldemort: Scaling Simple Storage at LinkedIn." The LinkedIn Blog. LinkedIn.com, 20 Mar. 2009. Web. 07 Mar. 2011.
<http://blog.linkedin.com/2009/03/20/project-voldemort-scaling-simple-storage-at-linkedin/>.
•
Lehnardt, Jan. "NoSQL Is About..." Blog.couchone.com. Couchone.com, 10 Apr. 2010. Web. 21 Feb. 2011. <http://blog.couchone.com/post/511008668/nosql-is-about>.
•
Metz, Cade. "HBase: Shops Swap MySQL for Open Source Google Mimic." Web log post. The Register: Sci/Tech News for the World. 19 Jan. 2011. Web. 07 Mar. 2011.
<http://www.theregister.co.uk/2011/01/19/hbase_on_the_rise/>.
•
Muthukkaruppan, Kannan. "The Underlying Technology of Messages." Web log post. Facebook. 15 Nov. 2010. Web. Feb.-Mar. 2011.
<http://www.facebook.com/note.php?note_id=454991608919>.

Presentation_Valery_Pechatnikov

Transcript Presentation_Valery_Pechatnikov

Directory