Transcript NoSQL

Cloud Computing
Clase 8 - NoSQL
Johnny Halife
@johnnyhalife
Matias Woloski
@woloski
Miguel Saez
@masaez
NoSQL
•
•
•
•
•
•
•
What does it mean?
RDBMS legacy and rise of NoSQL
NoSQL classification
Pros and Cons
Possible use cases
Real-world examples
What next?
What does it mean?
• Movement, not a specification
• Subjective term (like Web 2.0)
– Originally used in 1998
– Reintroduced at Rackspace to refer to non-RDBMS
• NoSQL != No SQL
• NoSQL == Not Only SQL ?
NoSQL Comment
RDBMS Legacy
• Efficient data storage
• Powerful querying
capabilities (SQL)
• Support ACID
Transactions
• Mature, well supported
• Ubiquitous
• Bottom-up design
•
•
•
•
•
Storage is cheap
O/R Impedance
Complex to manage
Always the bottleneck
Who really needs
transactions?
Rise of NoSQL
• Internet
• Google
• 2006 Bigtable whitepaper (Google)
– “a sparse, distributed multi-dimensional sorted map”
• 2007 Dynamo whitepaper (Amazon)
• 2008 Cassandra released (Facebook)
– “a BigTable data model running on an Amazon Dynamo-like infrastructure”
• 2009 Voldemort released (LinkedIn)
– “a big, distributed, persistent, fault-tolerant hash table”
No-SQL Offering
Windows Azure
Rise of NoSQL – Amazon
“There are many services on Amazon’s platform
that only need primary-key access to a data store.
For many services, such as those that provide best
seller lists, shopping carts, customer preferences,
session management, sales rank, and product
catalog, the common pattern of using a relational
database would lead to inefficiencies and limit scale
and availability. Dynamo provides a simple primarykey only interface to meet the requirements of these
applications.”
NoSQL Data Store Classifications
• Key-Value store
– Amazon SimpleDB, Amazon Dynamo (Amazon), Tokyo Cabinet,
Voldemort (Gilt Groupe)
• Wide-column (sparse) store
– Hadoop (Yahoo, EBay), Cassandra (Facebook), Bigtable (Google!),
Azure Table Storage (MSFT), Excel(!)
• Document database
– MongoDB, CouchDB (BBC), RavenDB
• Graph database
– Neo4J, InfoGrid
• Object database
– Db4o, Versant, Perst, Cache
• Data Grids
– Infinispan, GigaSpaces, Terracotta
Why NoSQL
Good
• Flexible (schema-less)
• Very scalable
• Scales over cheap hardware
• Reduces the need to DBA
• Simple to use and operate
• Eventually consistent
• Cheap
• Suited to Web applications
Bad
• Immature
• No common standards
• No support
• No standard
• Poor transaction support
• Poor query support
• New mindset required
NoSQL Use Cases
Good Examples
• Logging data
• Shopping carts
• Favourites
• Preferences
• Session data
• Mock data providers
• Temporary / working data
• Variable schema data
Stick with RDBMS
• Transactions (orders etc.)
• LOB applications
• Anything involving $$$
• Business-critical data
• Reporting
Real-world Examples
Real-world Examples
Real-world Examples
“As I described in an earlier blog post, the new BBC homepage has
been built on a whole new technical architecture. Since launching
we’ve found an issue with the service we use to save users’
customisation settings. Although we ran a public beta for more than 2
months, this problem only became apparent when we moved the
whole audience across to the new site, increasing the load on the
platform 20 times. Despite thorough load testing before launch we
were unable to accurately predict the type and combination of
customisations that users would perform, and as a result we now
need to re-architect the way we save your homepage customisation
settings in a more efficient way.”
Summary
•
•
•
•
NoSQL is not a replacement for RDBMS
No two scenarios are the same
Use best tool for the job
Experiment
Not only SQL