Transcript Column

: what’s all the buzz about?
http://nosql-database.org/
Next generation databases are:
• Non-relational,
• Distributed,
• Open-source,
• Horizontal scalable
Often more characteristics:
Schema-free, easy replication support, simple API,
eventually consistent / BASE (not ACID), a huge data
amount
List of NoSQL databases [122+]
•
Wide Column Store / Column Families
HBase, Cassandra, Hypertable, Cloudata, Cloudera, Amazon SimpleDB
•
Document Stores
CouchDB, MongoDB, Terrastore, ThruDB, OrientDB, RavenDB, Citrusleaf, SisoDB
•
Key Value / Tuple Store
Azure Table Storage, MEMBASE, Riak, Redis, Chordless, GenieDB, Scalaris, Tokyo Cabinet / Tyrant, Keyspace
Berkeley DB, MemcacheDB, Faircom C-Tree, Mnesia, LightCloud, Hibari, HamsterDB, STSdb, Pincaster, RaptorDB
•
Eventually Consistent Key Value Stores
Amazon Dynamo, Voldemort, Dynomite, KAI
•
Graph Databases
Neo4J, Infinite Graph, Sones, InfoGrid, HyperGraphDB, Trinity, AllegroGraph, Bigdata, DEX, OpenLink, Virtuoso,
VertexDB, FlockDB
•
Object Databases
db4o, Versant, Objectivity, Gemstone, Progress, Starcounter, Perst, Caching, ZODB, NEO, PicoLisp, Sterling
•
More and more databases
So what’s wrong with relational
databases?
Main principals of RDBMS
• SQL
• ACID
• Atomic “all or nothing”
• Consistent means that data moves from one correct
state to another correct state, with no possibility that
readers could view different values that don’t make
sense together.
• Isolated means that transactions executing
concurrently will not become entangled with each
other.
• Durable once a transaction has succeeded, the
changes will not be lost.
Shortcomings of RDBMS
• Transactions under heavy load
• Complexities of vertical scaling
• 2 phase commit (2PC) protocol
Sharding
If you can’t split it, you can’t scale it (Randy
Shoup, distinguished architect, eBay)
• Sharging approach
• Feature-based shard or functional segmentation
• Key-based sharding
• Lookup table
• Shared-nothing or Cassandra like sharding
The real question is not
“What’s wrong with relational
databases?” but rather,
“What problem do you have?”
Brewer’s CAP Theorem
Availability
Consistency
Partition
Tolerance
Brewer’s CAP Theorem
Availability
Relational:
MySQL, Oracle, MSSQL
Amazon Dynamo derivatives:
Cassandra, Voldemort,
Riak, CouchDB
Partition
Tolerance
Consistency
Neo4j, Google Big Table and
its derivatives: MongoDB,
Redis, Hypertable
in 50 words or less
Apache Cassandra is an open source, distributed,
decentralized, elastically scalable, highly available,
fault-tolerant, tuneably consistent, column-oriented
database that bases its distribution design on
Amazon’s Dynamo and its data model on Google’s
Bigtable. Created at Facebook, it is now used at
some of the most popular sites on the Web.
Cassandra case studies
Cassandra outlines
• BASE (Basically Available Soft-state Eventual
consistency) and not ACID (Atomicity,
Consistency, Isolation, Durability)
• Distributed and decentralized
• Elastic scalability
• High availability and fault tolerance
• Tunable consistency
Use cases for Cassandra
•
•
•
•
Large deployments
Lots of writes, statistics and analysis
Geographical distribution
Evolving applications
Writes
Memtable
Commit
log
Write
Threshold
SSTable
SSTable
•
•
•
•
•
•
•
•
No reads
No seeks
Fast
Sequential disk access
Atomic within a column family
Any node
Always writable (hinted hand-off)
≈ 0.2 ms
Reads
Memtable
Read
Bf
Idx
SSTable
Bf
Idx
SSTable
• Bloomfilter field to determine
whether a provided key is in the
SSTable
• Index field for quick read
• Any node
• Read repair
• ≈ 15 ms
The tenets of column-oriented model
• Keyspace
Outer container, that contains column
families (is sort of like a relational database)
• Column Family
Logical division that associates similar data
(very roughly analogous to tables in the
relational world)
• Column
Name/value pair (and a client-supplied
timestamp of when it was last updated)
• Super Column Family
Container for super columns sorted by their
names
• Super Column
Structure with name and set of dependent
columns
Column Family\Column
Column
A name value pair (contains also a time-stamp for conflict resolution
on the server side)
column
name : byte[]
column
value : byte[]
+ timestamp : long
Column Family
A container for columns sorted by their names. Column Families are referenced and
sorted by row keys.
row key
column
name 1
column
name n
column
value 1
column
value n
Super Column Family\Super Column
Super Column
A sorted associative array of columns.
super column name
column
name 1
column
name n
column
value 1
column
value n
Super Column Family
A container for super columns sorted by their names. Like Column Families, Super Column Families
are referenced and sorted by row keys.
super column name 1
row key
super column name m
column
name 1
column
name n1
column
name 1
column
name nm
column
value 1
column
value n1
column
value 1
column
value nm
Addressing Column Family
row key
column
name 1
column
name n
column
value 1
column
value n
• Four-dimensional hash
• [Keyspace][ColumnFamily][Key][Column]
Addressing Super Column Family
super column name 1
row key
super column name m
column
name 1
column
name n1
column
name 1
column
name nm
column
value 1
column
value n1
column
value 1
column
value nm
• Five-dimensional hash
• [Keyspace][ColumnFamily][Key][SuperColumn][SubColumn]
Cassandra client options
Thrift (12 different languages)
Avro (data serialization system)
Java:
Hector: http://github.com/rantav/hector (abstraction over thrift)
Pelops: http://github.com/s7/scale7-pelops (abstraction over thrift)
CQL: JDBC driver for Cassandra version starting from 0.8 (SQL like language)
Hector JPA: https://github.com/riptano/hector-jpa (ORM client)
Cassandrelle: http://demoiselle.sf.net/component/demoiselle-cassandra/ (documentation ???)
Kundera: http://code.google.com/p/kundera/ (buggy ???)
Python:
Pycassa, Telephus
Grails:
grails-cassandra
.NET:
Aquiles, FluentCassandra
Ruby:
Cassandra
PHP:
phpcassa, SimpleCassie
Cassandra\RDBMS query differences
•
•
•
•
No update query
Record-level atomicity on writes
No duplicate keys
Basic write properties: consistency level
(ZERO, ANY, ONE, QUORUM, ALL)
• Basic read properties: consistency level (ONE,
QUORUM, ALL)
Integrating
Hadoop (http://hadoop.apache.org) is a set of open source projects that deal with
large amounts of data in a distributed way.
• Hadoop Distributed File System (HDFS): a distributed file system that provides
high-throughput access to application data.
• Hadoop MapReduce: a software framework for distributed processing of large
data sets on compute clusters.
Other Hadoop-related projects at Apache include:
• Cassandra™: a scalable multi-master database with no single points of failure.
• Hive™: a data warehouse infrastructure that provides data summarization and ad
hoc querying.
• Mahout™: a Scalable machine learning and data mining library.
• Pig™: a high-level data-flow language and execution framework for parallel
computation.
The end
Questions?