Transcript document
Goodbye rows and tables, hello documents and collections
Lots of pretty pictures to fool you.
Noise
Introduction
MongoDB bridges the gap between key-value stores (which are fast and highly scalable) and traditional
RDBMS systems (which provide rich queries and deep functionality).
MongoDB is document-oriented, schema-free, scalable, high-performance, open source. Written in C++
Mongo is not a relational database like MySQL
Goodbye rows and tables, hello documents and collections
Features
Document-oriented
Documents (objects) map nicely to programming language data types
Embedded documents and arrays reduce need for joins
No joins and no multi-document transactions for high performance and easy scalability
High performance
No joins and embedding makes reads and writes fast
Indexes including indexing of keys from embedded documents and arrays
High availability
Replicated servers with automatic master failover
Easy scalability
Automatic sharding (auto-partitioning of data across servers)
Reads and writes are distributed over shards
No joins or multi-document transactions make distributed queries easy and fast
Eventually-consistent reads can be distributed over replicated servers
Why ?
Cost - MongoDB is free
MongoDb is easily installable.
MongoDb supports various programming
languages like C, C++, Java,Javascript, PHP.
MongoDB is blazingly fast
MongoDB is schemaless
Ease of scale-out
If load increases it can be distributed to other
nodes across computer networks.
It's trivially easy to add more fields -- even
Limitations
Mongo is limited to a total data size of 2GB for all databases in 32-bit
mode.
No referential integrity
Data size in MongoDB is typically higher.
At the moment Map/Reduce (e.g. to do aggregations/data analysis) is OK,
but not blisteringly fast.
Group By : less than 10,000 keys.
For larger grouping operations without limits, please use map/reduce .
Lack of predefined schema is a double-edged sword
No support for Joins & transactions
Benchmarking (MongoDB Vs. MySQL)
Record Structure
Field1 -> String, Indexed
Field2 -> String, Indexed
Filed3 -> Date, Not Indexed
Filed4 -> Integer, Indexed
25000
20000
15000
MySQL
MongoDB
10000
5000
0
Script 1 (Insert)
Script 2 (Insert)
Script 3 (Select)
Test Machine configuration:
CPU : Intel Xeon 1.6 GHz - Quad Core, 64 Bit
Memory : 8 GB RAM
OS : Centos 5.2 - Kernel 2.6.18 64 bit
Mongo data model
A Mongo system (see deployment above) holds a set of databases
A database holds a set of collections
A collection holds a set of documents
A document is a set of fields
A field is a key-value pair
A key is a name (string)
A value is a
basic type like string, integer, float, timestamp, binary, etc.,
a document, or
an array of values
MySQL Term
Mongo Term
database
database
table
collection
index
index
row
BSON document
column
BSON field
SQL to Mongo Mapping Chart
Continued ...
SQL Statement
Mongo Statement
Replication / Sharding
Data Redundancy
Automated Failover
Distribute read load
Simplify maintenance
(compared to "normal" master-slave)
Disaster recovery from user error
Automatic balancing for changes in
load and data distribution
Easy addition of new machines
Scaling out to one thousand nodes
No single points of failure
Automatic failover
These slides are online:
http://amardeep.in/intro_to_mongodb.ppt