Transcript document

Goodbye rows and tables, hello documents and collections
Lots of pretty pictures to fool you.
Noise
Introduction
MongoDB bridges the gap between key-value stores (which are fast and highly scalable) and traditional
RDBMS systems (which provide rich queries and deep functionality).
MongoDB is document-oriented, schema-free, scalable, high-performance, open source. Written in C++
Mongo is not a relational database like MySQL
Goodbye rows and tables, hello documents and collections
Features
Document-oriented

Documents (objects) map nicely to programming language data types

Embedded documents and arrays reduce need for joins

No joins and no multi-document transactions for high performance and easy scalability

High performance

No joins and embedding makes reads and writes fast

Indexes including indexing of keys from embedded documents and arrays
High availability

Replicated servers with automatic master failover
Easy scalability

Automatic sharding (auto-partitioning of data across servers)

Reads and writes are distributed over shards

No joins or multi-document transactions make distributed queries easy and fast

Eventually-consistent reads can be distributed over replicated servers

Why ?

Cost - MongoDB is free

MongoDb is easily installable.

MongoDb supports various programming
languages like C, C++, Java,Javascript, PHP.

MongoDB is blazingly fast

MongoDB is schemaless

Ease of scale-out
If load increases it can be distributed to other
nodes across computer networks.

It's trivially easy to add more fields -- even
Limitations
Mongo is limited to a total data size of 2GB for all databases in 32-bit
mode.


No referential integrity

Data size in MongoDB is typically higher.

At the moment Map/Reduce (e.g. to do aggregations/data analysis) is OK,
but not blisteringly fast.

Group By : less than 10,000 keys.
For larger grouping operations without limits, please use map/reduce .

Lack of predefined schema is a double-edged sword

No support for Joins & transactions
Benchmarking (MongoDB Vs. MySQL)
Record Structure
Field1 -> String, Indexed
Field2 -> String, Indexed
Filed3 -> Date, Not Indexed
Filed4 -> Integer, Indexed
25000
20000
15000
MySQL
MongoDB
10000
5000
0
Script 1 (Insert)
Script 2 (Insert)
Script 3 (Select)
Test Machine configuration:
CPU : Intel Xeon 1.6 GHz - Quad Core, 64 Bit
Memory : 8 GB RAM
OS : Centos 5.2 - Kernel 2.6.18 64 bit
Mongo data model
A Mongo system (see deployment above) holds a set of databases
A database holds a set of collections
A collection holds a set of documents
A document is a set of fields
A field is a key-value pair
A key is a name (string)
A value is a

basic type like string, integer, float, timestamp, binary, etc.,

a document, or

an array of values

MySQL Term
Mongo Term
database
database
table
collection
index
index
row
BSON document
column
BSON field
SQL to Mongo Mapping Chart
Continued ...
SQL Statement
Mongo Statement
Replication / Sharding
Data Redundancy
 Automated Failover
 Distribute read load
 Simplify maintenance
(compared to "normal" master-slave)
 Disaster recovery from user error

Automatic balancing for changes in
load and data distribution
 Easy addition of new machines
 Scaling out to one thousand nodes
 No single points of failure
 Automatic failover

These slides are online:
http://amardeep.in/intro_to_mongodb.ppt