Goodbye rows and tables, hello documents and collections Lots of pretty pictures to fool you. Noise Introduction MongoDB bridges the gap between key-value stores (which are fast and highly scalable) and traditional RDBMS systems (which provide rich queries and deep functionality). MongoDB is document-oriented, schema-free, scalable, high-performance, open source. Written in C++ Mongo is not a relational database like MySQL Goodbye rows and tables, hello documents and collections Features Document-oriented Documents (objects) map nicely to programming language data types Embedded documents and arrays reduce need for joins No joins and no multi-document transactions for high performance and easy scalability High performance No joins and embedding makes reads and writes fast Indexes including indexing of keys from embedded documents and arrays High availability Replicated servers with automatic master failover Easy scalability Automatic sharding (auto-partitioning of data across servers) Reads and writes are distributed over shards No joins or multi-document transactions make distributed queries easy and fast Eventually-consistent reads can be distributed over replicated servers Why ? Cost - MongoDB is free MongoDb is easily installable. MongoDb supports various programming languages like C, C++, Java,Javascript, PHP. MongoDB is blazingly fast MongoDB is schemaless Ease of scale-out If load increases it can be distributed to other nodes across computer networks. It's trivially easy to add more fields -- even Limitations Mongo is limited to a total data size of 2GB for all databases in 32-bit mode. No referential integrity Data size in MongoDB is typically higher. At the moment Map/Reduce (e.g. to do aggregations/data analysis) is OK, but not blisteringly fast. Group By : less than 10,000 keys. For larger grouping operations without limits, please use map/reduce . Lack of predefined schema is a double-edged sword No support for Joins & transactions Benchmarking (MongoDB Vs. MySQL) Record Structure Field1 -> String, Indexed Field2 -> String, Indexed Filed3 -> Date, Not Indexed Filed4 -> Integer, Indexed 25000 20000 15000 MySQL MongoDB 10000 5000 0 Script 1 (Insert) Script 2 (Insert) Script 3 (Select) Test Machine configuration: CPU : Intel Xeon 1.6 GHz - Quad Core, 64 Bit Memory : 8 GB RAM OS : Centos 5.2 - Kernel 2.6.18 64 bit Mongo data model A Mongo system (see deployment above) holds a set of databases A database holds a set of collections A collection holds a set of documents A document is a set of fields A field is a key-value pair A key is a name (string) A value is a basic type like string, integer, float, timestamp, binary, etc., a document, or an array of values MySQL Term Mongo Term database database table collection index index row BSON document column BSON field SQL to Mongo Mapping Chart Continued ... SQL Statement Mongo Statement Replication / Sharding Data Redundancy Automated Failover Distribute read load Simplify maintenance (compared to "normal" master-slave) Disaster recovery from user error Automatic balancing for changes in load and data distribution Easy addition of new machines Scaling out to one thousand nodes No single points of failure Automatic failover These slides are online: http://amardeep.in/intro_to_mongodb.ppt