in 10 minutes Mohannad El Dafrawy Sara Rodriguez Lino Valdivia Jr What is MongoDB? • • • Document database o Data is structured as schema-less JSON documents One of the most popular NoSQL solutions Cross-platform and open source o o written in C++ supports Windows, Linux, Mac OS X, Solaris Features (I) • • • Document-based storage and querying o Queries themselves are JSON documents Full Index Support o Allows indexing on any attribute, just like in a traditional SQL solution Replication & High Availability o Supports mirroring of data for scalability Features (II) • • • Auto-Sharding (horizontal scaling) o Large data sets can be divided and distributed over multiple shards Fast In-Place Updates o Update operations are atomic for contention-free performance Integrated Map/Reduce framework o Can perform map/reduce operations on top of the data History • • • • First developed by 10gen (later MongoDB, Inc.) in 2007 Name comes from “humongous” Became open source in 2009 Latest stable release (2.4.9) released Jan 2014 Basic Ideas { _id: 1234, author: { name: “Bob Jones”, email: “b@b.com” }, post: “In these troubled times I like to ...“, date: { $date: “2014-03-12 13:23UTC” }, ● Collections of JSON objects location: [ -121.2322, 48.1223222 ], rating: 2.2, comments: [ { user: “lalal@hotmail.com”, ● Embed objects within a single document upVotes: 22, downVotes: 14, text: “Great point! I agree” }, ● Flexible schema { user: “pedro@gmail.com”, upVotes: 421, downVotes: 22, text: “You are a...” } ], tags: [ “databases”, “mongo” ] } ● References Query Example db.posts.find({ author.name: “mike” }) db.posts.find({ rating: { $gt: 2 }}) db.posts.find({ tags: “software” }) db.posts.find().sort({date: -1}).limit(10) // select * from posts where ‘economy’ in tags order by ts DESC db.posts find( {tags :‘economy’}) .sort({ts :-1 }).limit(10); http://try.mongodb.org/ Note on internals • documents stored as BSON (Binary JSON) {_id: ObjectId(XXXXXXXXX), • • memory-mapped files indexes are B-Trees http://bsonspec.org hello: “world”} \x27\x00\x00\x07 _i d\x00 X X X X X X X X\x02 h e l l o\x00\x06\x00 \x00\x00 w o r l d\x00\x00 Cassandra (1.2) VS MongoDB (2.2) Best used: Best used: • When you write more than you read (logging). • If every component of the system must be in • If you require Availability + Partition Tolerance • If you need dynamic queries. • If you prefer to define indexes, not map/reduce functions. • If you need good performance on a big DB. • If you require Consistency + Partition Tolerance For example: Banking, financial industry (though For example: For most things that you would do with MySQL not necessarily for financial transactions, but these or PostgreSQL, but having predefined columns really holds industries are much bigger than that.) Writes are you back. Java. faster than reads, so one natural niche is data analysis. source: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis Why (and why not) MongoDB? • • If you need dynamic queries If you need good performance on a big DB • • It doesn't support SQL It doesn't have any built-in revisioning like CouchDB If you wanted CouchDB, but your data changes too much, filling up disks It lacks transactions, so if you're a bank, don’t use it If you prefer to define indexes, not map/reduce functions • • • • It doesn't have real full text searching features Production Users •Archiving - Craigslist •Content Management - MTV Networks •E-Commerce - Customink •Real-time Analytics - intuit •Social Networking - Foursquare Long-term goals for MongoDB To add new features as: Natural language processing Full text search engine More real-time search in data • • • Personal conclusion • • • • • • Getting up to speed with MongoDB (document oriented and schema free) Advanced usage (tons of features) Administration (Easy to admin,replication,sharding) Advanced usage (Index & aggregation) BSON and Memory-Mapped There are times where not all clients can read or write. CP (Consistency and Partition Tolerance). References • • • • • • • MongoDB.org (https://www.mongodb.org/) Wikipedia: MongoDB (http://en.wikipedia.org/wiki/MongoDB) DB-Engines Ranking (http://db-engines.com/en/ranking) Interview about the future of MongoDB (http://strata.oreilly.com/2012/11/the-future-ofmongodb.html) MongoDB Inside and Outside by Kyle Banker (http://vimeo.com/13211523) How This Web Site Uses MongoDB (http://www.businessinsider.com/how-we-use-mongodb2009-11) Cassandra and MongoDB comparison (http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdbvs-redis)