WTT 2014 - WORKSHOP DE TENDÊNCIAS TECNOLÓGICAS 2014
Agenda
NoSQL Concepts
MongoDB Concepts
MongoDB Demos
NoSQL databases?
“NoSQL” = “No SQL” =
Not using traditional relational DBMS
“No SQL”
Don’t use SQL language
Alternative to traditional relational DBMS
+ Flexible schema
+ Quicker/cheaper to set up
+ Massive scalability
+ Relaxed consistency
higher performance & availability
– No declarative query language
more programming
– Relaxed consistency
fewer guarantees
NoSQL Systems
Map Reduce Framework Originally from Google, open source Hadoop
Key-Values-Stores Google BigTable, Amazon Dynamo, Cassandra, HBase, …
Document Stores
• Data model: (key, document) pairs
• Document: JSON, XML, other semistructured formats
• CouchDB, MongoDB, SimpleDB, …
Graph Database Systems Neo4j, FlockDB, Pregel, …
Big-Table Implementation Teradata, Exadata, GreenPlum, MonetDB, …
ACID versus BASE
ACID Atomicity, Consistency, Isolation, Durability Traditional
Databases
CAP
Strong Consistency + High Availability + Partition-tolerance
The CAP-Theorem postulates that only two of the three different aspects of scaling out are can be achieved fully at the same time.
Many of the NOSQL
BASE Basically Available, Soft-state, Eventually consistent
Quiz. NoSQL Applications ?
[ ] Web Log Analysis URL, timestamp, number of accesses
[ ] Social-network graphs user1, user2, Find friends of a friends
[ ] Wikipedia Pages Large collections, structured and unstructured data
[ ] Twitter messages unstructured data
[ ] Blog maintenance unstructured data
[ ] Account credits and debts
MongoDB
MongoDB (from "humongous") is an open-source document database, and the leading NoSQL database . Written in C++,
MongoDB features
• Document-Oriented Storage
• Querying
• Full Index Support
• Replication & High Availability
• Auto-Sharding
• Map/Reduce
• Geospatial support
• Text Search
DB-Engines Ranking
SQL to MongoDB Mapping Chart
JSON x SQL x BSON
JSON
JavaScript Object Notation. A human-readable, plain text format for expressing structured data with support in many programming languages.
SQL Schema Statements MongoDB Schema Statements
CREATE TABLE users (
Id INT NOT NULL AUTO_INCREMENT, user_id Varchar ( 30 ), age Number , status char ( 1 ),
PRIMARY KEY (id) )
db.users.insert( { user_id : "abc123" , age : 55 , status : "A" } )
BSON
A serialization format used to store documents and make remote procedure calls in
MongoDB. “BSON” is a portmanteau of the words “binary” and “JSON”.
JSON Document Model var p = {
‘_id’: ‘3432’,
‘author’: DBRef(‘User’, 2) ,
‘title’: ‘Introduction to MongoDB’,
‘body’: ‘MongoDB is an open sources.. ‘,
‘timestamp’: Date(’01-04-12’),
‘tags’: [‘MongoDB’, ‘NoSQL’] ,
‘comments’: [{‘author’: DBRef(‘User’, 4),
‘date’: Date(’02-04-12’),
‘text’: ‘Did you see.. ‘,
‘upvotes’: 7, … ]
}
> db.posts.save(p);
Indexes
Create Index on any field in the document
// 1 means ascending, -1 means descending
> db.posts.
ensureIndex ({‘author’: 1});
//Index Nested Documents
> db.posts.ensureIndex(‘comments.author’: 1);
// Index on tags
> db.posts.ensureIndex({‘tags’: 1});
// Geo-spatial Index
> db.posts.ensureIndex({‘author.location’: ‘2d’});
Queries?
// find posts which has ‘MongoDB’ tag.
> db.posts.
find ({tags: ‘MongoDB’});
// find posts by author’s comments.
> db.posts.find({‘comments.author’: DBRef(‘User’,2)}).count();
// find posts written after 31 st March.
> db.posts.find({‘timestamp’: {‘gte’: Date(’31-03-12’)}});
// find posts written by authors around [22, 42]
> db.posts.find({‘author.location’: {‘near’:[22, 42]});
$gt, $lt, $gte, $lte, $ne, $all, $in, $nin, count, limit, skip, group, etc…
Updates? Atomic Operations db.posts.
update ({_id: ‘3432’},
{‘title’: ‘Introduction to MongoDB (updated)’,
‘text’: ‘Updated text’,
${addToSet: {‘tags’: ‘webinar’}});
$set, $unset
$push, $pull, $pop, $addToSet
$inc, $decr, many more…
MongoDB does not support TRANSACTIONS !
Some Cool features, but not in this lab
• Geo-spatial Indexes for Geo-spatial queries.
$near, $within_distance, Bound queries (circle, box)
• Map/Reduce
GROUP BY in SQL, map/reduce in MongoDB.
• GridFS
Stores Large Binary Files.
Demo: my bond girls database
Demo: mongodb install http://www.mongodb.org/downloads http://docs.mongodb.org/manual/tutorial/install-mongodb-on-windows/
C:\mongodb\bin\mongod.exe --dbpath d:\test\mongodb\data
Demo: mongodb as a Service https://mongolab.com/welcome/
Demo: pymongo
Demo: mongolab create a account
Demo: after a Collection, create a document
Demo: find( ) with Python
Demo: find( ) with mongo shell
Demo: load a .CSV collection with mongo shell
Demo: results of load a .CSV collection
Demo: find( ) a document with mongo shell
Demo: find( ) a document with Python
Demo: find( ) select a field
References
The little Mongodb Book by Karl Seguin http://openmymind.net/mongodb.pdf
Mongodb http://docs.mongodb.org/manual/reference/sql-comparison/ http://www.mongodb.org/downloads http://docs.mongodb.org/manual/installation/ http://docs.mongodb.org/manual/tutorial/getting-started-with-the-mongo-shell/ http://docs.mongodb.org/manual/ https://www.mongodb.com/reference https://university.mongodb.com
Mongodb-as-a-Service https://mongolab.com/welcome/ https://www.mongohq.com/
Pymongo, mongodb drive for Python http://api.mongodb.org/python/current/installation.html
http://api.mongodb.org/python/current/tutorial.html
Easy Install, to install pymongo for Windows https://pypi.python.org/pypi/setuptools
Conclusion: When to use MongoDB
Schema-less
Writes db.createCollection(‘logs’, {capped: true, size: 1048576})
NO TRANSACTIONS
Data Processing ~ MapReduce
Geospatial
Full Text Search
NoSQL + tools and maturity
WTT 2014 - WORKSHOP DE TENDÊNCIAS TECNOLÓGICAS 2014
Prof. Dr. Rogério de Oliveira roger.oliveira@mackenzie.br
http://meusite.mackenzie.br/rogerio