MongoDB Indroduction Presentation - agile

advertisement
MongoDB
Introduction
© 2014 - Zoran Maksimovic www.agile-code.com
MongoDB is a scalable, highperformance,
open source,
schema-free, document-oriented
database
© 2014 - Zoran Maksimovic www.agile-code.com
History
• 2007 - First developed (by 10gen)
• 2009 - Become Open Source
• 2010 - Considered production ready (v 1.4 > )
• 2013 - MongoDB Closes $150 Million in Funding
• 2014 - Latest stable version (v 2.6)
• Today- More than $231 million in total investment since 2007
• MongoDB inc. valuated $1.2B.
© 2014 - Zoran Maksimovic www.agile-code.com
© 2014 - Zoran Maksimovic www.agile-code.com
NoSQL Breakdown
• NoSQL encompasses a wide variety of different database technologies and
were developed in response to a rise in the volume of data
• Document databases pair each key with a complex data structure known as a
document (MongoDB, Couchbase Server, CouchDB )
• Key-value stores are the simplest NoSQL databases. Every single item in the database
is stored as an attribute name (or "key"), together with its value (DynamoDB,
Windows Azure Table Storage, Riak, Redis, LevelDB , Dynomite )
• Wide-column stores such as Cassandra and HBase are optimized for queries over
large datasets, and store columns of data together, instead of rows.
• Graph stores are used to store information about networks, such as social
connections. Graph stores include Neo4J and HyperGraphDB.
© 2014 - Zoran Maksimovic www.agile-code.com
NoSQL made by big vendors
• Oracle NoSQL Database (Key-Value store)
• Microsoft Azure Table Storage (Key-Value store)
• Google: BigTable (proprietary)
• Google: LevelDB (Open Source key-value store)
• Amazon: SimpleDB (Wide Column store)
• Amazon: DynamoDB (Key-Value store)
• Apache: HBase, Riak , …
• Facebook: Cassandra (Wide column store)
© 2014 - Zoran Maksimovic www.agile-code.com
MongoDB in a nutshell
• Document-Oriented Storage » JSON-style documents with dynamic
schemas offer simplicity and power.
• Full Index Support »Index on any attribute, just like you're used to.
• Replication & High Availability » Mirror across LANs and WANs for scale
and peace of mind.
• Auto-Sharding » Scale horizontally without compromising functionality.
• Querying » Rich, document-based queries.
• Fast In-Place Updates »Atomic modifiers for contention-free
performance.
• Map/Reduce »Flexible aggregation and data processing.
• GridFS »Store files of any size without complicating your stack.
• MongoDB Management Service »Monitoring and backup designed for
MongoDB.
• Professional Support by MongoDB »Enterprise class support, training,
and consulting available.
© 2014 - Zoran Maksimovic www.agile-code.com
MongoDB is a Document oriented database
• Think of “documents” as database records. No Schema!
• Documents are basically just JSON objects that Mongo stores in
binary (BSON) format
© 2014 - Zoran Maksimovic www.agile-code.com
MongoDB database structure
© 2014 - Zoran Maksimovic www.agile-code.com
Embedded Data Model
When to use:
• “contains” relationships between
entities.
• one-to-many relationships
between entities. In these
relationships the “many” or child
documents always appear with or
are viewed in the context of the
“one” or parent documents.
• Retrieving data in one query
• Data redundancy.
© 2014 - Zoran Maksimovic www.agile-code.com
Document oriented database – Normalized
data
model
When to use:
• When embedding would result in duplication of data but would not
provide sufficient read performance advantages to outweigh the
implications of the duplication.
• To represent more complex many-to-many relationships.
• To model large hierarchical data sets.
• Multiple queries!
May, 14 2014
Zoran Maksimovic www.agile-code.com
Indexing
• All indexes in MongoDB are B-Tree indexes
• Index Types:
•
•
•
•
•
•
•
•
Single field index
Compound Index: more than one field in the collection
Multikey index: index on array fields
Geospatial index and queries.
Text index: Index
TTL index: (Time to live) index will contain entities for a limited time.
Unique index: the entry in the field has to b unique.
Sparse index: stores an index entry only for entities with the given field.
© 2014 - Zoran Maksimovic www.agile-code.com
Security
• Authentication:
•
•
•
•
MongoDB’s default UserName/Password authentication
x509 certificate authentication
LDAP proxy authentication
Kerberos authentication
• Authorization
• Role based access control
© 2014 - Zoran Maksimovic www.agile-code.com
Replication
• Replication provides redundancy and increases data high availability
© 2014 - Zoran Maksimovic www.agile-code.com
Sharding (Horizontal scaling)
• Sharding is a method for storing data across multiple machines
• When HDD, CPU or RAM limits are reached.
• Vertical Scaling vs Horizontal Scaling.
• Range based vs Hash based sharding
© 2014 - Zoran Maksimovic www.agile-code.com
How to access MongoDB?
Drivers: http://docs.mongodb.org/ecosystem/drivers/downloads
Administration interfaces: http://docs.mongodb.org/ecosystem/tools/administration-interfaces
© 2014 - Zoran Maksimovic www.agile-code.com
C# code example
var connectionString = "mongodb://localhost";
var client = new MongoClient(connectionString);
var server = client.GetServer();
public class Entity
{
public ObjectId Id { get; set; }
public string Name { get; set; }
}
var database = server.GetDatabase("test");
var collection = database.GetCollection<Entity>("entities");
{
//insert a new entity
_id: “13098098”,
Name: “Tom”
var entity = new Entity { Name = "Tom" };
collection.Insert(entity);
}
var id = entity.Id;
//Retrieve
var query = Query<Entity>.EQ(e => e.Id, id);
entity = collection.FindOne(query);
//Save (Update) -> Sends the full content of the entity to be updated.
{
_id: “13098098”,
Name: “Nick”
entity.Name = “Nick";
collection.Save(entity);
}
//Update -> Sends partial content of the entity to be updated.
{
_id: “13098098”,
Name: “Nick”
var update = Update<Entity>.Set(e => e.Name, "Harry");
collection.Update(query, update);
}
//Deleting the entity
collection.Remove(query);
© 2014 - Zoran Maksimovic www.agile-code.com
Some of the MongoDB Shell methods
•
•
•
•
db.inventory.find( { type: "snacks" } )
db.inventory.find( { type: 'food', price: { $lt: 9.95 } } )
db.inventory.insert ( { _id: 10, type: "misc", item: "card", qty: 15 } )
db.inventory.find( { type: 'food' } ).explain()
{
"cursor": "BtreeCursor type_1",
"isMultiKey": false,
"n": 5,
"nscannedObjects": 5,
"nscanned": 5,
"nscannedObjectsAllPlans": 5,
"nscannedAllPlans": 5,
"scanAndOrder": false,
"indexOnly": false,
"nYields": 0,
"nChunkSkips": 0, "millis" : 0,
"indexBounds": { "type" : [ [ "food", "food" ] ] },
"server": "mongodbo0.example.net:27017"
}
© 2014 - Zoran Maksimovic www.agile-code.com
What is missing (from the RDBMS
perspective)
• No JOINS support
• No complex transaction support
• No constrains support (have to be implemented at the application
level)
© 2014 - Zoran Maksimovic www.agile-code.com
Where/When to use?
• A main drivers:
• Big amount of data (Twitter: ~12TB of data per day!)
• Develop more easily (according to surveys)! impedance mismatch problem!
• In general:
• Content Management and Delivery: serve content, as well as the associated
metadata (attachments, images, binary)
• Big Data too diverse, fast-changing, or massive… These include a wide variety
of apps such as genomics, clickstream analysis, customer Sentiment analysis,
log data collection etc…
• Analytics and Reporting (data warehouse)
• Market Data Management
© 2014 - Zoran Maksimovic www.agile-code.com
Problems
• Maturity!!!
• Skillset?
• Organizational change?
• What’s about the future?
© 2014 - Zoran Maksimovic www.agile-code.com
Q&A
© 2014 - Zoran Maksimovic www.agile-code.com
Download