MongoDB and Spring Data Prepared for: THE JAVA™ METROPLEX USERS GROUP August 8th, 2012 icfi.com | 1 ICF IRONWORKS Integrated Services Interactive Developing creative ideas and engaging audiences through Web, Mobile, and Social Media • Social Media + • User + Industry Research Monitoring • Web Analytics • Digital Strategy + Planning • Mobile Strategy + Execution • Search Marketing • Information Architecture + Usability • Creative Design • Rich Media Development Interactive Portal + Content Management Building Internet-based systems to share content, knowledge, and data • Enterprise Content • Custom Application Development • • Systems Integration • Portal Business + IT Alignment Portal + Content Management Management • Search • Cloud Services • E-Commerce • Application + Platform Management Business + IT Alignment Developing practical strategies to help clients improve business performance • Management • Business Process Improvement • IT Strategy and Roadmap • Governance • Technology Selection • Business Intelligence • Program + Portfolio icfi.com | 2 ICF IRONWORKS Partnerships and Platform Expertise ICF Ironworks has experience in the following market-leading platforms: • Microsoft • Ektron • Autonomy Interwoven • Oracle UCM and WebLogic Portal • Alfresco • SiteCore • Percussion • IBM WebSphere We leverage our strategic partnerships to enhance the services we provide to our clients and to build on our sales pipeline ICF Ironworks is one of 34 Microsoft National Systems Integrators (NSI) icfi.com | 3 ICF IRONWORKS Healthcare Mfg/Retail/Distribution Non-Profit/Assn Financial icfi.com | Government Energy 4 Who Am I? Java Solutions Architect with ICF Ironworks Adjunct Professor Started with HTML and Lotus Notes in 1992 • In the interim there was C, C++, VB, Lotus Script, PERL, LabVIEW, etc. Not so much an Early Adopter as much as a Fast Follower of Java Technologies • Learned Java 1.1 in 1997, J2EE in 1999 Alphabet Soup (MCSE, ICAAD, ICASA, SCJP, SCJD, PMP, CSM) LinkedIn: http://www.linkedin.com/in/iamjimmyray Blog: http://jimmyraywv.blogspot.com/ Avoiding Tech-sand icfi.com | 5 MongoDB and Spring Data icfi.com | 6 Tonight’s Agenda Quick introduction to NoSQL and MongoDB • Configuration • MongoView Introduction to Spring Data and MongoDB support • Spring Data and MongoDB configuration • Templates • Repositories • Query Method Conventions • Custom Finders • Customizing Repositories • • • • icfi.com | Metadata Mapping (including nested docs and DBRef) Aggregation Functions GridFS File Storage Indexes 7 What is NoSQL? Official: Not Only SQL • In reality, it may or may not use SQL, at least in its truest form • Varies from the traditional RDBMS approach of the last few decades • Not necessarily a replacement for RDBMS; more of a solution for more specific needs where is RDBMS is not a great fit • Content Management (including CDNs), document storage, object storage, graph, etc. It means different things to different folks. • It really comes down to a different way to view our data domains for more effective storage, retrieval, and analysis icfi.com | 8 From NoSQL-Database.org “NoSQL DEFINITION: Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open-source and horizontally scalable. The original intention has been modern web-scale databases. The movement began early 2009 and is growing rapidly. Often more characteristics apply such as: schema-free, easy replication support, simple API, eventually consistent / BASE (not ACID), a huge amount of data and more.” icfi.com | 9 Some NoSQL Flavors Document Centric • MongoDB • Couchbase Wide Column/Column Families • Cassandra • Hadoop Hbase Key/Value Stores • Redis Object • DB4O Other • LotusNotes/Domino XML • MarkLogic Graph • Neo4J icfi.com | 10 Why MongoDB Open Source (written in C++) Multiple platforms (Linux, Win, Solaris, Apple) and Language Drivers Explicitly de-normalized Document-centric and Schema-less Fast (low latency) • Fast access to data • Low CPU overhead Ease of scalability (replica sets), auto-sharding Manages complex and polymorphic data Great for CDN and document-based SOA solutions Great for location-based and geospatial data solutions icfi.com | 11 Why MongoDB (more) Because of schema-less approach is more flexible, MongoDB is intrinsically ready for iterative (Agile) projects. Eliminates “impedance-mismatching” with typical RDBMS solutions If You are already familiar with JavaScript and JSON, this is an easy database to understand. icfi.com | 12 What is schema-less? A.K.A. schema-free It means that MongoDB does not enforce a column data type on the fields within your document, nor does it confine your document to specific columns defined in a table definition. The schema is actually controlled via the application API layers and is implied by the “shape” (content) of your documents. This means that different documents in the same collection can have different fields. • So the schema is flexible in that way • Only the _id field is mandatory in all documents. Requires more rigor on the application side. icfi.com | 13 Why Not MongoDB High speed and deterministic transactions: • Banking and accounting Where SQL is absolutely required • Where Joins are needed Traditional non-real-time data warehousing ops If your organization lacks the controls and rigor to place schema and document definition at the application level without compromising data integrity icfi.com | 14 MongoDB Was designed to overcome some of the performance shortcomings of RDBMS Some Features • • • • • • • icfi.com | Fast Querying In place updates Full Index support (including compound indexes) Replication/High Availability (see CAP Theorem) Auto Sharding for scalability Aggregation, MapReduce GridFS 15 CAP Theorem Consistency Availability Partition Tolerance (network partition tolerance) You can never have all three, so you plan for two and make the best of the third. • For example: Perhaps “eventual consistency” is OK for a CDN application. icfi.com | 16 Container Models: RDBMS vs. MongoDB RDBMS: Servers > Databases > Schemas > Tables > Rows • Joins MongoDB: Servers > Databases > Collections > Documents • No Joins, Db References, Nested Documents, de-normalization • Embedding and Linking icfi.com | 17 MongoDB Collections Schema-less Can have up to 24000 (according to 10gen) • Cheap to resource Contain documents (…of varying shapes) icfi.com | 18 MongoDB Documents JSON (what you see) • Actually BSON (Internal - Binary JSON - http://bsonspec.org/) Elements are name/value pairs 16 MB maximum size What you see is what is stored • No default fields (columns) icfi.com | 19 Why BSON? Adds data types that JSON did not support Optimized for performance Adds compression icfi.com | 20 MongoDB Install Extract MongoDB Build config file, or use startup script Start Mongod (daemon) process Use Shell (mongo) to access your database Use MongoVUE for GUI access and to learn shell commands icfi.com | 21 Mongo Shell In Windows, mongo.exe Command-line interface to MongoDB (sort of like SQL*Plus for Oracle) icfi.com | 22 MongoVUE GUI around MongoDB Shell Makes it easy to learn MongoDB Shell commands • db.employee.find({ "lastName" : "Smith", "firstName" : "John" }).limit(50); • show collections Demo… icfi.com | 23 Web Admin Interface Localhost:28017 Quick stats viewer Run commands Demo icfi.com | 24 Spring Data Large Spring project with many subprojects • Category: Document Stores, Subproject MongoDB “…aims to provide a familiar and consistent Spring-based programming model…” Like other Springs, Data is POJO Oriented Provides high-level API and access to low-level API for managing MongoDB documents. Provides annotation-driven meta-mapping Will allow you into bowels of API if you choose to hang out there icfi.com | 25 Spring Data MongoDB Templates Implements MongoOperations (mongoOps) interface • mongoOps defines the basic set of MongoDB operations for the Spring Data API. • Wraps the lower-level MongoDB API Provides access to the lower-level API icfi.com | 26 Spring Data MongoDB Templates - Configuration See mongo-config.xml icfi.com | 27 Spring Data MongoDB Templates - Configuration Or…see the config class icfi.com | 28 Spring Data Repositories Convenience for data access • Spring does ALL the work Convention over configuration Hides complexities of Spring Data templates and underlying API Builds implementation for you based on interface design • Implementation is built during Spring container load. Is typed (parameterized via generics) to the model objects you want to store. • When extending MongoRepository • Otherwise uses @RepositoryDefinition icfi.com | 29 Spring Data Meta Mapping Annotation-driven mapping of model object fields to Spring Data elements in specific database parlance. icfi.com | 30 MongoDB DBRef Optional Instead of nesting documents Have to save the “referenced” document first, so that DBRef exists before adding it to the “parent” document icfi.com | 31 MongoDB Custom Spring Data Repositories Hooks into Spring Data bean type hierarchy that allows you to add functionality to repositories Important: You must write the implementation for this custom repository, using the class name for the Spring Data generated class And…your Spring Data repository interface must extend this custom interface Demo icfi.com | 32 MongoDB Advanced Queries http://www.mongodb.org/display/DOCS/Advanced+Queries#Advan cedQueries-%24all Demo - $in, $nin, $gt, $all icfi.com | 33 MongoDB Aggregation Functions Aggregation Framework Map/Reduce Distinct - Demo Group - Demo • Similar to SQL Group By function Count icfi.com | 34 MongoDB GridFS “…specification for storing large files in MongoDB.” As the name implies, “Grid” allows the storage of very large files divided across multiple MongoDB documents. • Uses native BSON binary formats 16MB per document • Will be higher in future Large files added to GridFS get chunked and spread across multiple documents. icfi.com | 35 MongoDB Indexes Similar to RDBMS Indexes Can have many Can be compound Makes searches, aggregates, and group functions faster Makes writes slower Sparse = true • Only include documents in this index that actually contain a value in the indexed field. icfi.com | 36 MongoDB Security Default is trusted mode, no security --auth --keyfile • Replica sets require this option icfi.com | 37 MongoDB Encryption MongoDB does not support data encryption, per se Use application-level encryption and store encrypted data in BSON fields Or…use TDE (Transparent Data Encryption) from Gazzang icfi.com | 38 Helpful Links Spring Data MongoDB - Reference Documentation: http://static.springsource.org/spring-data/datamongodb/docs/1.0.2.RELEASE/reference/html/ http://nosql-database.org/ www.mongodb.org http://www.mongodb.org/display/DOCS/Java+Language+Center http://www.mongodb.org/display/DOCS/Books http://openmymind.net/2011/3/28/The-Little-MongoDB-Book/ http://jimmyraywv.blogspot.com/2012/05/mongodb-and-spring-data.html http://jimmyraywv.blogspot.com/2012/04/mongodb-jongo-and-morphia.html https://www.10gen.com/presentations/webinar/online-conference-deep-divemongodb icfi.com | 39 Questions icfi.com | 40