TriJUG.2013-01-21.v1 - the Triangle Java Users Group

advertisement
MongoDB and Spring Data
Prepared for:
THE TRIANLGE JAVA USER’S GROUP
January 21st, 2013
icfi.com |
1
ICF IRONWORKS
Integrated Services
Interactive
Developing creative ideas and engaging audiences through
Web, Mobile, and Social Media
• Social Media +
• User + Industry Research
Monitoring
• Web Analytics
• Digital Strategy + Planning
• Mobile Strategy + Execution
• Search Marketing
• Information
Architecture + Usability
• Creative Design
• Rich Media Development
Interactive
Portal + Content Management
Building Internet-based systems to share content, knowledge,
and data
• Enterprise Content
• Custom Application
Development
•
• Systems Integration
• Portal
Business +
IT Alignment
Portal +
Content
Management
Management
• Search
• Cloud Services
• E-Commerce
• Application + Platform
Management
Business + IT Alignment
Developing practical strategies to help clients improve
business performance
• Management
• Business Process
Improvement
• IT Strategy and Roadmap
• Governance
• Technology Selection
• Business Intelligence
• Program + Portfolio
icfi.com |
2
ICF IRONWORKS
Partnerships and Platform Expertise
 ICF Ironworks has experience in the
following market-leading platforms:
• Microsoft
• Ektron
• Autonomy Interwoven
• Oracle UCM and WebLogic Portal
• Alfresco
• SiteCore
• Percussion
• IBM WebSphere
 We leverage our strategic partnerships
to enhance the services we provide to
our clients and to build on our sales
pipeline
 ICF Ironworks is one of 34 Microsoft
National Systems Integrators (NSI)
icfi.com |
3
ICF IRONWORKS
Healthcare
Mfg/Retail/Distribution
Non-Profit/Assn
Financial
icfi.com |
Government
Energy
4
Who Am I?
 Solutions Architect with ICF Ironworks
 Part-time Adjunct Professor
 Started with HTML and Lotus Notes in 1992
• In the interim there was C, C++, VB, Lotus Script, PERL, LabVIEW,
etc.
 Not so much an Early Adopter as much as a Fast Follower of Java
Technologies
 Alphabet Soup (MCSE, ICAAD, ICASA, SCJP, SCJD, PMP, CSM)
 LinkedIn: http://www.linkedin.com/in/iamjimmyray
 Blog: http://jimmyraywv.blogspot.com/ Avoiding Tech-sand
icfi.com |
5
MongoDB and Spring Data
icfi.com |
6
Tonight’s Agenda
 Quick introduction to NoSQL and MongoDB
• Configuration
• MongoView
 Introduction to Spring Data and MongoDB support
• Spring Data and MongoDB configuration
• Templates
• Repositories
• Query Method Conventions
• Custom Finders
• Customizing Repositories
•
•
•
•
icfi.com |
Metadata Mapping (including nested docs and DBRef)
Aggregation Functions
GridFS File Storage
Indexes
7
What is NoSQL?
 Official: Not Only SQL
• In reality, it may or may not use SQL*, at least in its truest form
• Varies from the traditional RDBMS approach of the last few decades
• Not necessarily a replacement for RDBMS; more of a solution for more
specific needs where is RDBMS is not a great fit
• Content Management (including CDNs), document storage, object storage,
graph, etc.
 It means different things to different folks.
• It really comes down to a different way to view our data domains for
more effective storage, retrieval, and analysis
icfi.com |
8
From NoSQL-Database.org
“NoSQL DEFINITION: Next Generation Databases mostly
addressing some of the points: being non-relational, distributed,
open-source and horizontally scalable. The original intention has
been modern web-scale databases. The movement began early
2009 and is growing rapidly. Often more characteristics apply such
as: schema-free, easy replication support, simple API, eventually
consistent / BASE (not ACID), a huge amount of data and more.”
icfi.com |
9
Some NoSQL Flavors
 Document Centric
• MongoDB
• Couchbase
 Wide Column/Column
Families
• Cassandra
• Hadoop Hbase
 Key/Value Stores
• Redis
 Object
• DB4O
 Other
• LotusNotes/Domino
 XML
• MarkLogic
 Graph
• Neo4J
icfi.com |
10
Why MongoDB
 Open Source (written in C++)
 Multiple platforms (Linux, Win, Solaris, Apple) and Language Drivers
 Explicitly de-normalized
 Document-centric and Schema-less
 Fast (low latency)
• Fast access to data
• Low CPU overhead
 Ease of scalability (replica sets), auto-sharding
 Manages complex and polymorphic data
 Great for CDN and document-based SOA solutions
 Great for location-based and geospatial data solutions
icfi.com |
11
Why MongoDB (more)
 Because of schema-less approach is more flexible, MongoDB is
intrinsically ready for iterative (Agile) projects.
 Eliminates “impedance-mismatching” with typical RDBMS solutions
 “How do I model my application in 3NF?”
 If You are already familiar with JavaScript and JSON, this is an easy
database to understand.
icfi.com |
12
What is schema-less?
 A.K.A. schema-free
 It means that MongoDB does not enforce a column data type on
the fields within your document, nor does it confine your document
to specific columns defined in a table definition.
 The schema is actually controlled via the application API layers
and is implied by the “shape” (content) of your documents.
 This means that different documents in the same collection can
have different fields.
• So the schema is flexible in that way
• Only the _id field is mandatory in all documents.
 Requires more rigor on the application side.
icfi.com |
13
Why Not MongoDB
 High speed and deterministic transactions:
• Banking and accounting
 Where SQL is absolutely required
• Where Joins are needed
 Traditional non-real-time data warehousing ops
 If your organization lacks the controls and rigor to place schema
and document definition at the application level without
compromising data integrity
icfi.com |
14
MongoDB
 Was designed to overcome some of the performance
shortcomings of RDBMS
 Some Features
•
•
•
•
•
Fast Querying (atomic operations, embedded data)
In place updates (physical writes lag in-memory changes)
Full Index support (including compound indexes)
Replication/High Availability (see CAP Theorem)
Auto Sharding (range-based portioning, based on shard key) for
scalability
• Aggregation, MapReduce
• GridFS
icfi.com |
15
MongoDB – In Place Updates
 Physical disk writes lag in-memory changes.
• Multiple writes in memory can occur before the object is updated on
disk
 MongoDB uses an adaptive allocation algorithm for storing its
objects.
• If an object changes and fits in it’s current location, it stays there.
• However, if it is now larger, it is moved to a new location. This moving
is expensive for index updates
• MongoDB looks at collections and based on how many times items
grow within a collection, MongoDB calculates a padding factor that trys
to account for object growth
• This minimizes object relocation
icfi.com |
16
MongoDB – A Word About Sharding…
 Need to choose the right key
• Easily divisible (“splittable”– see cardinality) so that Mongo can
distribute data among shards
• “all documents that have the same value in the state field must reside on the
same shard” – 10Gen
• Enable distributed write operations between cluster nodes
• Prevents single-shard bottle-necking
• Make it possible for “Mongos” return most query operations from a
single mongod instance
• “users will generally have a unique value for this field, MongoDB will be able
to split as many chunks as needed” – 10Gen
icfi.com |
17
MongoDB – Cardinality…
 You want higher cardinality to allow chunks of data to be split
among shards
• Example: Address data components
• State – Low Cardinality
• ZipCode – Potentially low or high, depending population
• Phone Number – High Cardinality
icfi.com |
18
CAP Theorem
 Consistency
 Availability
 Partition Tolerance (network partition tolerance)
 You can never have all three, so you plan for two and make the
best of the third.
• For example: Perhaps “eventual consistency” is OK for a CDN
application.
• For large scalability, you would need partitioning. That leaves C & A to
choose from
• Would you ever choose consistency over availability?
icfi.com |
19
Container Models: RDBMS vs. MongoDB
 RDBMS: Servers > Databases > Schemas > Tables > Rows
• Joins, Group By, ACID
 MongoDB: Servers > Databases > Collections > Documents
• No Joins
• Instead: Db References (Linking) and Nested Documents (Embedding)
icfi.com |
20
MongoDB Collections
 Schema-less
 Can have up to 24000 (according to 10gen)
• Cheap to resource
 Contain documents (…of varying shapes)
• 100 nesting levels (version 2.2)
 Are namespaces, like indexes
 Can be “Capped”
• Limited in max size with rotating overwrites of oldest entries
• Example: oplog
icfi.com |
21
MongoDB Documents
 JSON (what you see)
• Actually BSON (Internal - Binary JSON - http://bsonspec.org/)
 Elements are name/value pairs
 16 MB maximum size
 What you see is what is stored
• No default fields (columns)
icfi.com |
22
Why BSON?
 Adds data types that JSON did not support
 Optimized for performance
 Adds compression
icfi.com |
23
MongoDB Install
 Extract MongoDB
 Build config file, or use startup script
• Need dbpath configured
• Need REST configured for Web Admin tool
 Start Mongod (daemon) process
 Use Shell (mongo) to access your database
 Use MongoVUE for GUI access and to learn shell commands
icfi.com |
24
Mongo Shell
 In Windows, mongo.exe
 Command-line interface to MongoDB (sort of like SQL*Plus for
Oracle)
icfi.com |
25
MongoVUE
 GUI around MongoDB Shell
 Makes it easy to learn MongoDB Shell commands
• db.employee.find({ "lastName" : "Smith", "firstName" : "John"
}).limit(50);
• show collections
 Demo…
icfi.com |
26
Web Admin Interface
 Localhost:28017
 Quick stats viewer
 Run commands
 Demo
 There is also Sleepy Mongoose
• http://www.kchodorow.com/blog/2010/02/22/sleepy-mongoose-amongodb-rest-interface/
icfi.com |
27
Spring Data
 Large Spring project with many subprojects
• Category: Document Stores, Subproject MongoDB
 “…aims to provide a familiar and consistent Spring-based
programming model…”
 Like other Spring projects, Data is POJO Oriented
 For MongoDB, provides high-level API and access to low-level API
for managing MongoDB documents.
 Provides annotation-driven meta-mapping
 Will allow you into bowels of API if you choose to hang out there
icfi.com |
28
Spring Data MongoDB Templates
 Implements MongoOperations (mongoOps) interface
• mongoOps defines the basic set of MongoDB operations for the Spring
Data API.
• Wraps the lower-level MongoDB API
 Provides access to the lower-level API
 Provides foundation for upper-level Repository API.
icfi.com |
29
Spring Data MongoDB Templates - Configuration
 See mongo-config.xml
icfi.com |
30
Spring Data MongoDB Templates - Configuration
 Or…see the config class
icfi.com |
31
Spring Data Repositories
 Convenience for data access
• Spring does ALL the work (unless you customize)
 Convention over configuration
• Uses a method-naming convention that Spring interprets during
implementation
 Hides complexities of Spring Data templates and underlying API
 Builds implementation for you based on interface design
• Implementation is built during Spring container load.
 Is typed (parameterized via generics) to the model objects you
want to store.
• When extending MongoRepository
• Otherwise uses @RepositoryDefinition annotation
icfi.com |
32
Spring Data Meta Mapping
 Annotation-driven mapping of model object fields to Spring Data
elements in specific database dialect.
icfi.com |
33
MongoDB DBRef
 Optional
 Instead of nesting documents
 Have to save the “referenced” document first, so that DBRef exists
before adding it to the “parent” document
icfi.com |
34
MongoDB Custom Spring Data Repositories
 Hooks into Spring Data bean type hierarchy that allows you to add
functionality to repositories
 Important: You must write the implementation for part of this
custom repository
 And…your Spring Data repository interface must extend this
custom interface, along with the appropriate Spring Data repository
 Demo
icfi.com |
35
Creating a Custom Repository
 Write an interface for the custom methods
 Write the implementation for that interface
 Write the traditional Spring Data Repository application interface,
extending the appropriate Spring Data interface and the (above)
custom interface
 When Spring starts, it will implement the Spring Data Repository
normally, and include the custom implementation as well.
icfi.com |
36
MongoDB Advanced Queries
 http://www.mongodb.org/display/DOCS/Advanced+Queries#Advan
cedQueries-%24all
 Demo - $in, $nin, $gt, $all
icfi.com |
37
MongoDB Aggregation Functions
 Aggregation Framework
 Map/Reduce
 Distinct - Demo
 Group - Demo
• Similar to SQL Group By function
 Count
icfi.com |
38
MongoDB GridFS
 “…specification for storing large files in MongoDB.”
 As the name implies, “Grid” allows the storage of very large files
divided across multiple MongoDB documents.
• Uses native BSON binary formats
 16MB per document
• Will be higher in future
 Large files added to GridFS get chunked and spread across
multiple documents.
icfi.com |
39
MongoDB Indexes
 Similar to RDBMS Indexes
 Can have many
 Can be compound
• Including indexes of array fields in document
 Makes searches, aggregates, and group functions faster
 Makes writes slower
 Sparse = true
• Only include documents in this index that actually contain a value in the
indexed field.
icfi.com |
40
MongoDB Security
 http://www.mongodb.org/display/DOCS/Security+and+Authenticati
on
 Default is trusted mode, no security
 --auth
 --keyfile
• Replica sets require this option
icfi.com |
41
MongoDB Encryption
 MongoDB does not support data encryption, per se
 Use application-level encryption and store encrypted data in BSON
fields
 Or…use TDE (Transparent Data Encryption) from Gazzang
• http://www.gazzang.com/encrypt-mongodb
icfi.com |
42
MongoDB 2.2
 Drop-in replacement for 1.8 and 2.0.x
 Aggregation without Map Reduce
 TTL Collections (alternative to Capped Collections)
 Tag-aware Sharding
 http://docs.mongodb.org/manual/release-notes/2.2/
icfi.com |
43
Helpful Links
 Spring Data MongoDB - Reference Documentation:
http://static.springsource.org/spring-data/datamongodb/docs/1.0.2.RELEASE/reference/html/
 http://nosql-database.org/
 www.mongodb.org
 http://www.mongodb.org/display/DOCS/Java+Language+Center
 http://www.mongodb.org/display/DOCS/Books
 http://openmymind.net/2011/3/28/The-Little-MongoDB-Book/
 http://jimmyraywv.blogspot.com/2012/05/mongodb-and-spring-data.html
 http://jimmyraywv.blogspot.com/2012/04/mongodb-jongo-and-morphia.html
 https://www.10gen.com/presentations/webinar/online-conference-deep-divemongodb
icfi.com |
44
Questions
icfi.com |
45
Download