MarkLogic 8 Overview of New Features © COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. MarkLogic / Enterprise NoSQL Database Platform POWERFUL AGILE TRUSTED Native JSON Store Native XML Store Scalable and Elastic Cloud Ready (AWS) Performance at scale LDAP and Kerberos Security Native RDF Triple Store Geospatial Support Hadoop and HDFS REST API Security Certifications Monitoring and Management Full-text Search Flexible Indexes SQL Support Multi-OS Support Configuration Management 24/7 Engineering Support Bitemporal Real-time Alerting Schema Agnostic Samplestack ACID Transactions Flexible Replication Semantic Inference Tiered Storage MarkLogic Content Pump XA Transactions Customizable Backup Server-Side JavaScript Fully Transactional Ad-hoc Queries Index Across Data Types Point-in-time Recovery Customizable Failover Atomic Forests MarkLogic 8 / More Powerful, Easier to Use Developer Experience Semantics MarkLogic 8 is more powerful than ever, but remarkably easy to use Enterprise triple store, document store, database combined Bitemporal JSON Unified indexing and query for today’s web and SOA data Node.js Client API Java Client API Server-Side JavaScript Enterprise NoSQL database for Node.js applications NoSQL agility in a pure Java interface JavaScript runtime inside MarkLogic using Google’s V8 Track information along two dimensions of time MarkLogic 8 / More Powerful, Easier to Use Additional New Features Management API Incremental Backup Flexible Replication Enhanced HTTP Server REST-based Management API to manage all MarkLogic capabilities, providing more programmatic control than ever before Faster backups that use less storage, only backing up changes since the previous incremental or full backup Customizable information sharing between systems, allowing for the easy and secure distribution of data Simple and fast client-server interactions out-ofthe-box with a single interface Learn MarkLogic with an end-toend three-tiered sample app + Database + Middle Tier Front End JSON Unified indexing and query for today’s web and SOA data Speed up development with powerful built-in search, transformation, and alerting capabilities designed for JSON Reduce lost fidelity and functionality from data model translations and brittle ETL Simplify architecture with data, metadata, and relationships managed consistently and securely together Ease modern, end-to-end JavaScript 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 { "_id": 1, "name": { "MarkLogic" }, "supports" : [ { "datatype": "XML", "year": 2003 }, { "datatype": "JSON", "year": 2015 } ] } development SLIDE: 5 © COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Node.js Client API Enterprise NoSQL database for Node.js applications Focus on application features rather than plumbing with out-of-the-box search, transactions, aggregates, alerting, geospatial, and more Move faster to production with proven reliability at scale Maximize performance and flexibility—bringing code to the data Enable modern end-to-end JavaScript development SLIDE: 6 Always open source on GitHub Participate. Contribute. Fork it. © COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Java Client API NoSQL agility in a pure Java interface Faster development and less custom code with out-of-the-box data management, search, and alerting Pure Java query builder and conveniences for POJOs, JSON, XML, and binary I/O Built-in extensibility for moving performancecritical code to the database Always open source and developed on GitHub Participate. Contribute. Fork it. SLIDE: 7 © COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Server-Side JavaScript JavaScript runtime inside MarkLogic using Google’s V8 Run code near the data for unparalleled Front End power, efficiency Build applications faster from a growing pool of skills, tools Middle Tier Reduce risk with proven performance and reliability Decrease brittle ETL and lost fidelity and + Database Layer functionality from JSON data conversions Pair with Node.js to ease full-stack JavaScript development SLIDE: 8 © COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Samplestack An end-to-end three-tiered application in Java and Node.js Encapsulates best practices and introduces key MarkLogic concepts Use sample code as a model for building applications more quickly Front End Middle Tier Modern technology stack shows where MarkLogic fits in your environment Database Layer Participate. Contribute. Fork it. SLIDE: 9 © COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Semantics Enterprise triple store, document store, database combined Store and query billions of facts and relationships; infer new facts Facts and relationships provide context for better search Flexible data modeling—integrate and link data from different sources Standards-based for ease of use and integration – RDF, SPARQL, and standard REST interfaces SLIDE: 10 New in MarkLogic 8: SPARQL 1.1, graph traversal, automatic inference using rule sets, and SPARQL from Server-Side JavaScript and Node.js © COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Bitemporal Timing is everything SLIDE: 11 Rewind the information “as it actually was” in combination with “as it was recorded” at some point in time Provides increased insight into your business and mission Capture evolving schema as the shape of the data changes with changing time, a capability that has prevented relational bitemporal offerings from being widely adopted Critical for anyone in regulated industries Even better because of Tiered Storage and Semantics Valid Time EVENT 3 EVENT 2 EVENT 2 EVENT 1 System Time Valid Time – Real-world time, information “as it actually was” System Time – Time it was recorded to the database © COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Management API REST-based API to manage all MarkLogic capabilities SLIDE: 12 Increase efficiency and agility by automating timeconsuming repetitive tasks across production, testing and development Reduce setup time and admin error by orchestrating multi-step configurations and deployments Fit more seamlessly into IT environments by using REST interfaces unlike CLI or proprietary APIs Perform automated testing and monitor performance using market tools that support REST Even better with Client REST API, Elasticity © COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Incremental Backup Faster backups while using less storage Store only changes since the previous full or incremental backup Consume less storage for backup copies Reduce backup window Improve availability with multiple daily backups Work with Log Archiving to enable fine-grained point-in-time recovery INCREMENTAL BACKUP (differential) FULL SUNDAY SLIDE: 13 FULL MONDAY TUESDAY WEDNESDAY THURSDAY FRIDAY SATURDAY SUNDAY © COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Flexible Replication Customizable information sharing between systems SLIDE: 14 Enable content collaboration across numerous systems Support directly connected or mobile users Provide data that users need using simple configurable parameters or queries Ensure data consistency and security with simple workflows Even better with Bitemporal and Management API © COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Enhanced HTTP Server Simple and fast client-server interactions out-of-the-box SLIDE: 15 Use a single interface when employing the REST API, custom HTTP, XCC/XDBC to connect to any database Delivers ease-of-use by not having to create extra ports Simplifies the out-of-the-box interaction and can improve the performance of client and server Provides an improved and more efficient developer experience with MarkLogic © COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. APPENDIX Continuous Innovation Cerisent XQE Server 1 ACID Transactions Text Based Search Backup and Restore Linux Support Web-based Protocols HTTP and XDBC XQuery 2003 2004 MarkLogic Server 4 MarkLogic Server 3 Alerting Entity Enrichment Geospatial Analytics (co-occur., value lexicons, bucketing) Modular documents Security auditing HA: forest-level failover Advanced Search Features Content processing (including PDF, Word, Excel, PPT) HTTP calls Failover Support for Linux, Windows Server, .NET 2005 MarkLogic Server 2 Clustering Role-based security w/BASIC authentication Document Collections Enhanced Search (stemming, thesaurus, wildcard) WebDAV support Document locking Enhanced XDBC support 2006 MarkLogic Server 5 2008 2010 MarkLogic 7 Complete Enterprise Roadmap Database Replication Multi-statement and distributed transactions Point-in-time recovery Start Hadoop Roadmap Hadoop Connector 2011 MarkLogic Server 3.1 MarkLogic Server 4.1-2 Advanced Search Features Wildcard queries Directories Forward Compatibility Support for Sun Solaris XML Contentbase Connector (XCC) Replication Failover Database Rollback Compartment Security Search Optimizations, API Information Studio Application Builder REST capabilities SSL support Schema Validation Japanese added 2012 MarkLogic 6 Accessibility SQL/BI Java/REST/JSON UDFs/Analytics mlcp Hadoop Distributions HDFS Tech Preview Semantics Foundation Next-gen Infrastructure Support Elasticity Tiered Storage Continue Hadoop Roadmap Run on HDFS 2013 2015 MarkLogic 8 JSON Storage Server-side JavaScript Semantics Bitemporal Samplestack Java Client API Node.js Client API Management API Incremental Backup Flexible Replication Enhanced HTTP Server MarkLogic / Enterprise NoSQL Database Platform POWERFUL AGILE TRUSTED Better answers from today’s data Adaptive to every environment Hardened, proven platform MarkLogic is built to find answers in documents, relationships, and metadata MarkLogic runs well everywhere, while preserving the option to change hardware, data, and scale later MarkLogic has a proven track record of performance under all enterprise conditions Simpler data integration Uncompromised data resiliency MarkLogic accelerates and simplifies data sharing across silos, cutting down on ETL and making agile development possible MarkLogic will keep your data safe and whole—no matter what happens in your application or at your data center The intelligent data layer An intelligent data layer powers intelligent applications—and makes them faster and more flexible than any alternative // POWERFUL / Deliver more value, build better apps Native JSON Store Store and manage data natively as JSON documents, speeding up development and reducing data transformation with a simplified architecture for end-to-end JavaScript development. Native XML Store Store and manage data natively as XML documents, a hierarchical selfdescribing data type that is ideal for a wide variety of applications. Native RDF Triple Store Geospatial Support Store RDF triples and query them using SPARQL—providing context to your data and better search with a database that can handle a combination of documents, data, and triples. Store geospatial data such as GML, KML, and GeoRSS and do complex queries on the data or in combination with other data types. Also integrate with ESRI ArcGIS and Google Maps for visualization. Full-text Search Flexible Indexes Bitemporal Real-time Alerting Built-in, lightning fast search and query capabilities across hundreds of billions of documents. And, fullfeatured UX with type-ahead suggestions, facets, snippeting, relevance ranking, and language support. Rely on over 30 sophisticated, composable indexes including a universal index, range index, geospatial index, and triple index—all designed so that developers can ask harder questions and get faster responses. Handle historical data along two different timelines, making it possible to rewind the information “as it actually was” in combination with “as it was recorded” at some point in time. Create an unlimited number of realtime alerts by email or text using the alerting API and reverse indexes. Whenever a document is loaded that matches a specific query, you’ll know. Semantic Inference Tiered Storage Server-Side JavaScript Work with new data that didn’t exist before. For example, if John lives in London, and London is in England, then MarkLogic can infer that “John lives in England” and then add that new fact to your semantic search. Store and manage data in different tiers based on cost and performance trade-offs, and easily migrate between tiers without any ETL, additional software, or expensive infrastructure changes. Live in JavaScript. Run JavaScript near the data for unparalleled power and efficiency with a high performance JavaScript runtime inside MarkLogic using Google’s V8. Run complex distributed transactions across multiple documents and Fully collections with no performance dropTransactional offs at scale. Production applications run tens of thousands of transactions per second for tens of thousands of users. // AGILE / Prepare for and respond to change Handle petabytes of data without Scalable over-provisioning, over-spending, or and Elastic experiencing downtime, SQL Support inconsistency, or risk of data loss. Cloud Ready (AWS) Use MarkLogic’s cloud templates to get up and running quickly on AWS or other cloud environments, starting with a three node cluster or a large cluster with over a hundred nodes. Hadoop and HDFS Make Hadoop better by connecting it to MarkLogic and using it as part of an infrastructure to handle both operational and analytic workloads. REST API Configure and administer MarkLogic with a single REST-based API. This provides more programmatic control than ever before—giving DBAs the power and flexibility necessary to run a modern data center. Multi-OS Support Schema Agnostic Samplestack Use a relational SQL data model within MarkLogic, connecting to SQLbased tools using the ODBC driver, or execute SQL commands against relational databases using the MLSAM open-source XQuery library. Run MarkLogic on Windows, Linux, Solaris, OS X. MarkLogic runs easily and is easy to setup in your environment, whether in the cloud, virtualized, or on premises. Only use schema when you need it. Ingest all your data as-is, whether structured or unstructured, using the NoSQL document model rather than being forced to use a predefined schema. Get going fast on MarkLogic with Samplestack, an end-to-end three tiered sample application designed to show developers how to implement a reference architecture using key MarkLogic concepts and sample code. MarkLogic Content Pump XA Transactions Ad-hoc Queries Index Across Data Types MLCP makes it easy to quickly import or export documents and metadata from MarkLogic, or to copy from one database to another using a command-line tool. Run distribute transactions across a cluster using the XA (eXtended Architecture) standard, which ensures ACID properties for global transaction processing. Don’t plan your queries in advance of ingesting your data. MarkLogic is designed for search and discovery so that you can run any query at any-time and get real-time results. Use multiple indexes in concert across multiple data types—giving you the power to search and query all of your data. // TRUSTED / Enterprise-ready for mission-critical uses Performance at scale LDAP and Kerberos Security Security Certifications Monitoring and Management Scales easily to handle hundreds of terabytes using shared-nothing architecture in which data partitions are completely independent of each other and can act independently. Use third party authentication from LDAP or Kerberos, making the most secure NoSQL database easier to manage. Secure your data with government-grade security. MarkLogic has certified, granular security for modern data governance and to handle the increased complexity of today’s cyber threats. Use the Management API for cluster management, process automation, access controls, database cloning, audit trails, and connections to third-party interfaces. Configuration Management 24/7 Engineering Support ACID Transactions Flexible Replication View and manage the configuration settings for MarkLogic databases, forests, application servers, groups or hosts—and easily propagate changes across the entire cluster. Rely on support from the 24/7, allengineer support staff to ensure you get answers fast, or just want some friendly tips on saving a few milliseconds on performance. Don’t settle for a BASE-ic database. Use ACID transactions to ensure you don’t run the risk of encountering data corruption, stale reads, and inconsistent data—all of which are unacceptable. Enable customizable information sharing between systems, allowing for the easy and secure distribution of portions of data even across disconnected, intermittent, and latent networks. Customizable Backup Customizable Failover Point-in-time Recovery Atomic Forests Restore the database quickly with minimal downtime, relying on full and consistent backups, hot configuration changes, and automatic index optimization without shutting down the system. Have confidence that your data is always available, reducing risk and avoiding interruptions with automated local- or shared-disk failover made possible with sharednothing architecture. Rollback to a specified point in time by replaying journal archives, an additional feature to ensure disaster recovery and easy of management. Manage data in collections of documents similar to partitions, called forests, that exist independently and enable scalability and elasticity, rebalancing, efficient operations, and easier data governance.