Big Data Initiatives An Enterprise Perspective Kent Laursen, CTO, No Magic, Inc. September 15, 2014 Agenda • • • • • • Big Data Characteristics Industry Trends and Uses Conceptual Modeling Weaving the Polyglot Process Execution No Magic Roadmap 2 © 2014 No Magic, Inc. Exclusively for No Magic Use Big Data – What is it? • Data sets that are too large and complex to manipulate or interrogate with standard methods or tools • The V4C of Big Data • • • • • Volume – lots of it Variety – many kinds, both structured and unstructured Velocity – fast production and consumption Variability – changes over time Complexity – complicated composition and relations 3 © 2014 No Magic, Inc. Exclusively for No Magic Use Big Data – What about...? • Veracity • Truth of data, pedigree, trust of source • Quality • Validity, correctness, completness and integrity • Context • Business: alignment with portfolios and capabilities • Process: placement and use in operations • System: production, transport, transformation and consumption via automations • Meaning • Understanding through conceptual description • Weaving the polyglot 4 © 2014 No Magic, Inc. Exclusively for No Magic Use Big Data Trends • • • • Hype becomes reality Not just Hadoop More than unstructured data Modeling and visualization become critical • Conceptual/Ontology • More sophisticated NoSQL • Fusion with process • Replacing legacy data management 5 © 2014 No Magic, Inc. Exclusively for No Magic Use Big Data Use – Internet of Things • Drivers • Internet integration of devices • ...Many existing and emerging use cases... • Data • Sensor observations • Commands • Monitoring 6 © 2014 No Magic, Inc. Exclusively for No Magic Use Big Data Use - Financial • Drivers • • • • • Fraud Detection Compliance Risk Management Integration Customer Relationship Management, Product Tailoring • Data • Transactions • Accounts • Financial Instruments 7 © 2014 No Magic, Inc. Exclusively for No Magic Use Big Data Use – Bio IT • Drivers • • • • • Discovery Biology, Proteomics, Genomics Clinical Data Analysis, Drug Research Disease Control, Health, Epdidemics Environment Food • Data • Samples • Instrument Output, e.g. Mass Spectrometry • Experiments, Assays, Investigations 8 © 2014 No Magic, Inc. Exclusively for No Magic Use Big Data Use – Defense • Drivers • • • • Cybersecurity Intelligence Analysis Situational Awareness Alerting • Data • • • • Human Observation Sensor Data Video Network Monitoring Data 9 © 2014 No Magic, Inc. Exclusively for No Magic Use Conceptual Modeling Problem Statement • It’s hard to get a new project under way • • • • • Business concepts get lost in technical detail Many models are often necessary • • • • • How does a forming team knit together the plethora of methodologies, profiles, and plug-ins? How do we unify models of various data concerns across an enterprise? It takes a long time to develop techniques and automation Many profiles are at the intricate technology level (e.g., DDL, XSD, AndroMDA) Too many technical choices leads to inconsistent models Technology concerns drag down the level of abstraction It is too much work to align models, so we get disconnected silos Generating systems from abstract models should be easy by now! Conceptual Modeling Vision • A unifying business concept model • • • • • • Used in business process models Connected to other models • • • • Can generate a PIM from selected classes and properties Can be traced to any UML model, such as NIEM-UML Can provides a kind of “Rosetta Stone” for enterprise-level semantic integration That can generate code -- by convention • • • • Represents the concepts and defining relations of the business Understood and validated by business experts Grounded by a subset of OWL Can be augmented with Alf to generate an entire system OWL for ontologies DDL for databases XML Schema or NIEM-UML for messages And keep models in sync • Concept model changes flag other models for resolution Concept Modeling Features • • Abstract diagrams focus on the business Simpler alternative to ODM • • • • • • • • • Supports a glossary with plain-English statements for business expert validation Generates OWL / Turtle that ontologists can augment: • • • • Does not require full-fidelity OWL to be useful Crossing lines is optional Association class boxes are unnecessary «Stereotype» markup is unnecessary (for most models) NoTechieCamelCase for class or property names Uses standard UML as intended Encourages cleaner, hyperlinked micro-subject-area diagrams Classes Global properties Per-class property restrictions Allows use of existing ontologies Concept Modeling Features (Continued) • Semantically integrates multiple UML models • • Data at rest (e.g., relational DB, XML DB) Data in motion (e.g., XSM Schema, NIEM-UML) • Ties with other UML models of: • • Systems (e.g., UPDM, SysML) Services (e.g., SoaML) • Works with other standards (e.g., BPMN, SysML, UPDM) • Works well for MDA (i.e., forward engineering): • • • Concept model plays the role of an OOA model for executable UML UML activities can manipulate class properties Concept model can generate schemas by convention rather than markup (e.g., XML Schema, DDL, RDFS / OWL) Concept Modeling Features (Additional) • Concept model provides a business-vocabulary basis for integrating master data sources (i.e., ETL) • • • A concept model is a semantic hub for multiple logical model spokes Relationships between the hub and a spoke can forward generate views of data Views can be used for loading data into or emulating a tuple store • Supports use cases such as: • • • Risk analysis at a bank Data pull from legacy systems RDF data lakes integrating multiple data sources FIBO Example - Before FIBO Example - After FIBO Example – Generated OWL Polyglot Weaving – Concept Modeling • Concepts mapped to system constructs • Provides some specification and scoping • Multiple implemations and maintenance Conceptual Model aka Ontology Business Layer Data Layer Internal System DB DB External System Polyglot Weaving – Concept Generation • Utilize forward generation from conceptual model • Apply frameworks and API’s • Higher degree of reuse with less implementation and maintence Business Layer Data Layer Internal System DB DB External System Polyglot Weaving – Semantic Data Fusion • Utilize semantic standards, OWL, RDF, SPARQL • Both concept model and described data realized in RDF • Polyglot data homogenized in RDF data lake Business Layer (SPARQL) Data Layer (RDF) Internal System DB DB External System Polyglot Weaving – Semantic Process • Model driven processes exercise semantic services • Modeled logic combined with modeled data • Migration from legacy to executable models Process Layer (BPMN) Business Layer (SPARQL) Data Layer (RDF) Converge into Model Driven Ecosystem Internal System DB DB External System No Magic Roadmap • • • • • • • Model the data Model the data configuration Model data fusion Model and data are scalable Model forward generates implementions Model integrates with W3C and tuple Stores Modeled concepts traverse architecture 22 © 2014 No Magic, Inc. Exclusively for No Magic Use The Truth is in the Models Thank You! Questions and Dialog Kent Laursen, CTO klaursen@nomagic.com 23 © 2014 No Magic, Inc. Exclusively for No Magic Use