® IBM Software Group Linked Data as an Application Integration Architecture Martin Nally, IBM Fellow, Rational CTO © IBM Corporation IBM Software Group | Rational software My story 1. A case study – how we use linked data as an integration architecture 2. The gaps and problems we found, with proposed solutions 3. A call to action 2 IBM Software Group | Rational software Why do we need a new integration architecture? Case study - IBM Rational in 2004 A leading provider of Application Lifecycle Management Tools Source Code Management (ClearCase, #1 market share) Defect/Change Management (ClearQuest, #1 market share) Requirements Management (ReqPro, #2 market share). … Mature products Designed and built in the ’90s Evolving to the web 3 IBM Software Group | Rational software Times change – requirements change New Requirements Distribution of development teams around the world Want more tightly integrated end-to-end ALM solution Want broader solutions Need greater agility and lower cost of ownership Response Adapt existing products to the new environment Write glue code to integrate products Necessary, works, but has limitations 4 IBM Software Group | Rational software Traditional Tool Integration. Ouch. N2 possible point-to-point connections Limited coverage Closed APIs Vendor lock-in Tight Coupling Dependence on API details Lockstep upgrades Version incompatibilities No place for cross tool functions Global Query Account Mgmt Time for a new architecture … 5 IBM Software Group | Rational software What integration functions are really needed? Create a link between ‘artifacts’ in different tools E.g Test links to requirement. Development task links to requirement Within one tool, create an “artifact’ in another E.g. Within a Test tool, log a defect using test results Across tools, share common concepts People form a team to work on a project to produce the next release of a product Be able to query across information in multiple tools, e.g. Show me all the critical requirements that don’t have test cases Show me al the new test cases I should run on last night’s build 6 IBM Software Group | Rational software Alternative Architecture - 1 Ent Arch Requirements Bus Proc Model Software & Solution Architecture Development Test Repository 7 IBM Software Group | Rational software Alternative Architecture - 1 Standard Application Server Architecture – tried and proven IMS, CICS, JEE, Tuxedo, Encina, … Tried for tools many times before AD/Cycle, PCTE, … Pros: Effective when the function can be delivered and deployed in a highlycoordinated fashion Cons: Requires schema coordination across tools Does not provide a mechanism for integrating “existing ” or “foreign” tools Requires “central planning” by consumer (deploy 1 database) as well as provider (integrate tools to a single schema) Does not support an open ecosystem well 8 IBM Software Group | Rational software Alternative Architecture - 2 Ent Arch Bus Proc Model Development “ESB – Enterprise Service Bus” Requirements Software & Solution Architecture Test 9 IBM Software Group | Rational software Alternative Architecture - 2 Pros: Allows integration of existing applications Some of the process implemented in the “bus” Cons: Multiple copies of data in different tools Sensitive to changes in data model in any participating tool Sensitive to changes in the process Requires a lot of centralized implementation Many of the disadvantages of traditional approach Central point of control is bottleneck 10 IBM Software Group | Rational software Problem We’ve been stuck with these problems and solutions for 20 years Is Martin going to claim he’s invented a new way to write tools that fixes this? Yes, and No We think there is another way We didn’t invent it … …Al Gore did 11 IBM Software Group | Rational software Linked Lifecycle Data Data Integration for the 21st Century Inspired by Internet principles, implemented with Internet technologies: simple interfaces for exchange of resources If the entire Web can connect like this, would the same idea work for ALM? Loosely coupled: everything is a “resource” linked together with URLs Technology neutral: treats all implementations equally Minimalist: defines no more than necessary for exchange of resources Incremental: deliver value now, add more value over time Openly published standards: free to implement and irrevocable 12 IBM Software Group | Rational software Alternative Architecture - 3 Pros: Allows integration of existing applications Supports open, federated data model Does not require data copying Supports independent evolution of products Cons: Unproven (for our domain, at least) Big paradigm shift (for us) Lots of invention required The rest of this talk focuses on our experiences 13 IBM Software Group | Rational software OSLC and Open Community http://open-services.org Open Services for Lifecycle Collaboration open community. open interfaces. open possibilities Minimalist/additive approach Not a “complete” definition for a given area Scenario driven scope Co-evolve spec and implementations Identify Scenarios Call it a spec Iterate on working drafts Open participation around active core group Gain technical consensus, collect nonassert statements 14 IBM Software Group | Rational software OSLC Community Eleven workgroups operating at openservices.net Domain focused workgroups (e.g. CM, QM, RM) Common issues and patterns (Core) Solution oriented workgroups (e.g. PLM/ALM) Range of interests, expertise, involvement 400+ registered community members (up from 70 people at RSC 2009) Individuals from 34+ different companies have participated in OSLC workgroups (up from 5 companies at RSC 2009) Accenture APG Black Duck Boeing BSD Group Citigroup EADS Emphasys Group Empulsys Fokus Fraunhofer Galorath General Motors Health Care Services Corp IBM Institut TELECOM Integrate Systems Worthy of note PLM/ALM Workgroup Siemens leadership Open source Eclipse Lyo Project Proposal Eclipse Mylyn project restructuring and positioning of OSLC Mantis, Forges Customer and integrator involvement GM, Northrop Grumman, Tieto, Integrate Systems 15 Lender Processing Services Northrop Grumman Oracle QSM Rally Software Ravenflow Shell Siemens Sogeti SourceGear/Teamprise State Street Tasktop (Eclipse Mylyn) Thales Tieto TOPIC Embedded Systems UrbanCode WebLayers 15 IBM Software Group | Rational software Problem 0 - culture People have been brainwashed for years XML (don’t get me started) Operations, parameters SPARQL Protocol for RDF is a typical example of the mess Relational schemas and Object-oriented class models Let’s invent new media types Cling to closed world assumptions like to a security blanket Desktop user interfaces 16 IBM Software Group | Rational software Problem 1 – creating data on the web It seems obvious that you use POST to create, but what do you POST to? How do I find the things that already exist? 17 IBM Software Group | Rational software Proposed Solution 1 – “Atom Publishing for LD” A resource to which you POST to create new things, GET to find existing things Not really like APP, much simpler But APP saw the need, defined the use-cases, got traction We did not get this part right at http://open-services.net OSLC Observation – the simplest way to define a collection in RDF is a set of triples with common subject and common predicate That is all there is to it! 18 IBM Software Group | Rational software Solution 1 - APP for LD example (Trig format) @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix bp: <open-serives.org/bp/>. @prefix : <http://example.org/test/>. :testCases = { :testcases bp:commonSubject :proj1TestCases; bp:commonPredicate rdfs:member. :proj1TestCases rdfs:member :testCase1; rdfs:member :testCase2. } POST to this resource to create a resource -will also add a triple that points to the new resource GET to find [triples that reference] existing resources 19 IBM Software Group | Rational software Problem 2 - But it will get big; I need to paginate Yes, you need pagination APP has this Solution proposal: Define <URL>?firstPage as a resource Triples are a strict subset of those in URL (no change of subject) Define :nextPage as a predicate 20 IBM Software Group | Rational software Solution 2 - Pagination example @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix bp: <open-serives.org/bp/>. @prefix : <http://example.org/test/>. :testCases?firstPage = { :proj1Testcases rdfs:member :testCase1; rdfs:member :testCase2. :testCases?firstPage bp:nextPage :testCases?page2 }. :testCases?page2 = { :proj1Testcases rdfs:member :testCase3; rdfs:member :testCase4. :testCases?page2 bp:nextPage rdf:nil }. 21 IBM Software Group | Rational software Problem 3 - I only want the metaData @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix bp: <open-serives.org/bp/> @prefix : <http://example.org/test/>. :testCases?metaData = { :testcases bp:commonSubject :proj1TestCases; bp:commonPredicate rdfs:member. } 22 IBM Software Group | Rational software Basic Profile for LD Problem 2 - Now know where to put them and find them - what does a resource look like? 1. Use URIs as names for things. 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL). 4. Include links to other URIs so that people can discover more things. This is a terrific foundation, but … In practice, people have difficult interpreting this It doesn’t go far enough into detail for our purposes Proposed Solution 2 – the “Basic Profile for LD” Mostly just some elaboration on the 4 rules 23 IBM Software Group | Rational software Basic Profile – 12 rules (and counting) 1. BP Resources are HTTP resources that can be created, modified, deleted and read using standard HTTP methods. 2. BP Resources use RDF to define their states. 1. POST any resource to a BP collection, not just a BP resource 2. Use the null relative URI for the subject when POSTing 3. You can request an RDF/XML representation of any BP Resource. 1. Turtle, please. JSON, please. 4. BP clients use Optimistic Collision Detection during update. 5. BP Resources use standard media types. 24 IBM Software Group | Rational software Basic Profile – 12 rules (and counting) 6. BP Resources use standard vocabularies. Prefix Namespace URI dc http://purl.org/dc/terms/ rdf http://www.w3.org/1999/02/22-rdf-syntax-ns# rdfs http://www.w3.org/2000/01/rdf-schema# bp http://open-services.net/ns/basicProfile# xsd http://www.w3.org/2001/XMLSchema# 25 IBM Software Group | Rational software Basic Profile – 12 rules (and counting) 7. BP Resources use a restricted set of data types. Boolean: A Boolean type as specified by XSD Boolean http://www.w3.org/2001/XMLSchema#boolean, reference: XSD Datatypes. DateTime: A Date and Time type as specified by XSD dateTime http://www.w3.org/2001/XMLSchema#dateTime, reference: XSD Datatypes Decimal: A decimal number type as specified by XSD Decimal http://www.w3.org/2001/XMLSchema#decimal, reference: XSD Datatypes Double: A double floating point number type as specified by XSD Double http://www.w3.org/2001/XMLSchema#double, reference: XSD Datatypes. Float: A floating point number type as specified by XSD Float http://www.w3.org/2001/XMLSchema#float, reference: XSD Datatypes. Integer: An integer number type as specified by XSD Integer http://www.w3.org/2001/XMLSchema#integer, reference: XSD Datatypes. String: A string type as specified by XSD String http://www.w3.org/2001/XMLSchema#string, reference: XSD Datatypes. XMLLiteral: A literal XML value http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral 26 IBM Software Group | Rational software Basic Profile – 12 rules (and counting) 8. BP Resources set rdf:type explicitly. 9. BP clients expect to encounter unknown properties and content. 10.BP clients do not assume the type of a resource at the end of a link. 11.BP servers implement simple validations for Create and Update. 12.BP Resources use simple RDF predicates to represent links. 1. “Anchors” in addition, not instead 1. only if adding properties to links 2. Use reified RDF statements for anchors 27 IBM Software Group | Rational software Problem 4 - Shapes (Validation, constraints) RDFS and OWL do inferencing “If it walks like a duck, and quacks like a duck, it must be a duck” In software (as in real life) it is also useful to go the other way “If it wants to be a good duck, it must walk like one and quack like one” Everyone in software knows this model Database Schemas Object-oriented programming class definitions This model is actually very important for writing practical software Need to support more than one perspective on a duck It’s a food-source, a source of insulating materials, a hazard to aircraft, a bird Problem - this is a gap in RDF technologies 28 IBM Software Group | Rational software Proposed Solution 4 - Shapes @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix bp: <open-serives.org/bp/>. @prefix : <http://example.org/test/>. :teatCaseShape = { :testCaseShape a bp:Shape; bp:describedClass :TestCase; pb:propertyConstraint :testsRequirementConstraint }. :testsRequirementConstraint = { :testsRequirementConstraint a bp:propertyConstraint; bp:describedProperty :testsRequirement; bp:rangeShape :requirementShape }. 29 IBM Software Group | Rational software Problem 4 - Updating data on the web You use PUT, right? Observation – SQL has no equivalent of PUT Problem 4 – PUT is good for documents, not a good idea for data Read-only properties Independent perspectives Proposed solution 4 “Put” more energy into PATCH 30 IBM Software Group | Rational software Cross-product query Query server independent of storage, broader scope Schema-less query Exploit RDF XML and XQuery were tried, did not work as well 31 IBM Software Group | Rational software Implementing query on data not in a repository HTTP GET/PUT/POST/DELETE SPARQL RDF represent’n Jena RDF represent’n 32 IBM Software Group | Rational software UI Integration Problem 5: A test product wants to allow testers to open defects in another product Defects are customized – don’t know in advance what fields they have Defects have context (projects, components) and “meta-data” (defect states, defect types, …) Option 1 – teach test tool about defects, their types, constraints, contexts Option 2 – let the test tool use a UI component from the defect tool to create the defect allows much simpler interface, looser coupling Relies on human third-party Problem 6 – “hover help” on links Solution – UI preview protocol 33 IBM Software Group | Rational software Security Requirements Allow each product to have its own security administration and authentication Provide optional central place to configure users, integrate with LDAP etc Allow each product to manage it’s own access control lists, pre-conditions and post-actions Provide optional central federation of ACL administration Solutions For products that implement their own authentication, federate with OAuth For products that want to manage ACLs, provide federation Provide optional building blocks for new-construction that take over admin, authentication and ACLs 34 IBM Software Group | Rational software How do you write an Ajax UI? Attempt 1 – AJAX clients handle multiple-resources Pros: Responsive UI – no page refresh between resources Cons: Complex clients (1990s client-server) Closed system (can’t link to unexpected resources) Future attempt: Page-per-resource Ajax inside pages 35 IBM Software Group | Rational software Miscellaneous architecture/design Tags, not folders (still some controversy) Relationships/predicates, not collections (still some controversy) Exploit RDF, eschew WebDAV Programmable resources (but stay RESTful) Back links for inverse navigation? Discovery What is userID? Transactions 36 IBM Software Group | Rational software Need more help to shift to a web-centric design Old New Services REST Files Resources Repositories Linked open data J2EE apps Heterogeneous apps Tightly-coupled apps Loosely-coupled apps Relational/XML data/Objects RDF SQL/XQuery SPARQL authentication OAuth Folders Tags WebDAV RDF Collections RDF Predicates 37 IBM Software Group | Rational software Conclusion Linked Data could be huge in enterprise computing a very significant new application integration model Some significant gaps to close if we are to succeed Better definition and documentation of best practices, e.g. 12 practices of Basic Profile UI guidelines (resource-oriented UI) A few important specification gaps, e.g. POST-able collections Shapes/constraints/validation Security Let the W3C take up the challenge! 38