Data-modeling-is-underrated--PeterOKelly_20061130

Data Modeling is Underrated:
A Bright Future Ahead in the Grand Schema Things
Peter O’Kelly
Research Director
pokelly@burtongroup.com
www.burtongroup.com
pbokelly.blogspot.com
Thursday – November 30, 2006
All Contents © 2006 Burton Group. All rights reserved.
Data Modeling is Underrated
Agenda
• Synopsis
• ~7-minute summary
• Discussion
• Extended-play overview (for reference)
• Analysis
•
•
•
•
Market snapshot
Market trends
Market impact
Recommendations
2
Data Modeling is Underrated
Synopsis
• Data modeling used to be seen primarily as part of database
analysis and design -- for DBMS-nerds only
• There is now growing appreciation for the value of logical data
modeling in many domains, both technical and non-technical
• Historically, most data modeling techniques and tools have been
inadequate, and often focused more on implementation details than logical
analysis and design
• Pervasive use of XML and broader exploitation of metadata, along with
improved techniques and tools, is making data modeling more useful for all
information workers (as well as data-nerds)
• Data modeling is a critical success factor for XML – in SOA and elsewhere
• Data modeling is now
• A fundamental part of the back-to-basics trend in application development
• Key to effective exploitation of emerging applications and tools
• Essential to regulatory compliance (e.g., information disclosure tracking)
3
Data Modeling is Underrated
4
~7-minute summary
• Logical data modeling is often misunderstood and
underrated
• Models of real-world things (entities), attributes, relationships, and
identifiers
• Logical => technology-independent (not implementation models)
• Logical data modeling is not 1:1 with relational database design
• It’s as much about building contextual consensus among people as it is
capturing model design for software systems
• It’s also exceptionally useful for database design, however
• Some of the historical issues
• Costly, complex, and cumbersome tools/techniques
• Disproportionate focus on physical database design
Data Modeling is Underrated
~7-minute summary
• Logical data modeling is more relevant than ever before
• Entities, attributes, relationships, and identifiers
• None of the above are optional if you seek to
• Respect and accommodate real-world complexity
• Establish robust, shared context with other people
• Revenge of the DBMS nerds
• Not just for normalized “number-crunching” anymore…
• Native DBMS XML data model management => fundamental changes
• XQuery: relational calculus for XML
• SQL and XQuery have very strong synergy
• All of the capabilities that made DBMS useful in the first place apply to XML as well
as traditional database models
• DBMS price/performance and other equations have radically improved
• Logical modeling tools/techniques are more powerful and intuitive
• And less expensive
5
Data Modeling is Underrated
6
~7-minute summary
• XML-based models are useful but insufficient
• Document-centric meta-meta-models are not substitutes for techniques
based on entities, attributes, relationships, and identifiers
• Some XML-centric techniques have a lot in common with pre-relational data
model types (hierarchical and network navigation) or mutant “object database”
models
• XML also unfortunately has ambiguous aspects like the unfortunate “EntityRelationship” (E-R) model
• Logical data modeling is not ideal for document-oriented scenarios
(involving narrative, hierarchy, and sequence; optimized for human
comprehension)
• But a very large percentage of XML today is data-centric rather than documentcentric
• And increasingly pervasive beyond-the-basics hypertext (with compound and
interactive document models) is often more data- than document-centric
Data Modeling is Underrated
~7-minute summary
• Ontology is necessary but insufficient
• Categorization is obviously a useful organizing construct
• “Folksonomies” are also often very effective
• But…
• Categorization is just one facet of modeling
• Many related techniques are conducive to insufficient model detail,
creating ambiguity and unnecessary complexity, e.g., for model
mapping
• So…
• We’re now seeing microformats and other new words
• … that are fundamentally focused on logical data model concepts
• It’d be a lot simpler and more effective to start with logical data models
in the first place
7
Data Modeling is Underrated
Discussion
8
[Extended-play version] Analysis
Market snapshot
• Data modeling concepts
• Data modeling benefits
• Data modeling in the broader analysis/design landscape
• Why data modeling hasn’t been used more pervasively
9
Market Snapshot
10
Data modeling concepts: the joy of sets
• Core concepts
• Entity: a type of real-world thing of interest
• Anything about which we wish to capture descriptions
• More precisely, an entity is an arbitrarily defined but mutually agreed upon
classification of things in the real world
• Examples: customer, report, reservation, purchase
• Attribute: a descriptor (characteristic) of an entity
• A customer entity, for example, is likely to have attributes including
customer name, address, …
• Relationship: a bidirectional connection between two entities
• Composed of two links, each with a link label/descriptor
• Example: customer has dialogue; dialogue of customer
• Identifier: one or more descriptors (attributes and/or relationship links)
that together uniquely identify entity instances, e.g., CustomerID
Market Snapshot
11
Data modeling concepts: example data model fragment diagram
Entities
Customer
Attributes
Identifiers
CustomerID
CustomerName
CustomerIndustry
CustomerAddress
CustomerRenew alDate
Dialogue
Relationship
DialogueDate
DialogueTopic
DialogueAnalyst
Attributes
Following Carlis/Maguire (from their data modeling book):
• About each customer, we can remember its name, industry, address, renewal
data, and ID. Each customer is identified by its ID.
• About each dialogue, we can remember its customer and its date, topic, and
analyst. Each dialogue is identified by its customer and its date.
[Note: this model fragment is an example and is not very well-formed]
Market Snapshot
12
Data modeling concepts: example data model instance
Customer
CustomerID
(PK1)
CustomerName
CustomerIndustry
017823
Acme Widgets
Manufacturing
123 Main Street…
2005/10/14
75912
NewBank.com
Financial services
456 Central…
2006/05/28
91641
Degrees 4U
Education
P.O. Box 1642…
2004/12/31
CustomerAddress
Dialogue
CustomerID
(PK1, FK1)
DialogueDate
(PK1)
CustomerRenewalDate
PKn: participates in primary key
DialogueTopic
DialogueAnalyst
FKn: participates in foreign key
Bonus: it’s very simple to
create instance models (and
thus relational database
designs) from well-formed
logical data models
75912
2005/06/18
Data architecture
Peter O’Kelly
91641
2003/12/13
SIP/SIMPLE
Mike Gotta
017823
2004/10/14
Portal
Craig Roth
Market Snapshot
13
Data modeling benefits
• Precision and consistency
• High fidelity models
• Which are easier to maintain in order to reflect real-world changes
• Improved
• Ability to analyze, visualize, communicate, collaborate, and build consensus
• Potential for data reuse
• A fundamental DBMS goal
• Easier to recognize common shapes and patterns
• Impact analysis (e.g., “what if” assessments for proposed changes)
• Exploitation of tools, servers, and services
• DBMSs and modern design tools/services assume well-formed data models
• “Being normal is not enough”…
• SOA, defined in terms of schemas, requires data model consensus
Market Snapshot
Data modeling in the broader analysis/design landscape
• Four dimensions to consider
•
•
•
•
Data, process, and events
Roles/concerns/views: strategic, operational, and technology
Logical and physical
Current/as-is and goal/to-be states
14
Market Snapshot
15
Data, process, and events
• Think of nouns, verbs, and state transitions
• Data: describes structure and state at a given point in time
• Process: algorithm for accomplishing work and state changes
• Event: trigger for data change and/or other action execution
• Integrated models are critically important
• Data modeling, for example, is guided by process and event analyses
• Otherwise scope creep is likely
• There is no clear right/wrong in data modeling
• Scope and detail are determined by the processes and events you wish
to support, and they often change over time
Market Snapshot
16
Market Snapshot
Roles/concerns/views
• Three key dimensions
• Strategic
• Organization mission, vision, goals, and strategy
• Operational
• Data, process, and event details to support the strategic view
• Technology
• Systems (applications, databases, and services) to execute operations
• Again pivotal to maintain integrated models
• Data modeling that’s not guided by higher-level goal modeling can
suffer from scope creep and become an academic exercise
17
Market Snapshot
Logical and physical
• Another take on operational/technology
• Logical: technology-independent data, process, and event models
• Examples:
• Entity-Relationship (ER) diagram
• Data flow diagram (process model)
• Physical: logical models defined in software
• (Doesn’t imply illogical…)
• Examples
• Data definition language statements for database definition, including details such
as indexing and table space management for performance and fault tolerance
• Class and program modules in a specific programming language
• Integration and alignment between logical and physical are key
• But are often far from ideal, in practice today
18
Market Snapshot
19
Current/as-is and goal/to-be states
• Combining as-is/to-be states and logical/physical
Current/as-is
Goal state/to-be
Logical
Technologyindependent view
of current systems
Real-world model
unconstrained by
current systems
Physical
Systems already in
place; the stuff we
need to live with…
New system view
with high-fidelity
mapping to logical
goal state
Market Snapshot
Why data modeling hasn’t been used more pervasively
• So, why isn’t everybody doing this?...
• Data modeling is hard work
• Historically
• Disproportionate focus on physical modeling
• Inadequate techniques and tools
• Suboptimal “burden of knowledge” distribution
• Reduced “green field” application development
• Data modeling has a mixed reputation
20
Market Snapshot
Data modeling is hard work
• It’s straightforward to read well-formed data models, but
it’s often very difficult to create them
• Key challenges
• Capturing and accommodating real-world complexity
• Dealing with existing applications and systems
• Organizational issues
• Collaboration and consensus-building
• Role definitions and incentive systems that discourage designing for
reuse and working with other project teams
• Politics
21
Market Snapshot
Historically disproportionate focus on physical modeling
• Radical IT economic model shifts during recent years
• Design used to be optimized for scarce computing resources
including MIPs, disk space, and network bandwidth
• The “Y2K crisis” is a classic example of the consequences of placing
too much emphasis on physical modeling-related constraints
• Relatively stand-alone systems discouraged designing for reuse
• Now
• Applications are increasingly integrated, e.g., SOA
• Hardware and networking resources are abundant and inexpensive
• The ability to flexibly accommodate real-world changes is missioncritical
• Logical modeling is more important than ever before
22
Market Snapshot
Historically inadequate techniques and tools
• Tendency to focus on physical, often product-specific
(e.g., PeopleSoft or SAP) models
• Lack of robust repository offerings
• Making it very difficult to discover, explore, and share/reuse models
• Entity-Relationship (ER) “model”
• More of an ambiguous and incomplete diagramming technique, but
still the de facto standard for data modeling
23
Market Snapshot
24
Tangent: ER, what’s the matter?
• Entity Relationship deficiencies
• Per E. F. Codd [1990]
• “Only the structural aspects were described; neither the operators upon
those structures nor the integrity constraints were discussed.
Therefore, it was not a data model
• The distinction between entities and relationships was not, and is still
not, precisely defined. Consequently, one person’s entity is another
person’s relationship.
• Even if this distinction had been precisely defined, it would have added
complexity without adding power.”
Source: Codd, The Relational Model for Database Management, Version 2
Market Snapshot
Tangent: ER, what’s the matter?
• Many vendors have addressed some original ER limitations, but the
fact that ER is ambiguous and incomplete has led to considerable
problems
• The Logical Data Structure (LDS) technique is much more consistent
and concise, but it’s only supported by one tool vendor (Grandite)
• It’s possible to use the ER-based features in many tools in an LDScentric approach, however
• Ultimately, diagramming techniques are simply views atop an
underlying meta-meta model
• The most useful tools now include
• Well designed and integrated meta-meta models
• Options for multiple view types, including data, process, and event logical
views, as well as assorted physical views
25
Market Snapshot
26
Historically inadequate techniques and tools
• Unfortunate detours such as overzealous object-oriented analysis
and design
• Class modeling is not a substitute for data modeling
• “Everything is an object” and system-assigned identifiers often mean
insufficient specificity and endless refactoring
• Fine to capture entity behaviors and to highlight generalization, but you still
need to be rigorous about entities, attributes, relationships, and identifiers
• No “Dummie’s Guide to Logical Data Modeling”
• E.g., normalization: a useful set of heuristics for assessing and fixing
poorly-formed data models
• But there has been a shortage of useful resources for people who seek to
develop data modeling skills – in order to create well-formed data models in the
first place
• Result: often intimidating levels of complexity…
Market Snapshot
27
Historically inadequate tools and
techniques
• An Object Role Modeling (ORM)
example
• Consistent and concise
• But also overwhelming
• Doesn’t scale well for more
complex modeling domains
• Useful for some designers
• But not as useful for
collaborative modeling with
subject matter experts who
don’t seek to master the
technique
Source: http://www.orm.net/pdf/ORMwhitePaper.pdf
Market Snapshot
Historically suboptimal “burden of knowledge” distribution
• Following Carlis: knowledge is generally captured in three places
• Resource managers/systems such as DBMSs
• Applications/programs
• People’s heads
• Universally-applicable data, process, and event details are ideally
captured in DBMSs
• Applications can be circumvented and are often cruelly complex
• People come and go (and take their knowledge with them)
• But in recent years, DBMSs have been relegated to reduced roles
• Suboptimal in many data modeling-related respects
• Often meant inappropriate distribution of the burden of knowledge
• DBMSs (and thus data modeling) are now resurgent, however
28
Market Snapshot
Reduced “green field” application development
• Following the enterprise shift toward purchased-and-customized applications
such as ERP and CRM
• Start with models supplied by vendor
• Usually with major penalties for extensive customization
• So we often see enterprises changing their operations to match purchased applications
instead of the other way around
• In many cases, packaged applications
• Follow least common denominator approaches in order to support multiple DBMS
types
• Capture universally-applicable data/process/event model facets at the application
tier instead of in DBMSs
• Far from ideal distribution of the burden of knowledge
• Trade off increased complexity for increased generality
• Good for application vendors; not always so good for customer organizations
• Overall, this has often resulted in
• Reduced incentives and utility for data modeling
• Many organizations deferring to application suppliers for data models, often with
undesirable results such as “lock-in” and endless consulting
29
Market Snapshot
Recap: data modeling has a mixed reputation
• Because of the historical challenges
• The return on data modeling time investment has been far from ideal
because of
• Lack of best practices, techniques, and tools
• Environmental dimensions that reduced the utility of data modeling
• Many enterprise data modeling projects became IT full-employment acts
• With endless scope creep, unclear milestones, completion criteria, and return
on investment
• As a result, enterprise data modeling endeavors have become scarcer
during recent years, with the relentless IT focus on ROI and TCO
• Obviously an untenable situation
• Both IT people and information workers are increasingly making decisions
when they literally don’t know what they’re talking about, due to the lack of
high quality and fidelity data models
30
Analysis
Market trends
• Back to data basics
• Broader and deeper data modeling applicability
• Availability of more and better data models
• Simpler and more effective techniques and tools
• Increasing data modeling utility, requirements, and risks
31
Market Trends
Back to data basics
• Growing appreciation for
• The reality that all bets are off if you’re not confident you have
established consensus about goals, nouns, verbs, and events
• Software development life cycle economic realities
• It’s much more disruptive and expensive to correct models as you go
through analysis, design, implementation, and maintenance phases
• Less expensive hardware and networking means the return on time
investment for logical modeling is increasing while the return for
physical modeling is decreasing
• Indeed, emerging model-driven tools increasingly make it possible for the
logical model to serve as the application specification, with penalties for
developers who insist on endlessly tweaking the generated physical
models (code)
32
Market Trends
Broader data and deeper modeling applicability
• SOA is one of the most significant data modeling-related development during
recent years
• All about services, but with a deep data model prerequisite
• Don Box: services share schemas and contract, not class
• From a DBMS-centric world view, web services => pragmatic XML evolution
• Parameterized queries, as in DBMS stored procedures
• Structured and grouped query results
• SOA has also driven the need for web services repository (WSR) products
• Increasingly powerful tools for information workers have also expanded the
applicability of data modeling
• An early example: Business Objects – focused on making data useful for more
people through data model abstractions
• Similar capabilities are now available throughout products such as Microsoft Office
• Recent developments such as XQuery will dramatically advance the scope and
power of applied set theory
33
Market Trends
34
Availability of more and better models
• Resources such as books focused on the topic area,
e.g., Carlis/Maguire and David Hay’s Data Model
Patterns
• Products that include expansive data models, ranging
from ERP to recent data model-focused offerings such as
• NCR Teradata’s logical data model-based solutions
• “Universal model” resources from enterprise architecture tool
vendors such as Visible Systems
• Based on decades of in-market enterprise modeling experience
Market Trends
Availability of more and better models
• Standards groups and initiatives, such as
• ACCORD
• Open Application Group
• OASIS Universal Business Language
• Models developed by enterprises and government
agencies, e.g.,
• Canada’s Integrated Justice Information (IJI) initiative
• Provides a data model and context framework for all aspects of law
enforcement
• No magic: a multi-year effort with pragmatic hard work and governance
• Similar initiatives are now under way in the United States and
other countries
35
Market Trends
Simpler and more effective techniques and tools
• Most now include
• Cleaner separation of concerns and more intuitive user
experiences
• For data modeling: ER subsets/refinements that reduce
ambiguity and notational complexity
• And support view preferences with variable levels of detail
• Integrated meta-meta models and unified repositories
• Supporting enterprise architecture models such as the Zachman
Framework as navigational guides
• Although there’s still a perplexing lack of repository-related standards
36
Market Trends
Data modeling in the enterprise architecture landscape
• Relative to the Zachman Framework
Source: http://www.zifa.com/
37
Market Trends
38
Simpler and more effective techniques and tools
• Most now include (continued)
• Model-driven analysis and design tools
• Building on virtualization and application frameworks with declarative
services for transactions, security, and more
• Even more incentive to focus more on logical models and less on physical
models
• More powerful and robust forward- and reverse-engineering
capabilities
• To transform physical => logical as well as logical => physical
• Many are also available at much lower cost
• And some open source modeling tools have emerged
Market Trends
39
Increasing data modeling utility, requirements, and risks
• To recap: much more utility from effective data modeling
• Related trends and risks
• Regulatory compliance requirements, especially concerning
information disclosure
• Impossible to track what’s been disclosed (both by and to whom) if you
don’t know what you’re managing and who has access to it
• Increasing demand for reverse-engineering tools in order to better
understand existing systems and interactions
• “Cognitive overreach” – the potential for information workers to create
nonsensical queries based on poorly-designed data models
• The queries will often execute and return arbitrary results
• With which people will make equally arbitrary business decisions
Analysis
Market impact
• Pervasive data modeling and model-driven
analysis/design
• Vendor consolidation and superplatform alignment
• Potentially disruptive market dynamics
40
Market Impact
41
Pervasive data modeling and model-driven analysis/design
• No longer optional (never really was)
• Most of today’s software products assume effective data modeling
• Using a DBMS or an abstraction layer such as Microsoft’s ADO.NET with
poorly-designed data models results in significant penalties
• Often implicit, e.g., in
• Information worker-oriented tools such as the query and data manipulation
tools included in Microsoft Office
• Not a recent development – e.g., consider > $1B annual market for products
such as Apple Filemaker Pro and Microsoft Access – but rapidly expanding
• Future offerings such Microsoft Vista and Microsoft Office 2007, which are
deeply data model- and schema-based
• For documents, messages, calendar entries, and more, all with extensible
schemas and tools for direct information worker metadata manipulation actions
Market Impact
Vendor consolidation and superplatform alignment
• A familiar pattern –commoditization, standardization, and
consolidation, resulting in
• Significant merger/acquisition activity
• Shifting product categories, in this context including
• Specialized/focused modeling tools
• Including widely-used products such as Microsoft Visio
• Enterprise architecture/application lifecycle management tool suites
• Essentially CASE++, with more and better integrated tools, deeper standards
support, and often with support for strategic views
• Examples: Borland, Embarcadero, Grandite, Telelogic, Visible
• Superplatform-aligned tool suites
• IBM, Microsoft, and Oracle, for example, all either now or plan to soon offer end-toend model-driven tool suites
• IBM currently has a significant market lead, through its Rational acquisition
• Broader support for interoperability-focused standards initiatives such as
XMI (OMG’s XML Metadata Interchange specification)
42
Market Impact
43
Vendor consolidation and superplatform alignment
Bachman
Cayenne
Cadre
KnowledgeWare
Sterling
Computer Associates
Texas Instruments IEF
LogicWorks ERWin
Platinum
SDP S-Designor
PowerSoft
Sybase PowerDesigner
Visio
Microsoft
Popkin
Telelogic
TogetherSoft
Borland
Rational
IBM
Some CASE and modeling tool
vendor merger/acquisition activity
Market Impact
44
Potentially disruptive market dynamics
• Opportunities for new or refocused entrants, e.g.,
• Adobe: a potential leader in WSR following its acquisition of Yellow Dragon
Software
• Adobe doesn’t offer data modeling tools, but it has a broad suite of tools that
exploit XML and data models
• The urgent need for WSR products could result in SOA-centric repository
offerings expanding to encompass more traditional repository needs as well
• Altova: expanding into UML modeling from its XML mapping/modeling
franchise
• Microsoft: Visual Studio Team System (VSTS) is Microsoft’s first direct foray
in modeling tools
• It used to offer Visual Modeler, an little-used OEM’d version of Rational Rose
• VSTS won’t initially include data modeling tools, but they are part of the plan
for future releases
• MySQL AB: acquired an open source data modeling tool (DBDesigner 4)
and is preparing to reintroduce an expanded version (which will remain open
source)
Market Impact
Potentially disruptive market dynamics
• New challenges for UML, with significant implications
• UML is the most widely-used set of diagramming techniques today, but it’s
not particularly useful for data modeling, and it has some ambiguities and
limitations
• Microsoft and some other vendors believe domain-specific languages
(DSLs) are more effective than UML for many needs
• If UML falters, vendors that have placed strategic bets on UML (such as
Borland, IBM, and Oracle) will face major challenges
• Open source modeling initiatives
• Some examples
• Argo UML
• MySQL’s future Workbench tool set
• MyEclipse: $29.95 annual subscription for multifaceted tools with modeling
• These initiatives will accelerate modeling tool commoditization and
standardization
45
Market Impact
46
The U in UML stands for “unified,” not “universal”
• UML is in some ways ambiguous and is not a substitute
for data modeling
• Some tools include UML profiles for data modeling, however
• UML profiles are similar to domain specific languages in many respects
• It’s not clear that UML is ideal for meta-meta-meta
models
• UML represents unification of three leading diagramming techniques,
but it’s not universally applicable
• UML is much better than not using any modeling/diagramming tools,
but it’s not a panacea
• Although it’s getting more expressive and consistent, with UML v2
Analysis
Recommendations
• Think and work in models
• Build and use model repositories
• Create high-fidelity modeling abstractions for SOA
• Revisit modeling tool vendor assumptions and
alternatives
• Respect and accommodate inherent complexity
47
Recommendations
Think and work in models
• Develop skills and experience in
• Thinking at the type level of abstraction
• Using set-oriented query tools/services
• Data modeling utility now extends far beyond database analysis
and design
• Information workers who have effective data modeling skills will be much
more productive
• Use data modeling to analyze, visualize, communicate, and collaborate
• Provide guidance in
• Data modeling training and tools
• Selecting appropriate tools
• Don’t use ambiguous or incomplete diagramming techniques
• Making resources available in models
48
Recommendations
Build and use model repositories
• Do not
• Needlessly recreate/reinvent models
• Default to exclusively extrapolating models from existing XML
schemas or query results
• Reality check: that’s how most XML-oriented modeling is done today,
but it often propagates suboptimal designs and limits reuse
• This may seem familiar: it repeats an early DBMS pattern, when many
developers simply moved eariler file designs into DBMSs rather than
checking design assumptions/goals
• Ensure policies and incentive systems are in place to
encourage and reward model sharing via repositories
• Add to data governance strategy
49
Recommendations
Create high-fidelity modeling abstractions for SOA
• SOA is rapidly becoming a primary means of facilitating
inter-application integration
• Robust SOA schema design entails abstraction layers
• Exposing public interfaces to private systems otherwise often means
propagating suboptimal data model design decisions
• Sharing services with users whom you may never actually meet
• Making unambiguous and robust models more important than ever
• WSR is likely to become a key part of enterprise model
repository strategy
• Encompassing contexts and models that aren’t exclusively SOAfocused
50
Recommendations
51
Revisit modeling tool vendor assumptions and alternatives
• Think form-follows-function
• White board and pencil & paper often suffice for information worker contexts,
and are generally more conducive to productive modeling sessions
• Enterprise architecture-related modeling, in contrast, should be done with
integrated and repository-based tool suites
• Align with superplatform commitments, e.g.,
• If IBM-focused, for instance, Rational is an obvious candidate
• Microsoft-focused customers need tactical plans until Microsoft delivers a
more comprehensive VSTS
• Oracle customers should revisit Oracle Developer Suite 10g, which includes
Oracle Designer
• Organizations using a mix of DBMSs can benefit from using tools from
specialists such as Embarcadero, Telelogic, and Visible Systems
• Explore open source-related modeling initiatives
• And expect very rapid open source modeling initiative expansion/evolution
Recommendations
Respect and accommodate inherent complexity
• Modeling is and will remain hard work
• Modeling is simpler and more effective when people can work with common
techniques, tools, repositories, and collections of high-fidelity data models
• But the real world is increasingly complex and dynamic, and effective models
must reflect those realities
• Politics and other inter-personal communication challenges are also not going
away, especially in “virtual” organizations
• Neither over-simplify nor over-reach
• Suboptimal modeling and design decisions can cause much more damage
in today’s SOA-centric world
• Means sub-optimally shifting the burden of knowledge
• Information worker-oriented power tools mean the potential for cognitive
overreach is rapidly rising for people who (directly or indirectly) work with
ambiguous or otherwise poorly-designed models
52
Conclusion
Data modeling is not just for databases anymore
• Data modeling is pivotal for analysis, visualization,
communication, and collaboration
• Organizations that do incomplete or otherwise
inadequate data modeling
• Will fail to fully exploit today’s leading tools, servers, and services
• Will not be able to comply with regulatory compliance requirements,
especially for information disclosure
• Data modeling is not easy but it has a very strong return
on time investment
• It’s not optional, so enterprises need to do it well
• The timing and tools have never been better
53
Resources
Burton Group Content
• Business Process Modeling: Adding Value or Overhead?
• http://www.burtongroup.com/Content/doc.aspx?cid=838
• Data Modeling: Not Just for Databases Anymore
• http://www.burtongroup.com/content/doc.aspx?cid=732
• XML Modeling and Mapping: Tumultuous Transformation in the Grand Schema
Things
• http://www.burtongroup.com/Content/doc.aspx?cid=122
• Model-Driven Development: Rethinking the Development Process
• http://www.burtongroup.com/Content/doc.aspx?cid=121
Related Resources
• John Carlis, Joseph Maguire. Mastering Data Modeling: A User-Driven
Approach. Addison-Wesley, 2001.
• Jack Greenfield, Keith Short. Software Factories: Assembling Applications with
Patterns, Models, Frameworks, and Tools. Wiley, 2004.
• David C. Hay. Data Model Patterns: Conventions of Thought. Dorset House
Publishing, 1995.
• Martin Fowler. UML Distilled (3rd ed.). Addison-Wesley, 2004
54
Data Model Examples
Workspace
Workspace ID
Workspace title
E-mail post ID
...
Page version
Page
Page title
...
Version number
Date/time
Version content
...
From
To
Backlink
Page view
Date/time
...
Basic wiki model
55
Workspace person
Date/time added
...
Person
E-mail ID
Password
First name
Last name
Time zone
...
Data Model Examples
56
Site
URL
...
Workspace
Workspace ID
Workspace title
Workspace name
Logo image file name
Logo image URL
E-mail post ID
Technorati key
E-mail notify setting
Web services proxy URL
Weblog sort order
Display in My Workspaces
...
Workspace person
Display My Favorites flag
Side pane position
Hyperlinks underlined
E-mail notification frequency
E-mail notification sort sequence
E-mail notification change types
...
Category
Category name
My Favorites page
Team
favorites
page
Workspace
navigation
page
Person
Page
Page category
Page title
Deleted status
...
Attachment
File name
Size
Date uploaded
...
From
Page version
To
Version number
Date/time
Version content
Backlink
Page view
Socialtext wiki model
Date/time
...
E-mail ID
Admin
Force password reset
First name
Last name
Wikiwyg editor enabled
Password
E-mail notification preferences
Time zone
...