In its early days, information technology was about the capture

advertisement
Foreward
David L.Cohn
In its early days, information technology focused on the capture, processing, storage and
transfer of data. For each step, structures and standards were established and served as
the foundation for subsequent phases. IBM’s universal punched cards captured data in
volume, preparing it for processing. Electronics and programming languages established
mechanisms and disciplines for that processing. Databases and query languages
formalized data storage, and communication protocols led to widely accepted data
communication.
Classical information technology has focused on processing data. Indeed, when I was
young (which some are not sure was ever the case), the field was called Data Processing.
It has primarily dealt with the applications that did the processing (defined by Glushko
and McGrath as “software artifacts that present, collect, and manipulate information”).
We have a vast literature on modeling, creating, defining, testing and describing these
processes. They are important because, without them, nothing would happen.
However, as we move comfortably into the 21st century, information technology is
evolving into Business Informatics. This term recalls the dramatic transformation
information technology brought to biology through bioinformatics. We’ll likely see
similar impact on business.
With Business Informatics, we deal directly with the very concepts of data: what it
means, how it is represented and which elements are related. These meanings,
representations and relationships are present when data is structured into documents.
Documents have long been important, but HTML and the World Wide Web dramatically
increase their value. They’ve accelerated document exchange and emphasized the need
for structures and discipline. These structures and disciplines are what Document
Engineering is about, and the document-centric view is where this book is leading us.
Applications are to information technology as verbs (the action words) are to human
language. But human language would be useless without nouns (the actor words). In
fact, nouns play a larger role in language than verbs. According to Princeton University’s
Cognitive Science Laboratory, the English language has 114,648 distinct nouns but only
11,306 verbs (see wordnet.princeton.edu for a neat online lexical reference system).
However, language depends on both and on their close relationship.
Glushko and McGrath understand the dualism of information technology’s nouns and
verbs. They note, “it is undeniable that documents and processes have an inseparable and
complementary relationship.” However, the evolution of information technology has not
supported this duality. If it had, we would have the tools to model, create, define, test
and describe documents, just as we do for processes. Where are they?
They are in Document Engineering.
Unfortunately, the problem of creating these tools is hard. Just as there ten times as
many nouns in English as verbs, we seem to have ten times as many ways of representing
information as of processing it. Glushko and McGrath have laid down an organized
approach to identify the key documents, canonize their representations and leverage these
to solve the larger problem. They have begun to develop the structure that will lead us to
the needed tool set.
And there is good news along the way.
The document view of Business Informatics may be more natural than the process
view. Documents are concrete entities, and people are comfortable agreeing on their
description and meaning; processes are abstract, and consensus is difficult. In the work
described in this book, and in related efforts covered elsewhere, document-based analysis
is proving to be a powerful technique for designing, building and managing information
systems.
The journey is, indeed, the proverbial thousand miles; this book has begun it with well
more than the usual single step. Fortunately, we don’t have to reach the final destination
to reap substantial rewards.
David L. Cohn
Director, Business Informatics
IBM Research
Yorktown Heights, New York
Download