Uploaded by marc.lanzerath

1998 Ballou et al.

advertisement
This article was downloaded by: [128.122.253.228] On: 01 July 2015, At: 06:52
Publisher: Institute for Operations Research and the Management Sciences (INFORMS)
INFORMS is located in Maryland, USA
Management Science
Publication details, including instructions for authors and subscription information:
http://pubsonline.informs.org
Modeling Information Manufacturing Systems to
Determine Information Product Quality
Donald Ballou, Richard Wang, Harold Pazer, Giri Kumar Tayi,
To cite this article:
Donald Ballou, Richard Wang, Harold Pazer, Giri Kumar Tayi, (1998) Modeling Information Manufacturing Systems to
Determine Information Product Quality. Management Science 44(4):462-484. http://dx.doi.org/10.1287/mnsc.44.4.462
Full terms and conditions of use: http://pubsonline.informs.org/page/terms-and-conditions
This article may be used only for the purposes of research, teaching, and/or private study. Commercial use
or systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisher
approval, unless otherwise noted. For more information, contact permissions@informs.org.
The Publisher does not warrant or guarantee the article’s accuracy, completeness, merchantability, fitness
for a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, or
inclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, or
support of claims made of that product, publication, or service.
© 1998 INFORMS
Please scroll down for article—it is on subsequent pages
INFORMS is the largest professional society in the world for professionals in the fields of operations research, management
science, and analytics.
For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org
Modeling Information Manufacturing
Systems to Determine Information
Product Quality
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
Donald Ballou j Richard Wang j Harold Pazer j Giri Kumar.Tayi
Management Science and Information Systems, State University of New York at Albany, Albany, New York 12222
Total Data Quality Management (TDQM) Research Program, Room E53-320, Sloan School of Management,
Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
Management Science and Information Systems, State University of New York at Albany, Albany, New York 12222
Management Science and Information Systems, State University of New York at Albany, Albany, New York 12222
M
any of the concepts and procedures of product quality control can be applied to the
problem of producing better quality information outputs. From this perspective, information outputs can be viewed as information products, and many information systems can be
modeled as information manufacturing systems. The use of information products is becoming
increasingly prevalent both within and across organizational boundaries.
This paper presents a set of ideas, concepts, models, and procedures appropriate to information manufacturing systems that can be used to determine the quality of information products
delivered, or transferred, to information customers. These systems produce information products on a regular or as-requested basis. The model systematically tracks relevant attributes of
the information product such as timeliness, accuracy and cost. This is facilitated through an
information manufacturing analysis matrix that relates data units and various system components. Measures of these attributes can then be used to analyze potential improvements to the
information manufacturing system under consideration.
An illustrative example is given to demonstrate the various features of the information manufacturing system and show how it can be used to analyze and improve the system. Following
that is an actual application, which, although not as involved as the illustrative example, does
demonstrate the applicability of the model and its associated concepts and procedures.
(Data Quality; Timeliness of Information; Information Product; Information Systems; Critical Path)
1. Introduction
Product quality in manufacturing systems has become
increasingly important. The current emphasis on Total
Quality Management (TQM) is a manifestation of this
trend. Although increasing competition has heightened
attention to quality, quality control in manufacturing
systems has a long tradition (Shewhart 1931, Deming
1986, Figenbaum 1991). Quality-driven organizations
continually strive to improve their products in a variety
of ways. Some changes are major, others minor, but
taken together over an extended period of time such
changes can yield profound improvements in the product’s overall quality.
As in manufacturing systems, information quality in
computer-based systems is becoming increasingly critical to many organizations. The current efforts toward
information highways and networked organizations
underscore the importance of information quality. Organizations are relying more on the quality of the raw
data and the correctness of processing activities that ul0025-1909/98/4404/0462$05.00
462
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
3b28 0010 Mp 462 Monday Apr 20 09:28 AM Man Sci (April) 0010
Copyright q 1998, Institute for Operations Research
and the Management Sciences
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
timately determine the information outputs. They
would obviously prefer that their information outputs
be of the highest possible quality. As with product manufacturing, however, cost must be taken into consideration. A workable goal, then, is to achieve the highest
possible information quality at a reasonable cost.
1.1. Information Manufacturing Systems
Many of the concepts and procedures of product quality
control can be applied to the problem of producing better quality information outputs. Use of the term information manufacturing encourages researchers and practitioners alike to seek cross-disciplinary analogies that
can facilitate the transfer of knowledge from the field of
product quality to the less well-developed field of information quality. We use the term information manufacturing advisedly. For the purposes of this research,
we refer to information manufacturing as the process
that transforms a set of data units into information
products. In addition, we refer to information manufacturing systems as information systems that produce predefined information products. We use the term information product to emphasize the fact that the information output has value and is transferred to the
customer, who can be external or internal.
The systems we model have an analogy in manufacturing known as made-to-stock. Made-to-stock items
are typically inventoried or can be assembled upon demand. Requests for such products can be readily satisfied because the materials, procedures, and processes
needed for their manufacture are known in advance. In
the realm of information systems an example would be
a request by a client to his or her financial advisor for a
portfolio risk analysis. Although this would be requested on an ad hoc basis, the data and programs
needed to perform the analysis would be in place ready
to be used.
In our context, a predefined data unit could be, for
example, a number, a record, a file, a spreadsheet, or a
report. A predefined processing activity could be an arithmetical operation over a set of primitive data units
or an operation such as sorting a file. An information
product could be a sorted file or a corrected mailing list.
This information product, in turn, can be a predefined
data unit in another information manufacturing system.
Viewing information systems in the light of what is
known about producing high quality manufactured
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
3b28 0010 Mp 463 Monday Apr 20 09:28 AM Man Sci (April) 0010
goods can be very useful. An example of the potentially
fruitful cross-pollination between manufacturing and
information systems is the concept of critical path. Identifying the critical path is, of course, a standard activity
in manufacturing. As will be shown in the illustrative
example, if one wishes to produce a certain information
output sooner, one should first concentrate on those activities on the critical path.
Although much can be gained by incorporating concepts and techniques from product manufacturing into
the realm of information manufacturing, the analogies
between the two fields have important limitations.
These limitations arise from the nature of the raw material used in information manufacturing, namely the
original or raw input data. A significant feature of data
is that, although it is used, it does not get consumed.
One might think of a file or a data base as analogous to
in-process inventory. Yet such inventory gets depleted,
whereas stored data can be reused indefinitely. In a
sense, a database is more analogous to a tool crib than
to inventory. With a tool crib, tools are used and then
returned; they are not consumed. However, even this
analogy heightens the differences between the two
kinds of manufacturing. Tools are used to produce the
manufactured product and are not incorporated into the
product as is the case with data from a data base. A
related issue is that producing multiple copies of an information product is inexpensive, almost trivial when
compared to manufactured products.
We consider four attributes of information products in
this paper: timeliness, data quality, cost, and value. The
term ‘‘data quality’’ is used in a generic sense, that is,
‘‘data quality’’ is a place holder for whatever dimensions
of data quality are relevant. If one is interested solely in
the data’s completeness, then one would replace the term
‘‘data quality’’ wherever it appears in this work with the
word completeness. It should be noted we use the term
data quality for intermediate data products (those that
experience additional processing) and reserve the terms
information quality and information product for the final
product that is delivered to the customer.
Timeliness is usually considered a dimension of data
quality; see, for example, (Ballou and Pazer 1985). The
need to treat timeliness separately can be best understood by considering one of the ultimate goals of
this research: to permit changes to the information
463
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
manufacturing system, ranging from fine tuning to
reengineering, in the context of customer concerns regarding the information products. Many information
products are time-sensitive, and thus any efforts directed toward improving these products must explicitly
factor timeliness into the analysis.
1.2. Purpose and Scope of Paper
In this paper, we present a set of ideas, concepts, models, and procedures that form some of the building
blocks upon which a theoretical foundation for information manufacturing systems can be established.
Based on these building blocks, we present a methodology for determining information product attribute
values. In addition, we illustrate how these building
blocks can be used to study options in improving the
information manufacturing system.
In our context, information products can be produced
on a regular basis, for example standardized regular
billing such as monthly credit card statements. In some
cases there are few quality problems with an information product. However, timeliness and quality may be
conflicting goals. For example, the sooner credit card
bills are delivered to customers, the sooner the issuing
company would be reimbursed. Also, the card issuer
would be able to identify nonpayment problems sooner.
Speeding up the production of the monthly statement,
however, could compromise quality. Our work is designed to provide tools for analyzing how changing the
information manufacturing system would affect tradeoffs such as this one.
As is also the case with traditional product manufacturing systems, there is a partial separation between the
strategic issues of product mix and pricing and the managerial issues related to the efficient manufacture of the
desired products. In both cases, those designing the requisite manufacturing systems can make important contributions to these strategic decisions by determining
the economic and technical feasibility of possible variants of the product mix. While many of the strategic
issues relating to product mix and pricing extend well
beyond the domain of production planning, an important contribution of the production sector is in transforming product specifications into the desired components of the product mix. The concepts, techniques,
and procedures presented in this paper permit the de-
464
3b28 0010 Mp 464 Monday Apr 20 09:28 AM Man Sci (April) 0010
signers to assess the impact of various information manufacturing scenarios on timeliness, quality, and cost attributes of the information product. If necessary, modifications to individual products and/or the product
mix can be made in light of such an assessment.
Thus, this paper does not explicitly address the significant issue as to whether the information products
are appropriate although it does facilitate analysis of
revamped systems that produce different, presumably
more appropriate, information products. It is not explicitly concerned with issues such as what kinds of
data to use or what kinds of processing are required,
but it does allow the designer to test out various alternatives. Also excluded in our present model are ad hoc
queries. If such queries are requested frequently
enough, they could be included in the analysis. However, if they are that well-defined, we have in some
sense a made-to-stock situation.
1.3. Background and Related Research
Organizations are now better equipped than ever to develop systems that use raw data originating from a variety of sources. Unfortunately, most databases are not
error free, and some contain a surprisingly large number of errors.1 It has long been recognized that data
problems can cause computer-based systems to perform
poorly. The need to ensure data quality in computer
systems has been addressed by both researchers and
practitioners for some time. A growing body of literature has focused on data quality: what it is, how to
achieve it, and the consequences arising when it is inadequate (Wang et al. 1995). The dimensions of data
quality have been studied (Ballou and Pazer 1995),
(Wang and Strong 1996). A model for tracking errors
through a system to determine their impact on the information outputs has been developed by Ballou and
Pazer (1985). Procedures for achieving data quality
have also been presented (Morey 1982, Ballou and Tayi
1989). Deficiencies in data that affect individual’s lives
have also been formally examined by various researchers. Laudon (1986) determined that the records of many
of those involved with the criminal justice system contain potentially damaging errors. The impact of errors
1
‘‘Databases are Plagued by Reign of Error,’’ The Wall Street Journal,
May 26, 1992.
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
in information on the likelihood of making correct decisions was analyzed by Ballou and Pazer (1990).
Research efforts on data quality presented in the existing literature have addressed issues from the information systems perspective, but no general mechanism
has been proposed to systematically track attributes of
data. Our methodology allows for the systematic tracking of timeliness, quality, and cost. This capability can
be used to analyze an information manufacturing system and, based on the analysis, to experiment with various options. The ideas, concepts, model, and procedures proposed in this paper would be useful in providing a common set of terms and thus supporting the
building of a cumulative body of research in this domain. A major outcome of the work described in this
paper is a model-based approach for studying information manufacturing systems.
In the following section, we introduce the foundation
of our model. This model incorporates the various components of the information manufacturing system and
key system parameters including timeliness and data
quality, as well as value to the customer and cost of
information products. In §3, we use the model to provide a methodology for analyzing the impact of system
modifications on information product attributes. In §4,
this methodology is exemplified through an illustrative
example. Specifically, we focus on explaining the mechanics of the proposed methodology. Next, in §5, we
present a real-life application called the Optiserve case
with a goal to demonstrate the methodology’s usefulness and ease of implementation in improving an actual
information manufacturing system. Toward this end,
we highlight the modeling nuances needed to accommodate the realistic aspect of this case, and outline the
methods for acquiring appropriate data. Concluding remarks are found in §6.
that the product’s potential value to the customer may
be diminished if it is untimely or of poor quality. The
value of the information products can be improved by
making appropriate changes to the information manufacturing system. The importance of doing this is attested by Hammer (1990). We seek to determine the key
parameter values that will help to identify those
changes to the system.
2.1. Modeling of Information Manufacturing
Systems
To evaluate various system configurations, data units
must be tracked through the various stages or steps of
the information manufacturing process. Any of these
steps has the potential to affect timeliness and data quality for better or worse. For example, introduction of additional quality control would enhance the quality of
the data units but with a concomitant degradation in
the timeliness measure. Also, improving a processing
activity could result in higher levels of data quality and
improved timeliness but increase the cost. The various
components of the information manufacturing system
are displayed in Figure 1.
The data vendor block represents the various sources
of input raw data. Each vendor block can be thought of
Figure 1
Components of the Information Manufacturing System
2. Foundation of the Information
Manufacturing Model
As previously stated, the term information manufacturing refers to a predefined set of data units which undergo predefined processing activities to produce information products for internal or external customers, or
both. We postulate that each information product has
an intrinsic value for a given customer, and we assume
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
3b28 0010 Mp 465 Monday Apr 20 09:28 AM Man Sci (April) 0010
465
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
as a specialized processing block, one that does not have
a predecessor block. Thus, one vendor (internal or external) can potentially supply several different types of
raw data. The role of the processing block is to add
value by manipulating or combining appropriate data
units. The data storage block models the placement of
data units in files or data bases where they are available
as needed for additional processing. The quality block
enhances data quality so that the output stream has a
higher quality level than the input stream. The customer
block represents the output, or information product, of
the information manufacturing system. It is used to explicitly model the customer, which is appropriate, as the
ultimate judgment regarding the information products’
quality and timeliness is customer-based.
We envision that the modeling of the information
manufacturing system would take place at an appropriate level of detail. For example, the effect of a quality
block could be modeled by specifying the fraction of apparently defective units entering and the fraction leaving.
At a more detailed level, the quality block splits the incoming stream into apparently good and defective subsets. The apparently defective subset is examined and
undergoes corrective action as appropriate. Depending
on the nature of the defects identified, the apparently
defective items could be split into additional subsets,
each of which would undergo different, appropriate corrective action. Associated with each of these subsets are
probability values giving the likelihood of Type I and
Type II errors, which, together with knowledge of the
original fraction of defectives, yields the fraction of apparently correct and defective units arriving at the next
block. Information regarding how this applies in other
contexts can be found in Ballou and Pazer (1982) and
Morey (1982). For this paper we have chosen not to
model at this level of detail.
The nature of the activities performed by the quality
control blocks is context-dependent. This is true even
for the same data quality dimension. For example, suppose that an information product is dependent upon a
form with blanks filled in by various parties. A quality
control check in this case could be a scan of the form by
a knowledgeable individual to identify missing information. Another type of completeness quality control
could be a verification that all stores have reported their
sales for the most recent period. An accuracy check
466
3b28 0010 Mp 466 Monday Apr 20 09:28 AM Man Sci (April) 0010
could be a comparison of this period’s and last period’s
results with outliers flagged for verification.
Figure 2 displays a simple information manufacturing
system but one which captures many of the potential
components and interactions. This system will be used
throughout the paper to illustrate concepts, components,
and procedures developed for the information manufacturing model. In this system there are five primitive data
units (DU1 –DU5 ) supplied by three different vendors
(VB1 , VB2 , VB3 ). There are three data units (DU6 , DU8 ,
DU10 ) that are formed by having passed through one of
the three quality blocks (QB1 –QB3 ). For example, DU6
represents the impact of QB1 on DU2 . There are six processing blocks (PB1 –PB6 ) and accordingly six data units
that are the result or output of these processing blocks
(DU7 , DU9 , DU11 , DU12 , DU13 , DU14 ).
There is one storage block (SB1 ) in Figure 2. The storage block is used both as a pass-through block (DU6
enters and exits SB1 intact and is passed on to PB3 ) and
as the source for database processing (DU1 and DU8 are
jointly processed by PB4 ). Note that the autonomy of
the data units need not be preserved. A new data unit
DU11 that involves DU1 and DU8 is formed. The system
has three customers (CB1 –CB3 ), each of whom receives
some subset of the information products. Also note that
multiple copies of data can be produced. For example,
two copies of DU6 are produced and used subsequently
by PB1 and PB3 . Note that the placement of a quality
block following a vendor block (similar to acceptance
sampling) indicates that the data supplied by vendors
in general is deficient with regards to data quality. For
our illustrative example, the data DU2 has historically
exhibited quality deficiencies thus necessitating the
quality block QB1 .
This modeling is similar to the use of data flow diagrams (DFD). ‘‘Vendor’’ and ‘‘customer’’ blocks are
analogous to ‘‘external entities,’’ the ‘‘Process’’ block to
‘‘function,’’ and the ‘‘Data Storage’’ block to ‘‘data
store.’’ We have deliberately chosen not to use this terminology and notation primarily to emphasize in our
exposition the analogy with product manufacturing, the
theme of this paper. Also, the concept of quality block
does not have a direct analogue in the DFD technique.
However, those wishing to use DFD techniques to
model an information manufacturing process certainly
could do so. This would take advantage of the knowl-
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
Figure 2
An Illustrative Information Manufacturing System
edge of CASE tools held by many information systems
professionals.
As will be explained in greater depth in §3, the data
units have associated with them vectors of characteristics
or parameters whose components change as a result of
passing through the various stages of the information
manufacturing process. What constitutes a data unit is
context-dependent. For example, if all fields for all records of a certain file possess the same timeliness and data
quality characteristics, and if the entire contents of the
file are processed in the same manner, then that file could
be treated as a single data unit. In contrast, if the fields
within a record differ markedly in terms of their timeliness and data quality attributes, then it would be necessary to model them individually. By this we mean that
each field of each record would be treated as a different
data unit. Clearly in practice compromises would have
to be made to avoid an inordinate quantity of data units,
but in theory there is no limit regarding their number.
The Optiserve case described in §5 illustrates how to
convert an actual situation into an information manufacturing system model of the type displayed in Figure
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
3b28 0010 Mp 467 Monday Apr 20 09:28 AM Man Sci (April) 0010
2. That case will examine the current system and provides a basis for reengineering of the system.
2.2. Measurement of Key System Parameters
In this section we present various formulas to measure
timeliness, data quality, and value. To use the information manufacturing model, these factors must be
quantified. A discussion preceding each formula identifies properties that any measure of the quality in question must possess. Some of these measures can be justified on the basis of previous research. Thus these formulas build upon the accumulated knowledge and can
be applied in a wide range of situations. That being said,
the precise expression for these measures is not critical
for the information manufacturing model. If in a certain
case or situation those responsible for implementation
of the information manufacturing system feel that a different set of formulas would be more appropriate, then
the analysis would proceed using their formulas instead
of the ones used here.
2.2.1. Timeliness. The timeliness of a raw or primitive data unit is governed by two factors. The first,
467
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
currency, refers to the age of the primitive data units
used to produce the information products. The second,
volatility, refers to how long the item remains valid. The
age of some data, that is, its currency, does not matter.
The fact that George Washington was the first president
of the United States remains true no matter when that
fact centered the system. In contrast, currency matters
in the case of a market free fall, when yesterday’s stock
quotes may be woefully out of date.
The currency dimension is solely a characteristic of
the capture of the data; in no sense is it an intrinsic property. The volatility of the data is, however, an intrinsic
property unrelated to the data management process.
(We may choose to manage volatile data in such a way
that it is reasonably current, but such activities do not
affect in any way the underlying volatility.)
2.2.1.1. Timeliness Measure for Primitive Data
Units. The first step in developing a measure for timeliness of a primitive data unit is to quantify the currency
and volatility aspects of timeliness. Both currency and
volatility need to be measured in the same time units.
It is natural to use time tags to indicate when the data
item was obtained; see, for example, Wang, Kon, and
Madnick (1993). This information is used to determine
an appropriate currency measure. The currency measure is a function of several factors: when the information product is delivered to the customer (Delivery
Time); when the data unit is obtained (Input Time); and
how old the data unit is when received (Age). These
factors can be combined to yield the following definition of currency.
Currency Å (Delivery Time 0 Input Time) / Age.
(1)
Note that the term in parentheses represents how long
the data have been in the system and the last factor represents the time difference between when the real-world
event occurred and when the data was entered (Wand
and Wang 1996).
As will be shown in the Illustrative Example of §4,
volatility is captured in a way analogous to the shelf life
of a product. Perishable commodities such as food
products are sold at the regular, full price only during
specified periods of time. Degradation of the product
during that time is not deemed to be serious. Similarly,
suppliers of primitive or raw data units and/or data
managers would determine the length of time during
468
3b28 0010 Mp 468 Monday Apr 20 09:28 AM Man Sci (April) 0010
which the data in question remain valid. This number,
which we refer to as shelf life, is our measure of volatility. The shelf life of highly volatile data such as stock
quotes or currency conversion tables would be very
short. On the other hand the shelf life of data such as
the name of the first president of the United States
would be infinite. The shelf life would be determined
by the data quality manager in consultation with the
information product consumers and of necessity is
product-dependent. If the information product is designed for customers who are long-term investors in the
stock market, then quotes in today’s paper regarding
yesterday’s close are more than adequate. If the product
is for customers who are ‘‘in and out’’ traders, then the
most recent trading price is appropriate. In the former
case shelf life is in terms of one or more days. In the
second case it is minutes or even seconds.
Our approach postulates that the timeliness of an information product is dependent upon when the information product is delivered to the customer. Thus timeliness cannot be known until delivery. The purpose of
producing a timeliness measure is to have a metric that
can be used to gauge the effectiveness of improving the
information manufacturing system. For comparison purposes, it is important to have an absolute rather than a
relative scale for timeliness. With this in mind we measure timeliness on a continuous scale from 0 to 1. Value
of 1 is appropriate for data that meet the most strict timeliness standard; value of 0 for data that are unacceptable
from the timeliness viewpoint. The currency or overall
age of a primitive data unit is good or bad depending on
the data unit’s volatility (shelf life). A large value for
currency is unimportant if the shelf life is infinite. On the
other hand, a small value for currency can be deleterious
to quality if the shelf life is very short. This suggests that
timeliness is a function of the ratio of currency and volatility. This consideration in turn motivates the following
timeliness measure for primitive data units.
Timeliness Å {max[(1 0 currency/volatility), 0]}s
(2)
In §§4 and 5, volatility is measured in terms of shelflife, thus
Timeliness Å {max[(1 0 currency/shelf-life), 0]}s
(2a)
The exponent s is a parameter that allows us to control the sensitivity of timeliness to the currency-
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
volatility ratio. Note that for high volatility (i.e., short
shelf life) the ratio is large, whereas for low volatility
(i.e., long shelf life) the ratio is small. Clearly having
that ratio equal to or close to zero is desirable. As that
ratio increases, is the timeliness affected relatively little
(s Å 0.5, say), a lot (s Å 2, say) or neither (s Å 1)? The
appropriate value for s is context-dependent and of necessity involves judgement. In Equation (2), volatility
and the exponents are given as inputs to the model,
while currency is computed, as will be presented in §3
and illustrated in §4.
2.2.1.2. Timeliness Measure for Output of Processing Blocks. Our goal is to attach a timeliness measure
to each information output. Each such output is the result of certain processing and various inputs. Each input
in turn can be the result of other processing and inputs.
Potentially each information output is dependent upon
several stages of processing and a host of primitive data
units. This convolution is addressed by considering one
block at a time. First, we focus on those blocks that involve processing activities, both arithmetical and nonarithmetical. Quality and storage blocks are treated
next. It is important to keep in mind that a timeliness
value is computed and attached to each process output.
Timeliness is actually measured only for primitive data
units.
Arithmetical Operations. Even simple cases present
problems. Suppose, for example, that output value y is
the difference of input values x1 and x2 , i.e., y Å x1 0 x2 .
Assume further that x1 has a very good measure for
timeliness whereas x2 has a poor measure for timeliness.
If x1 Å 1000 and x2 Å 10, then the timeliness value for y
is very good. Conversely, should x1 Å 10 and x2 Å 1000,
the timeliness value for y is poor. Clearly any composite
timeliness value must take magnitudes into account.
How the variables interact must also be accounted for.
If, for example, x1 and x2 have the timeliness measure
described above, and are of roughly equal magnitudes,
then outputs y1 Å x1 / x2 and y2 Å x1 ∗ x2 clearly differ
in how the poor level of timeliness of x2 impacts the
timeliness of the outputs. From the calculus we know
that given a function y Å f(x1 , x2 , . . . , xn ), xi Å xi (t),
then
dy
Å
dt
n
S D
Ìf
∑ Ìx
i
iÅ1
dxi
.
dt
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
3b28 0010 Mp 469 Monday Apr 20 09:28 AM Man Sci (April) 0010
This expression captures how the dependent variable
is affected by changes in time t. More importantly from
our perspective, it accounts for the interaction among
the independent variables. We, of course, are not concerned with rates of change of the variables with respect
to time. Still the above can provide guidance regarding
a timeliness measure for a computed output. Ordinarily
one would expect that if the timeliness value for each
of the inputs were 1, then the timeliness value for the
output would be excellent, undoubtedly equal to 1 also.
Conversely, if all primitive data items possess a timeliness value of 0, one would expect the timeliness value
for any resulting output of the processing blocks to be
0 as well. Considerations such as these motivate our
definition for timeliness of the output of a processing
block that involves arithmetical computations. Let T(xi )
denote the timeliness measure for xi and let y Å f(x1 , x2 ,
. . . , xn ) be an information output. Then we propose the
following to represent or measure the timeliness of y.
T(y) Å
( niÅ1 wi ∗ T(xi )
(
n
iÅ1
wi
where wi Å
Z Z
Ìf
Ìxi
∗É xi É.
(3)
Equation (3) is a weighted average of the T(xi ). It is
assumed that each of the terms above is evaluated using
those values that determine the output value for y. (If y
Å x1 0 x2 and x1 Å 1000, x2 Å 10, then these values will
be used as appropriate in Equation (3).) Note that if
T(xi ) Å 0 for all i, then T(y) Å 0 and if T(xi ) Å 1 for all
i, then T(y) Å 1. The dependence of the timeliness of y
on the interactions of the xi is captured in a manner
analogous to the chain rule of the calculus. Finally, the
need to involve the magnitudes of the values is explicitly modeled. The absolute values ensure that the range
0 to 1 is preserved and that positive and negative values
do not cancel each other. As indicated, if a different formula would be more appropriate, then the analysis
would proceed in the same manner using that formula.
This is demonstrated in the Illustrative Example.
It is important to note that because of the currency
component, timeliness measures cannot be stored.
Rather they must be determined at the time the information product is delivered to the customer. Delivering
the same information product to different customers at
different times would result in different timeliness values for these customers.
469
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
Nonarithmetical Operations. Data units can undergo processing that does not involve any arithmetical
operations. For some types of data the processing does
not change the timeliness value. For example, if the data
unit is a file and the process is to sort the file, then the
output data unit would have the same timeliness measure as the input data unit. Recall that the timeliness
measure ultimately depends upon the volatility of the
raw data and the time the customer receives the information product. Built into the latter value is the time
for, say, sorting the file. Also, if the activity should be
to extract a subset from a data unit, the resulting (subset) data unit would inherit the timeliness measure from
the (superset) data unit. Another situation would be
combining all or a portion of two or more data units.
For example, suppose two data units (files) are merged.
Then a natural timeliness measure for the resulting data
unit would be some weighted average of the timeliness
values for the original data units. This is consistent with
the timeliness value for computed outputs. (Recall that
Equation (3) is essentially a weighted average of the
timeliness measures of the input data units.) The
weights could reflect the size of the data units that are
merged, their importance or some combination of attributes. The example in §4 illustrates these concepts and
methodology using equal weights for the inputs to processing blocks.
2.2.1.3. Quality and Storage Blocks. The timeliness
measure for the output of a quality block is the same as
that for the input data unit. Again this is so even though
quality control consumes time, the justification being
that all timeliness measures ultimately depend upon
that point in time when the customer receives the product. Thus time for one specific activity is already incorporated. If only some of the units pass through a quality
control block, then the subsets would have to be modeled separately and different timeliness measures could
result. Analogously for storage activity, the timeliness
of a retrieved data unit is that of the stored data unit,
and for combinations of data units weighting is appropriate.
2.2.2. Data Quality. It is also important to be able
to assess, in an overall sense, the quality of the information products. These products, as discussed before,
are manufactured in multiple stages of processing and
are based on data that have various levels of quality.
470
3b28 0010 Mp 470 Monday Apr 20 09:28 AM Man Sci (April) 0010
For our model we need to determine how each type of
block affects the quality of the input stream. Some cases
are straightforward. For example, the storage of data
does not affect its quality. (This assumes there is no external event such as accidental erasure of data.) If the
incoming data to a storage block has a certain level of
incompleteness, then the outgoing data has the same
level. For the vendor block it is necessary to know the
quality of the primitive data units. Determining this precisely can be difficult and may require, for example, a
statistical analysis similar to that used by Morey (1982).
Alternatively, these values can be estimated by using
some combination of judgment based on past experience and quality control procedures such as information
audits. In any case, the quality estimations for the primitive data units are exogenous to the system being modeled.
For the quality block typically the output data quality
is better than the input data quality. The magnitude of
the improvement must be determined by the analyst or
furnished to the individual. As with timeliness, weighting or inheritance is appropriate for certain types of processing. (Should the processing block function be to sort
a file, then the value for quality out would be inherited
from the value of quality in.) The least straight-forward
case is a processing block that involves arithmetical operations, which we now discuss.
Let DQ(xi ) denote a measure of the data quality of
data unit xi . As stated above, estimating the values for
the DQ(xi )s is an issue of concern only for the primitive
data units. Suppose all the inputs to some stage are
themselves outputs of other stages. Then the appropriate data quality measures have already been determined by applying at those previous stages the expression given below. As before, we use a scale from 0 to 1
as the domain for DQ(xi ) with 1 representing data with
no quality problems and 0 those with intolerable quality. If all data items should have a data quality measure
equal to 1 and if all processing is correct, then the output
quality measure should be 1 as well. Conversely, if the
quality of all inputs is 0, then the quality of the output
should be 0 as well.
Given this reasoning, we form a weighted average of
the DQ(xi ) values for the data quality of the output. Let
y be determined by data items x1 , x2 , . . . , xn , i.e., let y
Å f(x1 , . . . , xn ). Then Data Component (DC), an esti-
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
mate for the data quality of output y resulting solely
from deficiencies in the input units, can be obtained
from
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
DC Å
Z Z
( niÅ1 wi ∗ DQ(xi )
Ìf
where wi Å
∗É xi É.
( niÅ1 wi
Ìxi
(4)
Note that DC satisfies 0 ° DC ° 1; DC Å 0 if, and
only if, DQ(xi ) Å 0 for all i; DC Å 1 if, and only if, DQ(xi )
Å 1 for all i. Once again, DC involves the magnitude of
the input values and the interactions among the data.
Formulas analogous to (4) were used by Ballou and Pazer (1985).
Although it has been implicitly assumed that the processing activities are computerized, this need not be the
case. In most systems some of the processing activities,
such as data entry, have manual components. Especially
in this situation, and to a lesser degree with fully computerized systems, the processing itself can introduce
errors. Let PE be a measure of processing effectiveness. If
PE Å 1, then the processing never introduces errors. If
PE Å 0, then the processing corrupts the output to such
a degree that the data quality measure for that output
should be 0. Thus, the output quality of y, DQ(y), is
determined by both input data quality and processing
effectiveness, i.e.,
DQ(y) Å f(DC, PE).
(5)
There are various possibilities for this relationship. For
example, one such functional relationship is
q_______
DQ(y) Å DC ∗ PE.
(6)
Note that DQ(y) Å 1 if, and only if, both DC and PE
equal 1. Also DQ(y) Å 0 if either DC or PE is 0. Also, if
DC Å PE should hold, then DQ(y) has the same value
as DC and PE.
The data quality of the data items changes, of course,
as these values undergo a series of processing and quality control activities. The inputs for a given process may
well be outputs from other processes. Thus whenever a
data value undergoes processing or quality control, the
resulting quality measure of the output needs to be recorded so that the information is available for determining quality control of any subsequent outputs. If the
processing is complex, it may be necessary to substitute
subjectively derived quality response functions for the
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
3b28 0010 Mp 471 Monday Apr 20 09:28 AM Man Sci (April) 0010
calculus-based analysis. For example, this would be
necessary if the processing block involves a linear program. In such cases, one could specify that the output
quality is some function of the average input quality.
This function could be determined by simulation.
2.2.3. Cost of Information Product. To evaluate the
effectiveness of improving the system, it is necessary to
compare changes in value to the customer with changes
in cost. However, costing in a multi-input, multi-output
environment such as the one we model is difficult to do
and the approaches available are often controversial.
Implications of costing for multiple use systems have
been recognized and difficulties encountered in trying
to predict and track costs have been considered in Kraimer, Dutton, and Northrup (1981). Nevertheless, because of its importance, there is a substantial body of
literature dealing with the pricing of computer services;
see, for example, (Kriebel and Mikhail 1980).
In our methodology we adopt a cost accumulation
and pro rata allocation approach which, although ad
hoc, facilitates the estimation of the information product’s cost in a straightforward manner. As long as this
costing approach is used consistently in evaluating all
the possible options, it would not lead to erroneous decisions.
2.2.4. Value to the Customer. Ultimately, of course,
the measure that counts is the value of the product to
the consumer. This has been emphasized in both manufacturing and information systems environments. Our
approach is to hypothesize an ideal product, one with
100% customer satisfaction. Any actual product would
deviate from the ideal on several dimensions. Since our
concern is with evaluating alternative system designs
so as to improve either timeliness or data quality or
both, it is natural in this context to limit consideration
of the determinants of value to these dimensions. Thus
for each customer C, the actual value VA is a function
of the intrinsic value VI , the timeliness T and data quality DQ, i.e.,
VA Å fc (VI , T, DQ).
(7)
Given the above mechanism for measuring an information product’s timeliness and data quality, a functional form for VA could be
VA Å VI (w*(DQ) a / (1 0 w)*T b ).
(8)
471
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
Here VI , w, a, and b are customer dependent. The
weight w is a number between 0 and 1 (inclusive) and
captures the relative importance to the customer of
product quality and product timeliness. For example, w
Å 1 implies timeliness is of no concern to the customer
whereas w Å .5 implies that quality and timeliness are
equally important. The exponents a and b reflect the customer’s sensitivity to changes in DQ and T. Variants of
Equation (8) have been utilized in several previous research efforts; see, for example, (Ahituv 1980).
3. A Methodology for Determining
Information Product Attributes
The concepts and methods described in the previous
section form the basis for the methodology introduced
in this section. The purpose of this methodology is to
determine information product attribute values that can
be used to suggest ways to improve the system.
Timeliness, quality, and cost attribute values provide
the producer with a means to assess the potential value
of the information product for customers. As with product manufacturing, however, the producer should analyze the information manufacturing process with the
goal of improving one or more of these values. Doing
this may well result in degradation of the other attribute
values but should in an overall sense enhance the value
of the information product for the customer. The producer would have to determine whether the tradeoff is
beneficial. To improve the timeliness value, for example, one needs to change the system. There are two ways
to accomplish this. Modifying data gathering procedures so as to obtain data which are more current is one
approach. The other is to modify the system so as to
process the data more rapidly. In both cases a mechanism is needed that can be used to determine what approach would produce the largest improvement in
timeliness in a cost effective manner. For this we present
the information manufacturing analysis matrix in tabular
structure.
The Information Manufacturing Analysis Matrix has
one row for each of the data units, primitive and computed. With the exception of those blocks representing
the data vendors, the matrix has one column for every
block. Should a particular data unit pass through an
activity block, then associated with the cell determined
472
3b28 0010 Mp 472 Monday Apr 20 09:28 AM Man Sci (April) 0010
by the appropriate row and column is a five-component
vector, the components of which are described below.
The Information Manufacturing Analysis Matrix for the
system displayed in Figure 2 is presented in Figure 4 as
part of the Illustrative Example found in §4. The entries
found in certain cells of the matrix shown in Figure 4
indicate that the data unit passes through that activity
block. Note that there is a five-component vector of parameters associated with that cell. Recall that SB1 is used
as a pass through for data units DU1 , DU6 and DU8 .
In order to determine appropriate modifications to
the system, it is necessary to track time, data quality and
cost. The information needed for this is first described
in general terms and then followed by a discussion of
these parameters for each of the different types of activity blocks. Listed below are the five components of the
vector of parameters.
p: This specifies the predecessor or originating block.
For example, the predecessors of PB2 are QB2 , the origin
of DU8 , and VB3 , the origin of DU5 .
t1 : This represents the time when the data unit is
available for the activity. For example, the value for t1
for the vector associated with (DU10 , PB6 ) is that time
when DU10 is ready for the processing block PB6 . In the
special case when p is a vendor block (VBI ) then t1
Å Input Time as given in Equation (1), the expression
for currency.
t2 : This is the time when the processing begins. Processing cannot start until all data units needed for that
block are available. Also, processing may begin at a
scheduled time tsch . Thus t2 is the larger of max{t 1* s}
and tsch .
DQI : This is the quality of the incoming data unit for
a particular activity. It is, of course, the same as the data
quality of the output of the predecessor block. As mentioned above, we use the term ‘‘data quality’’ as a place
holder for whatever dimension or dimensions of data
quality are of concern to management (with the exception of timeliness).
CostI : This represents the cost of the incoming data
unit. In essence, CostI is the pro rated accumulation cost
of all the previous activities that this data unit has undergone. (We assume that if a data unit is passed onto
more than one activity, then CostI for those is determined in some pro rated fashion. This implies that total
cost is preserved.)
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
We now examine in some detail the implications of
the parameters for each of the activities. For this it is
useful to use the notation DQO to refer to the output
quality resulting from some process and CostO as the
cost of the output. DQO is computed using the concepts
and expressions given in §2, and set equal to DQI of
successor blocks. Also CostO is the sum of all input CostI
plus the cost of the block.
Processing Block. For a processing block the interpretation for each of the parameters is straight-forward.
Information regarding the cost, processing time, and
impact on quality for each processing block needs to be
furnished to the analyst. Assuming arithmetical operations are involved, the quality of the output is computed
using Equation (5). This is then the value of DQI for all
blocks that receive that output. The time when the processing is complete is used as the t1 value for all blocks
that receive that output. If there is a delay in getting the
data unit to the next activities, it may be necessary to
include a delay parameter as a descriptor of a processing block. This concept also applies to quality and storage blocks.
Quality Block. A quality block can operate on only
one data unit at a time. It may, however, process different
data units at different times. It is necessary that the ts
reflect this. DQI is simply, of course, the DQO of the previous process. Determination of the DQo of the quality
block would require something akin to the statistical
analysis described by Morey (1982). If t2 ú t1 should
hold, then time may be needlessly wasted at this step. A
positive value for (t2 0 t1 ) would reflect a scheduling
delay or a queuing delay. If an entire file is being
checked, we assume that the file is not ready to be passed
on until all corrections and changes have been made.
Storage Block. The value for t2 is that time at which
storing of the data unit commences. Assuming storage
of data cannot affect quality, DQI Å DQo . If a certain
subsequent process should require a subset of some
data unit (part of a relational table, for example), then
that subset inherits the data quality properties of the
data unit. A data unit is modeled for each subset, even
if they should come from the same original data unit.
Information regarding the cost and storage time also
needs to be furnished to the analyst. Note that storage
time is how long it takes to store the data unit. The data
unit is available for subsequent processing any time af-
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
3b28 0010 Mp 473 Monday Apr 20 09:28 AM Man Sci (April) 0010
ter it is stored. The amount of time between when the
data unit is stored and when it is used does not affect
the overall timeliness value of information products delivered to customers unless the storage block should lie
on the critical path. In this case, timeliness is affected by
the Delivery Time component of Equation (1).
Customer Block. For the customer block, t1 represents the time the product is finished, t2 the time it is
received. For on-line delivery systems t1 Å t2 could hold.
For this activity DQI has the value of DQO of the final
process that generated the product. CostO Å CostI assuming the cost of delivery is negligible. If delivery affects cost or quality, then the impact can be modeled as
an additional processing block.
3.1. Timeliness, Quality and Cost of Information
Products: Customer Perspective
The above structure provides the basis for making
changes to the information manufacturing system by
allowing one to quantify the parameters of timeliness,
quality and cost. We now discuss issues related to this
in the context of the customer’s needs. Determination of
customer needs regarding these parameters, especially
for external customers, can be made using market research techniques such as focus groups.
Timeliness. A value for the timeliness of a information product for a particular customer cannot be determined until the customer has received the information
product. This value can be determined by first computing the timeliness values T(xi ) for each of the primitive
data units provided by the vendors using Equation (2).
These values are then available as input to subsequent
activities. As previously discussed, sometimes the timeliness value for an output from an activity block differs
from the input values, sometimes they are the same.
Whenever arithmetical processing is involved, then
Equation (3) would be invoked. Activities such as quality control affect timeliness via the Delivery Time component in Equation (1). Note that the need to wait until
the time when the information product is delivered to
the customer necessitates a ‘‘second pass’’ through the
system in order to compute the timeliness values (to be
illustrated in §4). If timeliness is specified and measured
in terms of a contracted delivery date, then the second
pass through the system is not necessary, as the ‘‘delivery time’’ is prespecified. However, the timeliness
473
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
analysis is still important as it may be possible to deliver
the product sooner and hence gain a competitive advantage.
The information product’s timeliness measure is, of
course, a number between 0 and 1. Whatever the number happens to be, its significance or interpretation is
customer dependent. Some customers might find the information product’s timeliness to be satisfactory, others
not. If it is determined that the product is not sufficiently
timely, then this timeliness value serves as a benchmark
to determine the impact on timeliness of various
changes that could be made to the information manufacturing system. Using the framework described in this
section, one can reengineer the system and re-compute
the timeliness, quality and cost values. For example, one
possible way to improve timeliness might be to eliminate a specific quality block, which could enhance the
ultimate timeliness at the expense of quality.
Quality. The quality parameter is simpler in that the
quality measure of the product delivered to the customer is that associated with the output of the final activity block. Again though, its meaning or significance
is customer-dependent. Whatever the number happens
to be, the customer might feel that the quality is totally
satisfactory, completely unsatisfactory, or anything in
between. Again, the importance of the number is that it
serves as a benchmark to gauge the magnitude of quality improvement resulting from making changes to the
system. Note that different customers will perceive the
quality differently. Some may feel the quality is fine
whereas others may demand enhanced quality. In some
sense the information producer needs to optimize total
value across all products and all customers.
3.2. Cost and Value
The cost of the information product is of interest to the
producer. The customer is concerned with value received and price paid, the latter being an issue beyond
the scope of this paper. As discussed, both quality and
timeliness influence value, as do many other factors
which we deliberately have not modeled. To perform
the analysis, some functional expression relating these
quantities is required. Solely for the purposes of discussion we use Equation (8).
To maximize the net value (total value minus total
cost) received by all customers for all information prod-
474
3b28 0010 Mp 474 Monday Apr 20 09:28 AM Man Sci (April) 0010
ucts, one must obtain information from the customers
regarding each product’s intrinsic and actual value together with the customer’s perception regarding timeliness and quality. Suppose there are M customers and
N information products. Then for each customer i and
product j, an expression of Equation (8) applies, namely
VA (i, j) Å VI (i, j) ∗ [w(i, j) ∗ (DQ(j)) a(i,j)
/ (1 0 w(i, j)) ∗ T(i, j) b(i,j) ].
(9)
Many of the VI (i, j) values would be zero. The double
subscript on T is necessary since the same product could
be delivered to different customers at different times.
Only a single subscript for product quality is required,
as the computed quality measure of a particular product
is the same for all customers. The customer’s sensitivity
to improvements in data quality and timeliness can be
handled via the exponents a(i, j) and b(i, j) respectively.
The producer wishes to optimize net value. Assuming
appropriate and consistent units, the problem is given
by
M
Maximize
N
∑ ∑ [VA (i, j) 0 C(i, j)]
(10)
iÅ1 jÅ1
subject to 0 ° T(i, j) ° 1, 1 ° i ° M,
0 ° DQ(j) ° 1, 1 ° j ° N.
It should be kept in mind that this is a nonlinear optimization problem given the structure of VA as shown in
Equation (9). Here C(i, j) represents the portion of the
cost of product j assigned to customer i. It should be
noted that this formula is used in the context of evaluating a small set of possible alternative information
manufacturing system configurations instead of the traditional optimization.
We have presented a methodology for determining
the timeliness, quality, and cost of information products. In the next section, an example is presented to illustrate some of the conceptual and computational issues that will be encountered when the information
manufacturing model is applied to real world scenarios.
It also illustrates the business impact of possible
changes to the system.
4. Illustrative Example
For continuity, the system depicted in Figure 2 and described in the previous section will be used. Figure 3
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
Figure 3
Figure 3(a)
Data for Illustrative Example
Descriptive Inputs Required for the Primitive Data Units
Figure 3(b)
Descriptive Inputs Required for the Processing Blocks
Figure 3(c)
Descriptive Inputs Required for the Quality Blocks
presents the descriptive inputs required to compute the
timeliness, quality, cost, and value characteristics of information products to be delivered to an array of customers.
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
3b28 0010 Mp 475 Monday Apr 20 09:28 AM Man Sci (April) 0010
Figure 3(a) identifies the seven descriptive inputs required for each of the five primary data units. For example, DU2 is obtained from the first vendor at a cost
of 10, is of intermediate quality, and is already 2 time
475
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
Figure 3(d)
Descriptive Inputs Required for the Storage Block
units old when it enters the system at the beginning of
the information manufacturing process. It is highly volatile with a shelf life of only 30 time units and a second
degree timeliness function (i.e., s Å 2 in Equation (2a)).
By contrast, only four descriptive inputs are required
for each of the six processing blocks as shown in Figure
3(b). For example, PB2 has a cost of 30 and requires 4
time units to complete. As noted in the previous section,
when processing is complex, it may be necessary to substitute subjectively derived quality response functions
for the calculus-based analysis. This is the process followed in this example. For PB2 , output quality is equal
to the square of the average (unweighed) quality of the
two input data units (DU5 and DU8 ). The output of this
block is available to the next block without delay.
Each quality block also requires only four descriptive
inputs shown in Figure 3(c). QB2 has a cost of 40 and
requires 8 time units to complete. Once again a subjectively derived quality output function is employed. The
effect of this quality block is to eliminate 75% of the
difference between the quality of the input flow and the
best achievable quality (i.e., Qout Å 1.0). This output is
also available without delay.
Figure 3(d) presents the three descriptive inputs required for the storage block. For simplicity only a fixed
cost of 5 units per input is assigned. The storage time
for SB1 , 1 time unit, is the time spent to store a data unit
and, as mentioned earlier, is unrelated to the time the
data unit actually spends in storage. No additional delay is encountered.
The only additional requirement for analyzing the
system is the specification of a cost accounting assumption relating to the allocation of input and processing
costs across multiple outputs. The simplifying assumption of equal allocation is made for this example.
We now proceed to explain the mechanics of applying the methodology using the data found in Figure 3
to generate the Information Manufacturing Analysis
Matrix presented in Figure 4. It may be instructive to
476
3b28 0010 Mp 476 Monday Apr 20 09:28 AM Man Sci (April) 0010
view the column corresponding to PB6 . We can observe
that this block requires 10 time units and incurs a cost
of 100. Its quality output function is of the second degree and there is a delay of 1 time unit to deliver its
information product to the final customers. This processing block requires three inputs, DU4 , DU10 , and
DU11 which arrive from VB3 , QB3 , and PB4 respectively
at times 10, 31, and 20. Since processing will not begin
until all inputs are available, processing starts at time
Å 31. The quality and costs of the three inputs are (0.9,
30), (0.9373, 73.75), and (0.9153, 77.6).
The output of this process block is DU13 and is represented by a row in the Information Manufacturing
Analysis Matrix. It can be seen from this matrix, as well
as from Figure 2, that DU13 is a final information product which is provided to three customers. It is created
at time Å 41 by PB6 (31 / 10), but since there is a oneunit delay it is not received by the customers until time
Å 42. Since Q1 , the average quality for the three inputs
described above is 0.9176, and since the quality output
function is of the second degree, data quality of DU13
Å 0.842. The cost of the three inputs when added to the
processing cost yields a sum of 281.35. Since this is
equally distributed over the three outputs, Ci Å 93.75
for DU13 .
Figure 5 provides the information required to evaluate the products received by the customers. The relevant
cost and quality are obtained from the Information
Manufacturing Analysis Matrix. As discussed in the
previous section, determining the timeliness value requires a ‘‘second pass’’ through the system. For example, once it is determined that DU12 is delivered to the
customer at time Å 13, the currency value of its primary
data input (DU2 ) can be determined by Equation (1) as
13 0 0 / 2 Å 15. Therefore, the timeliness value of DU2
can be determined by Equation (2a) as (15/30)**2
Å 0.25. Since there is only one input, this is also the
timeliness value of the information product, DU12 . In a
similar manner, once delivery times for DU13 and DU14
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
The Information Manufacturing Analysis Matrix for the Illustrative Example
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
Figure 4
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
3b28 0010 Mp 477 Monday Apr 20 09:28 AM Man Sci (April) 0010
477
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
Figure 5
Information Required to Evaluate the Information Products
are determined to be 42 and 37, the timeliness values of
various primary inputs can be determined by Equation
(2a). Starting with these and employing equal weights
(for simplicity) at each point of convergence, the timeliness values for DU13 and DU14 are determined to be
0.35 and 0.46.
The right-hand portion of Figure 5 presents customerspecific descriptive inputs required to determine the
value of information products to the three customers.
Also, for simplicity the linear version of Equation (8)
was utilized for all customers. In this example, marketing research has determined that Customer 1 finds data
quality and timeliness to be equally important. Timeliness is twice as important as data quality for Customer
2, while the reverse is true for Customer 3. This example
also shows that the intrinsic value for the same information product can vary among customers.
Figure 6 presents the results of the Illustrative Example and highlights the ‘‘bottom line’’ of this entire
analysis. The numbers in parentheses relate to a modified version and are discussed later in this section. Before the modification, on the aggregate, this information
manufacturing process generates 64.69 in net value (the
difference between total value to the customers and total cost to the firm). It should be noted that this is not
the same as net profit, but it appears that, in aggregate,
it should be possible to negotiate prices that will pro-
478
3b28 0010 Mp 478 Monday Apr 20 09:28 AM Man Sci (April) 0010
vide both a profit to the manufacturer and a net value
to the customers.
Another picture emerges, however, when net value is
viewed, in the disaggregate, by looking at individual
customers. The value of DU13 to Customers 2 and 3 is
less than the cost of production. If either of these customers should discontinue buying DU13 , the consequences would be substantial since revenues would decline but costs would remain unchanged (unless production of DU13 were terminated). Since the purpose of
this framework is to provide a vehicle for improving
information manufacturing systems, the example will
be extended in that direction. By inspecting Figure 6, it
can be determined that quality is near the top of the
scale (recall that 0.842 is a point on a zero-one quality
scale and does not imply that only 84.2% of the output
is correct). On the other hand, timeliness is rather poor
for all information products. This suggests a qualitytimeliness tradeoff which could be achieved by the elimination of a time-consuming quality block. Since DU13
is in a deficit position it should also be a quality block
that affects this output. QB3 is such a block. The justification for this is found in Figures 2, 3, 4, and 6. It is
further observed from Figure 4 that QB3 is on the critical
path for all information products except DU12 . A side
benefit of this quality-timeliness tradeoff will be the
avoidance of the 50-unit cost for QB3 .
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
Evaluation of Information Products Before and (After) Improvement
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
Figure 6
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
3b28 0010 Mp 479 Monday Apr 20 09:28 AM Man Sci (April) 0010
479
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
The analysis is repeated for the modified version, and
the results are shown in parenthesis in Figure 6. Not
only has the aggregate net total value more than doubled but also each information product now has a positive net value for each customer. Given the parameters
of this example, the avoidance of QB3 seems an excellent
first step in improving this system. For a note of caution,
see Chengalur-Smith, Ballou, and Pazer (1992) where it
was shown that this may be detrimental if there is a
considerable variability in the input material.
5. Application of the Information
Manufacturing Model: The Case
of Optiserve
In this section we apply the model described in this paper to a mission-critical information manufacturing system found in a major optical products company, which
we will refer to as Optiserve (Lai 1993, Wang and Firth
1993). For expository purposes, we present the relevant
portion of the case with a simplified system topology.
The current system is discussed in some detail, and a
reengineered system designed to address some of the
current deficiencies is briefly presented. Since the most
difficult aspect of implementing the model is obtaining
estimates for the various parameters needed by the
model, we concentrate on this. The analysis, once the
numbers are available, proceeds as was demonstrated
in the Illustrative Example.
5.1. Current System
Optiserve is a large optical products chain with 750
stores located throughout the country. It provides oneFigure 7
stop eye care in that each store provides eye exams, optical products, and fitting of the products. However,
grinding of the lens is carried out at four manufacturing
laboratories. Our analysis focuses on the express orders
which are handled by the main laboratory. Optiserve
strives to differentiate itself in several ways, one of the
most important being customer service. This is a key
factor in Optiserve’s mission, which is to ‘‘create patients for life.’’ However, at the present time, problems
with data quality not only are costing the company over
two million dollars on an annual basis but also are resulting in the loss of customers (Wang and Firth 1993).
Optiserve’s current information manufacturing system is displayed in Figure 7. Several types of data, modeled as DU1 , are obtained and entered onto the Spectacle
Rx form, the output of interest in this case. At this stage,
the Rx form has patient information (e.g., patient name
and telephone number), the prescription (provided by
the optometrist), information on the glasses themselves
(e.g., frame number and cost, provided by the optician),
and additional characteristics, such as distance between
the patient’s pupils, provided by the optician. These
forms are batched and entered by the optician (represented by PB1 ) into the store’s PC whenever the optician
has free time, often at the end of the day. Roughly 80%
of the data quality problems arise as a consequence of
the above process.
Normally twice a day the optician forwards the day’s
orders (DU2 ) to an IBM mainframe based at corporate
headquarters. In a process represented by PB2 , the IBM
queries the item master file (SB1 ) to determine if the
frame ordered is available (DU3 ). Updating of this file
is represented by VB2 . The Rx ticket (DU4 ) is then
Current Information Manufacturing System
480
3b28 0010 Mp 480 Monday Apr 20 09:28 AM Man Sci (April) 0010
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
checked by the IBM for completeness and correctness
(QB1 ). Assuming no problems, it then forwards the Rx
ticket (DU5 ) to an HP computer also based at headquarters. That computer accesses the frame characteristic file (SB2 ) to obtain information (DU6 ) regarding the
physical characteristics of the frames ordered (size,
shape, etc.) and uses that information to generate the
specifications the laboratory will use to actually grind
the lenses (PB3 ). In some cases this cannot be physically
done. For example, if the lens is large, the blanks may
not be thick enough to accommodate a high curvature.
(The process of checking the output of PB3 , namely DU7 ,
is modeled by QB2 .) Assuming no problems, the Spectacle Rx ticket (DU8 ), now complete with grinding instructions, is returned to the IBM (PB4 ), which routes
the Rx ticket (DU9 ) to the main laboratory (CB1 ).
It is important to keep in mind that the above scenario
captures an information system that supports a manufacturing system. The information product, which is
manufactured on a regular, repetitive basis, is the completed Spectacle Rx ticket (DU9 ), which is delivered to
one of the laboratories. The customer for the information product is the laboratory, an internal customer (CB1 ).
The fraction of Rx tickets that exhibit some data quality problem at some stage is not excessive, as 95% of the
tickets are completely free of data quality problems. Of
those with deficiencies, two-fifths relate to initial data
gathering and entry, another two-fifths to frame-related
errors, and one-fifth due to inability to match the frame
chosen with the prescribed physical requirements for
grinding. 2% of all Rx tickets which are in error are
never detected until the patient receives the glasses. For
the 3% of all the Rx tickets where a problem is detected,
the optician has to contact the customer, who usually
has to come back into the store. As mentioned, this results in a non-trivial financial loss to the company. More
importantly, it violates Optiserve’s desire to differentiate itself by service and results in permanent loss of customers.
5.2. Reengineered System
As indicated, the purpose of our model is to provide information regarding the impact of proposed changes to
the information manufacturing system. The Optiserve
case illustrates substantial problems with all three di-
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
3b28 0010 Mp 481 Monday Apr 20 09:28 AM Man Sci (April) 0010
mensions tracked, i.e., quality, timeliness, and cost. A design alternative is proposed and discussed. It involves a
reengineering of the system, which substantially improves the timeliness and quality dimensions and hence
the company’s goal of superior customer service, but at
an increased cost in terms of hardware and software.
The reengineering option is a decentralized version
of the current system. In each store the PC has been
upgraded to include features such as a math coprocessor, and it now has the responsibility for most of
the computations and processing. The optician still enters the data directly into the PC. In addition, the optician checks to determine if there have been any significant changes in the prescription. (Such changes are
verified with the prescribing optometrist.) Most importantly, the computation of the physical characteristics of
the lenses formerly performed by the HP computer (that
resides at headquarters) are now performed by the PC
in the store while the customer waits. A copy of the item
master file and frame characteristic file reside on the PC
and are updated by a server located at the headquarters
periodically as appropriate. In the reengineered system
a new processing block, which can be labeled PB1* essentially incorporates the activities of PB1 , PB2 , and PB3
of the current system (see Figure 7). If any problems are
identified, they can be resolved immediately, as the patient has been asked to wait a moment while the computations are performed. A quality control block, say
QB 1* , would essentially combine the functions of QB1
and QB2 . If there is no problem, the PC forwards the Rx
ticket to the server in the headquarters (which replaces
the IBM mainframe), which in turn forwards the ticket
to the appropriate laboratory. This results in a major
improvement in patient service and would serve to further differentiate Optiserve on the service dimension.
In the reengineered system, the server performs the
following functions: (1) It maintains the most current
version of the item master file and the frame characteristic file. Periodically, it updates the 750 PCs at the store
level with the most current version. (2) It keeps track of
the inventory level of blank lens, frames, etc. in the laboratories. Each laboratory will periodically report its actual inventory level to the server for reconciliation purposes. This would account for breakage and defective
items (3). The server would route the Spectacle Rx ticket
to the appropriate laboratories.
481
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
5.3. Data Requirements for the Optiserve Case:
Current System
In this subsection we describe how to obtain the data
required to implement the information manufacturing model in the case of Optiserve. Specifically we
present one set of input data values needed to compute the timeliness, quality, cost, and value characteristics of the Rx ticket—the information product. Although the other sets are not necessarily handled in
the same manner, similar procedures can be applied
to obtain the corresponding input values. Those desirous of the full set of input data should contact the
authors.
5.3.1. Primitive Data Units. Eight descriptors are
required for each primitive data unit. As an example,
we will consider DU1 .
1. Vendor— The vendor is the patient who is the
source of the information collected by both the optician
and the optometrist.
2. Cost— The cost of securing the patient information
has two major components. The cost of the optician’s
time was approximately $10 while the cost of the optometrist’s time was approximately $20 to yield the estimate of $30.
3. Quality— The estimate of the quality of the input
is constructed from three sources: the proportion of remakes due to data quality (2%), the proportion of erroneous orders detected at QB2 (1%), and the proportion of erroneous orders detected at QB1 which were
attributable to vendor input (1%). Thus, the proportion
of ‘‘defects’’ originating from the vendor is closely approximated by 4%, the sum of these three error rates.
As noted in §3, the modeling process can incorporate
either relative or actual quality measures. In this case,
since actual measures are available, the quality of DU1
is 0.96.
4. Input Time— The completion of the collection
of patient information is used as the point of reference for this analysis and is consequently set equal
to zero.
5. Age— The information is collected by the optician
and optometrist over the period of an hour. On the average, this information is one-half hour old at the completion of the collection process (t Å 0). Since the analysis is based on a ten-hour day, one-half hour is represented as 0.05 day.
482
3b28 0010 Mp 482 Monday Apr 20 09:28 AM Man Sci (April) 0010
6. Volatility— At first it would appear that much of
the information concerning the patient would be of
rather low volatility. However, the volatility of the information in this analysis relates to how long the patient
will wait before canceling the order (shelf life). After
this point the information is useless. Most of the orders
are express orders, which implies that receiving the
glasses promptly is critical.
7. Shelf-Life— It was determined that unless the laboratory received the patient information within five
days it would become useless due to cancellation of the
order.
8. Timeliness Function— Since cancellations accelerate near the end of the five-day shelf-life, an exponent
of less than one for the timeliness function is required.
An exponent of 14 indicating that approximately 67% of
the cancellations occurred during the fifth day proved
satisfactory.
5.3.2. Processing Blocks. Four descriptors are required for each processing block. As an example we will
discuss PB2 .
1. Cost— The cost of the IBM query of the item master file to determine frame availability was estimated to
be $4.00. This includes items such as personnel costs.
2. Time— The expected time required by this query
is very small compared to various delays in the system.
Such times are represented in the model as 0.001 day,
an upper limit on the time required.
3. Quality Function— Unlike the previous example
where the processing was based on aggregation and the
corresponding quality function based on averaging of
inputs, this process is based on a comparison. Consequently the output is of acceptable quality if and only
if both the information provided by the vendor and the
frame availability information are correct. The corresponding quality function is the product of the quality
of DU2 and DU3 .
4. Delay— This processing by the IBM is done at one
hour intervals, consequently the average delay is onehalf hour or 0.05 day based on a 10 hour day.
5.3.3. Storage Block. For each storage block only
three descriptors are required. SB1 will be used as an
example.
1. Cost— The cost of storing master file data for the
IBM is estimated to be $0.1, which includes the time to
retrieve the data and disk storage cost.
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
2. Time— This is primarily the time to retrieve the
record associated with a patient’s frame, which is
relatively short, and is estimated to be 0.0002 work
day.
3. Delay— The frame availability information is
available without delay once the query is received since
the database is on-line.
5.3.4. Quality Blocks. Each Quality Block requires
four descriptors. QB1 is used as an example.
1. Cost— When the rework operation is included as
part of the quality block, the cost is a sum of the actual
cost of checking the information quality plus a prorated
cost to cover the rework operation for that proportion
of the flow which is rejected. The check is performed by
the IBM at a cost of $0.10.
2. Time— A similar argument holds for the time estimate. The time is quite small and 0.0002 days is used
as an upper limit.
3. Quality Function— At this point approximately
5% of the flow is in error (4% from the flow provided
by the patient and 1% from errors in the item master
file concerning frame availability). The quality check at
QB1 focuses only on frame related data, consequently
only about 2% of the flow (1% relating to errors from
the patient flow and 1% from the master file) is detected
to be in error and rejected. The remaining 3% out of the
initial 5% is used in the modeling of the output quality,
Qout , for this block by using the parameter, 0.60, in the
quality function (i.e. .03/.05, QB1 ). The specific structure of the quality function is necessitated by the fact
that Qout is the proportion that is good. Note that this
structure is similar to that used in Figure 3(c).
4. Delay— Since the quality check is made by the IBM
as the last step in processing, the delay Å 0 for this quality block.
5.3.5. Information Required to Evaluate Information Products. Of the eight descriptors required to
evaluate an information product, three are computed
while the other five must be specified.
1. Customer— The customer for the information
product is the laboratory.
2. Intrinsic Value— VI was set to 1.00 so that it could
be scaled up or down as a function of spectacle value.
This permits flexibility for the analysis.
3. Weighting Factor— Since it was estimated that for
laboratory operations, quality was approximately twice
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
3b28 0010 Mp 483 Monday Apr 20 09:28 AM Man Sci (April) 0010
as important as timeliness, the weight, w, was set equal
to 0.67.
4. Data Quality Exponent— Since the value of the information product declines substantially for relatively
small data quality problems, an exponent greater than
one is indicated. For example, moving from 0.95 quality
to, say 0.98 quality, although a small quality increase,
would be a substantial improvement in the value of the
information product. It was estimated that a probability
of error of 0.10 would reduce the value of the information product by half, while if the error probability were
0.50 the information product would have no appreciable value. An exponent of a Å 7.0 provides a good approximation to these conditions.
5. Timeliness Exponent— Since for this case the
shelf-life used in the timeliness function was based on
the patient’s tolerance for delays, a linear function is
used to convert timeliness to value to avoid ‘‘doublecounting’’ this factor in the model.
6. Conclusions
We have presented an information manufacturing
model that can be used to determine the timeliness,
quality, cost, and value of information products. The
systems we model have a predefined set of data units
which undergo predefined processing activities. Our
work is customer driven in that the value of the information products manufactured by the system is determined by the customer of information products.
The Optiserve case focused on a relatively small-scale
information manufacturing system. For large-scale systems, which may contain hundreds of processes, data
units, and so forth, a hierarchical modeling approach is
required. Under this approach an analyst would model,
initially, at a higher (macro) level with each block possibly representing a large number of related activities.
That macro model, which would contain a relatively
small number of blocks, is then analyzed. Those blocks
that for whatever reason require more specific analysis
are then replaced with a detailed (micro) model.
One of the benefits of the Information Manufacturing
Model is the ability to use it to study the impact on an
information system of a changed environment and the
efficacy of various options for addressing these changes.
For example, suppose that governmental regulations
483
Downloaded from informs.org by [128.122.253.228] on 01 July 2015, at 06:52 . For personal use only, all rights reserved.
BALLOU, WANG, PAZER, AND TAYI
Modeling Information Manufacturing Systems
alter the frequency with which information products are
required. Proposed changes to the current system can
be simulated to verify if these alterations can, in fact,
enable the information product to be delivered when
required. It also provides insights regarding the information quality that would result together with the associated costs.
This research is particularly timely in light of the industrial trend toward total quality management and
business process reengineering. At the intersection of
these driving forces is information quality. Ultimately,
we need to deliver high-quality information products to
the customer in a timely and cost-effective manner.2
2
Work reported herein has been supported, in part, by MIT’s Total
Data Quality Management (TDQM) Research Program, MIT’s International Financial Services Research Center (IFSRC), Fujitsu Personal
Systems, Inc., Bull-HN, and Advanced Research Projects Agency and
US Naval Command, Control, Ocean Surveillance Center.
References
Ahituv, N., ‘‘A Systematic Approach Toward Assessing the Value of
an Information System,’’ MIS Quarterly, 4, 4 (1980), 61–75.
Ballou, D. P. and H. L. Pazer, ‘‘The Impact of Inspector Fallibility on
the Inspection Policy in Serial Production System,’’ Management
Sci., 28, 4 (1982), 387–399.
and
, ‘‘Modeling Data and Process Quality in Multi-input,
Multi-output Information Systems,’’ Management Sci., 31, 2 (1985),
150–162.
and
, ‘‘A Framework for the Analysis of Error in Conjunctive,
Multi-Criteria, Satisficing Decision Processes,’’ J. of Decision Sciences Inst., 21, 4 (1990), 752–770.
and
, ‘‘Designing Information Systems to Optimize the
Accuracy-Timeliness Tradeoff,’’ Information Systems Res., 6, 1
(1995), 51–72.
and G. K. Tayi, ‘‘Methodology for Allocating Resources for Data
Quality Enhancement,’’ Comm. ACM, 32, 3 (1989), 320–329.
Chengalur-Smith, I., D. Ballou, and H. Pazer, ‘‘Dynamically Determine
Optimal Inspection Strategies for Serial Production Processes,’’
International J. Production Res., 30, 1 (1992), 169–187.
Deming, E. W., Out of the Crisis, Center for Advanced Engineering
Study, Massachusetts Institute of Technology, Cambridge, MA, 1986.
Figenbaum, A. V., Total Quality Control, McGraw-Hill, New York,
1991.
Hammer, M., ‘‘Reengineering Work: Don’t Automate, Obliterate,’’
Harvard Business Rev., 90, 4 (1990), 104–112.
Kraimer, K. L., W. H. Dutton, and A. Northrup, The Management of
Information Systems, Columbia University Press, New York, 1981.
Kriebel, C. H. and O. Mikhail, ‘‘Dynamic Pricing of Resources in Computer Networks,’’ Logistics, 1980.
Lai, S. G., Data Quality Case Study—‘‘Optiserv Limited,’’ Master’s Thesis, MIT Sloan School of Management, Cambridge, MA, 1993.
Laudon, K. C., ‘‘Data Quality and Due Process in Large Interorganizational Record Systems,’’ Comm. ACM, 29, 1 (1986), 4–11.
Morey, R. C., ‘‘Estimating and Improving the Quality of Information
in the MIS,’’ Comm. ACM, 25, 5 (1982), 337–342.
Shewhart, W. A., Economic Control of Quality of Manufactured Products,
Van Nostrand, New York City, 1931.
Wand, Y. and R. Y. Wang, ‘‘Anchoring Data Quality Dimensions in
Ontological Foundations,’’ Comm. ACM, November (1996).
Wang, R. Y. and C. Firth, Using a Flow Model to Analyze the Business
Impact of Data Quality, (No. TDQM-93-08), Total Data Quality
Management (TDQM) Research Program, MIT Sloan School of
Management, Cambridge, MA, 1993.
, H. B. Kon, and S. E. Madnick, ‘‘Data Quality Requirements Analysis and Modeling,’’ in Proc. 9th International Conf. on Data Engineering, (670–677), IEEE Computer Society Press, Vienna, 1993.
, V. Storey, and C. P. Firth, ‘‘A Framework for Analysis of Data
Quality Research,’’ IEEE Trans. on Knowledge and Data Engineering,
7, 4 (1995), 623–640.
and D. Strong, ‘‘Beyond Accuracy: What Data Quality Means to
Data Consumers,’’ J. Management Information Systems, 12, 4
(Spring 1996), 5–34.
Accepted by Abraham Seidmann; received June 1993. This paper has been with the authors 11 months for 3 revisions.
484
3b28 0010 Mp 484 Monday Apr 20 09:28 AM Man Sci (April) 0010
MANAGEMENT SCIENCE/Vol. 44, No. 4, April 1998
Download