The Heartbeat of the Factory: Understanding the Dynamics of Agile

The Heartbeat of the Factory
The Heartbeat of the Factory:
Understanding the Dynamics of Agile Manufacturing Enterprises1
Van Parunak (vparunak@erim.org)
ERIM
Abstract
Manufacturing engineers often characterize the behavior of a manufacturing system with averages over time. This
approach is appropriate for a system that spends most of its time in a steady state. Manufacturing systems that face a
constantly changing demand for a changing array of products may never reach a stable equilibrium. Like a surfer
balanced on a wave, they must continually adjust themselves in a state of constant transition.
Averages and variances are not enough to understand such an adaptive system. The study of naturally occurring
adaptive systems (such as ecosystems, market economies, and various physical and chemical processes) is yielding a
new breed of analytical tools that can characterize the dynamics, or quantifiable time-varying behavior, of these
systems. Agile manufacturing requires the adaptation and application of these tools to manufacturing systems, first to
measure the baseline behavior of current systems, then to model the patterns we observe so that we can test our
understanding of how the systems operate, and finally to manage systems for greatest competitive advantage.
The paper has three parts. Section 1 develops the problem of unstable manufacturing systems in more detail, and
describes why we need to direct attention to their dynamic analysis. Section 2 illustrates the approach by presenting a
simple analysis of data from an actual automotive-like manufacturing plant, using the “time-delay plot,” one of the
tools of dynamic analysis. Section 3 outlines the broader questions that must be addressed to deliver the promise of
this approach.
1. The Problem
Manufacturing engineers often characterize the behavior of a manufacturing system with averages over time. Typical
measures include, “How many parts of which types did we make last month? What utilization did we achieve on the
various machines on the floor? What was our WIP level?” At the level of a supply chain, trading partners are
interested in such measures as aggregate release rates and expected flows of payment.
Averages over time are useful ways to characterize a system that is in a steady state, but they are less helpful if the
system is moving between states. For example, when a manufacturing system is shifting from one product to another,
it may show erratic transients that are not present once the new product is flowing at capacity. Traditionally,
engineers have been more concerned with the steady-state behavior of the system than with its transient behavior as it
changes state, under the assumption that the transition is short compared with the steady-state production run.
These assumptions may be appropriate for a manufacturing system that provides a single product to supply a
constant demand. Such systems are becoming scarce. Competitive pressures are reducing product life times and thus
the length of production runs, forcing systems to change frequently. Mass customization (as of fighter aircraft in the
JAST program or of missiles in the AM3 program) requires systems to assemble products in batches approaching
single items, and generates a highly variable demand for the components of these products. In such an environment
of continual, unexpected change, a manufacturing system may never reach a steady state, but must balance on the
crest of a constantly breaking wave.2 Like a fly-by-wire fighter, its inherent instability permits it to react rapidly to
change, but requires that it leave the comfort of a stable steady state for a world of constant transition.
One way to understand the dynamics of such a complex system is to reason analytically from its detailed internal
structure. This approach has limits when applied to manufacturing systems. First, important components of the
system involve human decision-making and thus cannot (at the current state of the art) be modeled in detail. Second,
analytic solutions are unavailable for systems beyond even a very modest level of complexity (for example, the
equations of motion for three masses under gravitational forces).
1
The latest version of this paper is available on-line at http://www.erim.org/~vparunak/heartbt.pdf .
2
[Preiss 95] offers a useful discussion of the difference between static and dynamic manufacturing systems.
7/17/00 12:14 PM
Copyright © 1995, Industrial Technology Institute
All Rights Reserved
Page 1
The Heartbeat of the Factory
Level of
Understanding
New Perspective
Data Collection
Generalizations and
Patterns
Predictive Rules that
enable Control
Mechanisms
Table 1: Levels of Understanding
Example from Physics
Manufacturing Systems Parallel
Copernicus: Shift focus from earth-ascenter to sun-as-center
Brahe: Detailed measurements
Kepler: Solar “radius” sweeps out equal
areas in equal time
Newton: F = Gm1m2/r2
Shift focus from steady-state to dynamics
Existing shop-floor and supply-chain data; other
data to be gathered
Results of applying dynamic systems techniques to
manufacturing data
???
Einstein: mass warps the space-time
???
continuum
An alternative approach is to begin by studying actual operating systems and attempting to generalize from their
behavior. As an example of this approach, consider the development of Newtonian physics (Table 1). Starting with
the Copernican model of planets orbiting the sun, Brahe gathered detailed observational data on the motion of
planets from which Kepler was able to develop important generalizations. These generalizations in turn led to
Newton’s laws. As manufacturing engineers, we need the manufacturing equivalent of Newton’s laws. The vision of
manufacturing systems that can thrive on change plays the role of the Copernican model. This new perspective tells
us, like Brahe, what kind of data we need to gather (in our case, data that has not had the time variation averaged out
of it). In turn, we may form generalizations at the level of Kepler’s laws that ultimately will permit us to understand
the physics of manufacturing in a manner comparable with Newton’s achievement.
Even before we reach the ultimate abstractions, this approach can yield important commercial benefit. In medicine,
physicians learn to recognize particular patterns of EKG’s or brain wave recordings and the underlying conditions
that they reflect without understanding in detail the mechanisms by which these signals are generated. [Kempf &
Beaumariage 94] have derived some tantalizing patterns from a simplified simulation of a semiconductor fab. Our
vision here is to build a body of understanding that will permit us to use easily gathered data for diagnostic and
descriptive purposes. We can reasonably expect in less than five years to develop a crude tool that can give us the
EKG of a factory or a set of trading partners. Such a tool in turn will refine our understanding of manufacturing
systems.
2. A Real-World Example
This section illustrates the vision by applying a simple tool for analysis of dynamic systems (the time-delay plot) to
actual data from an operating manufacturing plant. It defines the manufacturing context, introduces the analytical
methodology, applies it to the data, and discusses the implications of the results. This exercise does not by any means
exhibit the breadth of the dynamical systems toolbox. We use a shop-floor example because we have that data
available, but the techniques should be applicable to interactions between firms as well. Our objective is to get a feel
for the kind of information that a dynamical perspective on manufacturing can yield. In particular, we will see how
fairly conventional data can distinguish between two different obstacles to the throughput of a plant: various degrees
of equipment shutdown, and dynamic congestion due to the real-time flow of material.
2.1 The Manufacturing Context
The data in this study come from an operating American factory that produces several different models of specialty
vehicles, and from a portion of that factory that is similar in its processes and structures to an automotive body and
paint shop. Parts are mounted on carriers, which in turn travel on a power-and-free conveyor system. Sensors placed
at critical points in the process record the time and identity of passing carriers.
Carriers pass sensors 12 and 17 in that order, and traffic jams sometimes form in the intervening segment of the
conveyor. We will examine the transit times for carriers moving between these sensors. That is, we calculate from
the raw data how long it takes each successive carrier to move from sensor 12 to sensor 17, and record these transit
times in the order in which the successive carriers reach sensor 17. Transit times are recorded in days. That is, a
7/17/00 12:14 PM
Copyright © 1995, Industrial Technology Institute
All Rights Reserved
Page 2
The Heartbeat of the Factory
transit time of 1.5 is one and a half days, or 36 hours. For convenient reference,
Table 2 gives the translation among fractional days, hours, and minutes.
Table 2: Translating Time
Units
Days
Hours Minutes
1.0000 24.000 1440.00
0.1000
2.400 144.00
1.000
0.0417
60.00
0.0100
0.240
14.40
0.0010
0.024
1.44
1.00
0.0007
0.017
0.0001
0.002
0.14
Figure 1 shows the overall series of 1685 successive transit times over a two-week
period. The vertical (logarithmic) axis indicates the transit time for a single carrier in
days, and the horizontal axis counts the successive transits included in the data.
Transit Time (Days
Summary statistics are not enough to understand this scenario, but they are a useful
starting point. Most of the transits are well below 0.1 days (about 2.5 hours), with
prominent peaks about every 200 readings. The largest peak, at index 1200, is
Thanksgiving weekend, and the second largest peak is the previous weekend. The
population on the segment between the two sensors ranges from 0 to 17, with an
average of 4; the average transit time is .066
10
(about an hour and a half); the median transit
time is .013 (about 20 minutes), and the
shortest .0065 (about ten minutes). These
1
numbers capture some basic information
about the shop, but the complex structure of
0.1
the plot shows that there is much more that
they do not capture. In the following sections,
we explore how one simple tool of dynamical
0.01
systems analysis can uncover some of this
information.
1601
1401
1201
1001
801
601
2.2 Analysis Method: Timedelay plots
401
1
201
0.001
Time Series Index
Figure 1: Basic Transit Time Data (Two-Week Period)
We want to learn how successive transit times
depend on one another. A useful tool for this
purpose is the time-delay plot, which plots each value of a time series (on the Y axis) against the
Table 3: Timeprevious value (on the X axis). The justification for this procedure lies in Takens’ theorem
Shifting a
[Takens 81], which shows that such plots capture the topology of the system’s underlying state
Time Series
space, to which we do not have direct access. For our purposes, the proof of the value of the
x=t(i-1) y=t(i)
technique will lie in the results we gain from it.
1
1
For example, consider the time series <1, 3, 5, 4, 3, 2>. Table 3 shows how this series is converted 1
3
into a set of (x,y) pairs by shifting it relative to itself. The y column consists of the time series
3
5
shifted up one position with respect to the x column, and the missing values at the top of the x
5
4
column and bottom of the y column are then made up by duplicating the first or last value,
4
3
respectively, of the series (shown as italicized numbers in the Table). More sophisticated use of
3
2
time delay plots includes different amounts of shift, and constructing triplets or higher-order data
2
2
points by a similar mechanism, but the single shift is sufficient to illustrate the approach.
t(i)
t(i)
Figure 2 shows the results of plotting these artificial pairs. The trajectory begins at the lower left corner of the plot
with (1,1), and proceeds clockwise.
1
5
What would less-structured data
look like? Figure 3 shows such a
0.8
4
time-delay plot generated from 100
0.6
3
points randomly generated from a
uniform distribution on the interval
0.4
2
[0,1].
0.2
1
0
0
0
0
1
2
3
4
5
t(i-1)
Figure 2: Time Delay Plot of Table 3
7/17/00 12:14 PM
Copyright © 1995, Industrial Technology Institute
All Rights Reserved
0.2
0.4
0.6
0.8
1
t(i-1)
Figure 3: Plot of a Random
Series
Page 3
The Heartbeat of the Factory
5
2.3 Plotting the Transit
Data
0.3
4
3
0.2
0.1
0
0
0.05
0.1
0.15
0.2
0.25
0.3
t(i-1)
Figure 5: Transits < 1 Day
0.07
0.06
0.05
0.04
t(i)
Figure 5 shows several large squares reminiscent of the two in Figure
4. It also contains some “imperfect” squares, and a distinctive diagonal
clump at the lower left corner. We can focus in on this latter feature by
restricting ourselves to transits less than .07 day, or about an hour and
a half(1610 points), in Figure 6.
t(i)
t(i)
Contrast the disorder generated by
2
only 100 points in Figure 3 with
Figure 4, the time delay plot for all
1
1685 points of the transit data. The
0
two weekend shutdowns produce
0
1
2
3
4
5
large square trajectories that
t(i-1)
dominate the display, but there
Figure 4: Full Data Set
appears to be some orderly structure
in the mass of points at the lower
left. To explore this region further, we eliminate from the data all
transit times greater than one day. The remaining 1660 points produce
Figure 5.
0.03
# of High Values
Figure 6 is still much simpler than the random plot in Figure 3, even
0.02
though it contains more than sixteen times as many points. In addition
0.01
to the familiar squares, the plot shows a strong, clearly defined
diagonal band. Points within the diagonal band result from successive
0
transits that are close to the same duration, while points off the
0
0.01 0.02 0.03 0.04 0.05 0.06 0.07
diagonal result from successive transits that differ widely from one
t(i-1)
another in duration. This observation suggests that we compute the
Figure 6: Transits < .07 Day
difference in transit between each successive pair of points, and
observe how the resulting differences are distributed.
900
Figure 7 plots the number of pairs of successive
800
700
points differing by more than a threshold value, as a
600
function of that threshold. For example, the point at
500
(x,y) = (.001, 900) means that 900 pairs of successive
400
points have transits differing by at least .001. The
300
resulting curve has a sharp elbow at about .005. The
200
right-hand leg of the curve (with pairs differing by
100
more than .005 in transit time) produces the squares
0
0
0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01
on Figure 6, while the left-hand leg of the curve
Threshold
produces the diagonal. The values on the left-hand
Figure
7:
Number
of
Transit
Differences > Threshold
leg are higher than those on the right-hand leg,
corresponding to the higher density of points along the
900
diagonal of Figure 6.
800
The empirical differences between the squares and the
diagonal on Figure 6 (shape, density of points, fit to
7/17/00 12:14 PM
700
# of High Values
Figure 8 shows the result of fitting straight lines by
least squares to the two legs of Figure 7. The fit to the
left-hand portion is y = -233828x + 1117, and the fit to
the right-hand portion is y = -12703x + 159. The
straightness of the individual legs and their sharp elbow
emphasize the distinctness of the two regions in Figure
6, and the location of the elbow shows where the edge
of the diagonal band lies.
600
500
400
300
200
100
0
0
0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009
0.01
Thre shold
Figure 8: Straight-Line Fits
Copyright © 1995, Industrial Technology Institute
All Rights Reserved
Page 4
The Heartbeat of the Factory
different underlying straight lines) suggest that they result from different underlying causes.
These simple examples show the ability of the time-delay plot to call our attention to patterns in shop data that would
escape our notice if we averaged out the time component or worked only with trend charts. In the next section, we
will learn what these various patterns mean.
2.4 Interpreting Patterns
The simple device of a time-delay plot graphically reveals two qualitatively different kinds of time-varying behavior
in our data. One manifests itself as squares of varying degrees of symmetry. The other generates a strong diagonal
band of points. By examining more carefully the data behind these plots, we can understand these behaviors in ways
that can help us manage a factory more efficiently.
2.4.1 Squares: Line Stoppages
It is helpful to walk through the events that generate the smaller square on Figure 4, corresponding to the weekend of
Nov. 18-19, 1989. Just before the line stops for the weekend, the transit times are small (about .01 day) compared
with the length of the weekend. When the last carrier clears sensor 17 before the weekend, its transit time, t(i = 580),
is close to that of the one before it, t(i-1 = 579), generating a point in the lower left corner of the plot. When the line
stops, fourteen carriers have passed sensor 12 but not yet reached sensor 17, so their transit time (about 2.25 days
each) includes the shutdown. For the first carrier to arrive at sensor 17 after the line restarts, t(i = 581) is large, but
t(i-1 = 580) is small (being the transit time of the last carrier to clear before the weekend), generating the upper left
corner of the plot. The other thirteen carriers that passed sensor 12 but not sensor 17 before the weekend all have
similar transit times to the first one, since the length of the weekend swamps any variation in their actual travel time.
Thus t(582) through t(594) are large and about the same as t(581), generating the upper right corner of the plot. At
the lower right corner, t(i-1 = 594) still records the transit time of the last weekend carrier, but t(i = 595) now records
the much lower transit time of the first carrier to pass sensor 12 after the shutdown. Since subsequent carriers also
have transit times much shorter than two days, subsequent points return to the lower left corner of the plot.
This analysis leads us to recognize other squares in the plots as resulting from similar conditions. A trajectory forms
a symmetrical square when all carriers have been in the segment for periods of time that are much larger than their
usual transit times, and when the differences between those periods are small (on the order of usual transits). This
condition could result from a downstream stoppage alone if the segment were full when the stoppage began, thus
keeping upstream processes from adding carriers. A square can also result if the segment is not full, as long as
upstream processes shut down before or concurrently with downstream ones, and stay down until the downstream
starts up again.
Four of the squares in Figure 5 have the same structure as the large squares in Figure 4: a one-step rise from a low
value to a high one (0.26 for t(789) through t(793), .2 for t(986) - t(990), .16 for t(167) and t(168), .12 for t(407) and
t(408)), followed by a one-step drop back to a low value. These squares suggest shutdown of the line for periods of
time comparable to their magnitudes, and in fact all occur between midnight and 8 AM, probably representing
maintenance shutdowns.
2.4.2 Imperfect Squares: Interacting Stoppages
0.5
0.4
0.3
t(i)
Figure 9 and Figure 10 show two imperfect squares that result from timing
interactions between stoppages upstream and downstream of the 12-17
segment. To understand these patterns, we need to distinguish the relative
timing of upstream and downstream stoppages, and whether or not the
segment is full when the downstream process stops. The upstream process
might not stop at all, or its stoppage might overlap the downstream stoppage
in four different ways, illustrated in Figure 11. Thus we have a total of ten
possible conditions to consider, which together generate four different
patterns, as illustrated in Figure 12.
0.2
0.1
0
0
0.1
0.2
0.3
0.4
0.5
t(i-1)
Figure 9: t(1206) - t(1226)
7/17/00 12:14 PM
Copyright © 1995, Industrial Technology Institute
All Rights Reserved
Page 5
The Heartbeat of the Factory
0.15
0.1
t(i)
As noted in the previous section, a perfect square (Figure 12a) results any time
the segment is full to capacity when the downstream stoppage begins, whether or
not upstream processes stop. This is a fairly uncommon circumstance, and the
more likely cause of a perfect square is a stoppage of the upstream processes that
includes the complete downstream stoppage.
0.05
When the segment still has room and the upstream stoppage overlaps the
downstream one (that is, ends before the downstream stoppage ends), additional
carriers that enter the segment do not need to wait as long as the carriers that
0
were there when the downstream stoppage began. As before, the carriers in the
0
0.05
0.1
0.15
t(i-1)
segment for the full stoppage generate the upper right corner of the square, but
the right side drops only to the age of the first carrier to enter the segment when
Figure 10: t(411) - t(436)
the upstream stoppage ended (Figure 12b). The length of the vertical drop
corresponds to the overlap between the upstream
Downstream Stoppage
and downstream stoppages. Because the first
Upstream Stoppage
carrier from the restarted upstream process still
Overlaps
has to wait for part of the downstream stoppage,
Overlapped
its transit is longer than normal, but not as long as
During
the complete downstream stoppage. Successive
Includes
carriers are delayed even less by the downstream
stoppage, until the stoppage ends completely and
Time
normal transits resume. This situation gives rise to
Figure 11: Timing Relations between Downstream and
Figure 9.
Upstream Stoppages
When an upstream stoppage begins partway through a
Relation of
downstream stoppage, all of the carriers it supplied to the
Upstream to
segment between the start of the downstream stoppage and the
Downstream
start of its own stoppage must wait in the segment for a length
Stoppage:
of time equal to the overlap between the two stoppages. Again,
Includes
the right-hand corner of the square represents carriers in the
a
segment when the downstream process stopped. Since the
upstream process was still active at that point, the next several
carriers have transit times only incrementally shorter, but all at
b
Overlaps
least as long as the overlap. The final drop of the right-hand
side corresponds to the size of this overlap (Figure 12c). This
situation gives rise to Figure 10.
Overlapped
c
While we have not identified an example in the data, an
upstream stoppage contained completely within a downstream
stoppage would be expected to yield a combination of the
previous two figures, with the large half-box on the right-hand
side corresponding to the upstream stoppage (or perhaps
starting earlier if the segment fills) (Figure 12d).
During
d
(None)
The explanation we have just developed requires further
refinement. The stairsteps on Figure 9 and Figure 10 are much
Full
Not Full
larger than would be generated by the differences in arrival
Segment when Downstream Stops
rates of carriers under normal conditions. The size of these
Figure 12: Varieties of Perfect and Imperfect
normal rates can be seen in the fine detail at the lower-left and
Squares
upper-right corners of Figure 10. The larger steps in fact
reflect repeated intermittent stoppages of the upstream process (“upstream stuttering”) that group the late-arriving
carriers into batches. All the carriers in a batch arrive at about the same time, and generate a single step. In Figure 9,
these stoppages occur as the upstream processes restart, and reflect difficulty in bringing the system up after a
prolonged shutdown. In Figure 10, the steps occur at the beginning of a system-wide shutdown rather than its end,
and reflect a pattern of cascading system breakdown.
7/17/00 12:14 PM
Copyright © 1995, Industrial Technology Institute
All Rights Reserved
Page 6
The Heartbeat of the Factory
0.2
0.15
t(i)
Figure 13 exhibits a new phenomenon: stairsteps up as well as
down. The upward stairsteps result when one downstream
stoppage is followed by another before the backlog from the first
has cleared the segment. Just before the first stoppage, the
segment fills with sixteen carriers. When the line restarts, only
two are able to escape before the line stops again, generating the
first step at about 0.1. Another carrier arrives during this time,
leaving the segment with a population of fifteen carriers. Of these
fifteen, fourteen are resident during both stoppages, while one is
resident only for the second stoppage. When the line again
restarts after the second stoppage, the first fourteen carriers to
exit have transit times equal to the total time between the first
stoppage and the second resumption, leading to the second
upward step. The next carrier to exit has been in the segment
through the second stoppage, so the plot drops down only part
way. The remaining steps down are due to upstream stutter, as
observed in Figure 9.
0.1
0.05
0
0
0.05
0.1
0.15
0.2
t(i-1)
Figure 13: t(1421) - t(1445)
Thus the large squares that pervade the data tell a detailed story of the patterns of equipment shutdown that
characterize a facility. We have extracted considerable information from the patterns in only a single segment of the
conveyor system. These patterns could easily be detected by automatic systems to isolate repeated correlations
between the failure of different processes, and thus could help focus maintenance activity more efficiently.
2.4.3 The Diagonal: Dynamic Congestion
7/17/00 12:14 PM
81
81
91
71
71
41
41
61
31
31
61
21
21
51
11
11
51
1
6
5
4
3
2
1
0
91
Population
What is the origin of this periodicity? Figure 15
shows essentially the same pattern in the
population of the 12-17 segment over the same
set of transits. As the population of the segment
builds up, newly-arrived carriers must wait for
the earlier ones to be processed, and their transit
time becomes greater than the transit time for a
carrier that arrives when the segment is relatively
empty. The oscillation of period 60 that generates
the diagonal band in Figure 6 appears to be due
to “traffic-jam dynamics,” the effect of crowding
1
Transit
Our data shows not only squares of varying degrees of symmetry, but also a distinct diagonal band, in which transit
times vary in small steps (less than .005 day, which is about 7 minutes). The elbow in Figure 7 suggests that these
small steps may have a different origin
3.00E-02
than the big ones that generate squares. To
explore further, we select a series of 96
2.50E-02
points (t(1096) through t(1191)) within
2.00E-02
which there are no transitions greater than
.005 day. Even without large shifts, the
1.50E-02
transit times vary by a factor of more than
1.00E-02
four, from a maximum of .029 to a
minimum of .0068. Figure 14 shows how
5.00E-03
the lengths of these transits vary from one
0.00E+00
to the next. The first 30 or so transits
successively shrink in length. The next 30
Index
or so increase successively, and then there
Figure 14: Variation in t(1096) - t(1191)
appears to be another period of reduction,
though not as sharp as the first. The
8
difference in clock time covered by the first sixty
transits is about four hours.
7
Index
Figure 15: Segment Population in t(1096) - t(1191)
Copyright © 1995, Industrial Technology Institute
All Rights Reserved
Page 7
The Heartbeat of the Factory
within the segment. Further study is needed to determine the cause of this oscillation. It might be driven by the
timing of neighboring processes. It might also result from the combination of a random impulse (such as a
momentary line stoppage due to a machine failure) and the finite capacity of the 12-17 segment. Any physical
segment has an upper limit to its population, which imposes a nonlinear term in the recurrence relation governing its
population, and such a nonlinearity can lead to oscillation under the right circumstances.
2.5 Discussion
Using only a single simple tool of dynamical systems analysis, and concentrating on only a single segment of a
complex material transport system, we can peel away layers of behavior from shop floor data and distinguish
different sorts of dynamic behavior, including
•
system-wide shutdowns;
•
timing relations between upstream and downstream stoppages;
•
intermittent “stuttering” of selected processes, either when the line is started up or as a precursor to a shutdown;
•
periodic fluctuations in traffic density at selected locations in the line (“traffic-jam dynamics”).
These phenomena can easily be tracked and monitored automatically, enabling factory managers to distinguish
between two important kinds of obstacles to throughput: those that result from machine failure, and those that result
from dynamic congestion. The remedies to these problems are quite different. Once we can measure the relative
impact of these two problems inexpensively, we can make responsible decisions about allocating resources to
improve throughput. For example, if unplanned machine shutdowns dominate the dynamics of a particular shop,
increased preventive maintenance may be in order. If slowdowns are due to dynamic congestion, machine
maintenance is of little help, and the shop needs to devote attention to improving methods of shop-floor scheduling
and control. Techniques such as these can support decisions about the relative budgets needed by the maintenance
and industrial engineering departments.
As interesting as these observations are, they are only examples of a single technique in a single context. The larger
point is that dynamical systems analysis is a tool with great potential for understanding and managing the operation
of complex manufacturing facilities, particularly those that must handle changing mixes of different products. The
lessons of this exercise can be generalized in two directions: techniques and application domains. There are many
more techniques whose applicability should be explored, and the domains to which they are relevant include not only
shop floor, but also supply chains and other closely-coupled networks of trading partners.
3. What Needs to be Done?
Analysis techniques for nonlinear systems hold considerable promise for manufacturing systems engineering, but to
date the research community has not devoted much attention to this application area. In this section we survey some
of the problems that need to be addressed, and suggest some next steps to advance our abilities in this domain.
3.1 Problems to be Solved3
Figure 16 proposes an overall roadmap of the techniques we need to develop to apply dynamical systems analysis to
manufacturing enterprises. First, we need ways to measure the state of the enterprise. Then we need to interpret the
resulting information to yield insights about what is right or wrong. Finally, we need to implement these insights by
taking action in the enterprise itself.
3
The insights in this section were developed by a discussion panel at the workshop on Enhanced Manufacturing
Technologies, sponsored by Sandia National Laboratories, JAST, ARPA, and the Department of Energy in
Albuquerque, NM, Oct 1995.
7/17/00 12:14 PM
Copyright © 1995, Industrial Technology Institute
All Rights Reserved
Page 8
The Heartbeat of the Factory
3.1.1 Measurement: Gathering Data from the
Enterprise
Enterprise
Initially, the challenge in understanding manufacturing
dynamics may not be lack of information, but how to select
Implementation
Measurement
from the abundance of information that is available. The
example analyzed in this paper illustrates the kind of
information that can be routinely collected, and is often
gathered now in order to compute time-averaged performance
Insight
Information
statistics. To select from this jungle of information, we need
some idea of what different kinds of measurements can tell us
Interpretation
about the system. For example, the series of transit times
between sensors 12 and 17 yield useful patterns that do not
Figure 16: A Roadmap for Manufacturing
emerge as readily from the interarrival times at a single
System Dynamics
sensor. In some cases, theoretical approaches may suggest
that some measurements will be more “information-rich” than others, but in general, researchers need to conduct the
equivalent of a biologist’s collecting trip in the jungle of factory data to learn what sorts of information will be most
valuable.
The value of data will depend heavily on how it is organized, which in turn depends on the organizational structure
from which the numbers are gathered. [Packer 95] reports that useful management of the F16 JAST case study
depends on partitioning the enterprise in such a way that the behavior of organizational units directly impacts the
metrics by which they are evaluated, a process dubbed “getting the beans in the box.” Different measurements will
be appropriate for different organizational functions (e.g., production vs. field service). It is an open question which
measurements useful at one level of a given function will be appropriate for other levels. Patterns of transit times are
likely to be informative both between stations on the factory floor and between successive partners in a supply chain,
but each of these domains probably has important metrics that are less relevant to the other.
While much useful data may be available essentially for free, important insights may depend on information that is
not currently being gathered, for which new collection mechanisms need to be established. Thus in addition to
learning to assess the value that can be derived from information of a given kind, we need to quantify the cost
associated with collecting each class of measurement.
3.1.2 Interpretation: Making Management Decisions
Like raw data, some analysis procedures for dynamical systems are available off-the-shelf, having been developed
for applications other than manufacturing. For example, delay plots, power spectra, BDS and the related Savit delta
statistic, phase space and Poincaré plots, and computation of Lyapunov exponents and various measures of fractal
dimension, have been applied to many different kinds of data and can readily be adapted to manufacturing
information. Just as some data are more useful for our purposes than others, we need to learn which techniques are
most appropriate to the complex noisy information available in manufacturing. In some cases, the unique challenges
of manufacturing may suggest the development of new methods.
It is not enough to characterize a manufacturing system in terms of its formal dynamics. Simply knowing that a
process has an attractor of a particular shape or dimensionality does not lead to business decisions. As we learn to
characterize the dynamics of manufacturing systems, we must correlate particular dynamical patterns with business
performance, thus forming the basis for dynamically-based decisions support tools. The problem is inherently multidimensional. We cannot usefully optimize any single dimension, but need to learn how to combine different
characterizations to reach useful recommendations. In general, observation of real-world operations will provide data
for only some of the conditions of interest, and we will rely on simulation to explore the effects of changing various
operating parameters.
The initial impetus for a dynamical approach to manufacturing measurements is the growing demand for enterprises
to offer a wide scope of products and the observation that such an enterprise may never reach a steady state. To be
useful in such a context, our metrics must be correlated with the scope of products that an enterprise can offer. We
want to be able to reason in two directions. First, what can a given set of observed dynamics tell us about the scope
7/17/00 12:14 PM
Copyright © 1995, Industrial Technology Institute
All Rights Reserved
Page 9
The Heartbeat of the Factory
that a given enterprise can support? Second, given a proposed scope, can we estimate the resulting dynamics and
thus determine whether we can meet the proposal?
3.1.3 Closing the Loop
Neither information nor abstract insights are sufficient to meet the demands of modern commercial life. We must
move from insight to actual changes in the enterprise as a result of our measurement activity. Making changes
requires identifying the drivers of complex dynamics, learning which levers to pull to move a system’s dynamics
toward a more desirable configuration. These levers are often counter-intuitive. For example, in some cases of
contention for constrained resources, providing decision information too rapidly may lead to instability [Kephart et
al. 89].
The human side of manufacturing presents a particularly challenging implementation issue. Unlike custom
manufacturing equipment, unique skills cannot be reliably replaced on the open market. The ability of multi-skilled
workers to integrate different processes in an enterprise is only one example of a characteristic of human capital that
can be critical to success but difficult to anticipate in advance. The general US practice of “at will” employment
makes a company’s workforce especially liable to unexpected change and complicates our ability to project the
results of certain management decisions.
We need a new generation of management models that take account of the dynamics of changing organizations.
Developing these models will require extensive collection and analysis of data, development of theories, and
modeling and experimentation to validate these theories.
3.2 Next Steps
There is currently no formal program to support research and development in the application of dynamical systems
theory to manufacturing. ERIM has sketched out a research agenda in this domain,4 and is informally cultivating a
network of academic researchers and manufacturing professionals with interest in the area, but real progress depends
on support. A useful first step would be to convene a workshop of interested parties around a common data set such
as the one analyzed here, to illustrate what can be done with current techniques and to set priorities for ongoing
research. Such research will require an infrastructure that includes an archive of operating data from actual shops and
supply chains that researchers can explore for useful dynamical patterns, and validated simulation models of some of
those environments to test theories generated from data analysis.
3.3 Postscript (July 2000)
Since this paper was written, we have applied these techniques beyond the factory floor to the supply network in the
DASCh (Dynamical Analysis of Supply Chains) and SNAP (Supply Network Agility and Performance) projects.
Papers on the DASCh project are available at http://www.erim.org/cec/projects/dasch.htm.
References
[Kempf & Beaumariage 94]
Unpublished working paper.
K.Kempf and T.Beaumariage, "Chaotic Behavior in Manufacturing Systems."
[Kephart et al. 89]
J.O.Kephart, T.Hogg and B.A.Huberman, "Dynamics of Computational Ecosystems."
Physics Review 40A, 404-21.
[Packer 95]
M.Packer, Lockheed/Martin, “Cost Reductions Achieved on the F-16.” Presentation at Workshop
on Emerging Manufacturing Technologies, Albuquerque, 18 October 1995.
[Preiss 95]
K.Preiss, “Mass, Lean, and Agile as Static and Dynamic Systems.” Perspectives on Agility Series,
Vol. PA95-04. Agility Forum.
4
A white paper on “Complexity Theory in Manufacturing Engineering: Conceptual Roles and Research
Opportunities” is available at http://www.erim.org/~vparunak/3roles.pdf .
7/17/00 12:14 PM
Copyright © 1995, Industrial Technology Institute
All Rights Reserved
Page 10
The Heartbeat of the Factory
[Takens 81]
F.Takens, “Detecting strange attractors in turbulence.” Dynamical Systems and Turbulence,
Warwick 1980, Lecture Notes in Mathematics 898, Springer-Verlag, 366-81.
7/17/00 12:14 PM
Copyright © 1995, Industrial Technology Institute
All Rights Reserved
Page 11