Structured Analysis Data Flow Modelling

advertisement
ESE Module 3-3
Structured Analysis
Part 2: Data Flow Modeling
All software, whether it's used to produce paychecks
or control a supersonic aircraft, is an information transformer. That is, all software accepts input, transforms it
in some way, and produces output as a result. In
essence, data flow into a system, get changed, and
ultimately flow out.
As we create the analysis model, our challenge is
to understand this transformation. How are data
changed as they flow through a system? What functions are required to transform input into output?
These questions are answered during the second
analysis modeling activity-the creation of a data
flow model.
The data flow model kills two birds with one stone.
It indicates the flow of data and it also provides us
with a mechanism for functional decomposition.
Hence we address two important analysis principles:
(1) understanding the flow of data through a system,
and (2) partitioning (decomposing) function into progressively lower levels of abstraction. (See ESE Module
3-2 for details.)
Flow Modeling Notation
Data flow notation is surprisingly simple, and yet the
models that it produces can be quite sophisticated.
The basic data flow icons address only four things:
producers and consumers of data; processes that
transform data; the data flow itself; and places where
data are stored.
Armed with these icons, the software engineer
(analyst) can produce a picture of the software (and
its external environment) that establishes the basis for
architectural design.
Readings
The following excerpt has been adapted from
Software Engineering: A Practitioner's Approach and
discusses data flow notation. Extensions for real-time
software (not discussed in the video portion of this ESE
Module) are presented briefly.
Information is transformed as it flows through a computerbased system. The system accepts input in a variety of
forms; applies hardware, software and human elements to
transform input into output; and produces output in a variety of forms. Input may be a control signal transmitted by
a transducer, a series of numbers typed by a human operator, a packet of information transmitted on a network link
or a voluminous data file retrieved from secondary storage.
The transform(s) may comprise a single logical comparison, a complex numerical algorithm or rule-inference
approach of an expert system. Output may light a single
LED or produce a 200-page report. In effect, we can create
a flow model for any computer-based system, regardless of
size and complexity.
Structured analysis is an information flow and content
modeling technique. A computer-based system is represented as an information transform. Overall function of the
system is represented as a single information transform,
represented as a circle or bubble. One
or more inputs, shown as labeled arrows, originate from
external entities, represented as a box. The input drives the
transform to produce output information (also represented
as labeled arrows) that is passed to the external entities. It
should be noted that the model may be applied to the
entire system or to the software element only. The key is to
represent the information fed into and produced by the
transform.
Data Flow Diagrams
As information moves through software, it is modified by a
series of transformations. A data flow diagram (DFD) is a
graphical technique that depicts information flow and the
transforms that are applied as data move from input to
output. The DFD is also known as a data flow graph or
a bubble chart.
The data flow diagram may be used to represent a system or software at any level of abstraction. In fact, DFDs
may be partitioned into levels that represent increasing
information flow and functional detail. A level 0 DFD, also
called a fundamental system model or a context model,
represents the entire software element as a single bubble
with input and output data indicated by incoming and outgoing arrows, respectively. Additional processes (bubbles)
and information flow paths are represented as the level 0
DFD is partitioned to reveal more detail. For example, a
level 1 DFD might contain five or six bubbles with interconnecting arrows. Each of the processes represented at
level 1 are subfunctions of the overall system depicted in
the context model.
A rectangle is used to represent an external entity,
that is, a system element (e.g., hardware, a person,
another program) or another system that produces
information for transformation by the software or that
receives information produced by the software.
A circle represents a process or transform
that is applied to data (or control) items and changes them
in some way. An arrow represents one or more data items.
All arrows on a data flow diagram should be labeled. The
double line represents a data store--stored information that
is used by the software. The simplicity of DFD notation is
one reason why structured analysis techniques are the
most widely used.
3-3p2.2 ·· Essential Software Engineering
It is important to note that no explicit indication of the
sequence of processing is supplied by the diagram.
Procedure or sequence may be implicit in the diagram, but
explicit procedural representation is generally delayed
until software design.
processing narrative described the input to the bubble, the
algorithm that is applied to the input and the output that is
produced. In addition, the narrative indicates restrictions
and limitations imposed on the process, performance characteristics that are relevant to the process, and design constraints that may influence the way in which the process
will be implemented.
Extensions for Real-Time Systems
Many software applications are time dependent, and
process as much or more control-oriented information as
data. [For a detailed discussion of these real-time systems
see Chapter 15 of Software Engineering: A Practitioner's
Approach.] For now, suffice it to say that a real-time system must interact with the real world in a timeframe dictated by the real world. Aircraft avionics, manufacturing
process control, consumer products and industrial instrumentation are but a few of hundreds of real-time software
applications.
Ward and Mellor Extensions
As we noted earlier, each of the bubbles may be
refined or layered to depict more detail. Figure 1 illustrates
this concept. A fundamental model for system F indicates
the primary input is A and ultimate output is B. We refine
the F model into transforms f1to f7. Note that information
flow continuity must be maintained, that is, input and output to each refinement must remain the same. This concept, sometimes called balancing, is essential for the development of consistent models. Further refinement of f4
depicts detail in the form of transforms f41 to f45 Again,
the input (X,Y) and output (Z) remain unchanged.
The data flow diagram is a graphical tool that can be
very valuable during software requirements analysis.
However, the diagram can cause confusion if its function is
confused with the flowchart. A data flow diagram depicts
information flow without explicit representation of procedural logic (e.g., conditions or loops). It is not a flowchart
with rounded edges!
The basic notation used to develop a DFD is not in
itself sufficient to describe requirements for software. For
example, an arrow shown in a DFD represents a data item
that is input to or output from a process. A data store represents some organized collection of data. But what is the
content of the data implied by the arrow or depicted by the
store? If the arrow (or the store) represents a collection of
items, what are they? These questions are answered by
applying another component of the basic notation for
structured analysis--the data dictionary. [The format and
use of the data dictionary are presented in the third part of
this ESE Module.]
The graphical notation for DFDs must be augmented
with descriptive text. A processing narrative--a paragraph
that describes a process bubble--can be used to specify the
processing details implied by the bubble within a DFD. The
Ward and Mellor [1] extend basic structured analysis notation to accommodate the following demands imposed by a
real-time system:
 information flow that is gathered or produced on a
time-continuous basis;
 control information passed throughout the system and
associated control processing;
 multiple instances of the same transformation are
sometimes encountered in multitasking situations;
 system states and the mechanism that causes transition
between states.
In a significant percentage of real-time applications,
the system must monitor time-continuous information generated by some real world process. For example, a realtime test monitoring system for gas turbine engines might
be required to monitor turbine speed, combustor temperature, and a variety of pressure probes on a continuous
basis. Conventional data flow notation does not make a
distinction between discrete data and time-continuous
data. An extension to basic structured analysis notation,
shown in Figure 2, provides a mechanism for representing
time-continuous data flow. The double-headed arrow is
used to represent time-continuous flow; a single-headed
arrow is used to indicate discrete data flow. In the figure,
monitored temperature is measured continuously and a
single value for temperature set-point is also provided.
The process shown in the figure produces a time-continuous output, corrected value.
The distinction between discrete and time-continuous
data flow has important implications for both the system
engineer and the software designer. During the creation of
the system model, a system engineer will be better able to
isolate those processes that may be performance critical. (It
is often likely that the input and output of time-continuous
data will be performance sensitive.) As the physical or
implementation model is created, the designer must establish a mechanism for collection of time-continuous data.
Obviously, the digital system collects data in a quasi-continuous fashion using techniques such as high-speed
polling. The notation indicates where analog to digital
hardware will be required and which transforms are likely
to demand high-performance software.
In conventional data flow diagrams, control items or
event flows are not represented explicitly. In fact, the analyst is cautioned specifically to exclude the representation
of control flow from the data flow diagram. This exclusion
is overly restrictive when real-time applications are considered; for this reason, a specialized notation for representing
event flows and control processing has been developed.
Continuing the convention established for data flow diagrams, data flow is represented using a solid arrow.
Control flow, however, is represented using a dashed or
shaded arrow. A process that handles only control flows,
called a control process, is similarly represented using a
dashed bubble.
Structured Analysis: Data Flow Modeling .. 3-3p2.3
Control flow can be input directly to a conventional
process or into a control process. Figure 3 illustrates control
flow and processing as it would be represented using
Ward and Mellor notation. The figure illustrates a toplevel view of a data and control flow for a manufacturing
cell. [A manufacturing cell is used in factory automation
applications. It contains computers and automated
machines (e.g., robots, NC machines, specialized fixtures)
and performs one discrete manufacturing operation under
computer control.] As components to be assembled by a
robot are placed on fixtures, a status bit is set within a parts
status buffer (a control store) that indicates the presence or
absence of each component. Event information contained
within the parts status buffer is passed as a bit string to a
process, monitor fixture and operator interface. The
process will read operator commands only when the control information, bit string, indicates that all fixtures contain components. An event flag, start/stop flag, is sent to
robot initiation control, a control process that enables further command processing. Other data flows occur as a consequence of the process activate event that is sent to
process robot commands.
In some situations multiple instances of the same control or data transformation process may occur in a realtime system. This can occur in a multitasking environment
when tasks are spawned as a result of internal processing
or external events.
1] Ward, P.T., and S. Mellor, Structured Development for
Real-Time Systems (3 volumes), Yourdon Press, 1985. i~
Exercise 3-7
Data Flow Modeling
Recall the electronic checkbook problem introduced
in ESE Module 3-2. Assume that you work for a consumer products company that is about to build an
electronic checkbook, called ElectroChex. The product, about the size and shape of a standard checkbook will print checks that you insert into a slot at the
end. The product stores up to 256 payee names, categorizes payments, allows you to enter numeric and
alpha information via a qwerty keyboard and has
communication capabilities to PCs.
1. Develop Level 0, 1 and 2 data flow diagrams for
ElectroChex.
2. Be sure to identify all external entities.
3. Be sure to label all bubbles (transforms) and
3-3p2.4 ·· Essential Software Engineering
arrows in your DFDs.
4. Review your results with those developed by your
colleagues.
Hint: If you have trouble starting this exercise, think
back to our discussion of the grammatical parse in ESE
Module 3-3. part 1 . You'll recall that each of the
active verbs in the statement of scope represents a
potential function for the system. The functions translate into bubbles at level 1.
Creating DFDs
Although DFD notation is quite simple, the creation of
a DFD is more difficult than you might think. The reason: most software people have been trained to think
procedurally. Data flow modeling has a procedural
component to it, but it is a process of refinement
based on how data flows, not on how program logic
progresses. For this reason, it may seem a bit odd to
you if this is your first introduction.
Readings
The following excerpt has been adapted from
Software Engineering: A Practitioner's Approach and
discusses the mechanics for creating a data flow diagram.
The data flow diagram (DFD) enables the software engineer to develop models of the information domain and
functional domain at the same time. As the DFD is refined
into greater levels of detail, the analyst performs an implicit functional decomposition of the system. At the same
time, the DFD refinement results in a corresponding refinement of data as it moves through the processes that
embody the application.
A few simple guidelines can aid immeasurably during
derivation of a data flow diagram: (1) the level 0 data flow
diagram should depict the software/system as a single
bubble; (2) primary input and output should be carefully
noted; (3) refinement should begin by isolating candidate
processes, data items and stores to be represented at the
next level; (4) all arrows and bubbles should be labeled
with meaningful names; (5) information flow continuity
must be maintained from level to level; (6) one bubble at a
time should be refined. There is a natural tendency to overcomplicate the data flow diagram. This occurs when the
analyst attempts to show too much detail too early or represents procedural aspects of the software in lieu of infermation flow.
To illustrate the use of these basic guidelines, the
SafeHome security system example, introduced earlier,
will be used. A processing narrative for SafeHome is reproduced below:
SafeHome software enables the homeowner to configure the security system when it is installed, monitors all
sensors connected to the security system, and interacts
with the homeowner through a keypad and function keys
contained in the SafeHome control panel.
During installation, the SafeHome control panel is
used to program and configure the system. Each sensor is
assigned a number and type, a master password is programmed for arming and disarming the system, and telephone number(s) are input for dialing when a sensor event
occurs.
When software senses an event, it rings an audible
alarm attached to the system. After a specified delay time
set by the homeowner during system configuration activities, the software dials the telephone number of a monitoring service, provides information about the location and
the nature of the event that it detected. The software will
re-dial the number every 20 seconds until telephone connection is obtained.
All interaction with SafeHome is managed by a userinteraction subsystem that reads input provided through
the keypad and function keys and displays prompting
messages on the LCD display, displays system status infermation on the LCD display. Keyboard interaction takes the
following form ...
A level 0 DFD for SafeHome is shown in Figure 1. The primary external entities (boxes) produce information for use
by the system and consume information generated by the
system. The labeled arrows represent composite data
items, that is, a data item that is actually a collection of
many additional data items. For example, user commands
and data encompasses all configuration commands, all activation/deactivation commands, all miscellaneous interactions, and all data that are input to qualify or expand a
command.
The level 0 DFD is now expanded into a level 1 model.
But how do we proceed? A simple yet effective approach is
to perform a grammatical parse on the processing narrative
that described the context-level bubble. That is, we isolate
all nouns (and noun phrases) and verbs (and verb phrases)
in the narrative presented above. To illustrate, we again
reproduce the processing narrative, underlining the first
occurrence of all nouns and italicizing the first occurrence
of all verbs:
Structured Analysis: Data Flow Modeling .. 3-3p2.5
SafeHome software enables the homeowner to configure the security system when it is installed, monitors all sensors connected to the security system, and interacts with the
homeowner through a keypad and function keys contained
in the SafeHome control panel.
During installation, the SafeHome control panel is
used to program and configure the system. Each sensor is
assigned a number and type, a master password is programmed for arming and disarming the system, and telephone number(s) are input for dialing when a sensor event
occurs.
When software senses an event, it rings an audible
alarm attached to the system. After a specified delay time
set by the homeowner during system configuration activities, the software dials the telephone number of a monitorine service, provides information about the location and
the nature of the event that it detected. The software will
re-dialed the number every 2 seconds until telephone connection is obtained.
All interaction with SafeHome is managed by a userinteraction subsystem that reads input provided through
the keypad and function keys and displays prompting messages on the LCD display. displays system status information on the LCD display. Keyboard interaction takes the
following form ...
It should be noted that nouns and verbs that are synonyms
or have no direct bearing on the modeling process are
omitted.
Referring to the grammatical parse, a pattern begins to
emerge. All verbs are SafeHome processes; that is, they
may ultimately be represented as bubbles in a subsequent
DFD. All nouns are either external entities (boxes), data or
control items (arrows), or data stores (double lines). Note
further that nouns and verbs can be attached to one another (e.g., sensor is assigned number and ~. Therefore, by
performing a grammatical parse on the processing narrative for a bubble at any DFD level, we can generate much
useful information about how to proceed with the refinement to the next level. Using this information, a level 1
DFD is shown in Figure 2. The context-level process shown
in Figure 1 has been expanded into seven processes
derived from an examination of the grammatical parse.
Similarly, the information flow between processes at level 1
has been derived from the parse.
It should be noted that information flow continuity is
maintained between levels 0 and 1. Elaboration of the content of inputs and output at DFD levels 0 and 1 is postponed until later.
The processes represented at DFD level 1 can be further refined into lower levels. For example, the process
monitor sensors can be refined into a level 2 DFD as shown
in Figure 3. Note once again that information flow continuity has been maintained between levels.
The refinement of DFDs continues until each bubble
performs a simple function; that is, until the process represented by the bubble performs a function that would be
easily implemented as a program component. In [ESE
Module 3-4] we discuss a concept called cohesion that can
be used to assess the simplicity of a given function. For
now, we strive to refine DFDs until each bubble is singleminded.
Creating a Control Flow Model
For many types of data processing applications, data flow
modeling is all that is necessary to obtain meaningful
insight into software requirements. As we have already
noted, however, there exists a large class of applications
that are driven by events rather than data; that produce
control information rather than reports or displays; that
process information with heavy concern for time and performance. Such applications require the use of control flow
modeling in addition to data flow modeling.
To review the approach for creating a control flow diagram (CFD), a data flow model is stripped of all data flow
arrows. Events and control items (dashed arrows) are then
added to the diagram and a "window" (a vertical bar) into
the control specification is shown. But how are events
selected?
We have already noted that an event or control item is
implemented as a boolean value (e.g., true or false, on or
3-3p2.6 ·· Essential Software Engineering
off, 1 or 0) or a discrete list of conditions (empty, jammed,
full). To select potential candidate events, the following
guidelines are suggested:
 list all sensors that are read by the software;
 list all interrupt conditions;
 list all switches that are actuated by the operator
 list all data conditions
 recalling the noun-verb parse that was applied to the
processing narrative, review all control items as possible
control specification (CSPEC) inputs/outputs:
 describe the behavior of a system by identifying its
states; identify how each state is reached and define the
transitions between states
 focus on possible omissions--a very common error in
specifying control (e.g., ask: "Is there any other way I can
get to this state or exit from it?")
A level 1 CFD for SafeHome software is illustrated in
Figure 4. Among the events and control items noted are
sensor event (i.e., a sensor has been tripped), blink flag (a
signal to blink the LCD display) and start/stop switch (a
signal to turn the system on or off). When the event flows
into the CSPEC window from the outside world, it implies
that the CSPEC will activate one or more of the processes
shown in the CFD. When a control item emanates from a
process and flows into the CSPEC window, control and
activation of some other process or an outside entity is
implied.
Exercise 3-8,
Data Flow Diagrams
Read the following problem description and then perform the tasks required to build a model using structured analysis.
Problem Statement: The Department of Public Works
for a large city has decided to develop a computerized pothole tracking and repair system (PHTRS). As
potholes are reported, they are assigned an identifying number, stored by street address, size (on a scale
of 1 to 10), location (middle, curb, etc.), district (determined from street address), and repair priority (determined from the size and location of the pothole).
Work order data associated with each pothole
includes pothole location and size, repair crew identifying number, number of people on crew, equipment
assigned, hours applied to repair, hole status (work in
progress, repaired, temporary repair, not repaired),
amount of filler material used and cost of repair (computed from hours applied, number of people, material
and equipment used). Finally, a damage file is created to hold information about reported damage due
to the pothole and includes citizen's name, address,
phone number type of damage; and dollar amount
of damage. PHTRS is an on-line system; queries are to
be made interactively. The user interacts with the system using a city map (with appropriate zoom capabilities) that enables pothole information to be entered
and displayed. For example, all streets in a five-block
section of the city can be displayed with each pothole located.
1. Using the grammatical parse, make lists of potential external entities, data objects and data stores
(nouns) and system functions (verbs).
2. Build a paper prototype of the user interface for
PHTRS.
3. Develop a level 0 DFD.
4. Develop level 1 and level 2 DFDs.
5. Develop an ERD (ESE Module 3-3, part 1) for
PHTRS.
6. Review your results with those developed by a colleague
Use separate sheets of paper for your solution to this
exercise.
Structured Analysis: Data Flow Modeling .. 3-3p2.7
Post-test, Module 3-3, part 2
This post-test has been designed to help you assess
the degree to which you've absorbed the information
presented in this ESE module.
Data Flow Modeling
1. Every computer-based system is:
a. a data transform
b. a computational transform
c. a logic transform
d. a procedural transform
2. An external entity is represented in a DFD as a
box. It represents:
a. a producer or consumer of information
b. a part of the software to be built
c. the data dictionary
d. none of the above
3. Using the basic rules that were described for creating DFDs, which of the following would represent a
violation of the rules:
a. data flowing between two bubbles in the DFD
b. an labeled arrow
c. data flowing directly between a box and a
data store
d. a labelled bubble
e. all are valid according to the rules
4. Partitioning is accomplished in part by:
a. considering composite data items
b. representing data flow at a number of
different levels
c. using the data dictionary
d. all of the above contribute to partitioning
5. Balancing means:
a. that the data flow should not be visually
lopsided
b. that input and output flow at one level must
be maintained at the next level
c. that the same number of inputs and outputs
must always be present
d. none of the above
6. A level 1 data flow diagram (the first refinement of
the context level) would typically have the following
number of bubbles:
a. 1 or 2
b. 4 to 7
c. 7 to 14
d. more than 14
7. A data store could be:
a. a file
b. a buffer
c. a relational data base
d. all of the above
8. A single data item flows into a process bubble, is
transformed and then flows out of a process bubble, is
stored in a buffer, and then is extracted from the
buffer for use in another function, which produces a
single output item. The total number of DFD symbols
required to depict this situation (not counting labeling) is:
a. 3
b. 5
c. 7
d. 9
9. Most systems require from m to n levels of data
flow refinement, where m to n is:
a. 1 to 2
b. 3 to 7
c. 5 to 9
d. 7 to 10
10. What structured analysis notation provides infermation about the internal working of a process bubble?
a. PSPEC
b. statement of scope
c. data dictionary
d. ERD
11. The data flow diagram maps into:
a. a behavioral design
b. a procedural design
c. a data design
d. an architectural design
12. An arrow in a flow model can represent:
a. data
b. data and control
c. data and relationships
d. data and logical flow
Copyright 1995 R.S. Pressman & Associates, Inc.
No part of this material may be electronically copied,
transmitted, or posted without the express written
permission of R.S. Pressman & Associates, Inc.
These materials have been posted on the Local Web
Server of the Department of Computer Science with
permission of R.S. Pressman & Associates, Inc. and are
for use only by students registered for DCS/235 at
Queen Mary and Westfield College
Download