Part III: The Analysis Process
Lecture Note 9
Analyzing Systems
Using Data Dictionaries
Systems Analysis and Design
Kendall & Kendall
Sixth Edition
Major Topics
•
•
•
•
•
•
•
Data dictionary concepts
Defining data flow
Defining data structures
Defining elements
Defining data stores
Using the data dictionary
Data dictionary analysis
2
Data Dictionary
• Is a main method for analyzing the data flows and
data stores of Data-Oriented Systems.
• Is a reference work of data about data (that is ,
metadata), one that is compiled (編輯) by systems
analysts to guide them through analysis and design.
• As a document, It collects, coordinates (協調), and
confirms (確定) what a specific data term means to
different people in the organization.
• Objective is to provide clear, comprehensive (全面的,
綜合的) information about the data and processes that
make up the system.
• Data dictionaries contain:
Data Flow, Data Structures, Data Elements, Data Stores.
3
Reasons for Using a Data Dictionary
A set of DFDs products a logical model of the system, but
the details within those DFDs are documented
separately in a data dictionary, which is the second
component of structured analysis.
• The data dictionary may be used for the following
reasons:
–
–
–
–
–
–
Provide documentation.
Eliminate redundancy. (Clean data E.g. sex=“M”, / 1 / “Male” )
Validate the DFD for completeness and accuracy.
Provide a starting point for developing screens and reports.
Determine the contents of data stored in files.
To develop the logic for DFD processes.
4
The Repository
• Although the data dictionary
contains information about
data and procedure. A central
storehouse of large collection
of information about the
system’s data is called a Data
Repository.
• It includes:
– Information about system
data.
– Procedural logic.
– Screen and report design.
– Relationships between
entries.
– Project requirements and
deliverables (陳述).
– Project management
information.
5
How Data Dictionary relate to DFD
• A Data Element, also called a data item or field, is the smallest
piece of data that has meaning within an information system.
• Data element are combined into Record, also called Structures.
• A record is meaningful combination of related data elements that
is included in a data flow or retained in a data store.
6
An order form from
World’s Trend
Catalog Division
• This company clothing and
other items by mail order
using a toll-free phone order
system and via the Internet
using customized Web forms.
• This form gives some clues (綫
索) about what to enter into a
data dictionary.
•
•
•
First, you need to capture and
store the name, address and
telephone number of the person
placing the order.
Then you need to address the
details of the order: the item
description, size, colour, price,
quantity and so on.
The customer’s method of
payment must also be determined.
7
Defining the Data Flow
•
•
You must document all data flows in the data
dictionary. Data flow are usually the first
components to be defined.
Each data flow should be defined with
descriptive information and its composite
structure or elements.
8
Include the following information:
1.
2.
3.
4.
5.
6.
7.
8.
9.
ID – an optional identification number.
A unique descriptive name for this data flow. This name is the text that
should appear on the diagram and be referenced in all descriptions
using the data flow.
A general description of the data flow.
The source of the data flow. The source could be an external entity, a
process, or a data flow coming from a data store.
The destination (終點) of the data flow.
The status of data flow, either:
•
A record entering or leaving a file.
•
Containing a report, form, or screen.
•
Internal - used between processes.
The name of the data structure describing the elements found in this
data flow, it could be one or several elements.
The volume (容量) per unit of time.
• The data could be records per day or any other unit of time.
An area for further comments and notations about the data flow.
9
Data Flow Example1
Indicates that the flow represents an input screen. It could
be any screen, such as a mainframe, GUI or Web page.
Visible Analyst was used to
create the form.
10
Defining Data Structures
A data structure is a collection of symbols and elements
describing relationship between different data. A more
complete data structure is formed from the collection of
these structures.
• Data structures are a group of smaller structures and
elements.
• An algebraic notation is used to represent the data structure.
• This method allows the analyst to produce a view of the
elements that make up the data structure along with
information about those elements.
• The analyst define these structural records and elements
once and use them in many different applications.
11
Algebraic Notation
The symbols used are:
– Equal sign “=”, meaning “consists of”.
– Plus sign (+), meaning "and”.
– Braces {} meaning repetitive (重複) elements, a repeating
element, group of elements, or tables.
• A repeating group may be:
– A sub-form.
– A screen or form table.
– A program table, matrix, or array.
• There may be one repeating element or several within the group.
• The repeating group may have:
– Conditions.
– A fixed number of repetitions.
– Upper and lower limits for the number of repetitions.
– Brackets [ ] for an either/or situation.
• Either one element may be present or another, but not both. The
elements listed inside are mutually exclusive.
– Parentheses () for an optional element.
12
Data Structures Example 1
Customer Order =
This is a example of the data
structure for Adding a Customer
Order at World’s Trend Catalog
Division.
• Each NEW CUSTOMER screen
consists of the entries found on the
right side of the equal signs. Some
of the entries are elements, but
other, such as CUSTOMER
NAME, ADDRESS and
TELEPHONE, are groups of
elements or structural records. E.g.
customer name is made up of
FIRST NAME, MIDDLE INITIAL
and LAST NAME.
• Each structural record must be
defined until the entire set is
broken down into its component
elements.
Customer Name =
Address =
Telephone =
Available Order Items =
Method of Payment =
Credit Card Type =
Customer Number +
Customer Name +
Address +
Telephone +
Catalog Number +
Order Date +
{Available Order Items} +
Merchandise Total +
(Tax) +
Shipping and Handling +
Order Total +
Method of Payment +
(Credit Card Type) +
(Credit Card Number) +
(Expiration Date)
First Name +
(Middle Initial) +
Last Name
Street +
(Apartment) +
City +
State +
Zip +
(Zip Expansion) +
(Country)
Area Code +
Local Number
Quantity Ordered +
Item Number +
Item Description +
Size +
Color +
Price +
Item Total
[Check | Charge | Money Order]
[World's Trend | Amer. Express | Discover
| MasterCard | Visa]
Physical and Logical Data Structures
• Data structures may be either logical or physical.
– When data structures are first defined, only the data
elements that the user would see. This stage is the logical
design, showing what data the business needs for its dayto-day operations.
– Then analyst designs the physical data structures.
• Logical data structures indicate the composition of
the data familiar to the user.
• The physical data structures which include
additional elements necessary for implementing the
system.
14
• Additional physical elements include:
– Key fields used to locate records in the database table.
An example is an item number, which is not required for a business to
function but is necessary for identifying and locating computer record.
– Codes to indicate record status, such as if an employee is
active (currently employed) or inactive.
– Transaction codes are used to identify types of
records when a file contains different record types. An
example is a credit file containing records for returned items as
well as records of payment.
– Repeating group entries containing a count of how
many items are in the group.
– Limits on the number of items in a repeated group.
– A password used by a customer accessing a secure
Web site.
15
• A structure may consist of elements
or smaller structural records.
• These are a group of fields, such as:
– Customer Name.
– Address.
– Telephone.
• Each of these must be further defined
until only elements remain.
• Structural records and elements that
are used within many different
systems should be given a nonsystem-specific name, such as street,
city, and zip.
• The names do not reflect a functional
area.
• This allows the analyst to define
them once and use in many different
applications.
Structural Records
Customer Name =First Name +
(Middle Initial) +
Last Name
Address =
Street +
(Apartment) +
City +
State +
Zip +
(Zip Expansion) +
(Country)
Telephone =
Area code +
Local number
16
Data Structure Example 2
This is a example of the
data structure for a
CUSTOMER BILLING
STATEMENT, one showing
that the ORDER LINE is
both a repeating item and a
structure record. The
ORDER LINE limits are
from 1 to 5, indicating that
the customer may order
from one to five items on
this screen. Additional
items would appear on
subsequent orders.
17
Defining Elements
• Data elements should be defined with
descriptive information in the data dictionary,
length and type of data information,
validation criteria, and default values.
• Each element should be defined once in the
data dictionary.
• The objective is to provide clear,
comprehensive information about the data
and processes that make up the system.
18
•
Attributes of each element are:
1. Element ID. This is an optional entry that allows the
analyst to build automated data dictionary entries.
2. The name of the element, descriptive and unique
•
It should be what the element is commonly called in most
programs or by the major user of the element.
3. Aliases, which are synonyms (同義詞) or other names
for the element
4. A short description of the element. These are names
used by different users within different systems
Example, a Customer Number may be called a:
•
•
Receivable Account Number.
Client Number.
5. Whether the element is “Base” or “Derived”:
–
–
A base element is keyed into the system and it must be stored
on files.
A derived element is created by processes as the output of
calculations or logic.
19
6. The length of an element.
–
Some elements have standard lengths, such as a state
abbreviation, zip code, or telephone number.
– For other elements, the length may vary and the analyst
and user community must decide the final length.
The final length base on the following consideration:
1. Numeric amount lengths should be determined by
figuring the largest number the amount will probably
contain and then allowing reasonable room for
expansion (擴大).
2. Totals should be large enough to accommodate (容納) the
numbers accumulated (積累) into them.
3. It is often useful to sample historical data to determine a
suitable length.
20
7.
The type of data: numeric, date, alphabetic or
character, which is sometimes called alphanumeric
or text data.
8. The input and output formats should be included,
using special coding symbols to indicate how the
data should be presented. For example,
XXXXXXXX would be represented as X(8).
These may translate into masks used to define database
fields.
21
9. Validation (確認) criteria must be defined.
Validation criteria for ensuring that accurate data
are captured by the system.
Elements are either:
• A range of values is suitable for elements that contain
continuous (連續的) data with a smooth range of values.
Continuous elements are checked that the data is within
limits or ranges.
• A list of values is indicated if the data are discrete (不連接的).
– Discrete, meaning they have fixed values.
• Discrete elements are verified by checking the values within a
program.
• They may search a table of codes.
• A table of codes is suitable if the list of values is extensive (
大量的). E.g. telephone country code.
• For key or index elements, a check digit is included.
22
10. Any default value the element may have.
– The default value is displayed on entry screens and
is used to reduce the amount of keying that the
operator may have to do.
– Default values on GUI screens
• Initially display in drop-down lists.
• Are selected when a group of radio buttons are
used.
11. An additional comment / remarks area.
– This might be used to indicate the format of the
date, special validation that is required, the checkdigit method used, and so on.
23
Data Element Example 1
• Customer Number may be called Client Number. This variable can be as large as 999999,
but cannot be less than zero. Another kind of data element is an alphabetic (字母表) element.
BL: blue, WH: white, GR: green. When this element is implemented, a table will be need for
24
users to look up the meanings of these codes.
Data Element Example 2
Shows a sample screen that illustrates how the
Social Security Number data element might be
recorded in the Visible Analysis data dictionary.
1
2
3
4
5
6
8
7
1. Online or
manual
documentation
entries often
indicate which
system is
involved.
2. The data
element has a
standard label
that provides
consistency
throughout
the data
dictionary.
3. The data
element can
have an
alternative
name, or
alias.
4. This entry
indicates that
the data
element
consists of
nine numeric
characters.
5. Depending
on the data
element, strict
limits might
be placed on
acceptable
values.
6. The data
comes from
the
employee’s
job
application.
7. This entry
indicates that
only the payroll
department has
authority to
update or
change this
data.
8. Indicates the
individual or
department
responsible for
entering and
changing data.
Defining Data Stores
• Data stores contain a minimal of all base elements
as well as many derived elements.
• Data stores are created for each different data
entity; that is, each different person, place, or thing
being stored.
• Data flow base elements are grouped together and
a data store is created for each unique group.
• Since a data flow may only show part of the
collective data, called the user view, you may have
to examine many different data flow structures to
arrive at a complete data store description.
26
Data Store Definition
1.
2.
3.
4.
5.
6.
7.
8.
9.
The Data Store ID
The Data Store Name, descriptive and unique
An Alias for the file
A short description of the data store
The file type, either manual or computerized
If the file is computerized, the file format designates
whether the file is a database file or the format of a
traditional flat file.
The maximum and average number of records on the file
The growth per year. This helps the analyst to predict (預告)
the amount of disk space required.
The data set name specifies the table or file name, if
known. In the initial design stages, this may be left blank.
The data structure should use a name found in the data
27
dictionary.
Data Store Example 1
• Primary and secondary keys must
be elements found within the data
structure.
E.g:
• Customer Master File
Customer Number is the
primary key, which should be
unique.
• The Customer Name, Telephone,
and Zip Code are secondary keys.
28
Data Store Example 2
1. This data store has
an alternative name
or alias.
2. For consistency,
data flow names
are standardized
throughout the data
dictionary.
3. It is important
document these
estimates, because
they will affect
design decision in
subsequent SDLC
phases.
1
2
3
Visible Analyst screen that documents a data store named IN STOCK.
Defining the Entities
• By documenting all entities, the data dictionary can serve
as a complete documentation package.
• Typical characteristics of an entity include the following.
– Entity name: The entity name as appears in the DFDs.
– Description: Describe the entity and its purpose.
– Alternate name: Any aliases for the entity name.
– Input data flows: The standard DFD names for the
input data flows to the entity.
– Output data flows: The standard DFD names for the
data flows leaving the entity.
30
1
2
Diagram 0 for Order System.
1.
The external entity also can have
an alternative name, or alias, if
properly documented.
2.
For consistency, these data flow
names are standardized
throughout the data dictionary.
Visible Analyst screen that documents an
external entity named WAREHOUSE
31
Defining the Processes
• You must document every process. Following are
typical characteristics of a process:
– Process name or Label: The process name as it appears
on the DFDs.
– Description: A brief statement of the process’s purpose.
– Process Number: A reference number that identifies the
process and indicates relationships among various levels
in the system.
– Process description: This section includes the input and
output data flows. For functional primitives, the process
description also documents the processing steps and
business logic. (You will learn how to write process
descriptions in the next lecture)
32
1
2
Diagram 1 DFD shows details of the FILL
ORDER process in the order system.
Visible Analyst screen that describes a process
named VERIFY ORDER
1.
The process number identifies this process. Any sub-process are numbered 1.1,
1.2, 1.3 and so on.
2.
Process description includes data flows, logical rules and structured English
version of the policy.
Data Dictionary Report
• The data dictionary serves as the central storehouse of
documentation for an information system. In addition to
describing each data element, data flow, data store, data
structure, entity and process, the data dictionary documents
the relationships among these components. You can obtain
many valuable reports from a data dictionary, including the
following:
– An alphabetized list of all data elements by name.
– A report by user departments of data elements that must
be updated by each department.
– A report of all data flows and data stores that use a
particular data element.
– Detailed reports showing all characteristics of data
elements, records, data flows, processes, or any other
selected item stores in the data dictionary.
34
Data Dictionary and Data Flow Diagram Levels
• Data dictionary entries vary according to the level of
the corresponding data flow diagram.
• Data dictionaries are created in a top-down manner.
• Data dictionary entries may be used to validate parent
and child data flow diagram level balancing.
• Whole structures, such as the whole report or screen,
are used on the top level of the data flow diagram.
– Either the context level or diagram zero
• Data structures are used on intermediate-level data
flow diagram.
• Elements are used on lower-level data flow diagrams.
35
Creating Data Dictionaries
1.
Data Dictionary entries may be created after the DFD has
been completed.
–
–
–
2.
The analysts may create a Diagram 0 data flow after the first
interviews and
At the same time, make the preliminary data dictionary entries.
Typically, these entries consists of data flow name and their
corresponding data structure.
After several additional interviews have been conducted to learn
the details of the system, the analyst will expand the DFD and
create the child DFD.
Each level of a DFD should use data appropriate (恰當) for
the level. The data dictionary is then modified to include
new structural records and elements .
•
•
Diagram 0 should include only forms, screens, reports and records.
As child diagrams are created, the data flow into and out of the
processes becomes more and more detailed, including structural
records and elements.
36
• It is important that the data flow names in the child DFD are contained as elements
or structural records in the date flow on the parent process.
• It illustrates a
portion of two data
flow diagram levels
and corresponding
data dictionary
entries for producing
an employee pay
check. Process 5
found on Diagram 0,
is an overview of the
production of an
EMPLOYEE PAY
CHECK. The
corresponding data
dictionary entry for
EMPLOYEE
RECORD shows the
EMPLOYEE
NUMBER and four
structural records.
37
Analyzing Input and Output
An important step in creating the data dictionary is to identify and
categorize system input and output data flow. Input and output
analysis forms may be used to organize the information obtained
from interviews and document analysis. This form contains the
following commonly included fields:
• A name for the input and output.
• The user contact responsible (負責) for further details
clarification (澄清), design feedback, and final approval (批准).
• Whether the data is input and output.
• The format of the data flow.
• Elements indicating the sequence (次序) of the data on a
report and screen (perhaps in columns).
• A list of elements, including their names, lengths, base and
derived, and their editing criteria.
38
An Example of an input/output analysis form for World’s Trend
Catalog Division.
• Once the form has been
completed, each element
should be analyzed to
determine whether the
element repeats, is optional
or is mutually (相互的)
exclusive (排斥) of another
element.
• Elements that fall into a
group or that regularly
combine with several other
elements in many structures
should be placed together in
a structural record.
39
Developing Data Stores
• Another activity in
creating the data
dictionary is developing
data stores. Up to now,
we have determined what
data needs to flow from
one process to another.
This information is
described in data
structures. The
information may be
stored in numerous places
and in each place the data
store may be different.
• Data stores may be
determined by analyzing
data flows.
• Each data store should
consist of elements on the
data flows that are
logically related, meaning
they describe the same
entity.
Data stores derived from a pending order at
World’s Trend Catalog Division.
40
Using and Maintaining the Data Dictionary
• As the SA learns about the organization’s systems, data items
are added to the data dictionary.
• To have maximum power, the data dictionary should be tied (
束縛) into other programs in the system. So that when an item
is updated or deleted from the data dictionary, it is updated or
deleted from the database.
• Data dictionaries may be used to:
– Create reports, screens, and forms.
• Use the element definitions to create fields.
• Arrange the fields in an aesthetically pleasing screen, form, or
report, using design guidelines and common sense.
• Repeating groups become columns.
• Structural records are grouped together on the screen, report, or
form.
– Generate computer program source code.
– Analyze the system design for completion and to detect
design flaws.
41
42
Data Dictionary Analysis
• The data dictionary may be used in conjunction with the
data flow diagram to analyze the design, detecting flaws
and areas that need clarification.
• Some considerations for analysis are:
– All base elements on an output data flow must be
present on an input data flow to the process producing
the output.
– Base elements are keyed and should never be created
by a process.
– A derived element should be output from at least one
process that it is not input into.
– The elements that are present on a data flow into or
coming from a data store must be contained within the
data store.
43
Review Question
•
•
•
•
Define the term data dictionary.
What information is contained in the data
repository?
Describe the difference between base and
derived elements.
What are the main benefits of using a data dictionary?
44