Document 11070358

advertisement
I
\
*l
ALFRED
P.
WORKING PAPER
SLOAN SCHOOL OF MANAGEMENT
Productivity Impacts of Software Complexity
and Developer Experience
Geoffrey K. Gill
Chris F. Kemerer
MIT
Sloan School
WP
#3107-90
January 1990
MASSACHUSETTS
INSTITUTE OF TECHNOLOGY
50 MEMORIAL DRIVE
CAMBRIDGE, MASSACHUSETTS 02139
Productivity Impacts of Software Complexity
and Developer Experience
Geoffrey K. Gill
Chris F. Kemerer
MIT
Sloan School
WP
#3107-90
January 1990
Research support for the second author from the Center for Information Systems Research
and the International Financial Services Research Center is gratefully acknowledged.
Helpful comments on the first draft from W. Harrison, D. R. Jeffery, W. Orlikowski, and
D. Zweig are gratefully acknowledged.
Productivity Impacts of Software Complexity
and Developer Experience
Geoffrey K. GiU
Chris F. Kemerer
MIT
Sloan School of Management
Cambridge,
MA
January 1990
Research support for the second author from the Center for Information Systems Research and the
International Financial Services Research Center is gratefully acknowledged. Helpful comments on the
first draft from W. Harrison, D. R. Jeffery, W. Orlikowski and D. Zweig arc gratefully acknowledged.
MJJ
VAR
UBRAfifES
1
5 1990
RECaVED
rage z
Productivity Impacts of Software Complexity
and Developer Experience
Geoffrey K. Gill
Chris F. Kemerer
MIT
Sloan School of MaBagement
Cambridge,
MA
Abstract
The high costs of developing and maintaining software have become widely recognized as major
obstacles to the continued successful use of information technology. Unfortunately, against this backdrop
of high and rising costs stands only a hmited amount of research in measuring software development
understanding the effect of software complexity on variations in productivity. The
productivity and
current research is an in-depth study of a number of software development projects undertaken by an
aerospace defense contracting firm in the late 1980s. These data allow for the analysis of the impact
of software complexity and of varying levels of staff expertise on both new development and
m
maintenance.
the notion that more complex
In terms of the main research questions, the data support
systems, as measured by size-adjusted McCabe cyclomatic complexity, require more effort to maintain.
For this data set, the Myers and Hansen variations were found to be essentially equivalent to the
original McCabe specification. Software engineers with more experience were more productive, but this
The average net
labor production input factor revealed declining marginal returns to experience.
marginal impact of additional experience (minus its higher cost) was positive.
CR
Categories and Subject Descriptors: D2.7 [Software Engineering]: Distribution and Maintenance;
D2.8 [Software Engineering]: Metrics; D^.9 [Software Engineering]: Management; F23 [Analysis of
Algorithms and Problem Complexity]: Tradeoffs among Complexity Measures; K.6.0 [Management of
Computing and Information Systems]: General - Economics; K.6.1 [Management of Computing and
Information Systems]: Project and People Management; K.63 [Management of Computing and
Information Systems]: Software Management
General Terms: Management, Measurement, Performance.
Additional Key
McCabe
Words and
Metrics.
Phrases: Software Productivity, Software Maintenance, Software Complexity,
Page 3
1
INTRODUCTION
The high
obstacles
the
to
costs of developing
is
spent annually worldwide on software [Boehm, 1987].
productivity would
in
have significant cost reduction
importance of software productivity, there
development
productivity
and
understanding
in
the
However, despite the practical
effects.
measuring software
in
on
complexity
software
of
effect
has
Therefore, even a small
amount of research
only a hmited
is
Boehm
For example, Barry
continued successful use of information technology.
estimated that $140 billion
increase
and maintaining software have become widely recognized as major
variations
in
productivity.
The purpose
of this paper
is
to
add
body of research knowledge
to the
development productivity and software complexity. Traditionally,
measiued
as a simple output/input
[Boehm, 1981, Conte,
et al., 1986,
ratio,
most
lines
of code
(SLOC)
However,
one measiu^e of output.
consideration
may be
in
in
possible differences in the
researchers such as Fjeldstad and
per work-month
Jones, 1986]. While this metric has a long research history, and
used by practitioners, criticisms of the metric have sparked new research questions.
is
the
case
of
software
software
development productivity has been
soft\vare
typically source
in the areas of
maintenance,
is
widely
The numerator, SLOC,
it
does
not
take
underlying complexity of the code being maintained.
Hamlen have noted
that approximately
understanding the existing code, neglecting to account for
into
Since
50%
of the effort in maintenance
initial
code complexity may be a
serious omission [Fjeldstad/Hamlen, 1979].
In terms of the denominator, work-months (often called man-months), this too
Numerous
rough measure of productivity.
lab experiments
and
field
The
idea
work-months are charged.
is
that
who charges
only a
observations have noted the wide
Sackman,
variations in productivity across individual systems developers [Chrysler, 1978,
1981, Dickey, 1981].
may be
et
1968, Curtis,
al.,
those work-months could be as important as
how many
Yet, the usage of the aggregated work-months metric ignores this potential
source of productivity variations.
Another concern with
how
terms of which
those hours were spent,
in
this
actiNdties
metric
is
that
of the systems
it
provides no information on
development
life
cycle
were
supported. In particular, some previous research [McKeen, 1973] has suggested that greater time spent in
the design phase
The
may
result
current research
m
is
better systems.
an in-depth study of a number of software development projects undertaken
by an aerospace defense contracting firm
collecting only aggregate
SLOC/work-month
in
the
data.
late
1980s.
The data
collection effort
In terms of the source code, data
complexity of the code written, as measured by the
went beyond
were collected on the
McCabe Cyclomatic Complexity measure
[1976],
and
Page 4
the variations proposed by
collected
also
on the
Myers
and Hansen
[1977],
For maintenance projects, these data were
[1978|.
code before the maintenance
existing system
complexity were measured on approximately lOOK non-comment Pascal and Fortran
of source code.
These data allow
for the analysis of the
Code
was begun.
effort
SLOC
impact of complexity on both
and
size
465 modules
in
new development
and maintenance.
work-month information, data were collected by person by week
In terms of the
the
number of hours and
by collecting
analysis
the activities
upon which those hours were
spent.
in
terms of both
This information was obtained
and processing approximately 1200 contemporaneously-written
Therefore,
activity reports.
of the impact of varying levels of staff expertise and varying styles of project
management
is
possible.
The
statistical results
are reported in
the data support the notion that
more
drops an estimated one SLOC/day
the
Myers and Hansen
specification.
of this effect
for every extra
variations
were found
is
as
4.
In terms of the
is
related to
increased
percentage point of design
activity.
that the productivity of maintaining the
path per 1000 lines of existing code.
to
be
essentially
equivalent
productivity,
at
a rate
This paper
is
original
of over
six
in
data
set,
McCabe
the design
SLOC/day
for
each
Software engineers with more experience were more
The average
net marginal
Section 2 describes the main research questions and compares
and contrasts the current work with previous research.
and presents the
the
this
program
higher cost) was positive.
its
organized as follows.
collection methodology,
to
For
more time spent
productive, but there appeared to be declining marginal returns to experience.
impact of additional experience (minus
main research questions,
measured by cyclomatic complexity, require
In terms of productivity, the data support the notion that
phases of the project
additional
Section
more complex systems,
The magnitude
effort to maintain.
full in
data.
Section 5 presents the results of the productivity
Section 3 describes the data
site
and the data
Section 4 presents the analysis of the source code, and
and complexity
analysis.
Finally, Section 6 offers
some
conclusions and suggestions for future research.
2
RESEARCH QUESTIONS AND PRIOR RESEARCH
A
vast literature exists
beyond the scope of
on software engineering metrics and modeling, and a comprehensive review
Rather, in this section the current research questions will be
is
clearly
a
The interested reader is referred to [Conte, et cd., 1986] and [Jones, 1986] for broad coverage of
number of related topics. Also, an earlier, lengthier version of this section appears in [Gill, 1989].
^
this paper.'
Page 5
developed within the framework of specific prior research on the following topics: a) software output
metrics, b)
2.1
new
software development productivity, and c) software maintenance productivity.
Soft>vare Metrics
There are many possible ways
to
measure the productivity of software development.
widely used metric for size involves counting soiu-ce lines of code
of a line of code adopted for this study
that
is
(SLOC)
in
The most
some manner.^ The defmition
proposed by Conte, Dimsmore and Shen [1986,
p. 35]:
A
line of code is any line of program text that is not a comment or blank line, regardless of the
number of statements or fragments of statements on the line. This specifically includes all hues
containing program headers, declarations, and executable and non-executable statements.
This set of rules has several major benefits.
translate
for
any computer language. Because of
metric [Conte, et
al.
Through
1986, p. 35].
its
It
is
very simple to calculate and
simphcity,
it
become
has
consistent use of this metric, results
it
very easy to
is
SLOC
the most popular
become comparable across
different studies.
One problem
comments are not
with almost
all
the counting rules for
SLOC
is
that they ignore
executable, they do provide a useful function and are therefore properly regarded as
comment
part of the output of the system development process. For this reason,
counted
in the
current research. This issue
The most widely accepted
that a
measure of the complexity
Since the
number
comments. While
of paths in a
is
discussed
is
metric for complexity
the
number
program with
a
more thoroughly
is
that
in the
backward branch
is
were separately
methodology
proposed by McCabe
of possible paths through
lines
[1976]."^
section.
He proposed
which the software could execute.
infinite,
he proposed that a reasonable
measure would be the number of independent paths. After defming the program graph
for
a
given
program, the complexity calculation would be:
[1987] has described, this method has many deficiencies. In particular, SLOC are
define consistently, as Jones [1986] has identified no fewer than eleven different algorithms for
counting code. Also, SLOC do not necessarily measure the "value" of the code (in terms of functionality
and quality). However, Boehm [1987] also points out that no other metric has a clear advantage over
SLOC. Furthermore, SLOC are easy to measure, conceptually familiar to software developers, and are used
in most productivity databases and cost estimation models. For these reasons, it was the metric adopted
by this research.
^
As Boehm
difficult to
^
Another option would have been to use one of the Halstcad Software Science metrics in place of
the McCabe metric [Halstead, 1977]. However, previous research has shown no advantage for these metrics
over the McCabe metric [Curtis, et al., 1979; Lind and Vairavan, 1989], and more recent work has been
critical of Software Science metrics [Shen, Conte, and Dunsmore 1983; Card and Agresti, 1987].
For these
reasons, the McCabe metric was chosen.
Page 6
V(G) =
e
-
n +2.
V(G) = the cyclomatic complexity,
c = the number of edges,
n = the number of nodes.
where
V(G)
Analytic;illy,
is
number
the
of predicates plus
for
1
each connected single-entry
single-e.xit
graph or
subgraph.
Hansen
[1978] gave four simple rules for calculating
1.
Increment one
2.
Increment one for every
3.
Add
two
less
4.
Add
one
for
Myers
that
CASE
for every IF,
Iterative
DO, DO-WHILE
each logical operator (AND,
[1977] demonstrated
in the
or other alternate execution construct.
or other repetitive construct.
than the number of logical alternatives in a
what he beheved
an IF statement with logical operators was
one imbedded
McCabe's Complexity:
other because there
in
to
OR)
in
be a flaw
some sense
less
an
in
IF.
McCabe's
metric.
Myers pointed out
comphcated than two IF statements with
the possibility of an extra
is
CASE.
ELSE
clause in the second case.
For example:
IF
A
(
and B
)
THEN
ELSE
is
less
complex than:
IF
IF
A
(
(
)
B
)
THEN
THEN
else""'
ELSE
Myers
identical
for
these
(CYCMID,CYCMAX)
CYCMID
in the
V(G) by
[1977] notes that, calculating
two
cases.
where
To
remedy
CYCMAX
is
individual conditions (as
this
problem,
Myers
McCabe's measure, V(G),
McCabe
suggested
(all
four
suggests),
using
rules
tuple
a
are
V(G)
used)
is
of
and
uses only the fnst three. Using this complexity interval, Myers was able to resolve the anomahes
complexity measure.
Hansen
CYCMID
and
[1978] raised a conceptual objection to Myers's improvement.
CYCMAX
were so closely
related,
using a tuple to represent
Hansen
felt
that
because
them was redundant and
Page 7
did
provide enough
not
proposed
what was
that
information
really being
was
also
just a
two rules
first
number
of using
difficulties
half of the
tuple
a
Instead
tuple.
(CYCMAX)
of paths through the code. Furthermore, he
was
he
really
felt that
rule
measure of the complexity of the expressions. To provide a better metric, he
(CYCMIN,
suggested that a tuple consisting of
using just the
intrinsic
measured by the second
the complexity of the expression, not the
#3 above
warrant the
to
CYCMIN
operation count) be used where
calculated
is
McCabe's complexity.
for counting
All of the above tuple measures have a problem in that, while providing ordering of the complexity
number
of programs, they do not provide a single
that
can be used for quantitative comparisons.
is
It
also
an open research question whether choosing one or the other of these metrics makes any practical
difference.
While a number of authors have attempted validation of the
Cheung, 1987; Lind and Vairavan, 1989) there does not appear
significance
practical
research
22
is
of the
Myers or Hansen
a vaUdation of the
Productivity of
While the
New
CYCMIN, CYCMID, and
(e.g.
Li and
be any empirical validation of the
to
one question investigated
Therefore,
CYCMAX
metric
in
this
complexity metrics.
Software Development
development productivity
issue of software
amount of research
a considerable
variations.
CYCN4AX
1987]. Past research can generally
attention,
is
of practical significance and has generated
what has often been lacking
is
detailed empirical data [Boehm,
be categorized into two broad areas. The
first
are termed "black bo.x"
or "macro" studies which collect data on gross inputs, gross outputs and limited descriptive data in order
to build software
production models. Examples of
this
type of
work would include
the classic
work of
Walston and FeUx [1977] and Boehm [1981] and more recent work by Jeffery and Lawrence [1985] and
Banker,
et al. [1987].
A
second type of research
methodology
is
collected at a
is
so-called
box"
or
"micro"
models
more
possible
criticism
of the
first
Boehm,
et
al.
[1984],
type of research
allocate staffing over the phases of the system
level
On
which the
typical
where data are
detailed activity level, rather than the gross level of the "black box" models. Ejcamples
is
and Harel and McLean
that
by examining high
potentially interesting research questions cannot be answered. For example,
summary
in
a controlled experiment involving software developers doing a small task
of this type of "process" research include
A
"glass
cannot be used to address
development
lifecycle.
[1984].
level
one concern
Data collected only
is
at
data,
how
many
best to
the project
this question.
the other hand, the "micro" level research
is
sometimes
criticized for being artificial. In order
Page 8
to
meet the demands of experimental design and
their practical constraints,
the researchers are forced
either to use small samples or to abstract from reality (e.g. by using undergraduate students as subjects)
to such an extent as to bring the external validity of their results into question [Conte, ct al 1986].
The
taken
The approach
current research was designed to mitigate both of these potential shortcomings.
of the "macro" production modelling type, but data were collected at an in-deplh, detailed
is
Input (effort) data were collected by person by week and
phase/task
to
contemporaneously collected project
Output data included not only SLOC, but McCabe, Myers, and Hansen complexity
data.
measures as
mapped
level.
well.
Therefore these data allow the research to address the following detailed research questions:
What
are the productivity impacts of differing phase distributions of effort? Specifically,
in the design phase improve overaU productivity?
does a greater percent of time spent
How much
does the productivity improve with personnel experience? Are there dechning
marginal returns to additional experience, and if so, what is their magnitude?
These questions can be answered on the
basis of empirical data
real
world
sites
and therefore
research avoids some of the potential external vaUdity concerns of the classic experimental approach."*
this
23
Soft^vare
The
[1983],
Maintenance Productivity
definition of software maintenance
namely "work done on a software system
adopted by
after
it
this
becomes
pp. 4-5] include three tasks in their definition of maintenance:
processing, performance, or implementation, (2) adaptive
or
from
processing environments;
maintainability."
Swanson
The
(3)
—
is
that of Parikh
operational."
"...(1)
corrective
and Zvegintzov
Lientz and Swanson [1980,
--
dealing with failures in
responding to anticipated change
in the
data
enhancing processing efficiency, performance, or system
projects described below as maintenance
work pertain
to all of the three Lientz
and
types.
Software maintenance
80%
perfective
--
research
of information
systems
is
an area of increasing practical significance.
resources
are
spent
according to Lientz and Swanson [1980] relatively
studies have included Vessey
Rombach
[1987]
and Weber
and Gibson and Senn
[1983]
It
on software maintenance
httle
is
estimated that from 50-
[Elshoff,
research has been done in
and Banker,
et al. [1987].
this
1976].
area.
However,
Macro-type
Micro-level studies have included
[1989].
"Of course, the field study methodology is unable to provide many of the tight controls provided by
the experimental approach and therefore requires its own set of caveats. Some of these are discussed in
the conclusions section.
Page 9
One
area that
of considerable interest for research
is
on maintenance productivity.
and b)
that systems suffer
on [Belady and Lehman,
does software become
to
repair (maintain)
It
from
1976).
"entropy",
This
These assumptions
a
is
to maintain,
and therefore become more chaotic and complex as time goes
raise a
sufficiently costly to maintain that
it?
the effect of existing system complexity
more complex systems are harder
generally assumed that a)
is
is
number of research
more economic
is
it
when
questions, in particular,
to replace (rewrite)
than
it
question of some practical importance, as one type of tool in the
Computer Aided Software Engineering (CASE) marketplace
is
the so-called "restructurer", designed to
reduce existing system complexity by automatically modifying the existing source code [Parikh, 1986;
GSA,
Knowledge
1987].
that a metric such as
a set of source code would allow
well as aid in evaluating these
3
management
CASE
CYCMAX
to
make
US
accurately reflects the difficulty in maintaining
rational choices in the repair/replace decision, as
tools.
DATA COLLECTION AND ANALYSIS
3.1
Data Overview and Background of the Datasite
A
prerequisite to addressing the research questions in this study was to collect detailed data
about a number of completed software projects, including data on individual personnel. All the projects
used
study were undertaken in the period from 1984-1989, with most of the projects in the
in this
three years of that period.
hereafter
referred
as
to
last
All the data originated in one organization, a defense industry contractor
Alpha Company. This approach meant
because
that
all
the
were
projects
imdertaken by one organization, neither inter-organizational differences nor industry differences should
confound the
somewhat
results.
A
possible
drawback
is
that this concentration
limit the applicability of the results.
on one industry and one company may
However, two factors mitigate
the concepts studied here are widely apphcable across applications, there
that there will
be significant differences
in the
is
this
no a
drawback.
priori
First,
because
reason to believe
trends across industries. Second, the defense industry
very important industry for software^ and the results will clearly be of interest to researchers in that
The
contained
projects at this site were small custom defense appUcations.
in this
study proprietary and therefore requested that
data that could directly connect
altered
in
any way.
it
The company employed an average
^Fortune magazine reported
annually on software.
with this study be withheld.
in
its
September
25,
its
The company considers
name
about
30
a
field.
the data
not be used and any identifying
The numeric
of
is
data, however, have not
professionals
in
the
been
software
1989 issue that the Pentagon spends $30 billion
Page 10
department. The company produced statc-of-thc-art custom systems primarily for real-time defense-related
applications.
At the
start
of this study, 21 potential projects were identified for possible study.
projects were, however, eliminated as being non-representative of this sample.
The
first
Two
of the
project was a
laboratory experiment in which the major task of the chief software engineer was mathematical analysis.
While a small amount of software was written
effort
from the software development
The other
effort.
was no way
for this project, there
project
to separate the analysis
was nearly an order of magnitude
than the next largest project and was believed to be unrepresentative of the Alpha population
technical design. This project was the only project to
employ an array processor, and a
larger
in
its
substantial fraction
of the software developed on the project was micro-code, also non-representative of the Alpha environment.
The data
set therefore consists of the
remaining 19 software projects. These projects represented 20,134
hours (or roughly 11.2 work-years) of software engineering
effort,
and were selected because of
the
availabihty of detailed information provided by the engineers' weekly activity reports.
Table
of
3.1.1
is
a
summary
of
some of
non-comment source hnes of code;
CSLOC
department personnel;
number of labor
is
the
is
the total
Mean
Std. Dev.
projects.
DURATION
NCSLOC
is
the
is
the
number
number
of weeks
number of labor hours expended by
number of non-comment
hours.
Median
NCSLOC
NCPROD
on the
includes comments;
TOTALHRS
over which the project was undertaken;
soft\vare
the gross statistics
lines of
code divided by the
total
Page 11
questionnaire describing the project.
Engineer Profile Data Collection Forms — A survey on the education and experience of
every engineer who worked on one ot the projects.
5.
The
data source was the source code generated in each project.
first
the projects,
74.4% was written
Following
languages.
Boehm
Pascal,
in
and the other 25.6%
Of
the code developed for
a variety of other third generation
in
only deliverable code was included in this study, with deUverable
[1981),
defined as any code which was placed under a configuration management system. This definition eUminated
any purely temporary code, but retained code which might not have been part of the delivered system, but
was considered valuable enough
test data,
to
be saved and controlled. (Such code included progrjuns for generating
procedures to recompile the entire system,
The number
of hours
worked on the
project
utility
and the person charging the project were obtained
from the database. Because the hourly charges were used
to
routines, etc.)
to bill the project
(DOD)
United States Government Department of Defense
extreme care to insure proper allocation of charges. For
audit,
this reason,
customer and were subject
the engineers and managers used
these data are believed to be accurate
representations of the actual effort expended on each project.
At Alpha Company, software engineers wrote
they had done on each project every week.
pages to describe
for
that
project
all
the projects.
If
would occupy the
The standard
An
Project
project.
The
example of an
entire
activity report
page.
work
length of these descriptions was one or two
several
If
projects
were charged, the length of the
amount of time worked, with the
appears
in
forms developed for
data collection
called an activity report of the
an engineer worked on only one project, the description of the work
descriptions tended to be roughly proportional to the
on a page.
summary
a
entire report fitting
Appendix A.
research provided background data on each
this
surveys requested descriptions of the project, qualitative ranking of various characteristics of
the development process
(number of
interruptions,
number of changes
in specifications, etc),
and ratings
of the quality of the software developed. Unlike the data above, these data are subjective, and results based
on
in
it
are interpreted cautiously.
order to obtain some
collect these
is
included
The engineer
feel
in
It
for
is
useful,
how
however, to have some measure of these qualitative variables
the results
may be
affected.
A
copy of the survey form used to
Appendix B.
profile
forms
experience and Alpha company
obtained
level. If the
data
on each engineer's educational background, work
employee
in
the 34 engineers in the sample), he or she was asked to
question was
fill
still
with the
the form out personally.
company
If
(24 out of
the employee no
Page 12
longer worked for the company, personnel records were consulted.
is
included
33
in
A
copy of the engineer
profile survey
Appendix C.
Categorization of Development Activities
Although unusual
for this type of research, this study
had access
to the
wcek-by-week description
of what work was done by each engineer on each project. These descriptions were
individual engineers every
week
to inform their
documentation of the work charges
managers of the content of
case of a
in
DOD
audit.
their
filled
work and
out by the
also to provide
Because these documents were written
contemporaneously by everyone who was working on a project, they are believed to present an accurate
picture of what
examined
happened
diu-ing
activity reports
provided a large descriptive database; however,
relating to productivity a key
to
software development process. Approximately 1200 reports were
for this research.
These
was
the
categorize
development
to
phases employed by Alpha.
corresponded quite closely
to
the
"local
One advantage
vocabulary",
i.e.,
into
one of nine standard systems
of this approach
the engineers at
that these categories
is
Alpha Company tended
organize their work in terms of the above categories. This cotmection proved very helpful
reports were read, and the work categorized.
hypothesis
test the
determine a way to quantify these data. The method used
work done each week by each engineer
the
lifecycle
problem was
order to
in
The nine
when
to
the activity
categories that were used are described in
Appendix
D.
To
through
all
categorize
a
project's
reports,
activity
the
the engineer's activity reports sequentially.
researcher selected an engineer
The work done on
and then read
the project each
week was
categorized into one or more of the project phases. For the large majority of reports only one phase was
charged per report; however, for a fraction of the
To
activity reports the effort
was
split into
two phases.®
obtain consistency across projects and people, and because access to the data was limited to
on-site inspection, only
one researcher was used
to
do the categorization task
for all the reports.
Because
the researcher read the activity reports in a sequential manner, he was able to follow the logic of the work,
as well as getting the "snapshot" that
any one report would represent. Finally,
reports were imclear, the engineers were shown their
own
in
two cases where the
reports and asked to clarify what was actually
done.'
®
On
'
Of
methods
a few reports (less than
course
bias.
5%)
a
maximum
of three phases were charged.
approach is an inability to check for a possible
does introduce a possible source of error.
a possible limitation of this single rater
Although a conscious research design,
it
Page 13
In
some
forgotten or
lost.
cases,
all
of the activity reports were not available, typically because the report had been
In addition, the reports
had only been kept
for the past three years so that they
generally not available prior to that time.^ For four of the nineteen projects studied,
these data were missing, and no attempt was
made
to
reduce the
25%
or
were
more
activity reports for those projects.
of
To
obtain the best breakdown of phases for the unassigned time on the remaining 15 projects, such time was
categorized in proportion to the categorization of the rest of the project.
categorized as unassigned was 9.9% (see Table
OBS.
1
3.3.1).
TOTAL
HOURS
PERCENT
UNASSIGNED
The median amount
of time
Page 14
Scientists:
Level
1:
Principal Scientist. Requires peer review to be
Level
2:
Staff Scientist. Requires peer review to be
promoted
promoted
to this level.
to this level.
Entjineers:
No
peer review required.
Level
3:
Senior Engineer. Entry level lor Ph.D's.
Level
4:
Engineer. Entry level for Master's degree engineers.
Level
5:
Junior Engineer. Entry level for bachelor's degree engineers.
A
sixth
was defined
level
No
peer review required.
to include support staff (secretary, etc),
part time college students in
computer science, and other non-full time professionals. Because many of the projects
few people,
1
at
and
2),
Alpha,
engineer (levels
in that to
kept in their
the
six levels of detail
number
3,
were not required. For
4,
and
own group
(support
aggregate levels of scientist (levels
and support (other) were used. This simple
5)
be promoted to
this reason, the
scientist level, a
peer review
is
division
is
also used
required. Non-engineering staff were
because they were only para-professional
staff)
had rather
.studied
staff.^
Table
3.4.1
of people in each of the levels with each of their highest educational degree levels.
shows
All the
engineers in this department had technical degrees (most in computer science) so that no attempt was
made
to separate the different fields.
< Bachelors
2
14
Scientist
Engineer
Support
Table
It
is
is
1
Total
Graduate
Bachelors
10
8
5
20
4
4
Educational Degrees Categorized by Level
3.4.1
clear that the
company
also closely related to the
levels are related to the highest level of
number
shown
of years of experience as
15^6
200
5.1
5.8
Engineer
Table 3.42:
Use
Average Years of Experience for
of these levels had
many
advantages.
combine degree and length of work experience
of both experience and educational
°
Only 5.4% of the
professionals.
level.
total labor
The
3.4.2.
of Total Industry Experience
Company Experience
Scientist
Table
in
Average Number of Years
Average Number of Years
of
education reached. The level
The
first
into a single
Company
was
that
it
provided a reasonable way to
measure, as these levels are good indicators
engineer's compensation
hours for the projects
Levels
in
is
also highly
dependent on
his or
the dataset were charged by these para-
Page 15
her
company
Finally,
use of the
a natural
3.5
way
Statistics
One
and therefore project cost
level,
"level"
concept was the
is
common
the
SLOC
Alpha Company, and therefore provided
on Source Code
of the advantages of this data site to the research was that
comment, non-blank
all
practice at
to discuss the research with the participants.
available, thus all statistics of interest about the
NCSLOC
directly related to the use of staff of different levels.'"
plus the
lines of
source code), a second
number of
statistics is that
code could be calculated.
lines of
comment
statistic called
(still
the
actual
source code was
In addition to
CSLOC
was
also collected.
A
excluding blank lines)."
NCSLOC
CSLOC
is
problem with
potential
they do not take into account the complexity of the code.
(non-
As discussed
earlier,
most popular measure of complexity was developed by McCabe and both Myers and Hansen have
offered e.xtensions to
level in
McCabe's measure.
All of these complexity measures
order to provide empirical evidence for the relationship
among
were collected
the three measures.
studied was contained in a total of 465 modules of which 435 were written in Pascal and 30
Collecting
these
data
required
metrics
programs
for
collected for the vast majority of code which was in Pascal or
generated for two projects (one written
in
C, the other in
each
language.
FORTRAN. No
ADA). Table
3.5.1
of each project for which the cyclomatic complexity metrics were generated.
of 19 projects and, of those,
projects,
100%
the
at
in
Comple.xity
complexity
The code
FORTRAN.
were
results
statistics
were
shows the percent of code
They were generated
of the code was analyzed for 13 of the
module
17.
In the
for 17
remaining four
an average of 98.3% of the code was analyzed.'^
'°
There is Uttle variation in the salaries of engineers at a given level. Because proposals for projects
are costed using a standard cost for each level, but billed using actual costs, large deviations tend to cause
client dissatisfaction and are avoided by Alpha Company.
"
If a comment appeared on the same line as an executable line of code both the comment and
the line of code incremented CSLOC. If this method were not used, comments would be significantly
undercounted. In particular, the measurement of comment density would be distorted.
To adjust for the fact that complexity figures were not available for the utility routines for four
of the seventeen projects, an average complexity ratio (cyclomatic complexity/NCSLOC) was defined for
each project. By assuming that the complexity densities remained constant for the project, the total
complexity (including non-Pascal or
code) was calculated by multiplying the average complexity
density by the total NCSLOC. This adjustment was done on only four of the projects, and for less than
2% of the code on average on those four projects. Therefore, this is not believed to be a major source
of error.
'^
FORTRAN
Page 16
OBS
Page 17
NCPROD
project;
is
NCSLOC/TOTALHRS
the data used in the study were available for
or the productivity for
all
the projects.
non-comment
The cyclomatic complexity data
from two of the projects and the phase data from four of the projects. However,
was required, and,
a single statistic
in that
if
It
is
all
for
Not
all
are missing
most analyses only
projects with valid data
the analysis required complexity data, but not phase data, the
were generated on the seventeen projects with
phase data).
4.1
order to conserve degrees of freedom,
subset were used. For example,
statistics
4
in
lines of code.
valid complexity data (including the projects without
not believed that this approach introduced any systematic bias.
SOURCE CODE ANALYSIS
Comparison
For
of Metrics
this portion of the analysis
work-hours data were required.
Table
it
was possible
4.1.1
is
a
to analyze the data at the
summary
of
modules.
Variable
NCSLOC
MEAN
STD DEV
some
module
level, since
of the basic statistics for
all
no
the
Page 18
Page 19
TOTALHRS,
can be assumed to be different with a high degree of confidence. Because
all
of these
variables are directly related to the size of the project, this difference simply indicates that, on average,
new development
the maintenance projects were smaller than
in
was found. While the average productivity
productivity
development projects, the
fact that this difference
because the
line
for
is
the
significant difference
maintenance projects
means
not statistically significant
maintenance productivity
to reject the null hypothesis that
In addition,
is
However, no
projects.
same
as that for
that
is
less
may be
maintenance projects are combined
in later
understated. Therefore, the
development
new development
for
we were unable
count ignored changed lines and actually subtracted for deleted
productivity of maintenance projects
than
projects.
lines,
the
projects and the
productivity analyses, excluding maintenance specific analysis
that requires e-xisting code.
5
RESULTS
Four major productivity
effects
were investigated of which two applied
Are more experienced programmers more
were: (a)
does the proportion of time spent
The other two
much has been
effects
maintenance productivity.
One
if
so,
all
of the projects.
by how much? and (b)
the
They
How
design phase affect the productivity?
in the
studied were appropriate for maintenance projects only.
under
written
productive, and,
to
assumption that
characteristics
of the
McCabe developed
of the reasons that
his
As
code have an
cited earlier,
effect
upon
cyclomatic complexity was to
attempt to quantify the complexity of code in order to determine the appropriate length of subroutines so
that
code would not be too complex
the
complexity density metric defined
impact of documentation
5.1
the form of
maintain.
This study has examined
The second maintenance
this
effect
using the
productivity issue analyzed was the
comments.
Experience
It
is
in
earlier.
to
has long been thought and several studies have shown that the experience of the programmers
an important factor
in
productivity.
Verifying this
phenomenon
engineers with different levels of experience do different tasks.
three different categories of technical personnel.
company
levels.
contained
the
The employees
remaining three
in
this
highest
not
As described
always easy, in part because
above, there were essentially
category (scientist) included the top two
category were primarily project managers.
professional
responsible for developing the software.
The
is
The
levels
and the engineers
in
this
The second category
category were
primarily
lowest level provided support. Because the scientists and the
Page 20
support
productivity
To
test
were by and large not
stalf
is
involved
directly
writing of the
the
in
software,
only indirect. Therefore, the research effort focused on the engineer (levels
the hypothesis that the higher level (more education and
their
3, 4,
on
effect
and
5) staff.
more experience) engineers are more
productive, the following model was created:
Y =
Bo + (B,
P3)
*
+
PJ +
*
(B^
*
(B3
?,)
where,
Y
= NCPROD
P3
=
the percentage of the total development hours spent by level 03 engineers.
P4
=
the percentage of the total development hours spent by level 04 engineers.
P5
=
the percentage of the total development hoiu-s spent by level 05 engineers.
Development hours are only those hours worked on
The
specifications, design, coding, or debugging.
results of this regression were:
NCPROD
=
+
-15.75
(-2.03)
+
27.88*P3
R^ = 0.45
The
engineers
=
NCSLOC/day,
on
software
direct
The
ceteris paribus.
is
that
tasks
adding an additional one percent more time by
results
result for levels 4
3 and 4 engineers are significant at the alpha
=
What
is
interesting to note
is
between
an index salary of
is
.7225 (.85 squared), a level 4
1 for level 1 scientists, this
3 engineer over a level 5
is
is
.6141,
approximately
and
9%
5,
and
1.5 respectively.
more productive.
and
at the
(The
level 3
2.2
additional
These
results are
t-statistics for
the level
.10 level for the level 5 engineers.)
also possible to
is
draw conclusions about
In particular, the data
and a 47.28% increase between
mean
means
level 5,
2.1
approximately
of
value to the firm.
its
that the differences in
at
are
it
4 and
levels
increase
and 5 are
the data at this level of detail
in productivity
an
in
.01 level,
marginal productivity of additional experience, and
37.19% increase
0.30
15
more experienced engineers
exactly as expected in that
By examining
=
n
interpretation of these results
working
=
Adjusted R^
3.04
18.93*P5
(1.89)
(2.91)
(2.89)
F- Value
+
25.97*P,
salaries for
each
level
is
levels 3
only 15%.
the
show a
and
5.
Starting
that the average salary index for a level 3 engineer
.5220.
Therefore, the net benefit in using a level
(.4728-(l-(.7225/.5220))),
and the net benefit of employing
Page 21
Productivity
Marginal
vs Average Years Experience
u
o
O
"T
2,5
3.5
3
4
4.5
5
5.5
6.5
6
Avg Years Experience Cfounded to
Figure
a level 4 engineer over a level 5
to software
managers.
commensurate with
relationship
is
The
A
to
senior staff
approximately 20%. This illustrates two points of potential importance
is
that
second
experience (see Figure
Lawrence, 1985].
However,
may be
tliis
more
given
yr^
is
more productive
The
managers should seek out higher
that
8.5
B
Marginal Productivity versus Experience
5.1:
increased productivity.
their
increased marginal cost.
returns
first
.5
7,5
1
result
5.1).
result
intuitive
is
is
consistent
may be confounded by
difficult aspects of the system.
to
be compensated
recommendation
ability staff, since their
from these data
This
do not seem
staff
that there
with
that
follows
at a level
from
this
marginal benefit exceeds their
seem
to
be declining marginal
some previous research
a difference in task
Further study, perhaps
[Jeffery
complexity,
in the
i.e.,
and
more
form of a lab
experiment controlling for task complexity, seems indicated.
5.2
Design Phase Impact
Because of the
McKeen
[1983]
availability
and Gaffney [1982]
of phase data,
it
that by spending
was possible
more time
to
test
in design,
the
hypothesis developed by
the overall productivity of a
Page 22
project can be improved. Specifically, the
better
and more detailed the design
easier overall
hours spent
Appendix D).
will be.
development process. To
in
more lime
test
that
is
spent analyzing and designing the software, the
This good design would be expected to lead to a faster and
this
hypothesis
ANALFRAC
the specification and design phases divided by the total
The
was defined
number
as the
number
of project hours
of
(see
results are plotted in Figure 5.2.
Productivity vs Analysis Percentage
8
5
Spec, and Design Fttg
Figure 52: Productivity versus Fraction of Time Spent in Analysis
The
result for a linear regression
NCPROD
=
3.97
(1.56)
R^ = 0.09
F-Value = 1.30
+
model of these data
21.48
*
is:
ANALFRAC
(1.14)
Adjusted R^
=
0.02
n
=
15
Therefore, these data do not provide support for the notion that more time in the specification
Page 23
and design phases has
number
53
a
measurable impact on source
power of
of data points reduce the
lines of
course the limited
this test.
Complexity
Complexity density
(CYCMAX/NCSLOC)
code complexity because McCabe's complexity
complexity were
is
the appropriate metric for evaluating the effects of
so
is
managers
why
typically
do not have
it
a great
is
is
are
valid
productivity.
number
CASE
less interesting
When
is
the
length
of modules.
He
is
will,
therefore, only
effect,
McCabe's
If
i.e.,
the tested
this
is
than the impact of complexity.
The
first
program
length.
of the
code may not have a large
will
a
is
and adopting complexity standards,
manage complexity. The second reason
an engineer maintains a program, each change
size of this fraction
code
why
length.
deal of control over the size of a program (which
restructuring tools, they can
reasons
intuitive
program
While
related to
intimately connected to the size of the problem). However, by measuring
and/or by using
to
unnormalized, the results would be dominated by the length
left
possible hypothesis, there are two reasons
that
related
closely
hypothesis would actually be that maintenance productivity
is
Of
code productivity.
effect
that there
is
on maintenance
tend to be localized to a
fairly small
need detailed knowledge of a small fraction of the code. The
probably unrelated to the
total size of the
program. Thus the length of the entire
less relevant.'''
Logically,
to writing the
the
initial
complexity density should only affect those phases that are directly related
code. User support, system operation, and
affected by the complexity of the code. For this reason, a
ALLDPROD:
The
total
number
management functions should not be
new
partial productivity
of lines divided by the time spent in
specifying, designing, documenting, coding or testing
To perform
this
analysis
both the
maintenance projects were needed. Because
available.
A
regression of
ALLDPROD
phase
all
data
all
directly
measure was defmed:
non-overhead
activities:
and debugging.
and the cyclomatic complexity data
the
for
the data were required, only seven data samples were
was performed with the complexity density of the
initial
code
(INITCOMP):
'"In fact, the correlation of productivity with initial
-0.21;
alpha
= 0.61).
code length was not
significant
(Pearson coefficient,
Page 24
ALLDPROD
=
28.7
(3.53)
R2 =
=
INITCQMP
(-2.69)
Adjusted R^
0.59
F-Value
*
0.134
-
7.22
=
0.51
n
= 7
This model suggests that productivity decUnes with increasing complexity density (see Figure
5.3).
Complexity vs Maintenance Productivity
8
o>
c
(n
is
8
0.00
Actua
I
va ues
I
New SLOC/ work- hour
Linear Prediction
Figure 53: Maintenance Productivity versus Complexity Density
While
it
would be speculative
to ascribe too
much importance
to the results,
which are based on only seven
data points, the results do seem sufficiently strong and the implications sufficiently important that further
'^
A possible concern with this model might be multicollinearity between the independent regressor
and the intercept term.
To test for this, the Belsley-Kuh-Welsch test was run, and no significant
multicollinearity was detected [Belsley, et al. 1980].
Page 25
study
on a larger independent sample, then such
indicated.'^ If these results continue to hold
is
would provide strong support
for the use of the complexity density
analyzing software to determine
maintainability.
its
results
measure as a quantitative method
might also be used as an objective measure of
It
for
this
aspect of software quality.
5.4
Source Code Comments
The number
A
code [Tcnny, 1988].
were comments
of
comments
in the
regression of
(PERCOMM)
was
ALLDPROD
=
code has also been suggested
ALLDPROD
+
4.96
The
=
0.05
*
PERCOMM
n
0.13
=
8
comments does not seem
code may be commented more
of the code. Because
more
commented code
be easier to maintain. Therefore, another question
their
programming
difficult
styles to the
expect programmers to
complexity of the code which
comment complex code more
subroutines. If this were the case, the density of
fact,
need
heavily
is
comment
for
density
may be
is
that
to
be related to the maintainability
heavily,
is
it
is
whether the programmers adjust
and
to
break complex code into more
comments should increase with
more
difficult
not clear whether heavily
being written. For example, one would
the Pearson correlation coefficient between those two metrics
interpretation of this negative correlation
that
that
(0.36)
results indicate that the density of
will
code
Adjusted R^ = -0.14
0.02
F- Value
as a fimction of the percentage of the lines of
also performed. (These data are plotted in Figure 5.4.)
(L04)
R2 =
to affect the maintainability of the
is
the complexity density. In
-0.30 (alpha
code get commented
= 0.0001, n = 465). The
less heavily,
suggesting
simply a function of the time available, rather than being a function of the
comments.
'^
Programming style may also affect productivity. For example, one would expect that by increasing
the complexity density (CYCMAX/NCSLOC) of newly written code a programmer would require more time
to write these more complex hues of code. The correlation of the complexity density with the productivity
is
-0.43 (alpha = 0.09), suggesting that productivity does decrease with increasing complexity density.
Page 26
Maintenance Productivity
as a Function of Comrrent Density
1G
15
-
-14
13 12 11 I.
10 -
2
9
a
c3
q
7
-A
B
5
4 3
2
1
-1
1
T
1
1
1
1
1
45.00
35.00
25,00
15 00
55,00
65.00
Comments/ SLOC
Actual Values
Figure
6
5.4:
Maintenance Productivity versus Comment Density
CONCLUSIONS
Through the
at this data-site this
and other factors
availability of detailed
paper has been able to
affecting
test
a
number
of relationships between software complexity
development and maintenance productivity.
complexity, as measured by the
maintenance productivity.
data about the software projects developed and maintained
McCabe
In particular, increased software
cyclomatic complexity density metric,
Greater emphasis on the design phase
And, systems developers with greater experience were found
to
is
is
associated with reduced
associated with increased productivity.
be more productive, although
at a declining
marginal rate.
How
can software development and maintenance managers make best use of these results?
begin with, the measurement of these effects was
related
to
made
software development inputs and outputs.
To
possible only through the collection of metrics
In
particular,
use of these data argue for the
Page 27
collection of detailed data
beyond mere
productivity can be improved through
SLOC
employment
Staff with
complexity of existing code.
and work-hour measures.
Based on these data,
of higher experience staff
and reduction
higher experience were shown to be
cost.
of existing code should be attempted.
One approach may be by measuring
written,
A
and requiring
second
approach
that
it
may be through
is
existing
restructuring
may impose one
An
results
of
although,
This
lilcely
effect
it
is
course,
the
cost
on maintenance need
to
initially
is
that
reducing the
must also be taken into accoimt.
and maintenance community would benefit
on data from other firms particularly from other
results as guides for standard practice.
returns to experience
products,
additional caveat related to this solution
In terms of future research, the software development
to the experience issue, in
complexity of code as
time additional maintenance costs in terms of
maintainers' familiarity with the existing code structure.
effects of complexity
cyclomatic complexity
contingent upon the costs of acquiring and operating the restructurer,
as well as the size of the existing software inventory.
from validation of these
an
at
adhere to an existing standard estabhshed through on-site data collection.
effectiveness of this latter solution
these restructurers
the cyclomatic
more productive,
Finally, cost effective reduction in the
average rate greater than their increased
in
softwcire
industries.
Also, the
be rephcatcd on a larger sample before accepting these
Beyond these
validation-type studies, further
work could be devoted
terms of attempting to determine the causes of the apparent declining marginal
shown by these
data.
Research on complexity impacts could examine the actual
longitudinal impacts of the use of commercial restructurers.
In general,
more empirical
studies involving
detailed data collection will help to provide the insights necessary to help software development progress
further towards the goal of software engineering.
Page 28
7
BIBLIOGRAPHY
I
[Banker, et al. 1987]
Banker, Rajiv D.; Datar, Srikant M.; and Kemerer, Chris F.; "Factors Affecting Software Development
Productivity: An Exploratory Study"; in Proceedings of the Eighth International Conference on Information
Systems; December 6-9, 1987, pp.' 160-175.
[Basili. et al. 1983]
BasUi,
Victor R.; Selby, Richard
Across
FORTRAN
Projects"; in
W.
and Phillips, Tsai-Yun; "Metric Analysis and Data Validation
Transactions on Software Engineering; SE-9, (6):652-663, 1983.
Jr.;
IEEE
[Belady and Lehman, 1976]
Beladv, L. A. and Lehman, M. M.; "A
15, n.' 1, pp. 225-252, (1976).
Model
of Large
Program Development";
IBM
Systems Journal,
v.
[Beisley, et al. 1980]
ISelsley,
Kuh, E. and Welsch,
D.;
R.; Regression Diagnostics;
John Wiley and Sons, NY, NY, 1980.
[Boehm, 1981]
Boehm, Barry W.; Software Engineering Economics;
[Boehm,
Boehm,
Prentice-Hall,
Englewood
Cliffs,
NJ, 1981.
et al. 1984]
B. W.; Gray, T. E.;
IEEE
Transactions on Software Engineering;
Productivity";
Computer; September 1987; pp. 43-57.
and Seewaldt,
[Boehm, 1987]
Boehm, Barry W.; "Improving Software
T.;
May
1984.
[Card and Agresti, 1987]
Card, D. N. and Agresti. W. W.; "Resolving the Software Science Anomaly"; Journal of Systems and
Software; 7 (l):29-35, March, 1987.
[Chrysler, 1978]
Chrysler, Earl; "Some Basic Determinants
the ACM; 21 (6):472-483, June, 1978.
[Conte, et
Conte, S.
of
Computer Programming
Productivity";
Communications of
1986]
al.
Dunsmore, H. E.; and
Benjamin/Cunmiings PubUshing Company,
D.;
Shen,
Inc.,
V.
Y.;
Software
Engineering
Metrics
and
Models;
1986.
[Curtis, 1981]
Curtis, Bill; "Substantiating
[Curtis et
Programmer
Variability";
Proceedings of the
IEEE
69 (7):846, July, 1981.
1981]
al.,
"Measuring the Psychological Complexity of Software Maintenance tasks with the Halstead
and McCabe Metrics"; IEEE Transactions on Software Engineering; SE-5:99-104, March 1979.
Curtis, Bill et
al.;
[Dickey, 1981]
Dickey, Thomas E.; "Programmer Variability"; Proceedings of the
IEEE
69 (7):844-845,
July,
1981.
[Elshoff, 1976]
Elshoff,
J.
Engineering;
L.;
"An Analysis of Some Commercial PL/1 Programs"; IEEE Transactions on Software
SE2
(2):113-120, 1976.
[Fjeldstad and Hamlen, 1979]
Fjeldstad, R. K.
Proc of
GUIDE
and Hamlen, W. T.; "Application program maintenance
48; The Guide Corporation, Philadelphia, 1979. Also
study:
Report to our Respondents";
and Zvegintvov, 1983].
in [Parikh
Page 29
[Gaffney, 1982]
Gaffney, John E. Jr.; "A Microanalysis Methodology for Assessment of the Software Development Costs";
in R. Goldberg and H. Lorin, eds.; The Economics of Information Processing; New York, John Wiley and
Sons; 1982, v. 2, pp. 177-185.
[Gibson and Senn, 1989]
Gibson, V. and Senn, J.; "Systems Structure and Software Maintenance Performance"; Communications of
the
ACM;
March, 1989,
v.
32, n. 3, pp. 347-358.
[Gilb, 1977]
GiJb,
[Gill,
Thomas; Software Metrics; Winthrop
Press,
Cambridge,
MA
1977.
1989]
Geoffrey K.; "A Study of the Factors that Affect Software Development Productivity"; inpublished
Sloan Masters Thesis; 1989; Cambridge, MA.
Gill,
MIT
[Hansen, 1978]
Hansen, W. J.; "Measurement of Program Complexity By the Pair (Cyclomatic Number, Operator Count)";
ACM SIGPLAN
Notices; 13 (3):29-33,
March
1978.
[Harel and McLean, 1984]
McLean,
Harel, E. and
Productivity";
MIS
E.; "The Effects of Using a Nonprocedural
Quarterly; June 1985; pp. 109-120.
Computer Language on Programmer
and Lawrence, 1979]
D. R. and Lawrence, M. J.; "An Inter-Organizational Comparison of Programming Productivity";
Proceedings of the 4th International Conference on Software Engineering; pp. 369-377; 1979.
[JefTery
Jeffery,
[Jeffery
Jefferv,
and Lawrence, 1985]
D. R. and Lawrence, M.
J.;
"Managing Programming Productivity"; Journal of Systems and Software;
5:49-58, 1985.
[Jones, 1986]
Jones, Capers;
Programming
Productivity; McGraw-Hill,
New
York, 1986.
and Cheung, 1987]
H. F. and Cheung, W. K.; "An Empirical Study of Software Metrics";
Engineering SE-13 (6):697-708, June, 1987.
[Li
Li,
[Lientz
and Swanson, 1980]
and Swanson, E.
Lientz, B. P.
1980.
B.;
IEEE
Transactions cm Software
Software Maintenance Management: Addison-Wesley Pubhshing Company,
[Lind and Vairavan, 1989]
Lind, Randy K. and Vairavan, K.; "An Experimental Investigation of Software Metrics and Their
Relationship to Software Development Effort"; IEEE Transactions on Software Engineering; 15 (5):649653, 1989.
[McCabe, 1976]
McCabe, Thomas
J.;
"A Complexity Measure"; IEEE Transactions on Software Engineering; SE-2
(4):308-
320, 1976.
[McKeen, 1973]
McKeen, James
D.; "Successful
Development
Strategies for Business
Apphcation Systems"; MIS Quarterly.
7 (3):47-65; September, 1983.
[Myers, 1977]
Myers, G. J.; "An extension to the Cyclomatic Measure of Program Complexity".
(10):61-64, 1977.
SIGPLAN
Notices; 12
Page 30
[Parikh, 1986]
Parikh, Girish; "Restructuring your
1986.
COBOL
Programs"; Compulerworld Focus: 20 (7a):39-42; February
[Parikh and Zvegintzov, 1983]
Parikh, G. and Zvegintzov, N.; eds. Tutorial on Software Maintenance.
SUver Spring,
(1983).
MD
IEEE Computer
19,
Science Press,
[Rombach, 1987]
Rombacii, Dieter; "A Controlled Experiment of the Impact of Software Structure on Maintainability";
Transactions on Software Engineering; V. SE-13, N3; March 1987.
IEEE
[Sackman, et al. 1968]
Sackman, H.; Erikson, W. J.; and Grant, E. E.; "Exploratory Experimental Studies Comparing Online and
Offline Programming Performance"; Communications of the ACM; 11 (1):3-11, January, 1968.
[Selby, 1988]
Selby, Richard W.; "Empirically Analyzing Software Reuse in a Production Environment"; Software
Emerging Technologies; Traz, W. eds. IEEE Computer Society Press,
1988.
Reuse
-
NY NY
[Shen, et al., 1983]
Shen, v.; Conte, S.; and Dunsmore, H.; "Software Science Revisited: A Critical Analysis of the Theory
and Its Empirical Support"; IEEE Transactions on Software Engineering: v. SE-9, n. 2, March 1983,
pp.155-165.
[Tenny, 1988]
Tenny, Ted; "Program Readability: Procedures
V. 14, n. 9; September, 1988; pp. 1271-1279.
vs.
Comments"; IEEE Transactions on Software Engineering;
[US GSA, 1987]
and Evaluation of
a Cobol Restructuring Tool; Federal Software Management Support Center;
United Slates General Services Administration, Office of Software Development and Inlormation Technology,
Falls Church, Virginia Sept 1987.
Parallel Test
[Vessey
Vessey,
and Weber, 1983]
I.
and Weber,
Communications of the
R.;
"Some Factors Affecting Program Repair Maintenance:
ACM;
[Walston and Felix, 1977]
Walston, C. E. and Felix, C.
Journal. 16 (l):54-73, 1977.
An
Empirical Study";
26 (2):128-134, February, 1983.
P.;
"A Method of Programming Measurement and Estimation";
IBM
Systems
Page 31
Appendix A: Example of an
Junior Software Engineer's
Activity Report
Name
Weekly Report
4-1-1987
Project
#1
35 hrs. Spent the whole time working with [Staff Scientist] and [Principal Scientist) trying
to find the cause of the difference betsveen [Basebne Runs] and [New Instrument
Runs). After a lot of head scratching it turns out that both were correct. We just
had to play with the format of the data and units to get the two data sets to
match to within [xxx] RMS. This is still above the [yyy] RMS limit, but the
[baseline runs) and [new instrument runs) started off being [zzz) RMS apart. So,
until the [new instrument) can be improved, my software can do no better than
it is doing now.
Project
#2
1 hrs.
I
Project
#3
9 hrs
The automatic height input for
made some more modifications
had very Uttle time to work on
modify adhoc target routine.
IEEE board
Project
#4
2 hrs
I
did
manage
to start to finish the
I
if
also
the
fails.
and I spent the two hours trying to get the microVAX up after
the [Special Boards).
ended up having to pull the boards back
out to reboot the machine. The boards were eventually installed, but there still
installed
seems
=
but
the calibration routine is up and running.
so that they can return to the old method
[Staff Scientist]
we
Total
this,
45.0 hours
to
be something wrong.
We
:
Page 32
Appendix B: Project Data Collection Form
Project
Profile
Survey
Project Name:
Project Number;
Applicable Phase(s)
Brief Description of the Project;
Target System:
Mainframe:
VAX:
HP mini:
PC:
Other:
(please specify).
Can you think of any reasons that would make this software
development exceptional (please specify estimate of net effect as
well
)
:
.
.
.
Page 33
Application Characteristics
Was the code primarily generated from scratch for this particular
project (Development Type) or did this project modify or extend an
existing program (Maintenance Type):
Maintenance
Development
Choose the best description of the application:
Real Time:
Diagnostic:
Analysis:
Utility:
The primary goal of this system was to
control hardware in real time.
The system was primarily designed to
diagnose an independent hardware system.
The software was not an integral part of
the working system.
The software performed physical analysis
on data
The software performed utility functions
(for example in configuration management
and general plotting programs
)
Software Quality:
Choose the number that best describes the quality of the
software resulting from this project compared to the average
on the following three
l=Poor, 3-Average, 5=Excellent
project
dimensions
)
(
Correctness:
When the program ran without error, this
dimension measures the whether the results
matched the desired results.
Reliability:
The likelihood
without error.
Maintainability:
The ease with which modifications and
extensions can be made to the software.
12
12
of
the
program
Excellent
Poor
Correctness:
Reliability:
Maintainability:
running
3
3
3
4
4
4
5
5
5
:
1:
:
;
:
.
Page 34
Management: Characteristics
How the project was managed in comparison with similar projects in
terms of the following dimensions:
Time Pressure;
How much pressure there was to finish the
project in a particular calendar period.
Cost Pressure;
How much pressure there was to keep the
total software development costs down.
Interruptions
How much work on the project was subject
to interruptions from external sources
vacations,
sickness,
(other projects,
etc
.
)
Change in Specs:
How often the specifications
software were altered as the
progressed.
Relative Priority:
Priority that the project was given by
management relative to other projects.
Low
Time Pressure:
Cost Pressure:
Interruptions
Change in Specs:
Relative Priority:
1
1
Average
2
2
2
2
2
1
1
1
3
3
3
3
3
the
project
of
High
4
4
4
4
4
5
5
5
5
5
Environment
Languages
Pascal
FORTRAN
PL/
ADA
Assembly
Was hardware being developed simultaneously?
Very
Stable
Hardware Stability
Average
Stability
Very
Unstable
;
Page 35
Appendix C: Engineer Profile Form
Engineer
Survey
Profile
Name;
Identification Nvunber;
Months of Industry Experience:
Months of Alpha Company Experience;
Highest Degree Received:
Field:
Year:
Current Alpha Company Level
(
01 02 , 03 , 04 05 ,Co-op,etc
,
,
If promoted within the last four years, give
the approximate date of the last promotion:
)
Page 36
Appendix D: Lifecytie Phases
- included any work involved in
requirements. If the engineer reported that he
or her hours were included in this category.
engineer reported having in depth discussions
requirements.
Specifications
the understanding or writing of software specifications and
or she was working on writing software specifications, his
Furthermore, this phase also included hours in which an
with the customer or other engineers about the software
included any time spent determining how to solve a pre-specified problem as well as time spent
up the design of a software module was placed in this category. This phase included not only
the writing up of a formal design, but also informal designs and discussions with other engineers about how
the program should be written.
Design
--
in writing
Coding — included time spent actually writing software code. This category also included time spent in
preliminary debugging prior to the point where a module was determined to be ready to test. (Debugging
tasks such as eliminating compile time errors were included in this category). As soon as testing started
on a module, the debug phase was charged.
Debug — included testing and debug time done after a module was determined to be ready for testing
and/or integration.
User Support — included time spent by software engineers helping users to run the program. Because many
of the software development projects were part of an embedded system, and because many of the
maintenance projects involved updating a system, software engineers were often required to help the users
run the system. This phase was charged when this support did not involve any of the "traditional" software
development tasks.
Documentation — included all the time that was spent actually writing software documentation such as
users' manuals and software maintenance manuals with the exception of software specifications and designs.
Management
--
included management meetings, cost estimates, and other such management tasks.
Computer System Maintenance — included
many of the
needed to keep the computer
computer systems were purchased for the
projects. When this was the case, time spent backing up the disks or repairing the systems was charged
to the project. (Note; on projects that used Alpha Company's general purpose computers, this work was
charged to an overhead account and did not appear on the projects' charges.)
systems running properly. In
ail
the charges to support tasks
projects, dedicated
— included work done solely in a training mode, i.e., if there was no attempt to produce useable
output, this phase was charged. On a few of the projects, engineers were required to spend substantial time
learning some aspect of the system (a new computer language, system maintenance documentation). For
example, this phase was charged in one project when an engineer spent 20 hours learning Turbo Pascal
(TM) by reading the manual and writing small programs unrelated to the project.
Training
&0T 16
^91
MIT
lIBR&Plt'j
^060 OObOQBQS
Date Due
hf^-^b
,'^i
NOV 24
199
(....
Lib-26-67
Download