CHAPTER 1: INTRODUCTION

advertisement
CHAPTER 1: INTRODUCTION
Contractors use a variety of techniques to develop estimates for proposals submitted to the
Government. Some of the more frequently employed techniques include methods such as
analogous, bottoms-up, and parametric estimating. A primary responsibility of a project cost
estimator is to select an estimating methodology that most reliably estimates program costs,
while making the most economical use of the organization's estimating resources. In certain
circumstances, parametric estimating techniques can provide reliable estimates that are
generated at a lower cost and shorter cycle time than other traditional estimating techniques.
Over the last several years, both Industry and Government have focused on ways to maximize
the use of historical data in the estimating process. During this same time frame, both are trying
to find ways to reduce the costs associated with proposal preparation, evaluation, and
negotiation. Industry and Government representatives have offered parametric estimating as a
technique that, when implemented and used correctly, can produce reliable estimates for
proposals submitted to the Government, at significantly reduced cost and cycle time.
Data issues seem to be the greatest concern regarding the use of parametric estimating
methods, particularly in regard to the Truth in Negotiations Act (TINA). TINA requires that cost
or pricing data be certified as current, accurate, and complete as of the date of negotiation or
another agreed to date as close as practicable to the date of negotiation. TINA requires
contractors to provide the Government with all the facts available at the time of certification, or
an otherwise agreed to date. Properly calibrated and validated parametric techniques comply
with the requirements of TINA.
Parametric estimating is a technique that uses validated relationships between a project's
known technical, programmatic, and cost characteristics and known historical resources
consumed during the development, manufacture, and/or modification of an end item. A number
of parametric techniques exist that practitioners can use to estimate costs. These techniques
include cost estimating relationships (CERs) and parametric models. For the purpose of this
handbook, CERs are defined1 as mathematical expressions or formulas that are used to
estimate the cost of an item or activity as a function of one or more relevant independent
variables, also known as cost driver(s). Generally, companies use CERs to estimate costs
associated with low-dollar items. Typically, estimating these items using conventional
techniques is time and cost intensive. For example, companies often use CERs to estimate
costs associated with manufacturing support, travel, publications, or low-dollar material.
Companies with significant proposal activity have negotiated Forward Pricing Rate Agreements
(FPRAs)2 for certain CERs. Use of advance agreements such as FPRAs further streamlines the
acquisition process and helps reduce costs. Chapter 3, Cost Estimating Relationships, provides
examples, guidance, and best practices for implementing these techniques. Chapter 7,
Regulatory Compliance, more fully discusses the use of FPRAs and other advance agreements.
Parametric models are more complex than CERs because they incorporate many equations,
ground rules, assumptions, logic, and variables that describe and define the particular situation
being studied and estimated. Parametric models make extensive use of databases by
cataloging program technical and cost history. Parametric models can be developed internally
by an organization for unique estimating needs, or they can be obtained commercially.
Typically, the databases are proprietary to the contractor or vendor; however, a vendor will likely
share a description of the data in the database in order to build confidence with their users.
Parametric models can be used to discretely estimate certain cost elements (e.g., labor hours
for software development, software sizes such as lines of code (LOC) or function points), or
they can be used to develop estimates for hardware (e.g., radar systems, space shuttle spare
parts), and/or software systems (e.g., software for air traffic control systems). When
implemented correctly and used appropriately, parametric models can be used as the primary
basis of estimate (BOE). The following table identifies the chapters containing specific
examples, guidance, and best practices related to the implementation of various types of
parametric estimating methods:
Chapter
Model Type
Chapter 3
Cost Estimating Relationships (CERs)
Chapter 4
Company-Developed Models
Chapter 5
Commercial Hardware Models
Chapter 6
Commercial Software Models
Parametric techniques have been accepted by Industry and Government organizations for
many years, for use in a variety of applications. For example, many organizations have
experienced parametricians on staff who regularly use parametrics to develop independent
estimates (e.g., comparative estimates or rough order of magnitude estimates) and life cycle
cost estimates (LCCEs). In addition, Industry and Government often use these techniques to
perform trade studies such as cost as an independent variable (CAIV) analyses and design-tocost (DTC) analyses. Parametric techniques are also used to perform cost or price analyses. In
fact, the Federal Acquisition Regulation (FAR) identifies parametrics as an acceptable price
analysis technique in 15.404-1(b)(2)(iii). In addition, organizations used parametric estimating
techniques to develop estimates, or as secondary methodologies to serve as "sanity checks" on
the primary estimating methodology for proposals not requiring cost or pricing data. Chapter 11,
Other Parametric Applications, provides an overview of these and other uses. Until recently,
however, using parametric estimating techniques to develop estimates for proposals subject to
cost or pricing data 3 was limited for a variety of reasons. These reasons included:

Cultural resistance because many people in the acquisition community
expressed greater comfort with the more traditional methods of
estimating, including the associated BOE; and

Limited availability of guidance on how to prepare, evaluate, and
negotiate proposals based on parametric techniques.
Nevertheless, many Industry and Government representatives recognized parametrics as a
practical estimating technique that can produce credible cost or price estimates. By making
broader use of these techniques they anticipated realizing some of the following benefits:

Improvement in the quality of estimates due to focusing more heavily on
the use of historical data, and establishing greater consistency in the
estimating process;

Streamlined data submission requirements decreasing the cost
associated with preparing supporting rationale for proposals;

Reduced proposal evaluation cost and cycle time; and

Decreasing negotiation cost and cycle time through quicker proposal
updates.
After achieving some success with the broader uses of parametric techniques (e.g.,
independent estimates, trade studies), Industry saw the need to team with the Government to
demonstrate that parametrics are an acceptable and reliable estimating technique. In December
1995, the Commander of the Defense Contract Management Command (DCMC) and the
Director of the Defense Contract Audit Agency (DCAA) sponsored the Parametric Estimating
Reinvention Laboratory. The purpose of the Reinvention Laboratory was to test the use of
parametric estimating techniques on proposals and recommend processes to enable others to
implement these techniques. The primary objectives of the Reinvention Laboratory included:

Identifying opportunities for using parametric techniques;

Testing parametric techniques on actual proposals submitted to the
Government;

Developing case studies based on the best practices and lessons
learned; and

Establishing formal guidance to be used by future teams involved in
implementing, evaluating, and/or negotiating parametrically based
estimating systems or proposals.
Thirteen Reinvention Laboratory teams (as referenced in the Preface) tested and/or
implemented the full spectrum of parametric techniques. The Industry and Government teams
used these techniques to develop estimates for a variety of proposals, including those for new
development, engineering change orders, and follow-on production efforts. The estimates
covered the range of use from specific elements of cost to major-assembly costs. The teams
generally found that using parametric techniques facilitated rapid development of more reliable
estimates while establishing a sound basis for estimating and negotiation. In addition, the teams
reported proposal preparation, evaluation, and negotiation cost savings of up to 80 percent; and
reduced cycle time of up to 80 percent. The contractors, with feedback from their Government
team members, updated or revised their estimating system policies and procedures to ensure
consistent production of valid data and maintenance of the tools employed. The Reinvention
Laboratory Closure Report 4 provides details on the best practices for implementing parametric
techniques and is included in this edition of the handbook as Appendix F. The lab results have
also been integrated throughout this handbook in the form of examples, best practices, and
lessons learned with respect to implementing, evaluating, and negotiating proposals based on
parametric techniques.
As an example, one of the overarching best practices demonstrated by the Reinvention
Laboratory is that parametric techniques (including CERs and models) should be implemented
as part of a company’s estimating system. In a 1997 report entitled "Defense Contract
Management" (report number GAO/HR-97-4), the General Accounting Office (GAO) stated that
"contractor cost estimating systems are a key safeguard for obtaining fair and reasonable
contract prices when market forces do not provide for such determinations." The DOD
estimating system requirements are set forth in the Department of Defense FAR Supplement
(DFARS), 215.811-70. As shown in Figure 1-1, a parametric estimating system includes:

Data from which the estimate is based (i.e., historical data to the
maximum extent possible);

Guidance and controls to ensure a consistent and predictable system
operation;

Procedures to enforce the consistency of system usage between
calibration and forward estimating processes; and

Experienced/trained personnel.
Chapter 7, Regulatory Compliance, and Chapter 9, Auditing Parametrics, provide detailed
discussions on evaluating parametric estimating system requirements. Once a parametric
estimating system has been effectively implemented, use of these techniques on proposals can
result in significantly reduced proposal development, evaluation, and negotiation costs, and
associated cycle time reductions.
Figure 1-1: Parametric Estimating System Elements
The results of the Reinvention Laboratory also demonstrated that the use of integrated product
teams (IPTs) is a Best Practice for implementing, evaluating, and negotiating new parametric
techniques. Generally, each IPT included representatives from the contractor’s organization as
well as representatives from the contractor’s major buying activities, DCMC, and DCAA. Using
an IPT process, team members provided their feedback on a real-time basis on issues such as
the calibration and validation processes, estimating system disclosure requirements, and
Government evaluation criteria. Detailed Government evaluation criteria are included in this
Second Edition of the Parametric Estimating Handbook in Chapter 9, Auditing Parametrics, and
Chapter 10, Technical Evaluation of Parametrics. By using an IPT process, contractors were
able to address the concerns of Government representatives up-front, before incurring
significant costs associated with implementing an acceptable parametric estimating system or in
developing proposals based on appropriate techniques. The Reinvention Laboratory also
showed that when key customers participated with the IPT from the beginning, the collaboration
greatly facilitated their ability to negotiate fair and reasonable prices for proposals based on
parametric techniques.
The use of parametric estimating techniques as a BOE for proposals submitted to the
Government is expected to increase over the coming years for many reasons, including:

The results of the Reinvention Laboratory demonstrated that parametric
estimating is a tool that can significantly streamline the processes
associated with developing, evaluating, and negotiating proposals
based on cost or pricing data.

Parametric estimating techniques can be used as a basis of estimate
for proposals based on information other than cost or pricing data;
thereby increasing the reusability of this estimating technique.

Parametric estimating is referenced in the FAR. FAR 15.404-1 (c) (2) (i)
(C) states that the Government may use various cost analysis
techniques and procedures to ensure a fair and reasonable price,
including verifying reasonableness of estimates generated by
appropriately calibrated and validated parametric models or CERs.

The detailed guidance, case studies, and best practices contained in
this handbook provides an understanding of the "how-tos" for
parametric estimating. This handbook should help all those involved in
the acquisition process overcome barriers related to their lack of
familiarity with parametrics. However, it is also recognized that outside
IPT training may be needed on implementation, maintenance,
evaluation, and negotiation techniques for parametric-based estimating
systems and proposals. Appendix E, Listing of Professional
Societies/Web Sites/Educational Institutions, provides additional
sources where information on parametrics (including available training
courses) can be obtained.
This Handbook is designed to provide greater familiarization with parametric estimating
techniques, guidance on acceptable use of this estimating methodology, and methods for
evaluation. The organization of this Second Edition mirrors the process used in developing a
parametric estimating capability. Chapter 2 discusses data collection and normalization.
Chapters 3 through 6 discuss various parametric modeling techniques ranging from CER
development to more robust models, both proprietary and commercial. Chapter 7 addresses
regulatory issues, while Chapters 8 through 10 discuss the roles of various Government
organizations and functional specialists. The Handbook concludes with Chapter 11, which
discusses other uses of parametric estimating techniques. The Appendices provide
supplementary information including a glossary of terms, a listing of web site resources, and
other helpful information.
1See
Chapter 3 for complete definitions of the parametric terminology used in this handbook.
An FPRA is a structured agreement between the Government and a contractor to make
certain rates, factors, and estimating relationships available for pricing activities during a
specified period of time. See Chapter 7 for additional information on this topic.
3 In accordance with the FAR and accepted usage, the phrase "cost or pricing data," when used
in this specific combination, refers to data that is/will be subject to certification under TINA.
4 The Reinvention Laboratory Closure Report is an executive summary, which discusses the
criteria a company should use to determine if parametrics would be beneficial, and best
practices for implementing these techniques.
2
CHAPTER 2: DATA COLLECTION AND ANALYSIS
Chapter Summary
All parametric estimating techniques, including cost estimating relationships (CERs) and
complex models, require credible data before they can be used effectively. This chapter
provides an overview of the processes needed to collect and analyze data to be used in
parametric applications. The chapter also discusses data types, data sources, and data
adjustment techniques, including normalization.
Objective/Purpose

Discuss the importance of collecting historical cost and noncost (i.e., technical) data to
support parametric estimating techniques.

Identify various sources of information that can be collected to support data analysis
activities.

Describe the various methods of adjusting raw data so it is common (i.e., data
normalization).
I. Generalizations
Parametric techniques require the collection of historical cost data (including labor hours) and
technical noncost data. Data should be collected and maintained in a manner that provides a
complete audit trail with expenditure dates so that dollar valued costs can be adjusted for
inflation. While there are many universal formats for collecting data, an example of one
commonly used by Industry is the Work Breakdown Structure (WBS). The WBS provides for
uniform definitions and collection of cost and technical information. It is discussed in detail in
MIL-HDBK-881, DOD Handbook – WBS (Appendix B contains additional information). Other
data collection formats may follow the process cost models of an activity based costing (ABC)
system. Regardless of the method, a contractor’s data collection practices should be consistent
with the processes used in estimating, budgeting and executing the projects on which the data
was collected. If this is not the case, the data collection practices should contain procedures for
mapping the costs used in the database to the specific model elements.
The collecting point for cost data is generally the company’s management information system
(MIS), which in most instances contains the general ledger and other accounting data. All cost
data used in parametric techniques must be consistent with, and traceable back to, the original
collecting point. The data should also be consistent with the company’s accounting procedures
and cost accounting standards.
Technical noncost data describes the physical, performance, and engineering characteristics of
a system, sub-system or individual item. For example, weight is a common noncost variable
used in CERs and parametric estimating models. Other typical examples of cost driver variables
include horsepower, watts, thrust, and lines of code. A fundamental requirement for the
inclusion of a noncost variable in a CER would be that it is a significant predictor of cost (i.e., a
primary cost driver). Technical noncost data comes from a variety of sources including the MIS
(e.g., materials requirements planning (MRP) or enterprise resource planning (ERP) Systems),
engineering drawings, engineering specifications, certification documents, interviews with
technical personnel, and through direct experience (i.e., weighing an item). Schedule, quantity,
equivalent units, and similar information comes from Industrial Engineering, Operations
Departments, program files or other program intelligence.
Once collected, data need to be adjusted for items such as production rate, improvement curve,
and inflation. This is also referred to as the data normalization process. Relevant program data
including development and production schedules, quantities produced, production rates,
equivalent units, breaks in production, significant design changes, and anomalies such as
strikes, explosions, and other natural disasters are also necessary to fully explain any significant
fluctuations in the historical data. Such historical information can generally be obtained through
interviews with knowledgeable program personnel or through examination of program records.
Any fluctuations may exhibit themselves in a profile of monthly cost accounting data. For
example, labor hour charging may show an unusual "spike" or "depression" in the level of
charged hours. Data analysis and normalization processes are described in further detail later in
the chapter. First, it is important to identify data sources.
II. Data Sources
Specifying an estimating methodology is an important early step in the estimating process. The
basic estimating methodologies (analogy, catalog prices, extrapolation, grassroots, and
parametric) are all data-driven. To use any of these methodologies, credible and timely data
inputs are required. If data required for a specific approach are not available, then that
estimating methodology cannot be used effectively. Because of this, it is critical that the
estimator identifies the best data sources. Figure 2-1 shows nine basic sources of data and
whether the data are considered a primary or secondary source of information. When preparing
a cost estimate, estimators should consider all credible data sources. However, primary sources
of data should be given the highest priority for use whenever feasible.
The table below uses the following definitions of primary and secondary sources of data for
classification purposes:

Primary data are obtained from the original source. Primary data are considered the
best in quality, and ultimately the most reliable.

Secondary data are derived (possibly "sanitized") from primary data, and are therefore,
not obtained directly from the source. Because secondary data are derived (actually
changed) from the original data, it may be of lower overall quality and usefulness.
Sources of Data
Source
Basic Accounting Records
Cost Reports
Source Type
Primary
Either (Primary or Secondary)
Historical Databases
Either
Functional Specialist
Either
Technical Databases
Either
Other Information Systems
Either
Contracts
Secondary
Cost Proposals
Secondary
Figure 2-1: Sources of Data
Collecting the necessary data to produce an estimate, and evaluating the data for
reasonableness, is a critical and often time-consuming step. As stated, it is important to obtain
cost information, technical information, and schedule information. The technical and schedule
characteristics of programs are important because they drive the cost. For example, assume the
cost of another program is available and a program engineer has been asked to relate the cost
of the program to that of some other program. If the engineer is not provided with specific
technical and schedule information that defines the similar program, the engineer is not going to
be able to accurately compare the programs, nor is he or she going to be able to respond to
questions a cost estimator may have regarding the product being estimated in comparison to
the historical data. The bottom line is that the cost analysts and estimators are not solely
concerned with cost data. They need to have technical and schedule information available in
order to adjust, interpret, and lend credence to the cost data being used for estimating
purposes.
A cost estimator has to know the standard sources where historical cost data exists. This
knowledge comes from experience and from those people that are available to answer key
questions. A cost analyst or estimator should constantly search out new sources of data. A new
source might keep cost and technical data on some item of importance to the current estimate.
Internal contractor information may also include analyses such as private corporate inflation
studies, or "market basket" analyses. Market basket analysis is an examination of the price
changes in a specified group of products. Such information provides data specific to a
company's product line(s) that could be relevant to a generic segment of the economy as a
whole. Such specific analyses would normally be prepared as part of an exercise to benchmark
Government provided indices, such as the consumer price index, and to compare corporate
performance to broader standards.
In addition, some sources of data may be external. External data can include databases
containing pooled and normalized information from a variety of sources (other companies or
public record information). Although such information can often be useful, weaknesses of these
sources can relate to:

No knowledge of the manufacturing and/or software processes used and how they
compare to the current scenario being estimated.

No knowledge of the procedures (i.e., accounting) used by the other contributors.

No knowledge on the treatment of anomalies (how they were handled) in the original
data.

The inability to accurately forecast future indices.
It is important to realize that sources of data are almost unlimited, and all relevant information
should be considered during data analysis, if practical. Although major sources are described
above, data sources should not be constrained to a specific list. Figure 2-2 highlights several
key points about data collection, evaluation, and normalization processes.
Data Collection, Evaluation and Normalization

Very Critical Step

Can Be Time-Consuming

Need Actual Historical Cost, Schedule, and Technical Information

Know Standard Sources

Search Out New Sources

Capture Historical Data

Provide Sufficient Resources
Figure 2-2: Data Collection, Evaluation & Normalization
III. Routine Data Normalization Adjustments
Data need to be adjusted for certain effects to make it homogeneous, or consistent. Developing
a consistent data set is generally performed through the data analysis and normalization
process. In nearly every data set, the analyst needs to examine the data to ensure the database
is free of the effects of:

The changing value of the dollar over time,

The effects of cost improvement as the organization improves its efficiency, and

The effects of various production rates during the period from which the data were
collected.
Figure 2-3 provides a process flow related to data normalization. This process description is not
intended to be all-inclusive, but rather depicts the primary activities performed in normalizing a
data set.
Figure 2-3: Data Normalization Process Flow
Some data adjustments are routine in nature and relate to items such as inflation. Routine data
adjustments are discussed below. Other adjustments are more complex in nature, such as
those relating to anomalies. Section IV discusses significant data normalization adjustments.
A. Inflation
Inflation is defined as a rise in the general level of prices, without a rise in
output or productivity. There are no fixed ways to establish universal inflation
indices (past, present or future) that fit all possible situations. Inflation indices
generally include internal and external information as discussed in Section II.
Examples of external information include the Consumer Price Index (CPI),
Producer Price Index (PPI), and other forecasts of inflation from various
econometric models. Therefore, while generalized inflation indices may be
used, it may also be possible to tailor and negotiate indices used on an
individual basis to specific labor rate agreements (e.g., forward pricing rates)
and the actual materials used on the project. Inflation indices should be based
on the cost of materials and labor on a unit basis (piece, pounds, hour), and
should not include other considerations like changes in manpower loading or
the amount of materials used per unit of production. The key to inflation
adjustments is consistency. If cost is adjusted to a fixed reference date for
calibration purposes, the same type of inflation index must be used in
escalating the cost forward or backwards from the reference date and then to
the date of the estimate.
B. Cost Improvement Curve
When first applied, cost improvement was referred to as "Learning Curve"
theory. Learning curve theory states that as the quantity of a product produced
doubles, the manufacturing hours per unit expended producing the product
decreases by a constant percentage. The learning curve, as originally
conceived, analyzes labor hours over successive production units of a
manufactured item. The theory has been adapted to account for cost
improvement across the organization. Both cost improvement and the
traditional learning curve theory is defined by the following equation:
=
AXb;
Y
=
hours/unit (or constant dollars per unit)
A
=
first unit hours (or constant dollars per unit)
X
=
unit number
b
=
slope of the curve related to learning.
Y
Where:
In parametric models, the learning curve is often used to analyze the direct cost
of successively manufactured units. Direct cost equals the cost of both touch
labor and direct materials in fixed year dollars. Sometimes this is called an
improvement curve. The slope is calculated using hours or constant year
dollars. A more detailed explanation of improvement curve theory is presented
in Chapter 3 and Chapter 10.
C. Production Rate
The cost improvement curve theory has had many innovations since originally
conceived. One of the more popular innovations has been the addition of a variable in
the equation to capture the organization's production rate. The production rate is
defined as the number of items produced over a given time period. The following
equation modifies the general cost improvement formula to capture changes in the
production rate (Qr) and organizational cost improvement (Xb):
=
AXbQr
Y
=
hours/unit (or constant dollars per unit)
A
=
first unit hours (or constant dollars per unit)
X
=
unit number
b
=
slope of the curve related to learning
Q
=
production rate (quantity produced during the period)
r
=
slope of the curve related to the production rate.
Y
Where:
The net effect of adding the production rate effect equation (Q r) is to adjust the first unit
dollars (A) for various production rates throughout the life of the production effort. The
equation will also yield a rate-effected slope related to learning. The rate effected
equation must be monitored for problems of multicollinearity (X and Q having a high
degree of correlation). If the model exhibits problems of multicollinearity, the analyst
should account for production rate effects using an alternative method. If possible, rate
effects should be derived from historical data program behavior patterns observed as
production rates change, while holding the learning slope coefficient constant. The rate
effect can vary considerably depending on what was required to effect the change. For
example, were new facilities required or did the change involve only a change in
manpower or overtime? Chapter 10 provides additional information on data adjustments
for inflation, learning, and production rate.
IV. Significant Data Normalization Adjustments
The section describes some of the more complex adjustments analysts make to historical cost
data used in parametric analysis.
A. Consistent Scope
Adjustments are appropriate for differences in program or product scope
between the historical data and the estimate being made. For example,
suppose the systems engineering department made a comparison of five
similar programs. After initial analysis, the organization realized that only two of
the five had design-to-cost (DTC) requirements. To normalize the data, the
DTC hours were deleted from the two programs to create a consistent systems
scope.
B. Anomalies
Historical cost data should be adjusted for anomalies (unusual events) when it
is not reasonable to expect these unusual costs to be present in the new
projects. The adjustments and judgments used in preparing the historical data
for analysis should be fully documented. For example, a comparison has been
made to compare the development test program from five similar programs and
then certain observations are made that one of the programs experienced a
major test failure (e.g., qualification, ground test, flight test). A considerable
amount of labor resources were required to fact-find, determine the root cause
of the failure, and develop an action plan for a solution. Should the hours for
this program be included in the database or not? This is an issue analysts must
consider and resolve. If an adjustment is made to this data point, then the
analyst must thoroughly document the actions taken to identify the anomalous
hours.
There are other changes for which data can be adjusted, such as changes in
technology. These changes must be accounted for in a contractor’s estimate.
Data normalization is one process typically used to make all such adjustments.
In certain applications, particularly if a commercial model is used, the model
inputs could be adjusted to account for certain improved technologies (see
discussion of commercial models in Chapters 5 and 6). In addition, some
contractors, instead of normalizing the data for technology changes, may
deduct estimated savings from the bottom-line estimate. Any adjustments made
by the analyst to account for a technology change in the data must be
adequately documented and disclosed.
For example, suppose electronic circuitry was originally designed with discrete
components, but now the electronics are ASIC technology (more advanced
technology). Or, a hardware enclosure once was made from aluminum and now
is made, for weight constraints, of magnesium. What is the impact on the
hours? Perfect historical data may not exist, but good judgment and analysis by
an experienced analyst should supply reasonable results.
For example, suppose there are four production lots of manufacturing hours
data that look like the following:
Lot
Total Hours
Units
Average hours per unit
Lot 1
256,000
300
853 hours/unit
Lot 2
332,000
450
738 hours/unit
Lot 3
361,760
380
952 hours/unit
Lot 4
207,000
300
690 hours/unit
Clearly, Lot 3's history should be investigated since the average hours per unit
appear peculiar. It is not acceptable to merely "throw out" Lot 3 and work with
the other three lots. A careful analysis should be performed on the data to
determine why it exhibited this behavior.
C. Illustration of Data Adjustment Analysis
Based on the prior discussion, the following is an example to illustrate the data analysis
process. Suppose the information in the following table represents a company’s
historical data and that the prospective system is similar to one built several years ago.
Parameter
Historical System
Prospective System
Date of Fabrication
Jul 89-Jun 91
Jul 95-Dec 95
Production Quantity
500
750
Size- Weight
22 lb. external case
5 lb. int. chassis
8lb. elec. parts
20 lb. external case
5 lb. int. chassis
10 lb. elec. parts
Volume
1 cu ft-roughly cubical 12.l
x 11.5 x 12.5
.75 cu ft-rec. solid
8 x 10 x 16.2
Other Prog Features
5% elec.
Additional spare parts
5% elec.
No spare parts
This data needs several adjustments. In this example, the inflation factors, the quantity
difference, the rate of production effect, and the added elements in the original program
(the spare parts) would require adjustment. The analyst must be careful when
normalizing the data. General inflation factors are usually not appropriate for most
situations. Ideally, the analyst will have a good index of costs specific to the industry
and will use labor cost adjustments specific to the company. The quantity and rate
adjustments will have to consider the quantity effects on the company's vendors and the
ratio of overhead and setup to the total production cost. Likewise, with rate factors each
labor element will have to be examined to determine how strongly the rate affects labor
costs. On the other hand, the physical parameters do not suggest that significant
adjustments are required.
The first order normalization of the historic data would consist of:





Material escalation using Industry or company material cost history.
Labor escalation using company history.
Material quantity price breaks using company history.
Possible production rate effects on touch labor (if any) and unit
overhead costs.
Because both cases are single lot batches, and are within a factor of
two in quantity, only a small learning curve or production rate
adjustment would generally be required.
V. Evaluation Issues
The Defense Federal Acquisition Regulation Supplement (DFARS) 215-407-5, "Estimating
Systems," states that "contractors should use historical data whenever appropriate." The
DFARS also states "a contractor’s estimating system should provide for the identification of
source data and the estimating methods and rationale used to develop an estimate." Therefore,
all data, including any adjustments made, should be thoroughly documented by a contractor so
that a complete trail is available for verification purposes. Some key questions an evaluator may
ask during their review of data collection and analysis processes include:







Are sufficient data available to adequately develop parametric techniques?
Has the contractor established a methodology to obtain, on a routine basis, relevant
data on completed projects?
Are cost, technical, and program data collected in a consistent format?
Will data be accumulated in a manner that will be consistent with the contractor’s
estimating practices?
Are procedures established to identify and examine any data anomalies?
Were the source data used as is, or did it require adjustment?
Are any adjustments made to the data points adequately documented to demonstrate
that they are logical, reasonable, and defensible?
Chapter 9, Audit Issues, and Chapter 10, Technical Evaluation of Parametrics, provides
additional information on Government evaluation criteria.
VI. Other Considerations
There are several other issues that need to be considered when performing data collection and
analysis. Some of these are highlighted below.
A. Resources
Data collection and analysis activities require proper resources be established.
Therefore, companies should establish sufficient resources to perform these
activities. In addition, formal processes should be established describing data
collection and analysis activities. Chapter 7, Regulatory Compliance provides
information on estimating system requirements and includes discussion on data
collection and analysis procedures.
B. Information in the Wrong Format
While the contractor may indeed possess a great deal of data, in many cases the data
is not in an appropriate format to support the parametric techniques being used. For
example, commercial parametric models may have a unique classification system for
cost accounts that differ from those used by a company. As a result, companies using
these models would have to develop a process that compares their accounting
classifications to those used by the model (also known as "mapping"). In other
situations, legacy systems may have generated data to meet the needs for reporting
against organizational objectives, which did not directly translate into the needs of the
cost estimating and analysis function. For example, the orientation of a large number of
past and existing information systems may have focused on the input side with little or
no provision for making meaningful translations reflecting output data useful in CER
development or similar types of analysis. The growing use of ERP systems, which have
a common enterprise-wide database, should make this data disconnect less severe.
Most large organizations are implementing ERP systems or are reengineering their
existing Information Systems so that parametric estimating models can be interfaced
with these systems quite easily.
C. Differences in Definitions of Categories
Many problems occur when the analyst or the database fails to account for differences
in the definitions of the WBS elements across the projects included in the database.
Problems also occur when the definition of the content of cost categories fails to
correspond to the definition of analogous categories in existing databases. For
example, some analysts put engineering drawings into the data category while others
put engineering drawings into the engineering category. A properly defined WBS
product tree and dictionary can avoid or minimize these inconsistencies.
D. The Influence of Temporal Factors
Historical data are generated over time. This means that numerous dynamic factors will
influence data being collected in certain areas. For example, the definition of the
content of various cost categories being used to accumulate the historical data may
change as a system evolves. Similarly, inflation changes will occur and be reflected in
the cost data being collected over time. In addition, as the Department of Defense
(DOD) deals with a rapidly changing technical environment, both cost and noncost data
generated for a given era or class of technology are necessarily limited. Many analysts
would consider a data-gathering project a success if they could obtain five to ten good
data points for certain types of hardware.
E. Comparability Problems
Comparability problems include, but are not limited to, changes in a company's
department numbers, accounting systems, and disclosure statements. They also
include changes from indirect to direct charge personnel for a given function. When
developing a database, the analyst must normalize it to ensure the data are
comparable. For example, if building a database with cost data, the analyst must first
remove the effects of inflation so that all costs are displayed in constant dollars. The
analyst must also normalize the data for consistency in content. Normalizing for content
is the process of ensuring that a particular cost category has the same definition in
terms of content for all observations in the database. Normalizing cost data is a
challenging problem, but it must be resolved if a good database is to be constructed.
Resolving database problems so that an information system exists to meet user needs
is not easy. For example, cost analysis methodologies typically vary considerably from
one analysis or estimate to another. The requirements for CERs, such as the data and
information requirements are not constant over time. An analysts data needs
determination at one point is time is not the final determination for all time for that
system. Data needs must be reviewed periodically.
The routine maintenance and associated expense of updating the database must also
be considered. An outdated database may be of very little use in forecasting future
acquisition costs. The more the organization develops and relies on parametric
estimating methods, the more it will need to invest in data collection and analysis
activities. The contractor needs to balance this investment against the efficiency gains it
plans to achieve through use of parametric estimating techniques. If the contractor
moves towards an ERP system, the incremental cost to add a parametric estimating
capability may not be significant.
Good data underpins the quality of any estimating system or method. As the acquisition
community moves toward estimating methods that increase their reliance on the
historical costs of the contractor, the quality of the data cannot be taken for granted.
Industry and their Government customers should find methods to establish credible
databases that are relevant to the history of the contractor. From this, the contractor
should be in a better position to reliably predict future costs, and the Government would
be in a better position to evaluate proposals based on parametric techniques.
CHAPTER 3: COST ESTIMATING RELATIONSHIPS (CERs)
Chapter Summary
Many companies implement cost estimating relationships (CERs) to streamline the costs and
cycle time associated with proposal preparation, evaluation, and negotiation processes. Often
CERs are used to price low-cost items or services that take a significant amount of resources to
estimate using traditional techniques. Proper CER development and application depends
heavily on understanding certain mathematical and statistical techniques. This chapter explains
some of the easier and more widely used techniques. However, there are many other
techniques available, which are explained in standard statistical textbooks (Appendix E contains
a listing of statistical resources). The focus of the discussion in this chapter is designed to
permit an analyst to understand and apply the commonly used techniques. In addition, the
chapter provides "Rule-of-Thumb" guidelines for determining the merit of statistical regression
models, instructions for comparing models, examples of simple and complex CERs developed
and employed by some of the Parametric Estimating Reinvention Laboratory sites, and a
discussion of the differences between simple and complex models.
Objective/Purpose
The primary objective of this chapter is to provide general guidance for use in developing and
employing valid CERs. The chapter focuses on simple and complex CERs and provides
information on implementation, maintenance, and evaluation techniques. Specifically, this
chapter:
1. Discusses various techniques for implementing CERs, including the linear regression
model of the Least Squares Best-Fit model.
2. Provides a framework for analyzing the quality or validity of a statistical model.
3. Recommends procedures for developing a broad based CER estimating capability.
Key Assumptions
A number of quantitative applications can be used to analyze the strength of data relationships.
When applicable, statistical analysis is one of the most frequently used techniques. Therefore,
this chapter focuses on the use of statistical analysis as a tool for evaluating the significance of
data relationships. However, readers should be aware that other techniques are available.
I. Developing CERs
Before venturing into the details of how to develop CERs, an understanding of the definition is
necessary. Numerous cost estimating references and statistical texts may define CERs in a
number of different ways. In deciding on a standard set of definitions for this Handbook, the
Parametric Cost Estimating Initiative (PCEI), Working Group (WG) solicited feedback from a
number of sources. The PCEI WG decided upon the continuum represented in Figure 3-1, and
described below
Figure 3-1: Continuum of CER Complexity
In short, CERs are mathematical expressions of varying degrees of complexity expressing cost
as a function of one or more cost driving variables. The relationship may utilize cost-to-cost
variables, such as manufacturing hours to quality assurance hours, or cost-to-noncost variables,
such as engineering hours to the number of engineering drawings. The continuum of CERs is
synonymous with the term parametric estimating methods. Parametric estimating methods are
defined as estimating techniques that rely on theoretical, known or proven relationships
between item characteristics and the associated item cost. Whether labeled a CER or a
parametric estimating method, the technique relies on a value, called a parameter, to estimate
the value of something else, typically cost. The estimating relationship can range in complexity
from something rather simple; such as a numerical expression of value or a ratio (typically
expressed as a percentage), to something more complex; such as a multi-variable
mathematical expression.
As the relationships increase in complexity, many analysts identify them as a cost model. A
model is a series of equations, ground rules, assumptions, relationships, constants, and
variables that describe and define the situation or condition being studied. If the model is
developed and sold to the public for broad application, it is typically referred to as a commercial
model. If the model is developed for the specific application needs of an organization, it is
typically referred to as a company-developed or proprietary model.
A. Definition of a CER
As previously stated, a CER is a mathematical expression relating cost as the
dependent variable to one or more independent cost-driving variables. An
example of a cost-to-cost CER may be represented by using manufacturing
costs to estimate quality assurance costs, or using manufacturing hours to
estimate the cost for expendable material such as rivets, primer, or sealant. The
key notion is that the cost of one element is used to estimate, or predict, the
cost of another element. When the relationship is described as a cost-tononcost relationship, the reference is to a CER where a characteristic of an
item is used to predict the item’s cost. An example of a cost-to-noncost CER
may be to estimate manufacturing costs by using the weight of an item. Another
example is to use the number of engineering drawings to estimate designengineering costs. In the cost-to-noncost examples, both weight and the
number of engineering drawings are noncost variables.
For CERs to be valid, they must be developed using sound logical concepts.
The logic concept is one where experts in the field agree, as supported by
generally accepted theory, that one of the variables in the relationship (the
independent variable) causes or affects the behavior in another variable (the
dependent variable). Once valid CERs have been developed, parametric cost
modeling and estimating can proceed. This chapter discusses some of the
more commonly used statistical techniques for CER development.
B. CER Development Process
CERs are a key tool used in estimating by the cost analyst and they may be
used at any time in the estimating process. For example, CERs may be used in
the concept or validation phase to estimate costs of a program when there is
insufficient system definition. CERs may also be used in later phases to
estimate program costs for use as a cross-check against estimates prepared
using other techniques. CERs are also used as a basis of estimate (BOE) for
proposals submitted to the Government or higher-tier contractors. Often before
implementing complex CERs or models, analysts begin with more rudimentary
CERs in order to gain the confidence of internal company and external
Government representatives. The CER development process is illustrated in
Figure 3-2.
The beginning of the CER development process is the identification of an
opportunity to improve the estimating process through the use of CERs. The
specific outcome of this step is a whitepaper describing the specific opportunity,
the data needs, the analysis tools, the CER acceptance criteria, and a planned
process for keeping the CER current. In undertaking this effort, an organization
will typically investigate a number of estimating relationships. Evaluating CER
opportunities one at a time is rather inefficient when it comes to data collection.
Therefore, the cost team will typically research the firm’s databases for data
supporting a number of opportunities simultaneously.
The value of a CER depends on the soundness of the database from which the
CER is developed and subsequently, how it is used in future estimates.
Determination of the "goodness" of a CER and its applicability to the system
being estimated requires a thorough analysis of the system and knowledge of
the database. It is possible, however, to make a few general observations about
CER development. CERs are analytical equations that relate various cost
categories (either in dollars or physical units) to cost drivers or explanatory
variables. CERs can take numerous forms, ranging from informal rules-ofthumb or simple analogies to formal mathematical functions derived from
statistical analysis of empirical data. Regardless of the degree of complexity,
developing a CER requires a concerted effort to assemble and refine the data
that constitutes its empirical basis. In deriving a CER, assembling a credible
database is especially important and, often, the most time-consuming activity.
Deriving CERs is a difficult task and the number of valid CERs is significantly
fewer than one might expect. While there are many reasons for the lack of valid
CERs, the number one reason is the lack of an appropriate database.
Figure 3-2: CER Development Process
When developing a CER, the analyst must first hypothesize and test logical
estimating relationships. For example, does it make sense to expect that costs
will increase as aircraft engine thrust requirements increase? Given that it does
make sense, the analyst will need to refine that hypothesis to determine
whether the relationship is linear or curvilinear. After developing a hypothetical
relationship, the analyst needs to organize the database to test the proposed
relationship(s).
Sometimes, when assembling a database, the analyst discovers that the raw
data are at least partially in the wrong format for analytical purposes, or that the
data displays irregularities and inconsistencies. Adjustments to the raw data,
therefore, almost always need to be made to ensure a reasonably consistent
and comparable database. It is important to note that no degree of
sophistication in the use of advanced mathematical statistics can compensate
for a seriously deficient database.
Since the data problem is fundamental, typically a considerable amount of time
is devoted to collecting data, adjusting that data to help ensure consistency and
comparability, and providing for proper storage of the information so that it can
be rapidly retrieved when needed. More effort is typically devoted to assembling
a quality database than to any other step in the process. Chapter 2, Data
Collection and Analysis, provides further information on this topic. Given the
appropriate information, however, the analytical task of deriving CER equations
is often relatively easy.
C. Testing a CER’s Logic
Complementing the issues of deriving a good database is the need to first
hypothesize, then test, the mathematical form of the CER. Some analysts
believe the hypothesis comes first, then the data search to build a good
database. Other analysts believe the data search comes first, and given the
availability of data, the subsequent determination of a logical relationship or
hypothesis occurs. Regardless of the position taken, the analyst must
determine and test a proposed logical estimating relationship. The analyst must
structure the forecasting model and formulate the hypothesis to be tested. The
work may take several forms depending upon forecasting needs. It involves
discussions with engineers to identify potential cost driving variables, scrutiny of
the technical and cost proposals, and identification of cost relationships. Only
with an understanding of estimating requirements can an analyst attempt to
hypothesize a forecasting model necessary to develop a CER. CERs do not
necessarily need robust statistical testing. Many firms use CERs and validate
them by evaluating how well they predicted the final cost of that portion of the
project they were designed to estimate. If the CER maintains some reasonable
level of consistency, the firm continues to use it. Consequently, statistical
measures are not the only way to measure a CERs validity. Regardless of the
validation method, application of the technique must adhere to the company's
estimating system policies and procedures. Chapters 7 through 10 provide
practical guidance on Government review and evaluation criteria.
D. The CER Model
Once the database is developed and a hypothesis determined, the analyst is
ready to mathematically model the CER. While this analysis can take several
forms, both linear and curvilinear, the chapter will initially consider one simple
model -- the Least Squares Best Fit (LSBF) model. A number of statistical
packages are available that generate the LSBF equation parameters. Most
statistical software programs use the linear regression analysis process.
However, the chapter will first review manual development of the LSBF
equation and the regression analysis process.
II. Curve Fitting
There are two standard methods of curve fitting. One method has the analyst plot the data and
fit a smooth curve that appears to best-fit the relationship in the data. This is known as the
graphical method. Although in many cases the "curve" will be a straight line, the vocabulary of
cost estimating and math identifies this technique as curve fitting. The other method uses
formulas to mathematically develop a line of "best-fit." This mathematical approach is termed
the LSBF method and provides the foundation for simple linear regression. Any of the
mathematical analysis techniques described in this section of the handbook will work with the
simplest CER (regression model) to estimate a straight line. The mathematical equation for a
straight line is expressed as: Y = A + B(X). The elements of this equation are discussed in the
next section. Although few relationships in cost estimating follow a pure linear relationship, the
linear model is sufficiently accurate in many cases over a specified range of the data.
A. Graphical Method
To apply the graphical method, the data must first be plotted on graph paper.
No attempt should be made to make the smooth curve actually pass through
the data points that have been plotted. Instead, the curve should pass between
the data points leaving approximately an equal number on either side of the
line. For linear data, a clear ruler or other straightedge may be used to fit the
curve. The objective is to "best-fit" the curve to the data points plotted; that is,
each data point plotted is equally important and the curve you fit must consider
each and every data point.
Although considered a rather outdated technique today, plotting the data is still
generally a good idea. Spreadsheets with integrated graphical capabilities
make this task rather routine. By plotting the data, we get a picture of the
relationship and can easily focus on those points that may require further
investigation. Before developing a forecasting rule or mathematical equation,
the analyst is advised in every case to plot the data and note any points that
may require further investigation.
B. LSBF Method
The purpose of the LSBF analysis is to improve our ability to predict the next
"real world" occurrence of our dependent variable. The LSBF technique is also
the root of regression analysis, which may be defined as the mathematical
nature of the association between two variables. This association is determined
in the form of a mathematical equation. Such an equation provides the ability to
predict one variable on the basis of the knowledge of the other variable. The
variable whose value is to be predicted is called the dependent variable. The
variable for which knowledge is available or can be obtained is called the
independent variable. In other words, the value of the dependent variable
depends on the value of the independent variable(s).
The relationship between variables may be linear or curvilinear. A linear
relationship means that the functional relationship can be described graphically
(on an ordinal X-Y coordinate system) by a straight line and mathematically by
the common form:
Y = A + B(X), where:
Y = represents the calculated value of Y (the dependent
variable)
X = the independent variable
B = the slope of the line (the change in Y divided by the change
in X), and
A = the point at which the line intersects the vertical axis (Yaxis).
The bi-variate regression equation (the linear relationship of two variables)
consists of two distinctive parts, the functional part and the random part. The
equation for a bi-variate regression population is: Y = A + B(X) + E. The portion
of the equation given by "A + B(X)" is the functional part (a straight line), and E
(the error term) is the random part. A and B are parameters of the population
that exactly describe the intercept and slope of the relationship. The term "E"
represents the random part of the equation. The random part of the equation is
always present because of the errors of assigning value, measurement, and
observation. These types of errors always exist because of human limitations,
and the limitations associated with real world events.
Since it is practically impossible to capture data for an entire population, we
normally work with a representative sample from that population. We denote
that we are working with a sample by adjusting our equation to the form: Y = a
+ b(X) + e. Again, the term "a + b(X)" represents the functional part of the
equation and "e" represents the random part. The estimate of the true
population parameters "A" and "B" are represented in the sample equation by
"a" and "b", respectively. In this sense then, "a" and "b" are statistics. That is,
they are estimates of population parameters. As statistics, they are subject to
sampling errors. Consequently, a good random sampling plan is important.
The LSBF method specifies the one line that best fits the data set. The method
does this by minimizing the sum of the squared deviations of the observed
values of Y and calculated values of Y. The observed value represents the
value that is actually recorded in the database, the calculated value of Y,
identified as Yc, is that value the equation predicts given the same value of X.
For example, suppose we estimated engineering hours based on the number of
drawings using the following linear equation: EngrHours = 467 + 3.65
(NumEngrDrawings). In this case "EngrHours" is the dependent, or Y-variable,
and "NumEngrDrawings" is the independent or X-variable. Suppose the
company’s database contained 525 hours for a program containing 15
engineering drawings. The 525 hours represents the observed value for Y when
X is equal to 15. The equation however would have predicted 521.75 hours (Yc
= 467 + 3.65(x) = 467+3.65(15) = 521.75). The "521.75" is the calculated value
of Y, or Yc. The difference between the observed and calculated value
represents the error ("e") in the equation (model). The LSBF technique
analyzes each (X,Y) pair in the database, refining the parameters for the slope
and intercept terms, until it finds the one equation for the line that minimizes the
sum of the squared error terms. To illustrate this process, assume the
measurement of the error term for four points: (Y1 - YC1,), (Y2 - YC2), (Y3 - YC3),
(Y4 - YC4). The line that best fits the data, as shown in Figure 3-3, is the line that
minimizes the following summation:
;
where "i" is simply a counting scheme to denote that the technique minimizes
the squared distance for all elements in the data set. The data set starts with
observation number 1 and ends with the last one. In this case, the last
observation is number 4.
Figure 3-3: LSBF Graphical Estimation
To calculate the LSBF line for a database of n-number of observations, the
analyst needs to find the "a + bX" which minimizes:
.
Fortunately, the use of calculus has determined that the LSBF "a + bX"
parameters minimizes the squared error term when:
(1)
Y = AN + B
(2)
XY = A
X and
X+B
X2
Equations (1) and (2) are called the normal equations of the LSBF line.
References contained in any comprehensive statistical textbook will illustrate
that these two equations do meet the requirements of the ordinary LSBF
regression. These properties are:

The technique considers all points.

The sum of the squared deviations between the line and observed
points is the minimum value possible, that is, (Y - Yc)2 = E2 = a
minimum.
Similarities between these two properties and the arithmetic mean should also
be observed. The arithmetic mean is the sum of the values of the independent
variable divided by the number of observations or
X/n =
and the sum of
the "Ys" divided by the number of observations or
Y/n =
. It follows that the
point
falls on the LSBF line. To calculate the "a" and "b" for the LSBF
line, we need a spreadsheet format, as shown in Figure 3-4.
Computation
Element
X
Y
X*Y
X2
Y2
X1
Y1
X1 * Y1
X12
Y12
X2
Y2
X2 * Y2
X22
Y22
X3
Y3
X3 * Y3
X32
Y32
-
-
-
-
-
-
-
-
-
-
X2
Y2
Sum of the
Column ( )
Figure 3-4: Generic LSBF Analysis Table
To illustrate the calculations for a line of best fit, suppose we collected data and
assembled it in the LSBF Analysis Table format, as shown in Figure 3-5.
Computation
Element
Sum of the
Column ( )
X
Y
XY
X2
Y2
4
10
40
16
100
11
24
264
121
576
3
8
24
9
64
9
12
108
81
144
7
9
63
49
81
2
3
6
4
9
36
66
505
280
974
Figure 3-5: LSBF Analysis Example
From the "normal equations" for the LSBF technique, we can derive equations
to calculate "a" and "b" directly. The equations for "b" and "a" are given by:
(3)
, and
(4)
. (Recall that once we know the slope "b," we
can solve the general equation Y = a + b (X) for "a" because
we know that the point
must lie on the line and
therefore, directly solve the equation.
Solving first for "b," we use the data from Figure 3-5 and substitute the values
into equation (3). Recall that
and,
, where n = the
number of observations. Notice that the last row contains the sum of values
(
) for the element in bold in the top row of the figure. Solving for "b",
therefore, yields:
Solving for "a" yields:
Therefore, the LSBF equation for the line is
calculated value of Y).
(i.e., the
C. Limitations, Errors and Caveats of LSBF Techniques
When working with the LSBF technique, there are a number of limitations,
errors and caveats to note. The following are some of the more obvious ones.
1) Assumptions of the LSBF Model
With the LSBF method, there are a number of critical
assumptions for the theory to work precisely. If any of the
assumptions are not valid, then theoretically the technique is
flawed. Many applied mathematicians, however, consider the
assumptions more as guidelines on when the technique will
work the best. If any assumption is violated, then the next
question is how significant is the violation. If the violation is
relatively minor, or the data almost nearly complies, then the
technique is generally satisfactory for estimating. The size of
the error term and other statistical measures should provide
sufficient indication of the validity of the technique, even when
the data do not completely adhere to the assumptions identified
below:

The values of the dependent variable are distributed by
a normal distribution function around the regression
line.

The mean value of each distribution lies on the
regression line.

The variance of each array of the independent variable
is constant.

The error term in any observation is independent of the
error term in all other observations. When this
assumption is violated, data is said to be
autocorrelated. This assumption requires the error
term to be a truly random variable.

There are no errors in the values of the independent
variables. The regression model specifies that the
independent variable be a fixed number, and not a
random variable.

All causation in the model is one way. The causation
must go from the independent variable to the
dependent variable. Causation, though neither
statistical nor a mathematical requirement, is a highly
desirable attribute when using the regression model for
forecasting. Causation, of course, is what cost analysts
are expected to determine when they hypothesize the
mathematical logic of a CER equation.
2) Extrapolation Beyond The Range of The Observed Data
A LSBF equation is theoretically valid only over the same range
of data from which the sample was initially taken. In forecasting
outside this range, the shape of the curve is less certain and
there is more estimating risk involved. Less credence is given
to forecasts made with data falling outside the range of the
original data. However, this does not mean that extrapolation
beyond the relevant range is always invalid. It may well be that
forecasting beyond the relevant range is the only suitable
alternative available. The analyst must keep in mind that
extrapolation assigns values using a relationship that has been
measured for circumstances that may differ from those used in
the forecast. It is the analyst’s job to make this determination,
in coordination with the technical and programmatic personnel
from both the company and the Government.
3) Cause And Effect
Regression and correlation analysis can in no way determine
cause and effect. It is up to the analyst to do a logic check,
determine an appropriate hypothesis, and analyze the
database so that an assessment can be made regarding cause
and effect. For example, assume a high degree of correlation
between the number of public telephones in a city and city
liquor sales. Clearly, there is no cause and effect involved here.
A variable with a more logical nexus, such as population, is a
more causal independent variable that drives both the number
of public telephones and liquor sales. Analysts must ensure
that they have chosen approximately related data sets and that
real cause and effect is at work in their CERs.
4) Using Past Trends To Estimate Future Trends
It is very important to know that conditions change. If the
underlying population is no longer relevant due to changes in
technology, for example, then the LSBF equation may not be
the best forecasting tool to use. When using a CER, the analyst
needs to ensure the factors in the forecast still apply to the
original historical LSBF equation.
D. Multiple Regression
In simple regression analysis, a single independent variable (X) is used to
estimate the dependent variable (Y), and the relationship is assumed to be
linear (a straight line). This is the most common form of regression analysis
used in CER development. However, there are more complex versions of the
regression equation that can be used that consider the effects of more than one
independent variable. Multiple regression analysis, for example, assumes that
the change in Y can be better explained by using more than one independent
variable. For example, automobile gasoline consumption may be largely
explained by the number of miles driven. However, we may postulate a better
explanation if we also considered factors such as the weight of the automobile.
In this case, the value of Y would be explained by two independent variables.
Yc = a + b1X1 + b2X2
where:
Yc
=
a
=
X1
=
b1
=
X2
b2
=
=
the calculated or estimated value for the
dependent variable
the Y intercept, the value of Y when all Xvariables = 0
the first independent (explanatory)
variable
the slope of the line related to the change
in X1: the value by which Yc changes
when X1 changes by one
the second independent variable
the slope of the line related to the change
in X2: the value by which Yc changes
when X2 changes by one
Finding the right combinations of explanatory variables is no easy task. Relying
on the general process flow in Figure 3-2, however, helps immeasurably.
Postulating the theory of which variables most significantly and independently
contribute towards explaining cost behavior is the first step. Many applied
statisticians then use a technique called step-wise regression to focus on the
most important cost driving variables. Step-wise regression is the process of
"introducing the X variables one at a time (stepwise forward regression) or by
including all the possible X variables in one multiple regression and rejecting
them one at a time (stepwise backward regression). The decision to add or
drop a variable is usually made on the basis of the contribution of that variable
to the ESS [error sum of squares], as judged by the F-test." 1 Stepwise
regression allows the analyst to add variables, or remove them, in search of the
best model to predict cost.
Stepwise regression, however, requires the analyst to carefully understand the
variables they are introducing to the model, to hypothesize the effect the
variables should have on the model, and to monitor for the effects of
multicollinearity. Multicollinearity occurs when two or more presumed
independent variables exhibit a high degree of correlation with each other. In
short, the explanatory variables are not making independent contributions
toward explaining the variance in the dependent variable. The mathematics of
regression analysis cannot separate or distinguish between the contributions
each variable is making. This prevents the analyst from determining which
variable is stronger or whether the sign on the parameter for that variable is
correct. The analyst must rely on the postulated theory and pair-wise correlation
to help solve this dilemma. Symptoms of multicollinearity include a high
explanatory power of the model, accompanied by insignificant or illogical
(incorrect sign) coefficient estimates. With multicollinearity, the math may still
produce a valid point estimate for Yc. The analyst, therefore, may still be able to
predict with the model. They must, however, use the entire equation, and can
only project point estimates. Multicollinearity does not allow the analyst to trust
the value or the sign of individual parameter coefficients. More detail on multiple
regression and stepwise regression is beyond the scope of this Handbook.
Please refer to Appendix E for references to web sites and other resources that
further address this topic.
E. Curvilinear Regression
In some cases, the relationship between the independent variable may not be
linear. Instead, a graph of the relationship on ordinary graph paper would depict
a curve. For example, improvement curve analysis uses a special form of
curvilinear regression. Except for the brief review of cost improvement curve
analysis that follows, curvilinear regression is beyond the scope of this
Handbook. As stated above, please refer to Appendix E for sources of
additional information.
F. The Cost Improvement Curve Analysis
The cost improvement curve form of analysis is a model frequently used in cost
estimating and analysis. Many of the commercial cost estimating models base
their methods on some form of the basic cost improvement curve. The basic
form of the "learning curve" equation is Y = a(X b). Through a logarithmic
transformation of the data and the equation, the model appears intrinsically
linear: Ln(Y) = Ln(a) + b Ln(X). For both forms of the equation the following
conventions apply:
Y = Cost of Unit #X (or average for X units)
a = Cost of first unit
b = Learning curve coefficient
Note that the equation Ln(Y) = Ln(a) + b Ln(X) is of precisely the same form as
the linear equation Y = a + b(X). This means that the equation Ln(Y) = Ln(a) + b
Ln(X) can be graphed as a straight line, and all the regression formulae apply to
this equation just as they do to the equation Y = a + b(X). In order to derive a
cost improvement curve from cost data (units or lots) the regression equations
need to be used, whether the calculations are performed manually or by using a
computer based statistical package. In this sense, the cost improvement curve
equation is a special case of the LSBF technique. In cost improvement curve
methodologies the cost is assumed to decrease by a fixed proportion each time
quantity doubles. This constant improvement percentage is called the learning
curve "slope" (i.e., 90%). The slope is related to the learning curve coefficient
(b) through the equation:
. In applying the equation, the analyst
must use the decimal form of the slope percentage (i.e., 0.90). The divisor is
the Ln(2) because the theory is based on a constant percentage reduction in
cost each time the repetitions double.
Any good statistical package can perform all the calculations to derive the "a"
and "b" terms of the equation. A quality package will let you customize your
outputs, and calculate many statistics, including: frequency distributions,
percentiles, t-tests, variance tests, Pearson correlation and covariance,
regression, analysis of variance (ANOVA), factor analysis and more. Graphics
and tables such as scattergrams, line charts, pie charts, bar charts, histograms,
and percentiles are generally available to the user. Using these simple software
tools greatly simplifies the statistical analysis task.
III. Testing the Significance of the CER
After discussing the LSBF regression technique, the chapter next turns to evaluating the quality
of the CER. This answers the questions: How good is a CER equation and how good is the
CER likely to be for estimating the cost of specific items or services? What is the confidence
level of the estimate (i.e., how likely is the estimated cost to fall within a specified range of cost
outcomes)? Many analysts rely on two primary statistics to make this determination: the
coefficient of correlation (R) and the related coefficient of determination (R 2). Both of these
measures simply indicate the degree of relatedness between the variables. Neither measure
indicates cause and effect. Cause and effect requires a check of logic and depends on the
acumen of the analyst.
There are a number of other statistics to evaluate to expand the knowledge and confidence in
the regression equation and the assurance of its forecasting capability. Figure 3-6 provides an
example of one possible list of items to examine when evaluating the quality of a CER. The
matrix categories are listed in order of precedence. The top portion of the matrix focuses on the
statistical validation of the CER or model, while the bottom portion of the matrix focuses on the
use of the CER or model in predicting future estimates. Figure 3-7 provides definitions of the
evaluation elements shown in the below matrix.
Figure 3-6: CER Quality Review Matrix
One caution is warranted when performing statistical analysis of a relationship. There is no one
statistic that disqualifies a CER or model, nor is there any one statistic that "validates" a CER or
model. The math modeling effort must be examined from a complete perspective, starting with
the data and logic of the relationship. For example, the matrix shown in Figure 3-6 requires an
analyst to provide a complete narrative explanation of the quality of the database and the logic
of the proposed model. Only after ensuring that the data and the logic of the relationship are
solid should the analyst begin evaluating the statistical quality of the model. Statistical
examination typically begins with an evaluation of the individual variables in the model. The tstat for each explanatory variable is the most common method to evaluate the variable's
significance in the relationship. The next step is to assess the significance of the entire
equation. The F-stat is the most common statistic used to assess this quality of the entire
equation. Assuming the individual variable(s) and the entire equation have significance, the next
step is to judge the size and proportion of the equation’s estimating error. The standard error of
the estimate (SEE or SE) and coefficient of variation (CV) provide this insight. Finally, the typical
statistical analysis concludes with examining the value of the coefficient of determination (R2), or
Adjusted R2 when comparing models with a different number of independent variables for each
model. The coefficient of determination measures the percentage of the variation in the
dependent variable explained by the independent variable(s).
The elements in the matrix below the double line focus on the geography of the data on which
the CER or model is built. Ideally, the analyst prefers a strong statistical model with a large
number of observations, using the fewest number of variables to formulate the equation. In
addition, the analyst would like to witness a small number of actual data points that the model
poorly predicts. Finally, a critical piece of any evaluation is to identify the range of the
independent values on which the model was built. Theoretically, the model is only valid over this
relevant range of the independent value data. In practice, use of the model is permissible
outside of this range so long as the hypothesized mathematical relationship remains valid. This
is likely to be only a small limit beyond the actual values of the data. The range of validity is a
judgement call and should rely on those knowledgeable in the element being estimated to help
establish over what range the CER will provide reasonable predictions.
Because the LSBF model relies so heavily on the mean values for the dependent variable, the
matrix provides for recording the mean value of the dependent variable and its associated
statistics. The matrix allows the analyst to compare the statistics reported for the CER or model,
with the statistics of the mean of the dependent variable as a benchmark. Figure 3-7, on the
following page, provides a non-statistical interpretation of some of the statistics referred to in the
matrix. Appendix E provides several resources readers can use to obtain additional information.
There are no defined standards related to acceptable criteria for the various statistics. The
determination of acceptable criteria for a valid CER is based on discussions between the
contractor and its customers. There are no absolute thresholds. The analysis matrix; the
modeler’s data collection and normalization process; and the associated logic all form the basis
for accepting the CER as the basis for estimating. An example provided later in the chapter
uses a version of this matrix and analysis process. In order to keep reasonable statistical criteria
in the evaluation, the analyst must always ask: "If I reject this CER as the basis for estimating, is
the alternative method any better?"

-stat: Tests whether the entire equation, as a whole, is valid.


t-stat: Tests whether the individual X-variable(s) is/are valid.
Standard Error (SE): Average estimating error when using the equation as the
estimating rule
Coefficient of Variation (CV): SE divided by mean of the Y-data, relative measure of
estimating error
Coefficient of Determination (R2): Percent of the variation in the Y-data explained by the
X-data.
Adjusted R2: R2 adjusted for the number of X-variables used to explain the variation in
the Y-data
Degrees of Freedom (d.f.): number of observations (N) less the number of estimated
parameters (# of X-variables + 1 for the constant term "a"). Concept of parsimony
applies in that a preferred model is one with high statistical significance using the least
number of variables.
Outliers: Y-observations that the model predicts poorly. This is not always a valid
reason to discard the data.
P-value: probability level at which the statistical test would fail, suggesting the
relationship is not valid. P-values less than 0.10 are generally preferred (i.e., only a 10%
chance, or less, that the model is no good).






Figure 3-7: Non-statistical Interpretation of Statistical Indicators
IV. When to Use a CER
When a CER has been built from an assembled database based on a hypothesized logical
statistical relationship, and it is within an acceptable evaluation standard, the CER is ready for
application. A CER may be used to forecast costs, or it may be used to cross check an estimate
developed using another estimating technique. For example, an analyst may have generated an
estimate using a grassroots approach (a detailed build up by hours and rates) and then used a
CER, as a sanity check to test the reliability of the grassroots approach.
Generally, a CER built for a specific forecast may be used with far more confidence than a
generic CER. Care must be taken in using a generic CER when the characteristics of the
forecast universe are, or are likely to be, different from those reflected in the CER. Qualifying a
generic CER may be necessary to ensure that the database and the assumptions made for its
development are valid for its application. A need to update the database with data appropriate
for forecasting may be necessary.
When using a generic CER as a point-of-departure, the analyst may need to enhance or modify
the forecast in light of any other available supplementary information. This most likely will
involve several iterations before the final forecast is determined. It is important to carefully
document the iterations so that an audit trail exists explaining how the generic CER evolved to
become the final forecast. In order to apply good judgment in the use of CERs, the analyst
needs to be mindful of their strengths and weaknesses. Some of the more common strengths
and weaknesses are presented below:
A. Strengths
1. CERs can be excellent predictors when implemented
correctly, and they can be relied upon to produce quality
estimates when used appropriately.
2. Use of valid CERs can reduce proposal preparation,
evaluation, negotiation costs, and cycle time, particularly in
regard to low-cost items that are time and cost intensive to
estimate using other techniques.
3. They are quick and easy to use. Given a CER equation and
the required input data, developing an estimate is a quick and
easy process.
4. A CER can be used with limited system information.
Consequently, CERs are especially useful in the research,
development, test and evaluation (RDT&E) phase of a
program.
B. Weaknesses
1. CERs are sometimes too simplistic to forecast costs. When
detailed information is available, the detail may be more
reliable for estimates than a CER.
2. Problems with the database may mean that a particular CER
should not be used. While the analyst developing a CER
should also validate both the CER and the database, it is the
responsibility of any user to validate the CER by reviewing the
source documentation. The user should read what the CER is
supposed to estimate, what data were used to build that CER,
how old the data are, and how it was normalized. Never use a
CER or cost model without reviewing its source documentation.
The next two sections of the chapter focus on the application of the CER
technique by providing examples from common CER applications to
applications by contractors who participated in the Parametric Estimating
Reinvention Laboratory.
V. Examples of CERs in Use
CERs reflect changes in prices or costs (in constant dollars) as some physical, performance or
other cost-driving parameter(s) changes. The same parameter(s) for a new item or service can
be input to the CER model and a new price or cost can be estimated. Such relationships may
be applied to a wide variety of items and services.
A. Construction
Many construction contractors use a rule of thumb that relates floor space to
building cost. Once a general structural design is determined, the contractor or
buyer can use this relationship to estimate total building price or cost, excluding
the cost of land. For example, when building a brick two-story house with a
basement, a builder may use $60/square foot (or whatever value is currently
reasonable for the application) to estimate the price of the house. Assume the
plans call for a 2,200 square foot home. The estimated build price, excluding
the price of the lot, would be: $132,000 ($60/sq. ft. x 2,200 sq. ft.).
B. Electronics
Manufacturers of certain electronic items have discovered that the cost of
completed items varies directly with the number of total electronic parts in the
item. Thus, the sum of the number of integrated circuits in a specific circuit
design may serve as an independent variable (cost driver) in a CER to predict
the cost of the completed item. Assume a CER analysis indicates that $57.00 is
required for set-up, and an additional cost of $1.10 per integrated circuit
required. If evaluation of the drawing revealed that an item was designed to
contain 30 integrated circuits, substituting the 30 parts into the CER produces
the following estimated cost:
estimated item
cost
= $57.00 + $1.10 per integrated circuit * number of integrated circuits
= $57.00 + $1.10 (30)
= $57.00 + $33.00
= $90.00
C. Weapons Procurement
In the purchase of an airplane, CERs are often used to estimate the cost of the
various parts of the aircraft. One item may be the price for a wing of a certain
type of airplane, such as a supersonic fighter. History may enable the analyst to
develop a CER relating wing surface area to cost. The analyst may find that
there is an estimated $40,000 of wing cost (for instance nonrecurring
engineering) not related to surface area, and another $1,000/square foot that is
related to surface area to build one wing. For a wing with 200 square feet of
surface area, we could estimate a price as:
estimated price
= $40,000 + 200 sq ft x $1,000 per sq. ft.
= $40,000 + 200,000
= $240,000
VI. Examples of CER Development at Parametric Estimating Reinvention Laboratory Sites
Throughout this section, the description of the process used in CER development, data
requirements, validation, and documentation of simple CERs will rely on the experiences of
three Reinvention Laboratory sites. These sites included Boeing Aircraft & Missiles Systems (St.
Louis, MO.), Boeing Aircraft & Missiles Systems (Mesa, AZ.), and Lockheed Martin Tactical
Aircraft Engines (Ft. Worth, TX). Figure 3-8 provides examples of simple CERs implemented by
these lab sites
CER Title
Panstock Material
Pool Description
Allocated panstock
dollars charged.
Base Description
Manufacturing
Assembly "touch"
direct labor hours
charged.
Application
Panstock is piece-part
materials consumed in
the manufacturing
assembly organization.
The panstock CER is
applied to 100% of
estimated direct labor
hours for manufacturing
assembly effort.
F/A-18 Software
Design Support
Allocated effort
required to perform
software tool
development and
support for computer &
software engineering.
Computer and
software engineering
direct labor hours
charged.
F/A-18 computer and
software engineering
support direct labor
hours estimated for tool
development.
Design Hours
Design engineering
including analysis and
drafting direct labor
hours charged.
Number of design
drawings associated
with the pool direct
labor hours.
The design hours per
drawing CER is applied
to the engineering tree
(an estimate of the
drawings required for the
proposed work).
Systems
Engineering
Systems engineering
(including
requirements analysis
and specification
development), direct
labor hours charged.
Design engineering
direct labor hours
charged.
The system engineering
CER is applied to the
estimated design
engineering direct labor
hours.
Tooling Material
Nonrecurring, inhouse, tooling raw
material dollar costs
charged.
Tooling nonrecurring
direct labor hours
charged.
The tooling material CER
is applied to the
estimated nonrecurring
tooling direct labor
hours.
Test/Equipment
Material (dollars for
avionics)
Material dollars
(<$10k).
Total avionics
engineering
procurement support
group direct labor
hours charged.
The test/equipment
material dollars CER is
applied to the estimated
avionics engineering
procurement support
group direct labor hours
Figure 3-8: Examples of Simple CERs
A. Developing Simple CERs
For CERs to be valid, they must be developed and tested using the principles
previously discussed. Analysts rely on many forms of CERs in developing
estimates, and employ the use of CERs throughout the various phases of the
acquisition cycle. The value of a CER depends on the soundness of the
database from which it was developed, and the appropriateness of its
application in the next estimating task. Determination of the "goodness" of a
CER, and its applicability to the system being estimated, requires a thorough
understanding of the CER and the product being estimated by the cost analyst.
As a tool, CERs are analytical equations, which relate various cost categories
(either in dollars or physical units) to cost drivers. In mathematical terms, the
cost drivers act as an equation’s explanatory variables. CERs can take
numerous forms ranging from informal "Rule-of-Thumb" or simple analogies to
formal mathematical functions derived from statistical analysis of empirical data.
When developing a CER, the analyst should focus on assembling and refining
the data that constitute the empirical basis for the CER.
1) Data Collection/Analysis
Sometimes, when assembling a database, the analyst
discovers the raw data are in the wrong format for analytical
purposes, or the data display irregularities and inconsistencies.
Therefore, adjustments to the raw data usually need to be
made to ensure a reasonably consistent and comparable
database. Not even the use of advanced mathematical
modeling techniques can overcome or compensate for a
seriously deficient database.
Typically, a considerable amount of time is devoted to
collecting data, normalizing (adjusting) the data to ensure
consistency and comparability, and providing proper
information storage so it can be rapidly retrieved. Indeed, more
effort is typically devoted to assembling a quality database than
to any other task in the process. When enough relevant data
has been collected, the analytical task of deriving CER
equations is often relatively easy. Data normalization is
essential for ensuring consistency and comparability. Chapter 2
discusses data collection and analysis activities in further
detail. As a general rule, normalizing data typically addresses
the following issues:

Type of effort – such as non-recurring versus recurring,
development versus change proposals, and weapon
systems versus ground support equipment.

Time frame – such as number of months/year to cover
the period of performance, and total cumulative data
from inception to completion.

Measurable milestones to collect data – such as first
flight, drawing release, program completion, and
system compliance test completion.
2. Validation Requirements
CERs, like any other parametric estimating methodology, are of
value only if they can demonstrate, with some level of
confidence, that they produce results within an acceptable
range of accuracy. The CERs must also demonstrate reliability
for an acceptable number of trials, and they should be
representative of the database domain for which they are
applied. A process that adequately assures that the CERs and
estimating methodology meet these requirements is called
validation. Since both the developer and the customer must, at
some point, agree on the validation criteria, the Reinvention
Laboratory demonstrated that the use of Integrated Product
Teams (IPT) is a best practice for implementing CERs.
Preferably, IPTs should consist of members from the
contractor, buying activity, Defense Contract Management
Command (DCMC), and Defense Contract Audit Agency
(DCAA). Chapter 8 provides guidance relative to establishing
an implementation team.
One of the Parametric Estimating Reinvention Laboratory
teams developed a validation process flow, illustrated in Figure
3-9. This process is an adaptation of the testing methodology
described earlier in this chapter. The process depicted in
Figure 3-9 and described in Figure 3-10, is a formal company
procedure to develop and implement CERs. The company
developed this methodology with its customer, local DCMC,
and local DCAA. This procedure describes the activities and
criteria for validating Simple CERs, Complex CERs, and
models. Figure 3-11, provides the team’s guidelines for
statistical validation criteria and is an adaptation of the CER
analysis matrix discussed earlier in this chapter. In this
example, an IPT was formed and officially designated the "Joint
Estimating Relationship Oversight Panel" (JEROP). Figure 312 provides a description of the JEROP membership. The
JEROP manages the processes associated with implementing,
maintaining, and documenting CERs
Figure 3-9: ER Validation Process
Figure 3-10: Discussion of Activities
Figure 3-11: Summary of ER Report Card Criteria
JEROP Membership
Developer (Company Personnel)

Group Manager-Estimating

Principal Specialist-Estimating

Manager-Contracts & Pricing-Spares

Sr. Specialist-Accounting
DCAA

Supervisory Auditor
DCMC

Industrial Engineer

Contract Price Analysts
In this case, the customer was not a full-time member of the IPT. However, the customer
provided feedback to the IPT on a routine basis.
figure 3-12: JEROP Membership
It is important to note that in establishing this process, the IPT
uses the report card criteria as a starting point to evaluate the
strength of the CERs. The IPT does not use the statistical tests
as its sole criteria for accepting or rejecting the CERs. An
equally important factor to their assessment of the quality of the
CER is the non-quantitative information such as the materiality
of the effort and the quality of possible alternative estimating
methods. Their experience demonstrated that while statistical
analysis is a useful tool, it should not be the sole criteria for
accepting or rejecting CERs. The highest priority is determining
that the data relationships are logical; the data used are
credible; and adequate policies and procedures have been
established to ensure CERs are implemented, used, and
maintained appropriately.
3) Documentation
When implementing CERs, a company should develop
standard formats for documentation. Consistency in
documentation provides a clear understanding of how to apply
and maintain the CER. The documentation should evolve
during the development process. During each stage of
development, the team should maintain documentation on a
variety of items. This should include at minimum, all necessary
information for a third party to recreate or validate the CER.
This documentation should include:

An adequate explanation of the effort to be estimated
by the CER.

Identification and explanation of the base. Include
rationale for the base chosen when appropriate.

Calculation and description of effort (hours, dollars,
etc.) in the pool and base.

Application information.

Complete actual cost information for all accounting data
used. This provides an audit trail that is necessary to
adequately identify the data used.

Noncost information (technical data) should also be
included.
B. Lessons Learned from CER Implementation
Simple CERs are, by their very nature, straightforward in their logic and
application. Lessons learned from the accomplishment of the Parametric
Estimating Reinvention Laboratory demonstrated that IPTs are a best practice
for implementing broad use of CER-based parametric estimating techniques.
Perhaps one of the most valuable accomplishments of the Reinvention
Laboratory teams was the successful partnership established between the
contractor, customer, DCMC, and DCAA at each of the lab sites. Figure 3-13
summarizes the lessons learned from the IPTs that implemented CERs.

Cultural Change – It takes time and effort to work together openly in an IPT
environment. It may take a while to build trust if the existing climate does not encourage
a collaborative environment with common goals.

Empowering the IPTs – Team members should be empowered to make decisions.
Therefore, the teams should include people with decision-making authority.

Joint Training – All team members should participate in training sessions together.
Joint IPT training provides a common understanding of terminology and techniques,
and it facilitates team-building.

Strong Moderating – Teams should meet regularly and focus on the most significant
issues. This may require using a facilitator with strong moderating skills.

Management Support – Without total commitment from management, IPTs may
question the value of their efforts. Management should provide support in terms of
resources, consultation, interest in the progress, resolution of stalemates, and feedback
through communication
Figure 3-13: PCEI Lessons Learned
VII. Evaluating CERs
A. Government Evaluation Criteria
Contractors should implement the use of CERs as part of their estimating
system. Chapter 7, Regulatory Compliance, discusses estimating system
requirements in detail. In general, Government evaluators will focus on
evaluating and monitoring CERs to ensure they are reliable and credible cost
predictors. Specific Government evaluation criteria is discussed in Chapter 9,
Auditing Parametrics, and Chapter 10, Technical Evaluations of Parametrics.
This section provides a general overview of CER evaluation procedures that
can be used by anyone. Such evaluations generally include:

Determining if the data relationships are logical;

Verifying that the data used are adequate;

Performing analytical tests to determine if strong data relationships
exist; and

Ensuring CERs are used consistently with established policies and
procedures, and that they comply with all Government procurement
regulations.
B. Logical Data Relationships
CER development and implementation requires the use of analytical
techniques. When analyzing CERs, evaluators should be concerned with
ensuring that the data relationships used are logical. Potential cost drivers can
be identified through a number of sources, such as personal experience,
experience of others, or published sources of information.
As an example, during the Parametric Estimating Reinvention Laboratory, one
of the IPTs developed a process for identifying possible cost drivers. Using
brainstorming techniques, the IPT identified several alternatives for potential
cost drivers. The team then surveyed several experts to obtain their feedback
on the merits of each potential cost driver. Figure 3-14 contains an example of
their survey mechanism.
Figure 3-14: Sample Survey
Using this survey process, the IPT was able to identify the best cost driver
candidates for further analysis. Key questions the IPT considered in making its
determination, which should be important to evaluators, are:

Does the CER appear logical (e.g., will the cost driver have a significant
impact on the cost of the item being estimated)?

Does it appear the cost driver(s) will be a good predictor of cost?

How accessible are the data (both cost and noncost data)?

How much will it cost to obtain the necessary data (if not currently
available)?

How much will it cost to obtain the data in the future?

Will there be a sufficient number of data points to implement and test
the CER(s)?

Have all potential cost drivers been considered?

Were any outliers excluded, and if so, what was the rationale?
C. Credible Data
Contractors should use historical data whenever appropriate. As described in
Chapter 2, Data Collection and Analysis, parametric techniques generally
require the use of cost data, technical data, and programmatic data. Once
collected, a contractor will normalize the data so it is consistent. Through
normalization, data are adjusted to account for effects such as inflation, scope
of work, and anomalies. All data, including any adjustments made, should be
thoroughly documented by a contractor so a complete trail is established for
verification purposes.
All data used to support parametric estimates should be accurate and traceable
back to the source documentation. Evaluators should verify the integrity of the
data collected. This means an evaluator may want to verify selected data back
to the originating source. The evaluator may also want to evaluate adjustments
made as a result of data normalization to ensure all assumptions made by the
contractor are logical and reasonable.
Some key questions an evaluator may ask during the review of data collection
and normalization processes are:

Are sufficient data available to adequately develop parametric
techniques?

Has the contractor established a methodology to obtain, on a routine
basis, relevant data on completed projects?

Are cost, technical, and program data collected in a consistent format?

Are procedures established to identify and examine any data
anomalies?
D. Strength of Data Relationships
After determining data relationships are logical and the data used are credible,
the evaluation should next assess the strength of the relationships between the
cost being estimated and the independent cost driver(s). These relationships
can be tested through a number of quantitative techniques, such as simple ratio
analysis, ANOVA, and statistical analysis. The evaluation should consider the
associated risk of the cost and the number of data points available for testing
data relationships. Often, there are not a lot of current data available and
statistical techniques may not be the best quantitative method to use. This
would be the case when a company, out of convenience, establishes simple
factors, or ratios, based on prior program experience to estimate items of an
insignificant amount. Such factors would not lend themselves to regression
techniques, but could be verified using other analytical procedures, such as
comparisons to prior estimates. However, when there are sufficient data
available, and when the cost to be estimated is significant, statistical analysis is
a useful tool in evaluating the strength of CERs. When statistical analysis is
performed, refer to the matrix provided in Figure 3-6 as a method for evaluation.
E. CER Validation
CER validation is the process, or act, of demonstrating the technique’s ability to
function as a credible estimating tool. Validation includes ensuring contractors
have effective policies and procedures; data used are credible; CERs are
logical; and CER relationships are strong. Evaluators should test CERs to
determine if they can predict costs within a reasonable degree of accuracy. The
evaluators must use good judgment when establishing an acceptable range for
accuracy. Generally, CERs should estimate costs as accurately as other
estimating methods (e.g., bottoms-up estimates). This means when evaluating
the accuracy of CERs to predict costs, assessing the accuracy of the prior
estimating method is a key activity.
CER validation is an on-going process. The evaluation should determine
whether contractors using CERs on a routine basis have a proper monitoring
process established to ensure CERs remain reliable. A best practice is to
establish ranges of acceptability, or bands, to monitor the CERs. If problems
are identified during monitoring, contractors should have procedures in place to
perform further analysis activities. In addition, when a contractor expects to use
CERs repeatedly, the use of Forward Pricing Rate Agreements (FPRAs) should
be considered. FPRAs are discussed in Chapter 7, Regulatory Compliance.
F. Summary of CER Evaluation
CER analysis also requires addressing the questions:

What is the proportion of the estimate directly affected by CERs?

How much precision is appropriate to the estimate in total and to the
part affected by the CERs?

Is there a rational relationship between the individual CER affected
variables and the underlying variables?

Is the pattern of relationship functional or purely statistical?

If functional, what is the functional relationship? And why?

If statistical, is the history of the relationship extensive enough to
provide the needed confidence that it operates in the given case?

Is the pattern of relationship statistically significant? And at what level of
confidence?

What is the impact on the estimate of using reasonable variations of the
CERs?
VIII. Conclusion
This chapter has presented the concept of CERs and the statistical underpinnings of CER
development and application. Basic mathematical relationships were described and examples
showing the use of CERs were also presented. The next chapter builds on this knowledge by
discussing the development and application of company-developed (proprietary) models.
Typically these models organize and relate organization specific CERs into an estimating
model.
1
Gujarati, Domaodar. Basic Econometrics. McGraw-Hill Book Company, New York, 1978. p.
191.
Download