Estimate at Completion for construction projects using Evolutionary

Automation in Construction 19 (2010) 619–629
Contents lists available at ScienceDirect
Automation in Construction
j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / a u t c o n
Estimate at Completion for construction projects using Evolutionary Support Vector
Machine Inference Model
Min-Yuan Cheng a, Hsien-Sheng Peng b,⁎, Yu-Wei Wu a, Te-Lin Chen a
a
b
Department of Construction Engineering, National Taiwan University of Science and Technology, Taiwan
Ecological and Hazard Mitigation Engineering Research Center, National Taiwan University of Science and Technology, Taiwan
a r t i c l e
i n f o
Article history:
Accepted 24 February 2010
Keywords:
Estimate at Completion
Fast Messy Genetic Algorithms
Support Vector Machine
a b s t r a c t
Construction projects are influenced by a range of factors that impact upon final project cost. Estimate at
Completion (EAC) is an important approach used to estimate final project cost, which takes into
consideration probable project performance and risks. EAC helps project managers identify potential but
still unknown problems and adopt response strategies. This study constructed an evolutionary EAC model to
generate project cost estimates that proved significantly more reliable than estimates achievable using
currently prevailing formulae. The developed learning model fused two artificial intelligence approaches,
namely the fast messy genetic algorithm (fmGA) and Support Vector Machine (SVM), to create an
Evolutionary Support Vector Machine Inference Model (ESIM). The ESIM was then applied to estimate final
project costs for historical cases. Finally, using the EAC estimate, project cost influence indices, and project
cost diagrams, the discrepancy between estimate and practical values was examined to determine potential
problems in order to help project managers better control project costs. The learning results were validated
in real applications that showed good performance for training models. Providing project managers reliable
EAC trend estimates is helpful for their effective control of project costs and taking appropriate peremptory
measures to handle potential problems.
© 2010 Elsevier B.V. All rights reserved.
1. Introduction
Engineering projects face myriad uncertainties attributable to
chosen construction method as well as environmental and process
factors [1,2]. Construction firms typically focus only on budget
planning during the initial project stage, which ignores engineering
cost changes, information updates and cost management during
construction and, in turn, prevents effective project cost control and
the identification of potential problems. Cost overruns are frequently
not discovered until later project stages, at which time it is typically too
late to take effective remedial measures.
Effective project management requires that plans be constantly
revised in accordance with actual project conditions. However, factors
that influence project cost are numerous, and it is difficult to consider
individually each factor of influence at each stage. Moreover, data on
construction cost are manifold and variability is great. In order to
update cost information item by item in a timely manner, management
must adopt an efficient approach to the issue and invest significant
time.
⁎ Corresponding author. #43, Sec. 4, Keelung Rd., Taipei, 106, Taiwan. Tel.: +886 2
27301212; fax: +886 2 27301074.
E-mail address: hspeng@mail.ntust.edu.tw (H.-S. Peng).
0926-5805/$ – see front matter © 2010 Elsevier B.V. All rights reserved.
doi:10.1016/j.autcon.2010.02.008
The Estimate at Completion (EAC) is a quick and automatic formula
used by managers to assess the cost of work to complete schedule
activities [1]. Many researches have already been done in this area and
the methodologies that the existing previous works used were various.
Barraza et al. [3] used the concept of stochastic S curves (SS curves) to
determine forecasted project estimates as an alternative to using
deterministic S curves and traditional forecasting methods. A simulation
approach is used for generating the stochastic S curves, and it is based on
the defined variability in duration and cost of the individual activities
within the process. Lee [4] introduced a software, Stochastic Project
Scheduling Simulation (SPSS), developed to measure the probability to
complete a project in a certain time specified by the user. The SPSS finds
the longest path in a network and runs the network a number of times
specified by the user and calculates the stochastic probability to
complete the project in the specified time. Lee and Arditi [5] described
a stochastic simulation-based scheduling system (S3) that integrates
the deterministic critical path method (CPM), the probabilistic program
evaluation and review technique (PERT), and the stochastic discrete
event simulation (DES) approaches into a single system. The system is
based on an earlier version of the system called Stochastic Project
Scheduling Simulation and makes use of all the capabilities of this
system. Kim and Reinschmidt [6] introduced a new probabilistic
forecasting method for schedule performance control and risk management of on-going projects. The Bayesian betaS-curve method (BBM) is
620
M.-Y. Cheng et al. / Automation in Construction 19 (2010) 619–629
Fig. 1. ESIM framework.
based on Bayesian inference and the beta distribution. The BBM
provides confidence bounds on predictions, which can be used to
determine the range of potential outcomes and the probability of
success.
Earned Value Management (EVM) is one of the theoretical methods
for determining EAC that was originally developed for cost management
and later adopted to forecast project duration. Kim et al. [7] indicated
that EVM is gaining wider acceptance due to increasing recognition of its
ability both to diminish EVM problems and improve utilities. A broader
approach was developed that considers the four-factor groups (i.e. EVM
users, EVM methodology, project environment, implementation process) together to improve significantly the acceptance and performance
of EVM in different types of organizations and projects. Cioffi [8]
presented a new formula and corresponding notation for earned value
analysis that makes earned value calculations more transparent and
flexible. Vandevoorde and Vanhoucke [9] compare the classic earned
value performance indicators SV and SPI with the newly developed
earned schedule performance indicators SV(t) and SPI(t), and then
present a generic schedule forecasting formula applicable in different
project situations and compare the three methods from literature to
forecast total project duration. Finally, the use of each method was
illustrated on a simple one activity example project and on real-life
project data. Vitner et al. [10] investigated the possibility of using the
data envelope analysis (DEA) approach to evaluate project performances in a multi-project environment, which evaluates projects using
earned value management system (EVMS) and multidimensional
control system (MPCS) methods. Vanhoucke and Vandevoorde [11]
extensively reviewed and evaluated earned value (EV)-based methods
to forecast the total project duration, and investigated the potential of a
recently developed method, the earned schedule method, which
improves the connection between EV metrics and the project duration
forecasts. Lipke et al. [12] provided a method to improve the capability of
project managers to make informed decisions by providing a reliable
method to forecast final cost and duration. Their method and its
evaluation made use of a well established project management method,
a recent technique developed to analyze schedule performance, and
statistical mathematics to develop EVM, earned schedule (ES) and
statistical prediction and testing methods. Plaza and Turetken [13]
Table 1
Influencing factors of construction cost.
Classification
Influencing
factor
Index
Definition
Time now
Construction
duration
Duration to date/revised
contract duration
Actual cost
Construction
progress
percentage
ACP
Planned cost
EVP
Construction
Cost
management management
Time
management
Subcontractor
management
Contract
Contract
scope
payment
Change order
External
environment
Construction
price fluctuation
No. of rainy
day
Fig. 2. EAC prediction model.
CPI
Actual cost/budget at
completion
Earned value/budget at
completion
Earned value/actual cost
SPI
Earned value/planned value
Subcontractor
billed index
Owner billed
index
Change order
index
CCI
Subcontractor billed amount/
actual cost
Owner billed amount/earned
value
Revised contract amount/
budget at completion
Construction material price
index of that month/
construction material price
index of initial stage
(Revised project duration —
no. of rainy day)/revised
project duration
Climate effect
index
M.-Y. Cheng et al. / Automation in Construction 19 (2010) 619–629
621
Table 2
Historical cases.
Project name
Total area
(m2)
Underground
floors
Ground
floors
Buildings
Start date
Finish
date
Duration
(days)
Contract amount
(NTD)
ESIM prediction
periods
A
B
C
D
E
F
G
H
I
J
K
L
M
Total
Subtotal (training cases)
Subtotal (testing cases)
12,622
4919
19,205
5358
27,468
31,797
7707
10,087
3479
6352
4774
7289
3094
2
3
5
3
2
2
2
3
1
4
2
2
2
9
11
8
9
11
9
14
14
10
11
11
8
7
1
1
1
1
3
4
1
1
1
1
1
1
1
2003/12/1
2003/12/13
2000/5/20
2000/11/15
1999/12/16
2001/7/4
2001/11/24
2002/6/18
2003/6/2
2004/3/5
2004/2/21
2005/6/15
2005/10/1
2005/8/22
2005/11/10
2002/5/19
2002/11/14
2001/12/3
2003/3/31
2003/10/20
2004/7/6
2004/9/30
2006/2/18
2006/2/20
2006/9/15
2007/2/28
630
698
729
729
718
635
695
749
486
715
730
457
515
289,992,000
149,300,000
332,800,000
199,600,000
1,142,148,388
530,000,000
153,500,000
216,000,000
85,714,286
202,241,810
145,377,589
190,844,707
102,500,000
29
24
20
25
26
20
22
27
18
31
27
20
17
306
269
37
proposed an extended version of EVM (EVM/LC) that addresses the
effect of learning on the performance of project teams. A spreadsheetbased decision support tool that automates calculations and analyses
was presented in EVM/LC. Leu and Lin [14] attempts to refine and
improve the performance of traditional EVM by the introduction of
statistical control chart techniques. Individual control charts are used as
tools to monitor project performance data in order to detect adverse
changes in a timely manner.
At least eight common methods have already been put forward that
use EVM to predict the EAC for construction projects [15,16]. Each has
been applied to different special projects and achieved differing EAC error
rates. When applied to different special projects, the predictions achieved
Note
Testing case
Testing case
by the single method are extremely accurate for some and present
obvious errors for others. This has created confusion in the industry as to
which kind of prediction method should be chosen for particular project
types. Another issue with EVM is that it must be applied to each distinct
construction project process, with revisions conducted manually. Such
makes EVM both complicated and time consuming. Consequently,
computerization of the engineering management process is critical if
EVM is to be applied effectively to control construction costs. However,
most construction firms in Taiwan use computer systems powerful
enough only to analyze initial stage budgets. Systems are not equipped to
react to changes at each construction stage or use the EVM method to
predict construction project EAC.
Table 3
Input variables.
Training set name
and no
Construction progress
percentage
ACP
EVP
CPI
SPI
Subcontractor
billed index
Owner billed
index
Change order
index
CCI
Climate effect
index
Project-A-1
Project-A-2
Project-A-3
Project-A-4
Project-A-5
Project-A-6
Project-A-7
Project-A-8
Project-A-9
Project-A-10
Project-A-11
Project-A-12
Project-A-13
Project-A-14
Project-A-15
Project-A-16
Project-A-17
Project-A-18
Project-A-19
Project-A-20
Project-A-21
Project-A-22
Project-A-23
Project-A-24
Project-A-25
Project-A-26
Project-A-27
Project-A-28
Project-A-29
Project-B-1
Project-B-2
Project-B-3
Project-M-15
Project-M-16
Project-M-17
3.8%
7.9%
11.7%
15.8%
19.9%
23.8%
27.9%
31.8%
35.9%
40.0%
44.0%
48.0%
52.0%
56.1%
59.9%
67.9%
71.8%
75.9%
79.9%
84.0%
88.0%
92.0%
96.1%
100.0%
104.1%
107.9%
111.8%
115.9%
119.9%
2.4%
6.9%
0.0%
1.8%
3.4%
4.9%
9.0%
12.1%
13.8%
14.5%
18.2%
22.8%
26.9%
32.0%
37.4%
40.3%
42.1%
47.4%
53.1%
62.9%
71.2%
83.5%
92.0%
94.1%
97.1%
98.9%
100.2%
100.4%
100.6%
104.1%
108.7%
0.0%
2.5%
0.0%
3.6%
7.7%
10.1%
11.4%
12.4%
13.7%
15.2%
16.7%
16.7%
23.0%
30.3%
32.9%
39.1%
43.0%
57.0%
66.8%
76.1%
85.8%
97.3%
98.2%
98.5%
100.2%
100.2%
101.4%
101.4%
101.4%
101.4%
101.4%
0.0%
0.0%
1.00
2.01
2.24
2.05
1.27
1.02
1.00
1.04
0.92
0.73
0.86
0.95
0.88
0.97
1.02
1.20
1.26
1.21
1.20
1.16
1.07
1.05
1.03
1.01
1.01
1.01
1.01
0.97
0.93
1.00
1.00
1.00
0.69
0.89
0.99
0.99
1.00
0.99
1.00
1.02
1.02
0.93
1.02
0.94
0.95
0.95
0.94
0.95
0.96
0.97
0.99
0.99
0.99
0.99
0.99
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.67
1.33
0.73
1.07
1.06
1.14
0.91
0.68
0.64
0.71
0.69
0.74
0.81
0.84
0.86
0.87
0.88
0.97
0.88
0.95
0.92
0.91
1.01
1.01
1.01
0.97
0.93
1.00
1.16
1.00
1.00
0.75
0.65
0.57
1.05
1.06
1.09
0.99
0.93
0.75
0.75
0.78
0.77
0.79
0.70
0.68
0.72
0.73
0.83
0.82
0.91
0.89
0.89
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.01
1.01
1.01
1.01
1.01
1.01
1.01
1.00
1.00
1.00
1.03
1.09
1.12
1.11
1.10
1.10
1.12
1.13
1.13
1.14
1.13
1.12
1.12
1.12
1.13
1.12
1.11
1.11
1.11
1.12
1.12
1.12
1.12
1.12
1.13
1.14
1.16
1.20
1.00
1.03
1.00
0.99
0.98
0.97
0.96
0.95
0.95
0.94
0.92
0.91
0.90
0.90
0.90
0.89
0.88
0.87
0.86
0.84
0.83
0.82
0.81
0.80
0.80
0.79
0.79
0.79
0.78
0.76
0.75
1.00
0.99
105.8%
111.8%
84.4%
92.5%
99.9%
95.9%
1.18
1.04
1.00
1.00
1.08
1.04
0.92
1.00
1.00
0.96
1.13
1.13
0.76
0.75
622
M.-Y. Cheng et al. / Automation in Construction 19 (2010) 619–629
Fig. 4. Predicted output and actual output of estimate to completion for case L.
Fig. 3. Predicted output and actual output of training data.
In light of the above, the development of a fast and effective system
that considers the uncertainty and myriad problems involved in cost
control over the course of a project course while predicting
construction project EAC is an important issue to be resolved. This
research aims to resolve questions encountered in project cost
management by collecting relevant papers in the literature and
historical cases in order to identify the set of factors that influence
significantly on project cost. A project cost flow trend was then set up
using historical case data. The relationship between monthly costs
and project EAC was mapped based on past knowledge and
experience. An Evolutionary Support Vector Machine Inference
Model (ESIM) was established using historical case experience to
predict and control EAC variation tendency of the project during
construction. Finally, indices of project cost and project cost diagrams
were employed to identify the reason for the difference between
actual and ESIM-predicted values so that potential problems can be
found and effective control and management actions be implemented
at proper points in time.
2. Evolutionary Support Vector Machine Inference Model (ESIM)
2.1. Support Vector Machine (SVM)
Support Vector Machine (SVM) is a computer training technique
popularized in recent years. It is based on a statistics learning theory
described by Vapnik [17]. Traditional training techniques usually focus
on minimizing empirical risk; i.e., minimizing the classification error of
training data. However, SVM aims to minimize the structural risk in
finding a probable upper bound of the classification error of training
data [18]. This new computer training technique effectively minimizes
the upper bound of theoretical error.
Data classification and regression, two critical components of
computer science, are being used in increasingly broad and general
applications. Traditional classification methods include neural networks
(NNs), decision trees and nearest neighbour method, among others.
SVM is a new method that has already proved its value through good
results in many applications. SVM has relatively firmer theoretical
foundations than NNs. Support Vector Classification (SVC) is founded on
the principle of minimizing training theoretical structure risk. An
important advantage of SVC is its ability to handle linear inseparable
problems. SVC utilizes existing data to do training and then selects
several support vectors by analyzing the training data to represent the
whole data. Some extreme values were eliminated in advance. Finally,
selected support vectors were packed into a model and SVC was used to
carry out classification on testing data. The concept of Support Vector
Regression (SVR) is similar to that of SVC. It maps regression problems
from low dimensional to high dimensional vector spaces to identify the
support vector in which the appropriate linear regression equation
could be obtained.
2.2. Fast Messy Genetic Algorithms (fmGA)
The Simple Genetic Algorithm (sGA), an efficient and accurate
algorithm, was first developed by Holland in 1975. Goldberg et al.
subsequently developed the Messy Genetic Algorithm (mGA) in 1989
in order to improve sGA shortcomings. Several experiments [17,19]
have since shown the mGA much better at solving permutation
problems than sGA. In 1993, Goldberg established the Fast Messy
Genetic Algorithm (fmGA) to reduce the high memory consumption
of operation processes [20].
The mGA resolved the problem that sGA does not consider logical
limitations amongst gene bunches during the optimization process.
There are four main differences in solving mechanisms between fmGA
and sGA [21,22]. The first is that chromosomes of variable length could
be adopted in fmGA. Secondly, simple cut and splice are used to replace
the sGA operator mechanism. Thirdly, the optimization process includes
a primordial and juxtapositional stage. Lastly, competitive templates are
adopted to retain the most outstanding gene building blocks in each
generation [23].
2.3. ESIM framework
The Evolutionary Support Vector Machine Inference Model (ESIM)
fuses SVM and fmGA [20,24,25]. In this model, SVM is used to sum up
the complicated relationship between input parameters and output
parameters, while fmGA searches for the best parameters (C and γ)
needed by SVM needs in order to improve SVM prediction accuracy.
The framework of SVM is shown in Fig. 1.
Steps are explained as follows:
Default C, γ: The value of C and γ may be set up differently to
reflect case and problem characteristics. C and γ can be selected as 1
and 1/M respectively, where M stands for parameter number.
Table 4
Validation of training cases.
Project name
A
B
C
D
E
F
G
H
I
J
K
Total
Number of periods
Average error of EACP
Qualified (b10%)
29
8.7%
Yes
24
5.6%
Yes
20
4.1%
Yes
25
5.6%
Yes
26
6.2%
Yes
20
7.6%
Yes
22
3.6%
Yes
27
14.0%
No
18
5.0%
Yes
31
4.2%
Yes
27
3.7%
Yes
269
91%
M.-Y. Cheng et al. / Automation in Construction 19 (2010) 619–629
623
Table 5
(a) Testing results of case L. (b) Testing results of case M.
Training set name
and no
Predicted
output
Actual
output
EACP
(predicted)
EACP
(actual)
e (error of
EACP)
(a)
Project L-1
Project L-2
Project L-3
Project L-4
Project L-5
Project L-6
Project L-7
Project L-8
Project L-9
Project L-10
Project L-11
Project L-12
Project L-13
Project L-14
Project L-15
Project L-16
Project L-17
Project L-18
Project L-19
Project L-20
Average error
0.76985
0.75566
0.69682
0.66103
0.62353
0.57243
0.52164
0.47307
0.43308
0.40248
0.42472
0.39262
0.37184
0.33271
0.30792
0.13018
0.11847
0.12773
0.09538
0.05293
0.72710
0.72230
0.70360
0.63210
0.59440
0.54420
0.50640
0.44560
0.40210
0.37220
0.33140
0.29480
0.25570
0.22340
0.17890
0.15060
0.12780
0.10770
0.08770
0.08330
89.76%
88.53%
83.28%
87.95%
87.98%
87.86%
86.17%
87.75%
88.22%
88.12%
96.34%
96.93%
99.32%
98.42%
101.00%
81.52%
82.96%
86.79%
85.18%
80.21%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
6.63%
5.17%
1.06%
4.48%
4.52%
4.38%
2.37%
4.25%
4.80%
4.68%
14.45%
15.16%
18.00%
16.92%
19.98%
3.16%
1.44%
3.11%
1.19%
4.71%
7.02%
(b)
Project M-1
Project M-2
Project M-3
Project M-4
Project M-5
Project M-6
Project M-7
Project M-8
Project M-9
Project M-10
Project M-11
Project M-12
Project M-13
Project M-14
Project M-15
Project M-16
Project M-17
Average error
0.74012
0.65222
0.61894
0.57882
0.56265
0.51792
0.49309
0.45539
0.41551
0.39051
0.29915
0.26694
0.23630
0.18320
0.17857
0.15536
0.08462
0.72410
0.65710
0.62030
0.59120
0.61940
0.51410
0.48480
0.44820
0.41580
0.39570
0.31270
0.24910
0.22980
0.20250
0.19580
0.14510
0.08330
94.58%
91.85%
92.32%
90.88%
85.09%
92.99%
93.58%
93.42%
92.45%
91.81%
90.71%
94.82%
93.34%
89.97%
90.24%
93.83%
82.66%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
84.18%
2.26%
0.69%
0.18%
1.74%
8.00%
0.55%
1.18%
1.01%
0.04%
0.73%
1.92%
2.53%
0.92%
2.72%
2.43%
1.46%
0.18%
1.68%
Fig. 5. ESIM-predicted and actual EAC values of case L.
Search for fmGA parameters: In this step, fmGA searches for relatively
appropriate C and γ to serve as parameters for the next generation.
Optimized parameters: According to the above-mentioned optimization calculations, the best gene set will be retained. The optimum
inference model is obtained after gene set decoding as the C and γ
value for SVM type.
3. Constructing EAC prediction model using ESIM (EAC–ESIM)
Training data set: Before executing the prediction model, patterns
of influence must first be found and as training data into the system as
prediction input parameters.
SVM training model: In this step, the user collects relevant
historical cases for research. Case influence patterns serve as input
parameters and the case decision serves as output parameters. Such
input and output values became the training data set, and are input
into the model as initial training data. SVM regards selected C and γ
values as default patterns for the first training process.
Average accuracy: This step regards the reciprocal of the objective
function as the fitness function. A larger value correlates to a superior
model framework.
Termination criteria: The procedure operates continuously until
certain conditions are satisfied, e.g., confirmation of appropriate fitness
or absence of conspicuous fitness after making calculations in several
generations to demonstrate that convergence has already been reached.
Table 6
Validation of training cases qualification of testing cases.
Project
name
Number of
periods
ESIM
RMS
Average error of
EACP
Qualified
(b10%)
L
M
20
17
0.0594
0.0173
7.02%
1.68%
Yes
Yes
During the construction phase of the project, planning objectives and
achieved percentage are affected by actual work conditions, design
changes and external environmental conditions. Adjustments that result
to reflect such can generate differences between planned and actual
completion costs.
The EAC prediction is generally based on the construction budget
developed based on initial project conditions (i.e., the construction
scope specified in the contract and environmental factors). EAC
factors of influence deduced from reference material are represented
as input and prospective project cost is the output. The cost database
of project cases was established accordingly. Database records were
used to plan values and actual values for each month and calculate the
difference between the two. The mapping relationship between input
and output was found via case learning and ESIM training. Finally, the
prospective cost of a new project was predicted by inputting the
monthly cost of a project into the developed system in accordance
with training and testing results. The process is described in Fig. 2.
EACP is defined as Eq. (1) in this research.
ð1Þ
EACP = ACP + ETCP
where
EACP (estimate at completion percentage) predicting EAC in advance/total cost of original budget
ACP (actual cost percentage) actual cost at some specific moment/
total cost of original budget
Table 7
General EAC prediction methods of EVM.
Item
Equation
EAC1
EAC2
EAC3
EAC4
EAC5
EAC6
EAC7
= AC + BAC − EV
= BAC / CPI
= BAC / SPI
= AC + (BAC − EV) / CPI
= AC + (BAC − EV) / SPI
= AC + (BAC − EV) / SCI
= AC + (BAC − EV) / (W1 * CPI + W2 * SPI)
EAC8
Execution budget amount
Note
SCI = CPI * SPI
W1 = 0.8
W2 = 0.2
Approved budget
624
M.-Y. Cheng et al. / Automation in Construction 19 (2010) 619–629
Table 8
Error of EVM and ESIM prediction results.
Project name
EAC1
A
B
C
D
E
F
G
H
I
J
K
Average
Qualified percentage
L
M
Average
Qualified percentage
10.0%
7.0%
27.0%
9.1%
9.7%
5.0%
3.5%
13.6%
8.7%
3.2%
3.3%
9.1%
81.8%
7.6%
4.6%
6.1%
100%
EAC2
17.1%
9.1%
3.2%
6.5%
35.0%
14.0%
14.1%
8.9%
23.1%
11.4%
6.7%
13.6%
45.5%
21.0%
16.2%
18.6%
0%
EAC3
6.9%
10.6%
15.9%
23.9%
5.6%
12.9%
17.5%
35.4%
19.4%
7.2%
8.6%
14.9%
36.4%
23.5%
12.2%
17.9%
0%
EAC4
16.8%
9.1%
3.2%
7.1%
13.7%
12.3%
13.7%
7.5%
23.1%
9.2%
7.4%
11.2%
54.5%
19.4%
16.0%
17.7%
0%
EAC5
9.7%
7.3%
7.3%
9.1%
9.7%
11.6%
14.8%
13.7%
7.6%
5.4%
3.9%
9.1%
72.7%
12.1%
13.1%
12.6%
0%
ETCP (estimate to completion percentage) estimate to completion at
some specific moment/total cost of original budget, also the
output value of ESIM
The processes of developing the model include “Identify Significant
Factors of Influence”, “Historical Data Collection”, “Model Training” and
“Model Testing”. The details of each process are illustrated accordingly.
EAC6
15.9%
9.4%
3.2%
7.1%
13.7%
14.4%
40.0%
7.5%
21.7%
9.9%
7.0%
13.6%
54.5%
19.3%
21.7%
20.5%
0%
EAC7
15.7%
8.4%
3.3%
5.9%
13.0%
11.4%
13.3%
7.3%
20.7%
8.5%
6.1%
10.3%
54.5%
18.7%
15.5%
17.1%
0%
EAC8
5.2%
7.6%
15.9%
24.8%
4.3%
1.7%
6.1%
32.4%
14.2%
2.1%
6.8%
11.0%
63.6%
18.0%
9.3%
13.7%
50%
ESIM prediction
8.7%
5.6%
4.1%
5.6%
6.2%
7.6%
3.6%
14.0%
5.0%
4.2%
3.7%
6.2%
90.9%
7.0%
1.7%
4.4%
100%
Note
Testing case
Testing case
3.1. Identify Significant Factors of Influence
This research identified factors that significantly affect the EAC of
construction projects using several relevant publications (listed in
Table 1). Input parameters of the model were obtained after timedependent cost factors and performance management concepts were
also considered.
Fig. 6. Processes of applying EAC–ESIM for cost management.
M.-Y. Cheng et al. / Automation in Construction 19 (2010) 619–629
625
Fig. 7. Cost exception management based on prediction results.
Due to the variety of construction project categories and resultant
differences in data inputs, this research focused only on construction
projects done in reinforced concrete in order to control for potential
variance in results attributable to construction type. Ten significant
influence factors for EAC were calculated as input values based on
monthly construction cost data. Prospective cost was designated as
the output value.
input values in accordance with definition equations (see Table 3).
The 13 historical cases were divided into training and testing groups.
The 11 cases in the training group comprised 269 periods and the 2
cases in the testing group comprised 37 periods.
Output parameters were normalized by linear scaling. The normalization method was revised to keep estimated cost in the range of 0 to 1, so
that maximum and minimum of output parameters were enlarged by
10%. The detail processes are shown in Eqs. (2)–(4).
3.2. Historical Data Collection
The 13 historical cases included in this research were all reinforced
concrete projects executed between 2000 and 2007 by one engineering company located in Taipei City. Projects were located chiefly in
northern Taiwan, with selection criteria considering data distribution
and completeness. Building height ranged from 9 to 17 stories
(inclusive of stories belowground). The value of contracts ranged from
NT$80 million to NT$1.1 billion. Overall floor area of cases studied
ranged from 3094 m2 to 31,797 m2. Construction durations ranged
from 457 to 749 days. Relevant data from historical cases used are
arranged in Table 2.
Monthly cost records for every case were collected and the ten
identified factors of significant influence on EAC were calculated as
Table 9
Cost influencing indices.
Index name
Definition
Criterion
Cost performance
index
Schedule
performance index
Subcontractor
billed index
Owner billed
index
Change order
index
Construction
cost index
Earned value/actual cost
≧ 1.05
Earned value/
planned value
Billed amount/
actual cost
Billed amount/
earned value
Revised contract
amount/budget at completion
Construction material price index of that
month/construction material price index
of initial stage
≧1
Climate effect
index
(Revised project duration
— no. of rainy day)/revised
project duration
≧ 1, ≦1.1
≧ 0.9
≦ 1.1,
≧ 0.9
Progress b 30%,
≦ 1.02
Progress b 60%,
≦ 1.03
≧ 1 − 0.2 ×
progress
percentage
Xn =
Xa −XL
XU −XL
ð2Þ
XU = Xmax + Xrange × 10%
ð3Þ
XL = Xmin − Xrange × 10%
ð4Þ
where
Xn
Xa
XU
XL
Xmax
Xmin
Xrange
output parameter after normalization, the range is between
0 and 1
output parameter before normalization
upper bound of the output parameter
lower bound of the output parameter
maximum of the output parameter
minimum of the output parameter
difference between maximum and minimum
Table 10
Monthly cost information.
Item
Definition
Estimate at completion percentage
(EACP)
Actual cost percentage (ACP)
Earned value percentage (EVP)
Predicted EAC of ESIM/original budget
amount
Actual paid of cost/original budget amount
Budget cost at completion/original budget
amount
Budget cost of planning/original budget
amount
Proprietor pricing amount/original
contracted amount
Planned value percentage (PVP)
Contract billed percentage (CBP)
626
M.-Y. Cheng et al. / Automation in Construction 19 (2010) 619–629
Table 11
Comparison of cost variance.
Definition
Illustration
Budget
Revised budget % (RBP) — estimate at completion % (EACP)
Schedule
Earned value % (EVP) — planned value % (PVP)
Cost
Earned value % (EVP) — actual cost % (ACP)
Contract billed
Contract billed % (CBP) — actual cost % (ACP)
3.3. Model Training
Training of the ESIM training module began after parameter selection.
A total of 100 generations were searched for a period of 1.0167 min. The
fault-tolerant parameter C was 20. The kernel function parameter γ
equalled 0.1. The Root Mean Square (RMS) which describes the quality of
how well the model fits the data of the model obtained from the optimum
chromosome equalled 0.0559.
3.4. Model Testing
The ESIM performance assessment module was employed after
completion of training. ESIM decoded the optimum chromosome as the
EAC prediction model to facilitate Model Testing revelation of prediction
error and learning accuracy. Model Testing included testing of training
error and case verification.
Model regulation was learned by training historical case data. The
model was established utilizing ESIM to search for training cases that
possessed consistency between inference output and actual output. After
training, model accuracy was tested by comparing differences between
output results and actual values. Estimating criteria used in this research
are shown in Eq. (5). The model was qualified when the error fulfilled
selected requirements.
ESIM
ei =
jEAC Pi −AEAC P j
× 100%
AEAC P
ð5Þ
where
error percentage of ith period predicted by ESIM
ei
estimate at completion percentage of ith period predicted by
EACESIM
Pi
ESIM
actual estimate at completion percentage of the case
AEACP
N0, cash balance of budget
b0, not enough in budget, exception management
N0, exceed in progress
b0, delay in progress, overtime planning
N0, cash balance of contract, not pricing by subcontractor
b0, overspend of contract, missed list, not transacting budget increase
N0, cash balance of amount, large quantities of payment, subcontractor not valuating
b0, contract increased or constructed without valuation, contract decreased and not
revised, over amount
3.4.1. Testing of training error
The 269 periods collected from 11 cases were input into the
database. An estimated value of prospective cost percentage for each
case could be obtained via the ESIM performance assessment module.
The estimated value is termed ‘Predicted output’. The actual value of
the prospective cost percentage for each case is termed ‘Actual
output’. Fig. 3 shows the relationship between Predicted output and
Actual output produced by ESIM training. EACP (see Eq. (2)) could be
obtained after de-normalizing the Predicted output and the training
error percentage could be calculated by making a comparison with the
corresponding percentage of actual cost at completion.
During the constructing process, managers are most concerned
with the development trend and the influence of such on project
decision making. To determine model accuracy after training and
achieve the practical needs of managers, this research selected a 10%
average error as the qualification ceiling. The qualified rate was 91%
for the 11 training cases in this research, as shown in Table 4.
3.4.2. Verification of testing cases
A total of 269 periods collected from 11 cases were input into the
database. An estimated value of prospective cost percentage for each case
could be obtained using the ESIM performance assessment module. As
above, the estimated value is termed ‘Predicted output’, while the actual
value of prospective cost percentage for each case is termed ‘Actual
output’. Fig. 4 shows the relationship between Predicted output and
Actual output produced by ESIM training. After de-normalizing the
Predicted output, EACP (see Eq. (2)) could be obtained and the percentage
of the training error could be calculated through a comparison with the
percentage of actual cost at completion.
The main purpose of verification is to examine whether the model
trained by ESIM can be employed to infer or predict cases beyond
training cases. Data for two testing cases (Case L and M) were input
Fig. 8. Time series of the cost influencing indices for case L.
M.-Y. Cheng et al. / Automation in Construction 19 (2010) 619–629
627
Fig. 9. Cost management diagram of case L.
into the ESIM performance assessment module, with results shown in
Table 5. Comparing the predicted values of EAC with actual values,
average errors were 7.02% and 1.68%, respectively. Both results were
considered ‘qualified’ under the definition set above (Table 6).
Figs. 4 and 5 were drawn in order to perform further analysis for
case L, which returned a comparatively large error. From the 11th
period to 15th period, the error values between 14.45% and 19.98%
were noticeably large. A possible reason could be due to a rapid
revision of project budget during this period. Such would make data
unstable and cause larger error values.
The revisions and updates of the project budget are not rare in
practice for construction projects with or without change order. In
this study, the historical cases for Model Training implicitly included
the uncertainty of change order for replan. This is the reason of the
study to use AI to solve the uncertainty. To identify the impacts of
change order on EAC, a real case with significant change order, i.e.
Project L, was selected for Model Testing. The results showed that the
predicting error would increase in a period of time after change order
(but the error was still tolerable). Overall, it would converge to the
actual result in the end.
4. Comparison of EAC–ESIM prediction with 8 EVM methods
To demonstrate that the method of EAC prediction model
proposed in this research is feasible and reliable, EAC values used
were calculated using eight other EVM methods [16]. The values are
compared with the predictions generated by the ESIM (Table 7) to
assess comparative accuracy. This research selected 10% of error to be
the qualified criterion, and calculated the average error and the
qualified percentage for each prediction methods. Results of comparison are shown in Table 8 and discussed below:
1. No single EVM method predicts EAC at a consistently high level of
accuracy. Prediction results varied with differences in project
details. For the cases studied in this research, EAC1 and EAC5
attained better average error and qualified rates.
2. ESIM predictions showed a larger difference of accuracy during the
initial 30% of project work schedule. This may be attributable to
data instability in the initial stages and the influence of design
changes. However, the prediction result attained by ESIM is more
precise than by EVMs.
3. The predictive error of ESIM is comparatively steady. Qualified
rates of training and testing cases were 91% and 100%, respectively.
Both rates were larger than those of EVMs. Such proves the
feasibility of EAC predictions using ESIM.
5. Applications of EAC–ESIM to construction management
5.1. Applying EAC–ESIM for cost management
After the EAC prediction model using ESIM was established, the
feasibility of the model needed to be proven using actual projects. A
prediction could be made once data had been collected and formatted to
model requirements.
As the prediction model must address real project aspects, the
procedure used to apply the EAC prediction model was designed as shown
in Fig. 6, in accordance with various situations of every construction
stages.
5.2. Cost exception management based on prediction results
Table 12
Cost influencing indices analysis of case L.
Index name
Value Illustration
Cost performance index
1.17
Schedule performance index 0.96
Subcontractor billed index
Owner billed index
Change order index
Construction cost index
1.05
0.90
1.03
1.09
Climate effect index
0.83
≧1.05, well cost management,
but the ratio is slightly high.
≦1, slightly delay in progress,
and the trend is increasing.
≧1, ≦ 1.1, normal.
≧0.9, normal.
≦1.1, normal.
≧1.03, slightly high, but the main
decoration engineering items are contracted.
≦0.86, normal.
An effective prediction model establishes a response system able to
identify factors that significantly influence the EAC at different project
stages. The predicted trend acquired from the ESIM can then be compared
with project cost influencing indices and project cost diagrams. Finally,
revised propositions for deviations may be addressed and tracked
continuously in order to effectively control costs.
The EAC prediction provides a key reference to construction
managers by analyzing the cost information over the course of the
project. Furthermore, the managers may assess, term by term, the
factors of influence over cost and consider the various potential
reasons that might result in cost overruns when an EAC prediction
exceeds the approved budget, allowing managers to identify potential
problems and control costs.
628
M.-Y. Cheng et al. / Automation in Construction 19 (2010) 619–629
Table 13
Cost diagrams analysis of case L.
Item
RBP
EACP
ACP
EVP
PVP
CBP
Value
104%
97%
57%
66%
69%
60%
Analysis
Difference
Illustration
RBP N EACP
EVP b PVP
EVP N ACP
CBP N ACP
+7%
− 3%
+9%
+3%
Possible to decrease budget.
Delay in progress, and the trend is increasing.
Cash balance of contract or not pricing by subcontractor.
Cash balance of amount, but the ratio reducing, indicates
that the proprietor adds items without pricing.
Reasons
1. Addendum of structural engineering revised in the eleventh month with an evaluated delay of three months.
2. Current contract cash balance: 5%, but not transacts the cash balance decrease while the budget increases in the
eleventh month.
3. Items not yet priced by the subcontractor: approximately 2%.
4. Fits with estimation results.
Training and convergence testing demonstrated model fitness for
use. This research uses the Variance at Completion percentage (VACP,
defined as Eq. (6)) as the criterion for exception management. The
application procedure of EAC management is illustrated with Fig. 7.
VACP = RBP EACP
ð6Þ
where
RBP
Revised Budget Percentage
5.2.1. Project cost influencing indices analysis
Indices can monitor and identify project deviations effectively and
rapidly. However, factors that influence project cost are complicated
and the definition and criteria of indices are indeterminate in research
done in relevant fields (see a list of literature reviewed in Table 9).
5.2.2. Project cost diagrams analysis
Indices that influence project cost can be provided to managers to
help their control of prospective costs and investigate pre-emptively
potential cost management problems. Nevertheless, overall construction cost trends cannot be determined by analyzing project cost
indices alone as cost problems may display abnormalities in some
situations. Thus, project cost diagram analysis is used as a supplementary method in this research.
This research used original contract documents to define the scope
of comparative data. Project cost diagrams were drawn on a monthly
basis based on EAC prediction results and data items shown in
Table 10. Project schedule tendencies and costs may be obtained by
comparing and analyzing variations in cost information (Table 11).
Therefore, appropriate prewarnings may be heeded and operations
enforced in accordance with project characteristics. Furthermore,
comprehensive inspection and strategy direction may be provided.
5.2.3. Application of an actual example case
Example Case L: Time series for cost influencing indices and cost
management are shown in Figs. 8 and 9, respectively. Moreover, this
research selects the time point at which project progress reaches
71.16% (Project-L-12) to analyze indices of project cost (see Table 12)
and project cost diagrams (Table 13 and Fig. 10). Prediction results
were verified by comparing analysis results with the project
completion report, leading us to conclude that results are feasible
when applied to the effective management and control of project
costs.
6. Conclusion
This research proposed an EAC prediction method using ESIM that
employs fmGA and SVM. Results obtained in this paper are
summarized as follows:
1. Research established an EAC predication method utilizing ESIM
that identifies significant factors of influence on project cost and
performs predictions by collecting and arranging experience-based
rules from historical cases. ESIM results represent a significant
improvement over results obtainable using traditional EAC
prediction methods.
2. The EAC prediction model established in this research is considerably accurate. The only inputs that need to be entered into the
model are the key factors influencing costs during the current
month following training and testing. A significant disadvantage
for traditional construction projects, i.e., cost tendencies cannot be
predicted in real time, has been effectively remedied.
3. ESIM prediction results were compared with eight common EVM
prediction methods. Values obtained by the EVM methods were
relatively unstable due to the wide variance in data among
projects. Conversely, the model developed in this research
generated comparatively steady prediction values. Such verified
the feasibility of using ESIM to predict construction project EACs.
4. Prediction results were analyzed further using project cost
influencing indices and project cost diagrams. This helped identify
the causes underlying EAC differences and trends. Results help
managers to control project costs in real time.
References
Fig. 10. Cost diagram of case L.
[1] D. Bolles, S. Fahrenkrog (Eds.), A Guide to the Project Management Body of
Knowledge (PMBOK Guide), 3rd Ed, Project Management Institute, Newtown
Square, 2004.
[2] G. Clifford, E. Larson, Project Management: the Complete Guide for Every
Manager, McGraw-Hill, New York, 2002.
M.-Y. Cheng et al. / Automation in Construction 19 (2010) 619–629
[3] G.A. Barraza, W.E. Back, F. Mata, Probabilistic forecasting of project performance
using stochastic S curves, Journal of Construction Engineering and Management
130 (1) (2004) 25–32.
[4] D.E. Lee, Probability of project completion using stochastic project scheduling
simulation, Journal of Construction Engineering and Management 131 (3) (2005)
310–318.
[5] D.E. Lee, D. Arditi, Automated statistical analysis in stochastic project scheduling
simulation, Journal of Construction Engineering and Management 132 (3) (2006)
268–277.
[6] B.C. Kim, K.F. Reinschmidt, Probabilistic forecasting of project duration using
Bayesian inference and the beta distribution, Journal of Construction Engineering
and Management 135 (3) (2009) 178–186.
[7] E.H. Kim, W.G. Wells Jr., M.R. Duffey, A model for effective implementation of
earned value management methodology, International Journal of Project
Management 21 (5) (2003) 375–382.
[8] D.F. Cioffi, Designing project management: a scientific notation and an improved
formalism for earned value calculations, International Journal of Project
Management 24 (2) (2006) 136–144.
[9] S. Vandevoorde, M. Vanhoucke, A comparison of different project duration
forecasting methods using earned value metrics, International Journal of Project
Management 24 (4) (2006) 289–302.
[10] G. Vitner, S. Rozenes, S. Spraggett, Using data envelope analysis to compare
project efficiency in a multi-project environment, International Journal of Project
Management 24 (4) (2006) 323–329.
[11] M. Vanhoucke, S. Vandevoorde, A simulation and evaluation of earned value
metrics to forecast the project duration, Journal of the Operational Research
Society 58 (10) (2007) 1361–1374.
[12] W. Lipke, O. Zwikael, K. Henderson, F. Anbari, Prediction of project outcome: the
application of statistical methods to earned value management and earned
schedule performance indices, International Journal of Project Management 27
(4) (2009) 400–407.
[13] M. Plaza, O. Turetken, A model-based DSS for integrating the impact of learning in
project control, Decision Support Systems 47 (4) (2009) 488–499.
629
[14] S.S. Leu, Y.C. Lin, Project performance evaluation based on statistical process
control techniques, Journal of Construction Engineering and Management 134
(10) (2008) 813–819.
[15] H.L. Stephenson, Identifying risks and opportunities using EAC, Proc. 48th AACE
International Annual Meeting '04, AACE International Transactions, 2004,
pp. CSC.06.1–CSC.06.9.
[16] S. Alexander, Earned Value Management Systems (EVMS): Basic Concepts, Project
Management Institute, Washington DC, 2002.
[17] V.N. Vapnic, The nature of statistical learning theory, Springer, New York, 1995.
[18] C.W. Hsu, C.J. Lin, A simple decomposition method for support vector machine,
Machine Learning 46 (1–3) (2002) 219–314.
[19] C.F. Lin, Fuzzy support vector machines, Ph.D. Thesis, Department of Electrical
Engineering, National Taiwan University, 2004.
[20] D.E. Goldberg, K. Deb, H. Kargupta, G. Harik, Rapid, accurate optimization of
difficult problems using fast messy genetic algorithms, Proc. 5th International
Conference on Genetic Algorithms '93, Morgan Kaufmann Pub. Inc, San Mateo,
1993, pp. 56–64.
[21] R. Day, J. Zydallis, G. Lamont, Competitive template analysis of the fast messy
genetic algorithm when applied to the protein structure prediction problem, Proc.
2nd ICCN '02, Computational Publications, Cambridge, 2002, pp. 36–39.
[22] C.W. Feng, H.T. Wu, Integrating fmGA and CYCLONE to optimize the schedule of
dispatching RMC trucks, Automation in Construction 15 (2) (2006) 186–199.
[23] H. Drucker, C.J.C. Burges, L. Kaufman, A. Smola, V. Vapnik, Support vector
regression machines, Proc. 10th Annual Conference on NIPS '96, Advances in
Neural Information Processing Systems, vol. 9, MIT Press, Cambridge, 1997,
pp. 155–161.
[24] C.C. Chang, C.J. Lin, Training nu-support vector classifiers: theory and algorithms,
Neural Computation 13 (9) (2001) 2119–2147.
[25] D. Knjazew, G.A. Ome, A Competent Genetic Algorithm for Solving Permutation
and Scheduling Problems, Kluwer Academic Publishers, Boston, 2003.