Uploaded by TUAN GENUS

NNTuan-PhD-Disertation-final

advertisement
MINISTRY OF EDUCATION AND TRAINING
HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
Nguyen Ngoc Tuan
RISK MANAGEMENT IN SOFTWARE PROJECT SCHEDULING
USING BAYESIAN NETWORKS
PhD DISSERTATION ON SOFTWARE ENGINEERING
Hanoi – 2021
MINISTRY OF EDUCATION AND TRAINING
HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
Nguyen Ngoc Tuan
RISK MANAGEMENT IN SOFTWARE PROJECT SCHEDULING
USING BAYESIAN NETWORKS
Major: Software Engineering
Code No.: 9480103
PhD DISSERTATION ON SOFTWARE ENGINEERING
SUPERVISORS:
1. Assoc. Prof. Dr. Huynh Quyet Thang
2. Dr. Vu Thi Huong Giang
Hanoi – 2021
DECLARATION:
I certify that this thesis and the work presented in it are products of my
own work, and that any ideas or quotations from other people work published
or otherwise, are fully acknowledged in accordance with the standard referencing practices of the discipline.
This thesis has not been submitted for any degree or other purposes.
Hanoi, April 29, 2021
PhD STUDENT
Nguyễn Ngọc Tuấn
ON BEHALF OF SUPERVISORS
Assoc. Prof. Dr. Huỳnh Quyết Thắng
1
Acknowledgements
First of all, I would like to express my sincere gratitude to my first supervisor
Assoc. Prof. Dr. Huynh Quyet Thang for his invaluable guidance and support
throughout my research. Professor Thang has supported me all the way, all the time.
It is his patience that keeps me always committed to doing this research and
reaching the end of PhD student period. I am also very grateful to my second
supervisor Dr. Vu Thi Huong Giang whose bright hints and expertise has been
always helpful to me.
My special thanks go to Ms. Vo Thi Huong, Ms. Bui Thi Quynh Nga, Mr.
Tran Trung Hieu, Mr. Tran The Anh, Mr. Tran Bao Ngoc and Mr. Cao Manh
Quyen, who were master and bachelor students at School of ICT, Hanoi University
of Science and Technology and helped me with building the tools as well as testing
our models.
I am also indebted to Dr. Nguyen Thanh Nam (former CEO of FPT and
former President of FSOFT), Mr. Luu Quoc Tuan (Tinh Van Outsourcing Jsc.), Mr.
Ngo Quang Vinh (Evizi), Mr. Nguyen Huy Binh (FIS) who provide helpful real
software project data and valuable expertise judgments on the data.
Finally, my greatest appreciation is to my family, especially to my wife Tran
Thi Bich Ngoc and to my son Nguyen Minh Huy. Without their love, patience and
sacrifice, this achievement would never be possible.
2
Summary
Software project management is an art and science of planning and leading
software projects. In software industry, project managers mostly rely on their
experience and skills to manage their projects and lack of scientific tools to support
them.
Risk management is a crucial part of software project management that helps
prevent software disasters. In this research, risks are defined as uncertain events or
conditions that, if they occur, they would have a bad impact on one or more
software project outcomes (cost, time, quality). Identifying and dealing with risks or
uncertainty in early phases of software development life cycle would lessen longterm cost and enhance the chance of the project success. The most important part of
risk management is risk analysis which assesses the risks and their impact to the
outputs of the software project. To overcome subjective assessment based on
development team’s experience, the team needs a quantitative risk analysis method.
Software project scheduling is one part of software project planning. Since in
practice, most software projects are over-budget and behind schedule, software
project scheduling needs to be taken into careful consideration. We come up with
the following questions:
 How to schedule software projects better?
 How to better manage risks in software projects?
 How to quantitatively analyse risks?
Some researchers say that Bayesian Networks can be used to quantify uncertain
factors in (general) project scheduling and improve project risk assessment and
analysis. Our research is aimed at taking those advantages of Bayesian Networks
into software project scheduling by addressing common software project features.
The research provides answers to the above questions with probabilistic
approaches and tools to assess the impacts of risk factors on software project
scheduling; proposing list of common risk factors and Bayesian Network model of
these risk factors; and proposing advanced scheduling methods based on
incorporating Bayesian Networks into popular scheduling techniques such as CPM,
PERT or agile iteration scheduling etc. Bayesian Networks help quantify the
factors, and hence help better manage them as well as enhancing the predictability
of things happen in the project.
3
This research first takes a literature review on (general) project planning issues,
project scheduling techniques, project scheduling tools, uncertainty and risk
characteristics in software projects, risk management processes, project risk analysis
in order to apply state-of-the-art techniques into software projects (Chapter 1).
After that, Bayesian Networks are applied in building and experimenting risk
factors in software project scheduling. BRI (Bayes Risk-Impact) algorithm is
proposed to assess risk factors’ impact on software scheduling (Section 2.1). The
first set of risk factors with 5 risk factors are examined using a probabilistic ownbuilt tool CKDY to analyse risks in software project scheduling (Section 2.2).
The research proposes an advanced algorithm for agile iteration scheduling
using Bayesian Networks. The advantages of this method are providing a schedule
and the probability of finishing agile iteration on time (Section 3.1). In addition, the
author goes further with a more refined list of 19 risk factors in software scheduling
and uses them in software scheduling methods. The research also incorporates
Bayesian Network with CPM and PERT scheduling techniques in traditional
software projects together with the Bayesian Networks of common risk factors
(Section 3.2 and Section 3.3). The list of 19 risk factors in agile software
development is also examined in agile iteration scheduling (Section 3.4). The
experimental results show that our models are reliable and our approaches have
practical implications, i.e. we can take advantage of Bayesian Networks in
modelling and quantifying risks/uncertainty in software projects.
4
How to read this report?
The author highly recommends that you read this report from beginning to the
end. However, if at any point you want to look at specific important pieces of
information, the following guide could be helpful:
 To get the motivation, the overview of related work, the objectives, the
scope, the hypothesis and methodology of this research, please go to the
Introduction section.
 To get an overview of software project scheduling and risk management in
software project scheduling, please go to Sections 1.1, 1.2 and 1.3.
 To get an overview of Bayesian Networks, please go to Section 1.4.
 To get details on main contributions and key findings of the research, please
read Chapter 2 and Chapter 3.
 To get information on common risk factors in software project scheduling,
you can have a look at Section 2.3.
 The Chapter 2 is about building tools and doing experiments on applying
Bayesian Networks into risk management in software project planning
(Section 2.1) and some key risk factors (Section 2.2).
 The Chapter 3 is about incorporating Bayesian Networks and common risk
factors into software project scheduling techniques such as CPM (Section
3.2), PERT (Section 3.3), Agile software development scheduling (Section
3.4).
 To get to know the conclusions, the limitations as well as the further research
of the study in this PhD thesis, please read the Conclusion section.
5
Content
Acknowledgements ............................................................................................... 2
Summary………….. ................................................................................................ 3
How to read this report? ...................................................................................... 5
List of symbols and abbreviations ................................................................. 10
List of tables.......................................................................................................... 12
List of figures ........................................................................................................ 13
Introduction ........................................................................................................... 15
Motivation.................................................................................................................................... 15
Related work .............................................................................................................................. 18
Research scope ......................................................................................................................... 20
Research objectives .................................................................................................................. 21
Scientific and realistic meaning ............................................................................................... 21
Research hypothesis and methodology ................................................................................. 21
Expected results ........................................................................................................................ 22
Structure of the thesis ............................................................................................................... 22
Chapter 1. Overview of software project scheduling and risk
management.......................................................................................................... 24
1.1. Software project management and software project scheduling................................ 24
1.1.1. Software project management ................................................................................. 24
1.1.2. Software project scheduling ...................................................................................... 26
1.2. Software project scheduling methods and techniques ................................................ 27
1.2.1. Overview ...................................................................................................................... 27
1.2.2. Traditional scheduling methods and techniques ................................................... 27
1.2.3. Agile software project scheduling ............................................................................ 32
1.3. Risk management in software project scheduling ........................................................ 34
6
1.3.1. Overview of project risk management ..................................................................... 34
1.3.2. Project risk analysis ................................................................................................... 36
1.3.3. Unknown risks ............................................................................................................. 37
1.3.4. Risk aspects in software project scheduling .......................................................... 37
1.4. Bayesian Networks ............................................................................................................ 38
1.4.1. Bayesian approach vs classical approach.............................................................. 38
1.4.2. Probabilistic approach using Bayesian Networks.................................................. 39
1.4.3. Bayesian Inference..................................................................................................... 41
1.4.4. Bayesian Networks and project risk management ................................................ 42
1.5. Chapter remarks ................................................................................................................ 44
Chapter 2. Common risk factors and experiments on Bayesian
Networks and software project scheduling.................................................. 46
2.1. Application of Bayesian Networks into schedule risk management in software
project .......................................................................................................................................... 46
2.1.1. Common risk factors in software project management ........................................ 47
2.1.2. Bayesian Networks of risk factors ............................................................................ 48
2.1.3. Risk impact calculation .............................................................................................. 54
2.1.4. Bayesian Risk Impact algorithm ............................................................................... 57
2.1.5. Tool and experiments ................................................................................................ 58
2.1.6. Conclusion and contribution ..................................................................................... 63
2.2. Experiments on common risk factors ............................................................................. 64
2.2.1. Discovering the top ranked risk factors ................................................................... 64
2.2.2. Tool CKDY ................................................................................................................... 68
2.2.3. Experiments and analysis ......................................................................................... 70
2.2.4. Conclusion and contribution ..................................................................................... 74
2.3. Proposed common risk factors in software project scheduling................................... 75
2.3.1. The 19 common risk factors in traditional software project .................................. 75
2.3.2. The 19 common risk factors in agile software project .......................................... 77
7
2.3.3. Conclusion and contribution ..................................................................................... 79
2.4. Chapter remarks ................................................................................................................ 79
Chapter 3. Incorporation of Bayesian Networks into software project
scheduling techniques ....................................................................................... 81
3.1. Applying Bayesian Networks into specific software project development ................ 81
3.1.1. Introduction .................................................................................................................. 81
3.1.2. Optimized Agile iteration scheduling ....................................................................... 82
3.1.3. Optimization model for Agile software iteration ..................................................... 83
3.1.4. Tool and experimental results .................................................................................. 88
3.1.5. Conclusion and contribution ..................................................................................... 92
3.2. Incorporation of Bayesian Networks into CPM.............................................................. 92
3.2.1. The RBCPM Model .................................................................................................... 93
3.2.2. The RBCPM Method .................................................................................................. 96
3.2.3. Tool and experimental results .................................................................................. 97
3.2.4. Conclusion and contribution ................................................................................... 101
3.3. Incorporation of Bayesian Networks into PERT .......................................................... 102
3.3.1. Proposed model........................................................................................................ 102
3.3.2. Tool development and data collection................................................................... 106
3.3.3. Experimental results and analysis ......................................................................... 110
3.3.4. Conclusion and contribution ................................................................................... 112
3.4. Incorporation of Bayesian Networks into Agile software development scheduling 112
3.4.1. Incorporation of risk model ...................................................................................... 113
3.4.2. Tool and experimental results ................................................................................ 113
3.4.3. Conclusion and contribution ................................................................................... 115
3.5. Chapter remarks .............................................................................................................. 116
Conclusion .......................................................................................................... 117
What has been done ............................................................................................................... 117
8
Main contributions ................................................................................................................... 117
Limitations................................................................................................................................. 117
Further research ...................................................................................................................... 118
List of scientific publications ......................................................................... 119
References ........................................................................................................... 120
Index……… .......................................................................................................... 128
Appendix. Sub Bayesian Networks of the 24 risk factors ...................... 129
9
List of symbols and abbreviations
No.
Abbreviation
Description
1
AF
Assigned First
2
AISP
Agile Iteration Scheduling Problem
3
BAIS
Bayesian Agile Iteration Scheduling
4
BN
Bayesian Network
5
BRI
Bayes Risk-Impact
6
CMM
Capability Maturity Model
7
CMMi
Capability Maturity Model Integration
8
CPM
Critical Path Method
9
DAG
Directed Acyclic Graphs
10
EVM
Earned Value Management
11
FDD
Feature-Driven Development
12
IDE
Integrated Developer Environment
13
IGR
Internally Generated Risk
14
LPT
Longest Processing Time
15
MCS
Monte Carlo Simulation
16
NPT
Node Probability Table
17
PERT
Program Evaluation and Review Technique
18
PI
Probability-Impact
19
PMBOK
Project Management Body of Knowledge
20
PMI
Project Management Institute
21
PMP
Project Management Professional
10
22
PRAM
Project Risk Analysis and Management
23
PRM
Project Risk Management
24
PRMP
Project Risk Management Processes
25
PSPLIB
Project Scheduling Problem Library
26
RAMP
Risk Analysis and Management for Projects
27
RBCPM
Risk Bayesian Critical Path Method
28
RBPERT
Risk Bayesian PERT
29
RESCON
RESource CONstrained
30
RMP
Risk Management Processes
31
RUP
Rational Unified Process
32
SPT
Shortest Processing Time
33
XP
Extreme Programming
11
List of tables
Table 1.1 Basic mathematical notations used for CPM calculation ......................... 28
Table 1.2. The differences between waterfall and agile projects ............................. 33
Table 1.3. The differences between Bayesian and Frequentist approaches ............. 38
Table 2.1. Hui and Liu’s common risk factors [9] ................................................... 47
Table 2.2. Risk factors in the phases ........................................................................ 61
Table 2.3. Risk factors, consequences and impact ................................................... 65
Table 2.4. Examples of risk factors and probabilities .............................................. 67
Table 2.5. Probability of risk factors in the whole project with data set 1 ............... 72
Table 2.6. Probability of risk factors in the whole project with data set 2 ............... 73
Table 2.7. Probability of the experimental risk factors to compare with MSBNx ... 74
Table 2.8. CKDY compared with MSBNx .............................................................. 74
Table 2.9. List of 19 common risk factors for software project scheduling ............ 76
Table 2.10. List of 5 risk factors for software project scheduling in Section 2.2 .... 77
Table 2.11. List of 19 risk factors in iteration scheduling ........................................ 78
Table 3.1. The first data sample ............................................................................... 89
Table 3.2. The probability table for tasks and resources .......................................... 90
Table 3.3. Risk factors analysis ................................................................................ 94
Table 3.4. Data sample 1 .......................................................................................... 98
Table 3.5. Data sample 2 .......................................................................................... 99
Table 3.6. Task attributes of the first data sample.................................................. 108
Table 3.7. Task attributes of the second data sample ............................................. 108
Table 3.8. Task attributes of the third data sample ................................................ 109
Table 3.9. The result for the first data sample ........................................................ 114
12
List of figures
Figure 1.1. Activities of project management according to PMBOK Guide. .......... 26
Figure 1.2. CPM parameters in an activity ............................................................... 29
Figure 1.3. An example of BN which represents a simple case ............................... 41
Figure 2.1. A sub BN for the risk factor “Staff experience shortage” ..................... 49
Figure 2.2. A sub BN for the risk factor “Low productivity” .................................. 49
Figure 2.3. A sub BN for the risk factor “Lack of client support” ........................... 50
Figure 2.4. A sub BN for the risk factor “Inaccurate cost estimating” .................... 50
Figure 2.5. A sub BN for the risk factor “Incapable project management” ............. 51
Figure 2.6. A sub BN for the risk factor “Lack of senior management commitment”
.................................................................................................................................. 52
Figure 2.7. A sub BN for the risk factor “Inadequate configuration control” .......... 52
Figure 2.8. A sub BN for the risk factor “Inaccurate metrics” ................................. 53
Figure 2.9. A sub BN for risk factor “Excessive reliance on a single process
improvement” ........................................................................................................... 53
Figure 2.10. The overall BN for software risk factors ............................................. 54
Figure 2.11. A simple example of Bayesian inference ............................................. 55
Figure 2.12. The three nodes of a simple-chain BN ................................................. 57
Figure 2.13. The graphical interface of the tool ....................................................... 59
Figure 2.14. Result of experiment 1 ......................................................................... 60
Figure 2.15. Results of the three experiments .......................................................... 62
Figure 2.16. Experimental results for Software Design phase ................................. 63
Figure 2.17. Sub BN 1 .............................................................................................. 66
Figure 2.18. Sub BN 2 .............................................................................................. 66
Figure 2.19. The overall BN model .......................................................................... 67
Figure 2.20. Experiment with j30 with the early start schedule ............................... 71
Figure 2.21. Activity joint in the file j301_1.rcp ...................................................... 71
Figure 2.22. Diagram of probabilities of finishing phase by phase ........................ 72
Figure 3.1. Home GUI of tool BAIS ........................................................................ 88
Figure 3.2. Gantt chart for SPT strategy................................................................... 90
Figure 3.3. A part of a BN for 19 risk factors .......................................................... 93
Figure 3.4. Task’s parameters and connection to other tasks. .................................. 96
Figure 3.5. A screenshot of RBCPM ........................................................................ 97
Figure 3.6. A result for experiment with data sample 1 ......................................... 100
Figure 3.7. A result for experiment with data sample 2 ......................................... 101
Figure 3.8. Bayesian Network for each activity ..................................................... 103
Figure 3.9. Risk integration network model into PERT scheduling ...................... 104
Figure 3.10. Process in improved RBPERT Model ............................................... 105
Figure 3.11. The input screen of the RBPERT tool ............................................... 106
13
Figure 3.12. The input file type of the RBPERT tool ............................................ 107
Figure 3.13. A result for the network provided by the RBPERT tool for the first data
sample ..................................................................................................................... 109
Figure 3.14. A result for RBPERT network provided by the tool for the first data
sample ..................................................................................................................... 111
Figure 3.15. A result for experiment with the third data sample (distribution of Total
Duration of activity J) ............................................................................................. 111
Figure 3.16. A screenshot of tool BAIS ................................................................. 113
Figure 3.17. The result of the second experiment .................................................. 115
14
Introduction
Motivation
Projects in general always involve risks and project managers’ regular worries
are concerns about risks. In October 2008, the Hanoi Urban Railway Project Line
2A (Cat Linh-Ha Dong) was approved to be invested with the total budget of more
than 8.700 billion VND (552 million USD). Until now, the project’s investment had
almost doubled to 868 million USD. It was scheduled to be put into service in 2013
but until now the project remains incomplete1.
Software projects also have schedule risks, and as a consequence, budget or cost
risks. For example, the project on the Vietnamese National Population Database2
was approved to be invested in 2015 and was planned to be finished in two years
(2016 and 2017). However, the system can only be put into operations in February
2021. Another similar example is the project on Vietnamese National Public
Service Portal3 which was planned to come public in September 2016 but was only
opened since December 2019. As a matter of fact, the majority of software projects
the author has experienced in Vietnam are behind schedule (some of the projects
will be examined in Chapter 2 and Chapter 3).
Even in developed countries, software projects are facing ongoing problems.
For example, the project Universal Credit - the welfare payment system owned by
the Central Government of the United Kingdom - started in 2013. The project
schedule has slipped, with the final delivery date now expected to be 2021, although
the system is gradually being introduced. In 2013, only one of four planned pilot
sites went live on the originally scheduled date, and the pilot was restricted to
extremely simple cases4.
Many software projects have suffered from significant budget overruns together
with a series of delays, which cause either temporary issues or permanent failures.
For example, The Queensland Health Payroll System was launched in 2013 in what
could be considered one of the most spectacularly over budget projects in
Australian history, coming in at over 200 times the original budget. Besides, in spite
VnExpress (2019), “Ministry of Transport admits the mistakes on the Cat Linh-Ha Dong urban
railway project”, available online (in Vietnamese) at: https://vnexpress.net/bo-giao-thong-van-tai-thua-nhansai-sot-trong-du-an-cat-linh-ha-dong-3988254.html
2
Vietnamese Prime Minister (2015), “Decision regarding the approval of investment policy for the
project on the National population database”, Government of Vietnam, 2083/QĐ-TTg (26 November 2015)
3
Vietnamese Prime Minister (2015), “Resolutions on e-Government”, Government of Vietnam,
36a/NQ-CP (14 October 2015)
4
Wikipedia.org, “List of failed and over-budget custom software projects”, Retrieved 20 September
2019, available online at: https://en.wikipedia.org/wiki/List_of_failed_and_overbudget_custom_software_projects
1
15
of promises that the new system would be fully automated, the new system required
a considerable amount of manual operation [1]. Another example for software
project permanent failure case is the project e-Borders for an advanced passenger
information programme which aimed to collect and store information on passengers
and crew entering and leaving the United Kingdom. Started in 2007, the project had
a series of delays and had to be cancelled in 2014 [2].
Some researches pointed out that most of the software projects (83.8%) are over
budget or behind schedule and 52.7% of software development projects deliver
software with fewer features than originally specified [3, 4]. Statistics also show
that 31.1% of development projects end up being cancelled or terminated
prematurely. Among those completed projects, only 61% of them satisfy originally
specified features and functions [5]. In the software industry, one of the greatest
challenges that development teams constantly face with is to keep the projects under
control in terms of budget and schedule (development time frame). The activities of
a software project are influenced by internal and external factors (from that project
organization) that make it uncertain whether the project will achieve its objectives.
The effect that this uncertainty has on the project’s goals is called risk [6]. In the
other words, risk is an event or an uncertain condition that, if it occurs, will have a
positive or negative effect on at least one of the project objectives [7]. In this thesis,
risks are defined as uncertain events or conditions that, if they occur, they would
have a bad impact on one or more software project outcomes (cost, time, quality).
The above situation raises an important question: how projects’ risks are
managed better in order to get rid of the temporary issues as well as preventing from
failure?
The purpose of project management is to lead the project to success. A
successful software project certainly relies on many factors (e.g. following
appropriate processes and tasks, managing risks properly etc.). Since risks are
inevitable in projects, risk management has become an important part of project
management. Although many researchers, experts and writers have proposed variety
of processes and techniques, project risk management (PRM) is still rapidly
evolving and handling risks in general projects as well as software projects remains
a challenge.
Concerning PRM, an important component is risk analysis which also known or
considered the same as risk quantification. Risk analysis attempts to measure risks
and their impacts on different project outcomes (i.e., time, cost, quality). Many
software projects fail since project managers mostly plan based on their experience
and there is a lack of scientific methods to support them. To overcome subjective
16
assessment based on development team’s experience, the team needs a quantitative
risk analysis method. Although various researches have proposed and examined a
range of processes and techniques and software project risk management is
continuously evolving, handling uncertainty in more and more complex real-world
projects remains a challenge.
Aside from that, project scheduling (a part of project planning – an early phase
of software development life cycle) is concerned with the techniques that can be
employed to manage the activities that need to be undertaken during the
development of a project. There are various techniques for project scheduling, from
simple and easily understandable ones such as Task List, Gantt Chart, Schedule
Network Analysis, to more complicated ones like Critical Path Method (CPM),
Program Evaluation and Review Technique (PERT), Monte-Carlo Simulation
(MCS) or Fuzzy Logic etc. [6, 8, 9, 10].
Traditional project scheduling under risk/uncertainty has attracted more
research and attention in the project management community. In some of the project
management literature in 1990s, “risk analysis” was equivalent to “the analysis of
risk on project plan” [11]. This thesis focuses on modelling risks in software project
time management (of course, it is indirectly related to other project outcomes which
are cost and quality). In other words, this thesis concentrates on quantitative risk
analysis in software project scheduling.
The earliest studies incorporating uncertainty/risk in project scheduling were in
the late 1950’s by Malcolm et al. [12] and Miller [13]. Since then, a variety of
techniques have been introduced, several tools have been developed, and many of
them are widely used throughout different industries. However, they often fail to
capture uncertainty properly and/or produce inaccurate, inconsistent and unreliable
results, especially when applied to software projects which have specifically
different attributes to other traditional projects.
Project uncertainty has several aspects of which not all can be categorized and
treated as risks. Several authors such as Ward and Chapman [14] argued that project
risk management should be focusing on managing uncertainty and its various
sources rather than emphasizing a set of possible events that might have bad
impacts on project performance (i.e., should be aware more about uncertain aspects
rather than fixed set of defined risks). However, since this thesis is about software
project, risks are considered and treated the same as uncertainty. Most of
quantitative techniques and methods in the current practice of project risk
management are based on the “Probability Impact” concept, which have certain
shortcomings in terms of risk analysis in project scheduling. More sophisticated
17
methods and techniques are needed to address as well as managing important
sources of uncertainty/ risk.
In software industry, project scheduling also has to deal with the fact that
resources such as human, time, technology and money are not always predetermined [15]. There are always risks in software project scheduling as well. In
most of the projects, the activity (from now on is considered the same as the “task”
in software projects) times are not known for certain. Therefore, they may be
assumed as random variables.
Furthermore, Bayesian Networks (BNs) have attracted a lot of attention in
different fields (construction, R&D etc.) as a powerful approach for decision
support under uncertainty. A BN is a graphical and mathematical model which
offers a powerful, general and flexible approach for modelling risk and uncertainty.
Its capability of modelling causality and also conditional dependency between
variables make it perfectly suitable for capturing uncertainty in projects. Yet, BNs
are rarely applied in project risk management in general as well as in software
project management and software project scheduling.
The author of this thesis strongly believes that if we can identify and control
risks at early stages of software development project, we can significantly increase
the chance of success of the project. Since it is not easy (or impossible) to control
all of the problems or factors, this thesis only focus on time factors which related to
software development schedule.
Therefore, this thesis aims at introducing an advanced approach as well as
finding a better model for incorporating and managing uncertainty/risks in software
project scheduling. The idea is to use BNs to perform the well-known scheduling
techniques such as CPM, PERT etc. as well as modelling risk factors in software
project scheduling. The proposed approach enriches the benefits of scheduling
techniques by incorporating uncertainty/risk factors and adding the strong analytical
power of BNs.
Related work
There have been various researches on applying BNs in to general projects.
Khodakarami [15] applied BNs into general project scheduling with two case
studies of aircraft design and health and fitness center design and construction.
Erhan et al. [16] proposed a project control framework that integrates the project
uncertainty and associated risk factors into project control. Their framework is
based on earned value management (EVM), which is an effective and widely used
quantitative project control technique in practice. The framework uses hybrid BNs
18
to enhance EVM with the ability to compute the uncertainty associated with its
parameters and risk factors, making it practical for construction projects. Ali et al.
[17] combined Monte Carlo Simulation and Bayesian Networks methods to present
a structure for assessing the aggregated impact of risks on the completion time of a
construction project. Lee and Shin [18] proposed an application of BNs into risk
management of ship building project and proposed 26 risks. Sharma and Chanda
[19] developed a BN model for prediction of R&D project success which also
assesses based on R&D project risk factors. Khodakarami et al. [20] also examined
an approach to generate project schedules that incorporates risk, uncertainty, and
causality using BNs. Their model empowered the traditional CPM to handle
uncertainty, and they also provided explanatory analysis to elicit, represent, and
manage different sources of uncertainty in project planning. Fenton and Neil [21]
introduced AgenaRisk as a probabilistic tool based on BNs; Chang, Yu, and Cheng
[22] proposed a risk-based Critical Path Scheduling Method based on 2 risk
categories and 7 risk levels which applied into construction projects.
Regarding risk factors in software projects, Hui and Liu [5] selected 24 risk
factors that may cause potential impacts on (the whole) software project and applied
BNs properties in the calculation of impact in their project risk model. Kumar and
Yadav [23] considered quantitative features and causal relationships among risk
factors in software projects. They introduced a probabilistic approach to assess risks
in software projects as well as proposing a list of 27 risk factors (in software
projects). However, they analysed risks for the whole software projects and did not
focus on the scheduling and planning phases which would decide the success of
projects. Adjusting Kumar and Yadav’s method, this thesis proposes the list of 5
most crucial risk factors as well as building the tool CKDY to examine risks in
software scheduling (Section 2.2).
There have been some other researches on BNs and software risks’ analysis. Hu
et al. [24] studied causality analysis among risk factors and project outcomes for
software development projects. For this purpose, they proposed a modelling
framework based on BNs to deal with causality constraints in risk analysis. The
developed framework can be used for discovering new causal relationships and
validating existing relationships among risk factors and project outcomes. Anthony
et al. [25] proposed a risk assessment model for decision-making in software
management which consists of processes and component of risk assessment in three
groups: operational risks, technical risks and strategic risks. Rai et al. [26] believed
that managing projects is managing risks and identified 43 risk indicators in Agile
Software Development.
19
One notable research is from Szoke Akos’ PhD dissertation in 2014 which
proposed an optimized algorithm for agile software project scheduling [27].
As can be seen from literature review, much research on software risk analysis
focuses on finding out the relationship risk factors and software outcomes, but lack
of a quantitative approach and causal relationship between risk factors [5, 23, 28,
29]. Some other researches pay attention to define the quantitative approach and the
causal relationship between risk factors and assess risks for the whole software
project [30, 31] but does not pay enough attention to model risk factors from the
scheduling (in the planning) phase – the phase decides the failure or success of the
project later on. To quantify uncertainty, Jefferson et al. [32] apply Action Research
to develop a model that takes into account the relationships of dependencies and
interdependence that exist between the sources of risks and uncertainties in software
projects. As a result, their work contributes with the practice of risk and uncertainty
management in software projects.
J. Yong and Z. Zhigang [33] proposed a PERT Bayesian Network (PERTBN)
model with the modelling methodology and the conditional probability calculation
method of different kinds of procedure arrangement (single-chain, centralized,
distributed) and stated that with PERTBN model, the effectiveness of the project
schedule control and optimization are ensured. However, the research did not
examine more in-depth on the risk factors or other specific software features that
can have impacts on the project schedule.
In addition, there is always a need for properly schedule control in software
projects to determine the instant status of the schedule, to know if the schedule has
changed, and to embrace changes when they occur. In order to do that, influential
factors that cause schedule changes need to be carefully considered.
In summary, current researches related to this thesis are either on risk
management or assessment for the whole software project or for other project
(construction, building, R&D etc.) scheduling. There is a need of probabilistic
method on risk management in software project scheduling as well as examining
deeper the risk attributes of software project scheduling.
Research scope
The research is about software projects (or software development projects),
having common features and also specific features in comparison to other type of
projects (such as construction projects, R&D projects etc.). Unfortunately, there
have been only a few good researches on applying probabilistic methods on
software development projects. Therefore, this method first has a literature review
20
on common projects to look for approaches applied for them, and after that
proposes the approach applied for software projects.
The scope of this research is on risk management in software project
scheduling. This is quantitative risk management which concerns about risks
affecting project schedule (or project time frame). In terms of project scheduling
techniques, this thesis focuses on the most popular techniques such as CPM, PERT
for traditional software development projects, as well as Agile software project
scheduling.
Research objectives
The main objectives of this research are:
1) To find out a quantitative method to better assess and analyse risks in
software project scheduling. In order to achieve this objective, the research has to
answer to following questions: what are the risks’ attributes of software project
scheduling? How to manage risks in software project scheduling better?
In other words, the research aims at analyzing and modelling risks in software
project scheduling.
2) To find out a probabilistic method to improve well-known software project
scheduling techniques, including both techniques for traditional software scheduling
and agile software scheduling.
The proposed methods and models would enhance risk management process by
a quantitative assessment of risks impact on software project scheduling. If we
apply this model and method in practice, the author of this thesis expect that it
would help predict, monitor project schedule better as well as making appropriate
decisions.
Scientific and realistic meaning
The proposed methods and model would enhance risk management process by a
quantitative assessment of risks impact on software project scheduling.
If we apply this model and method in practice, it would help predict, monitor
project schedule better as well as making appropriate decisions.
Research hypothesis and methodology
The hypothesis of this thesis is that it is possible to use BNs to quantify
uncertainty in software project scheduling and improve software project risk
assessment.
21
Since there is very limited research on this topic, the research methodology
comprises a literature reviews from general project management to get the relevant
ideas for software project management. Firstly, a literature reviews to investigate
the current state of project scheduling under uncertainty which determines the need,
scope and objectives of the new approach. Secondly, a literature review follows on
the background, theory and application of BNs. This provides the conceptual and
the fundamental background for the new approach.
The research also examines the features of software projects, both in waterfall
model and agile software development model. In order to handle risks in software
project scheduling, the common risk factors are also needed to be examined.
Within the research, tools are built to validate the models and help software
project managers in assessing risks and making appropriate decisions.
Expected results
Following the above methodology, the author expects to:
1) Apply Bayesian Networks to develop an algorithm and tool to assess the
impacts of risks and hence proposes common risk factors in software project
scheduling.
2) Apply Bayesian Networks to develop a probabilistic approach to enhance the
common scheduling techniques (for both traditional software development and agile
software development) in terms of risk management and predictability.
Structure of the thesis
An overview of the main chapters is as follows:
Chapter 1 briefly reviews software project scheduling and software project risk
management process and explores the currently popular techniques in project
scheduling.
Chapter 2 consists of initial attempts of applying BNs into risk management in
software project scheduling as well as experiments on common risk factors in
software project scheduling. 19 common risk factors for both traditional software
development projects and agile software projects are proposed.
Chapter 3 incorporates BNs into popular software project scheduling techniques, namely CPM, PERT and agile software scheduling. BNs are also applied in
examining the relationships among risk factors proposed in Chapter 2.
22
The last section Conclusion concludes the thesis and points the way forward for
future research.
The main contributions and results of the research: The research has
developed the algorithm BRI (Bayes Risk-Impact) and the tool CKDY to assess the
impacts of risks and hence proposes common risk factors in software project
scheduling. Based on literature review and experiments, the research has come up
with 19 common risk factors in software project scheduling (for both agile
development style and traditional development style).
The research also proposes advanced scheduling methods in software project
development. The methods based on incorporating Bayesian Networks and common
risk factors models into popular software scheduling techniques such as PERT,
CPM, and Agile software development, with the examination of the model of 19
common risk factors. Tools have been built to experiment the proposed scheduling
methods and models. Experimental results show that the proposed methods and
models are reliable as well as providing practical value to software development
teams in analyzing, monitoring and predicting risks and the chance of success of the
project.
23
Chapter 1. Overview of software project scheduling
and risk management
1.1. Software project management and software project
scheduling
1.1.1. Software project management
Software project management is an art and science of planning and monitoring
software projects. It refers to the branch of project management dedicated to the
planning, scheduling, resource allocation, implementation, tracking and delivery of
software and web projects [34].
There are various types of projects (R&D projects, construction projects,
information system projects, software projects etc.) which are associated with
different styles of management. Software project management is quite distinct from
traditional or other project management. Firstly, software is developed, not
manufactured. Therefore, the product (working software) is intangible and uniquely
flexible. Secondly, software engineering is not recognized as an engineering
discipline with the same status as mechanical, electrical engineering etc. Moreover,
software projects have a unique lifecycle process that requires multiple rounds of
testing, updating, and customer feedback. That software development process is not
standardized. Lastly, most software projects are “one-off” projects. Software
development team can only use similar experience, not the same experience or
repeated process.
Therefore, software project management is about the methodology to organize
all activities related to the software. We always need project management since
software projects always have constraints of budget and time frame.
Nowadays, most IT-related projects are managed in the agile style and software
is developed in groups, in order to keep up with the increasing pace of business, and
iterate based on customer and stakeholder feedback. Besides being used in ITrelated projects, Agile style has also been increasingly used in other project
management.
The project manager leads the project team and often plays the central role
among the investors (or customers), the suppliers and the senior management of the
organization. He or she makes sure the project complies with the constraints as well
as delivering the product (software) on time. Software project managers may have
to do any of the following tasks [34]:
24
- Planning and scheduling: This means putting together the blueprint for the
entire project from ideation to fruition. It will define the scope, allocate necessary
resources, propose the timeline, delineate the plan for execution, lay out a
communication strategy, and indicate the steps necessary for testing and
maintenance.
- Leading: A software project manager will need to assemble and lead the
project team, which likely will consist of developers, analysts, testers, graphic
designers, and technical writers. This requires excellent communication, people and
leadership skills.
- Execution: The project manager will participate in and supervise the
successful execution of each stage of the project. This includes monitoring progress,
frequent team check-ins and creating status reports.
- Time management: Staying on schedule is crucial to the successful completion
of any project, but it is particularly challenging when it comes to managing software
projects because changes to the original plan are almost certain to occur as the
project evolves. Software project managers must be experts in risk management and
contingency planning to ensure forward progress when roadblocks or changes
occur.
- Budget: Like traditional project managers, software project managers are
tasked with creating a budget for a project, and then sticking to it as closely as
possible, moderating spend and re-allocating funds when necessary.
- Maintenance: Software project management typically encourages constant
product testing in order to discover and fix bugs early, adjust the end product to the
customer’s needs, and keep the project on target. The software project manager is
responsible for ensuring proper and consistent testing, evaluation and fixes are
being made.
Therefore, managers have diverse roles. Since software project management is
normally concerned with activities involved in ensuring that software is delivered
on time, on schedule and in accordance with the requirements of the organizations
developing and procuring the software, managers most significant activities are
planning, estimating and scheduling.
According to Project Management Institute (PMI) in Project Management Body
of Knowledge (PMBOK) guide [7], project management includes five stages or
process groups: Initiating, Planning, Executing, Monitoring and Controlling, and
Closing (Figure 1.1).
25
In modern software project planning, the two essential tasks are project risk
management and project scheduling. They play crucial roles to make sure the
project is effectively and efficiently organized, including resources (hardware,
software, and network) allocation, task and personnel assignment and monitoring
[7, 10]. Software projects are quite different to other projects since software
requirements are continuously changing (during software development life cycle),
software projects are often behind schedule and over budget. Moreover, in reality,
many software project managers either ignore or do not take appropriate risk
management. This leads to project failure or customer complains on the quality, the
schedule or the over budget of the project. Some other project managers who are
aware of risk management, but they only rely on their own team skills or
experience, even if they follow the capability maturity models CMM/CMMi
(Capability Maturity Model Integration) or PMP (Project Management
Professional). As can be seen in Figure 1.1, risk management affects all the
processes in Process Groups. In addition, project teams could adjust or update the
planning process while they are executing, monitoring and controlling their
projects.
Figure 1.1. Activities of project management according to PMBOK Guide.
1.1.2. Software project scheduling
Software project scheduling is one of the most demanding tasks for software
project managers. It is all about resources allocation during the project life cycle. In
simple words, software project scheduling is splitting the whole project into smaller
tasks and estimates the required time and resources to complete each task. Software
development teams normally try to organize tasks concurrently to make optimal use
of workforce as well as minimizing task dependencies to avoid delays caused by
26
one task waiting for another to complete. In reality, software project scheduling is
dependent on project managers’ intuition and experience.
In real-life software project, a schedule is represented as a set of activity
diagrams (Work Breakdown Structure, Activity Charts) which clarifies the
dependencies between activities (tasks) and personnel assignment.
1.2. Software project scheduling methods and techniques
1.2.1. Overview
There are many popular techniques for project scheduling, include:
-
-
Graphical representations used to illustrate the project schedule such as
+ Work Breakdown Structure: show project breakdown into tasks.
+ Activity Charts: show task dependencies and the critical path.
+ Gantt Charts: Bar charts show schedule against calendar time.
Critical Path Method – CPM [10, 15, 20].
Program Evaluation and Review Technique – PERT [12, 13, 15, 35].
Project scheduling (especially under uncertainty) is the most widely studied
area of risk quantification in project management. Producing a reasonable and
reliable project schedule is one of the crucial tasks of project managers. Moreover,
having a realistic schedule for the project is one of the most cited factors of project
success [36]. Several techniques are proposed for modelling risk and uncertainty in
project scheduling [10, 35, 37].
This section reviews some notable techniques. CPM and PERT are the classical
approaches for project scheduling. Simulation-based techniques are more modern
approach that is adopted by many project management software tools and some
argue the best practice available. Alternative approaches are Critical Chain Method
and Fuzzy logic will be reviewed briefly. Last but not least, scheduling technique
and method for agile software development will also be discussed.
1.2.2. Traditional scheduling methods and techniques
a) Critical Path Method (CPM)
Critical Path Method (CPM) is one of the most famous techniques in project
scheduling. Developed in 1957 by DuPont, CPM has become the standard technique
in project management and most project management tools support CPM
calculation [15]. According to Pollack-Johnson and Liberatore [38], almost 70% of
project managers or professionals use CPM. CPM calculation includes the
following steps:
27
-
Specify the individual activities using a work breakdown structure.
Determine the sequence of those activities and dependency between them.
Draw a network diagram (that models the activities and their dependency).
Estimate the completion time (duration) for each activity.
Identify the critical path (the shortest-duration path through the network).
Update the CPM diagram as the project progresses.
The basic mathematical notations used for CPM calculation is shown in the
Table 1.1. In fact, the parameters D, ES, EF, LF, LS are common used in scheduling
techniques.
Table 1.1 Basic mathematical notations used for CPM calculation
No. Notation
1
aj
2
Dj
Description
activity j
Duration of aj
Formula
Note
i is one of the
predecessor
activities
3
ES
Earliest start of
aj
ESj = Max[ESi +
Di ]
4
EF
Earliest finish of
aj
EFj = ESj + Dj
5
LF
Latest finish of
aj
6
LS
7
TF
Latest start of aj
Total float of aj the time that the
activity’s
duration can be
increased
without
increasing the
overall project
completion time
LFj = Min [LFk
– Dk ]
LSj = LFj – Dj
k is one of the
successor
activities
TFj = ESj – LSj
= LFj – EFj
A critical activity is the one with no float time (TF = 0) and should receive
special attention, since delay in critical activity will lead to delay the whole project.
Informally, the critical path is determined by performing forward and backward
passes through the project network. The forward path computes the earliest start
(ES) and the earliest finish (EF) time for each activity. The backward path computes
the latest start (LS) and the latest finish (LF) time for each activity. The total float
for each activity is the difference in the latest and earliest finish of each activity
[15]. The connections among these parameters in an activity are described in Figure
1.2.
28
Therefore, CPM is a deterministic model which uses a fixed time estimate for
activities. Although CPM (“pure deterministic in nature” [20]) was not developed to
handle or quantify uncertainty, it does provide very useful information about
relations between activities, activities time and the overall project schedule (so that
project scheduling can be controlled).
Figure 1.2. CPM parameters in an activity
b) Program Evaluation and Review Technique (PERT)
PERT was introduced in 1957 by the US Navy as one of earliest research
incorporating risk in project management [13, 15]. A special feature of PERT is its
ability to handle uncertainty in activity duration. This means if there is a variation in
time estimate of an activity; it may affect the whole project. PERT methodology is
developed to help completing the project successfully when the time estimate is not
definitive.
In order to do that, instead of a single estimation in CPM, PERT provides a beta
probability distribution to each project activity. Three time estimates (optimistic,
most likely, and pessimistic time estimates) can be obtained and can be used to
estimate the expected time and the standard deviation for an activity i.
Optimistic time estimate is the estimate determined considering all favorable
conditions; i.e. in the best-case scenario or when everything goes right. In other
words, this is the shortest time in which the activity may be completed.
29
Most likely time estimate is the time duration where there is a high probability
of completing the activity within the given time duration. In other words, it is the
estimate in case of normal problems or opportunities.
Pessimistic time estimate is the estimate determined when we consider all
unfavourable conditions; i.e. in the worst case scenario or when everything goes
completely wrong. In other words, this is the longest time the activity might require
to complete.
- Expected time: μi = (Optimistic + 4xMost likely + Pessimistic)/6
- Standard deviation: σi = (Pessimistic – Optimistic)/6
The critical path is the sequence of project activities that determines the earliest
time by which the project can be completed, and the total duration determines the
completion date of the project. PERT assumes that only one path is the critical path
and that the path does not change. Therefore, managers using PERT are advised to
focus on these critical activities to ensure the project completion date remains
unchanged. The expected value of a critical path is calculated by the expected value
of each activity, and the variance of the critical path is the sum of the variances of
all activities in the path. Based on the calculation, the probability that the project
will be completed by a certain date can be calculated. Therefore, PERT is somehow
similar to CPM. The main difference is that each activity in a PERT network has a
variance associated with its completion time. In other words, CPM is deterministic,
while PERT is somehow probabilistic.
c) Simulation-based techniques
Monte Carlo Simulation (MCS) was first proposed for project scheduling in the
early 1960s [39]. However, it was not until the 1980s when sufficient computer
power became available that simulation became the dominant technique for
handling risk and uncertainty in projects [40, 41]. In its simplest approach, MCS
uses the project activity diagram.
The duration of each activity is estimated by shortest, most likely and longest
duration and also the shape of the distribution (such as Normal, Beta etc.). Then
critical path calculation is performed several times, each time using random values
from the activities’ distribution function.
More advanced tools like PertMaster (Oracle Primavery Risk Analysis [42]) use
simulation-based approach not only for handling uncertainty in duration and cost,
but also for providing a whole risk analysis process. They can link the project
30
schedule to the risk register and apply simulation-based techniques to carry out
probability impact analyses.
A survey by the Project Management Institute [43] showed that nearly 20% of
project management software packages support Monte Carlo Simulation. Another
survey by Pollack-Johnson and Liberatore in 2003 [44] found that 17% of project
managers used probabilistic analysis and/or simulation within project management
software.
However, simulation has its own drawbacks. One serious methodological flaw
in traditional MCS of project networks is the assumption of statistical independence
for individual activities which share risk factors in common with other activities
[38]. Most available simulation packages assume that the marginal distributions of
uncertainty for individual activities in the project completely define the multivariate
distribution for project schedule. It is intuitively obvious that this assumption is
highly suspect for many projects which involve multiple activities of a similar type
and/or have different activity types, which are influenced by common risk factors.
van Dorp and Duffey in 1999 [45] demonstrated that failure to model such types of
risk dependence during MCS can result in the underestimation of total uncertainty
in project schedule. The most effective way to deal with dependence in a statistic is
use a causal structure to explain it. MCS is not capable of modelling causal
structures.
Another weakness of MCS explained by Williams [46] is the inability of
simulation to capture the actions taken by the managers to recover any slippage in
activity/project duration. MCS simply runs through a network assigning values to
random variables on each iteration. It ignores the fact that in reality if an activity
was running late, management would take actions to affect the activity duration.
Uncertainty in an activity is usually the result of a chain of causes (sources) and can
be affected by a chain of actions (controls).
Furthermore, MCS is only as good as the information that is fed into it. If the
duration distributions of the project activities are incorrect or inadequate, the
simulation results are erroneous and invalid. In reality duration of most activities are
estimated subjectively. In order to capture all aspects of uncertainty in activity
(project) duration various known and unknown sources of risk have to be addressed.
Therefore, MCS will not be applied as a scheduling technique in the scope of
this thesis.
31
d) Fuzzy logic
An alternative approach that has interested several researchers in the past two
decades [47, 48] is Fuzzy project-scheduling. The fuzzy set scheduling literature
recommends the use of imprecision rather than uncertainty, fuzzy numbers rather
than stochastic variables and membership functions rather than probability
distributions. The output of a fuzzy scheduling will normally be a fuzzy schedule,
which indicates fuzzy starting and ending times for the activities. This may be as
difficult to generate as probability distributions of activity duration and also there is
no generally accepted computational approach available. Therefore, the fuzzy
project-scheduling approaches have been kept in the academic sphere. A summary
of most of the published research works in fuzzy project scheduling can be found in
the work of Bonnal et al. in 2004 [49].
1.2.3. Agile software project scheduling
From the late 1990s several methodologies like RUP, XP, FDD, Scrum etc.
began to get increasing public attention and has become mainstream software
development methods, especially in Vietnam where most software vendors are
small and medium enterprises. These methods are representative of agile software
development.
Agile – denoting “the quality of being agile; readiness for motion; nimbleness,
activity, dexterity in motion” [50] – software development methods are attempting
to offer an answer to the eager business community asking for lighter weight along
with faster and nimbler software development processes. This is especially the case
with the rapidly growing and volatile Internet software industry as well as for the
emerging mobile application environment.
Agile development is a way of organizing the development process,
emphasizing direct and frequent communication – preferably face-to-face, frequent
deliveries of working software increments, short iterations, active customer
engagement throughout the whole development life-cycle and change
responsiveness rather than change avoidance [51]. Thus, agile software
development recognizes that software development is inherently a type of product
development and therefore a learning process. It is iterative, explorative and
designed to facilitate learning as quickly and efficiently as possible. Two of the
most significant characteristics of agile approaches are: 1) they can handle unstable
requirements throughout the development cycle; and 2) they deliver products in
shorter time-frames and under budget constraints when compared with traditional
development methods.
32
An agile approach can be seen as a contrast to (traditional) waterfall-like
processes [52, 53, 54] which pay attention to thorough and detailed planning and
design upfront and consecutive plan conformance. The waterfall model is the oldest
and the most mature software development model [53]. In practice, the waterfall
development model can be followed in a linear way, and iteration in an agile
method can also be treated as a miniature waterfall lifecycle.
Agile approaches have been widely employed in a domain of low cost of failure
or linear incremental cost of failure [55]. Examples within this domain include webbased applications, mobile applications [50], Internet commerce, social networking,
games development, and even some areas in government, finance and banking
software development.
Table 1.2 summarizes some of the differences between waterfall and agile
projects.
Table 1.2. The differences between waterfall and agile projects
Criteria
Product/
scope
Waterfall
Agile
An often bloated product that The best possible product according
is still missing features (i.e., to customers own prioritization,
rejected change requests or de- incorporating learning from actual
scoped to meet deadlines).
use (revolves with the increments).
Schedule/
time
Deadlines are usually missed, Very high probability of meeting
and it is unlikely for a project fixed date commitments; can often
to deliver early.
deliver early with the highest value.
Quality
Defects must be tested Quality is built in, and is the key to
extensively and expensively.
productivity (writing tests before
writing code).
Return/
value
creation
Revenue earning and value
creation are delayed until the
lowest priority features are
implemented and delivered.
Relationship Contractual
to the
customer
Value is generated early, as soon as
the minimum highest prioritized
features are delivered. Greater
return on investment.
Collaborative
33
Since agile software development is organized iteratively and incrementally in
iterations, agile software scheduling is actually iteration scheduling. Iteration
scheduling aims at determining a very feasible and precise plan for the development
that schedules the implementation of selected features within an iteration (i.e.
assigning tasks to developers). Technical tasks (or Sprint backlog items in Scrum)
are the main concepts of iteration scheduling. These tasks are the fundamental
working units accomplished by one developer, and usually require some working
hour realization effort that is estimated by the team. The aim of iteration scheduling
is to break down selected requirements into technical tasks and to assign them to
developers [56]. In that process, the development team also needs to care about
tasks dependencies (sequencing) and time constrains. The problem of optimized
Agile iteration scheduling will be discussed in details in Section 3.1.
1.3. Risk management in software project scheduling
1.3.1. Overview of project risk management
Risk management has become an important part of project management and has
attracted a wide range of research during the last two decades [11]. Since 1990
various Risk Management Processes (RMP) have been proposed. Probably the most
popular Project Risk Management Processes (PRMP) is Chapter 11 of the PMBOK
(Project Management Body of Knowledge) guide [7], the PRAM (Project Risk
Analysis and Management) guide [57] and the RAMP (Risk Analysis and
Management for Projects) guide [58]. Most organisations adopt one of these guides
or use them to develop their own process. This thesis does not intend to explore the
detailed differences between different guides since, apart from fundamental
differences in assumptions and methodologies [59], they all aim to capture risk and
uncertainty in the following three stages:
-
Risk Identification
Risk Analysis
Risk Response
The Risk Identification stage attempts to discover the main sources of risk. This
stage is also known as qualitative risk management. By using various data gathering
techniques (e.g. interviewing, brainstorming, Delphi technique, checklists etc.) from
all parties involved in the projects, the possible risks that might affect the project are
identified.
The usual output of the risk identification stage is a document called the Risk
Register. Many authors have discussed risk registers in their works [60]. Williams
[61] stated two main roles for a risk register:
34
-
A repository of a corpus of knowledge.
To initiate the analysis and plans that flow from it.
Chapman and Ward [14] consider a risk register as documentation of the sources
of the risks, their responses and also risk classification. Ward [62] described the
purpose of a risk register “to help the project team review project risk on a regular
basis throughout the project”. Patterson and Neailey [63] presented a risk register
database system to aid managing project risk. Risk registers can be a good
management tool during the course of a project. However, it is not possible to
identify all risks and capture all aspects of them. There are always unknown (i.e.
undiscovered, unattended or immeasurable) risks that often are more important than
the identified risks in the risk register.
The Risk Analysis stage attempts to measure the risk and its impacts on different
project outputs (i.e. cost, time, and performance). This stage is also known as
quantitative risk management. The likelihood that each identified risk will occur
and also its possible impact on the project is estimated. The combination of the
risks, probabilities and their impact create ‘probability-impact’ (PI) matrices. This
matrix can be used to assign ranks to risks and then prioritise them. Most of the
available quantitative tools and techniques (simulation based tools) implement the
PI values to quantify uncertainty in projects. However, use of PI matrices has some
important shortcomings [11].
The Risk Response stage attempts to formulate management responses to the
risk. Also known as “Risk Mitigation”, it uses the results of the analysis stage in
order to improve the chance of achieving the project objectives. “Risk Response” is
a decision making process. A number of alternative strategies are available when
planning risk responses, which can be described under one of the following
strategies [64]:
- Avoid - seeking to eliminate uncertainty by reducing either the probability or
the impact to zero.
- Transfer – seeking to transfer ownership and/or liability to a third party (e.g.
insurance).
- Mitigate – seeking to reduce the size of the risk exposure in order to make it
more acceptable to the project or organization.
- Accept – recognizing residual risks and responding either actively by
allocating appropriate contingency, or passively doing nothing except monitoring
the status of the risk.
35
There are several other publications with different perceptions of project risk
management processes. For example, Al-Bahar and Crandall [65], the UK Ministry
of Defence [66], del Caano and de la Cruz [67], Wideman [68], British Standard
Institute (BSI) [69], NASA (Rosenberg et al. 1999) [70], the U.S. Department of
Defence [71], and the US Department of Transportation [72] suggest the use of
processes with different stages or phases. Even though risk management process is
adopted for managing risk/uncertainty, risk analysis always plays an important role
in the process.
1.3.2. Project risk analysis
The term risk analysis in the scope of this research is the same with quantitative
risk analysis and related to risk measurement, as we focus on quantitative issues of
project risks. Project risk analysis is one stage of project risk management. In some
literature, risk analysis is even synonymous with risk management.
In fact, risk analysis is usually started out by a qualitative analysis and its
results support the decision making process in the Risk Response stage. It is a
continuous process that can be started at almost all stages in the duration of a
project. However, it is the best to use risk analysis in the beginning stages of
projects (i.e. some phases like feasibility study and planning) and continually update
it during the implementation phase. This can be done iteratively at intervals, and
this also matches with agile software development.
Risk analysis is the most “formal” aspect of the project risk management
process [64]), often involving sophisticated techniques and usually requiring
computer software (or tools). Such techniques may be applied with various levels of
effort depending on the available resources for the analysis and also on the details.
Risk analysis can bring in certain benefits to software project, including:
- Help to make decisions and make it possible for more effective and efficient
risk management.
- Help to make more feasible (realistic) plans, in terms of both duration and
costs.
- Help to form statistical data of historical risks. This in turn would be benefits
in better planning and implementation of future projects.
36
1.3.3. Unknown risks
One important category of uncertainty in projects is “Unknown Risks”. These
are important sources of uncertainty because their impact on a project may
outweigh all other sources of risks.
Although unknown risks are thoroughly acknowledged (perhaps with different
names) by several authors, none of the existing approaches for project scheduling is
able to model and quantify this type of risk. The conventional “probability impact”
approach at best is only capable of modelling “known risk”. Most of the current
quantitative techniques for risk analysis are event-oriented and more concerned
about ‘risk of something happening’. They assume that a list of events (conditions)
that may take place is known, the impact of each risk on activity duration is also
known and even the nature of the response to each risk is roughly known [15].
However, unknown risks are unpredictable and immeasurable (their impacts are
unknown or hard to quantify). Those risks required much effort to clarify. An
example of unknown risks is Internally Generated Risk - IGR [73]. As their names
already reveal, IGRs originated from within the project team or organization, from
rules, policies, regulations, structures, actions, behaviours or culture of the
organization. IGRs have the following features:
- Common, since organizational issues such as policies, processes, culture etc.
are widespread in most projects of the organization.
- Important, since they often have impact on more than one activity.
- Not well-managed in projects, as they are unpredictable (and hardly put in
documents or risk registers) and hard to quantify.
1.3.4. Risk aspects in software project scheduling
In different project management processes there are different aspects of
uncertainty/risk [20]. This thesis focuses on quantitative risk management which
concerns about risks affecting project schedule (or project time frame), including
risks affecting project scheduling (a phase or a process in project planning). As can
be deduced from the previous sections, these risks cannot be completely separated
from risks of other processes or phases.
In project scheduling, the most obvious risk is in duration estimation for a
particular activity. Difficulty in this estimation can arise from a lack of knowledge
of what is involved as well as from the uncertain consequences of potential threats
or opportunities. Some sources of uncertainty:
37
- Level of available and required resources (including inexperienced or lack of
training developers).
- Incomplete (or often changing) requirements.
- Tradeoff between resources and time.
- Possible occurrence of uncertain events (especially those cause badly impact,
or risks).
- Challenges from technology (incompatible technology, built-in API without
sufficient documentation, insufficient architecture etc.).
- Causal factors and interdependencies including common causal factors that
affect more than one activity (such as organizational issues).
- Lack of previous experience and use of subjective instead of objective data.
- Incomplete or imprecise data, or lack of data.
- Uncertainty about the basis of subjective estimation (i.e. bias in estimation).
1.4. Bayesian Networks
1.4.1. Bayesian approach vs classical approach
The fields of statistics and data analysis are concerned about inferring the
probability of an uncertain event. The difference between the classical (also called
Frequentist) style and Bayesian approach is summarised in Table 1.3.
Table 1.3. The differences between Bayesian and Frequentist approaches
Criteria
Bayesian
Frequentist
Parameters/
Variables
Uncertain
Random
Probability
Degree of belief
Physical property
(subjective)
(objective)
Inference
Bayes’ theorem
Confidence interval
Judgement
Depends on the person’s
(subjective) opinions or
beliefs
A fact, independent on the
analyst’s opinions or beliefs
38
Criteria
Bayesian
Frequentist
Samples/
Observations
Any number of samples or
observations
Large enough number of samples
or observations
The fact about risks is that most uncertain events do not have much historical
data associated with them. The analyst does not have much data, although he or she
may have certain opinion or belief (prior probability). In other words, even where
relevant historical data does exist it must still usually be informed by subjective
judgements before it can be used for measuring uncertainty. Moreover, the amount
of real-life software project data collected (samples/ observations) may be also
limited. Therefore, we cannot rely on the classical approach to measure uncertainty,
and Bayesian approach is the most suitable for risk analysis in software projects.
The Bayesian approach can also provide a rational way of revising our beliefs in
the light of new information (i.e. evidence) which will be explained in the next
section.
1.4.2. Probabilistic approach using Bayesian Networks
Bayesian Network (BN, or also known as Bayesian Belief Network, Causal
Probabilistic Networks, Probabilistic Cause-Effect Models, and Probabilistic
Influence Diagrams) is a special type of graphs that associated together with a set of
probability tables. BN models causal relationships of a system or dataset and
provides a graphical representation of this causal structure through the use of
directed acyclic graphs (DAGs) with nodes and edges. The DAG representation
provides a framework for inference and prediction. The nodes represent random
variables with probability distributions, while edges represent weighted causal
relationships between the nodes. Each node has a probability of having a certain
value (a finite set of mutually exclusive states). A directed edge exists from a parent
to a child. Each child node A has a conditional probability table P(A|B1,…,Bn)
based on its parental values B1,…,Bn. If the node has no parents, then the table
becomes the unconditional probabilities P(A) (i.e. prior probability).
BN is based on Bayes’ Theorem, with the well-known formula presenting the
joint probabilities:
P(R|S) =
𝑃(𝑅,𝑆)
𝑃(𝑆)
(1.1)
It follows to be expressed in the basic form of Bayes’ rule as [23]:
39
P(R|S) =
𝑃(𝑆|𝑅)𝑃(𝑅)
𝑃(𝑆)
(1.2)
The above Bayes rule is interpreted in terms of updating the belief (posterior
probability of each possible state of a variable, that is, the state probabilities after
considering all the available evidence) about a hypothesis R in the light of new
evidence S. So, the posterior belief P(R/S) is calculated by multiplying the prior
belief P(R) by the likelihood P(S/R) that S will occur if R is true (see more about
updating probability in Section 1.4.3).
We can re-arrange the formula for conditional probability to get the following
formula in form of product rule:
P(R,S) = P(R|S)*P(S)
(1.3)
We can extend the above product rule for three variables:
P(A,B,C) = P(A|B,C)*P(B,C) = P(A|B,C)*P(B|C)*P(C)
(1.4)
And it follows the generalized formula to n variables that:
P(A1,A2,…,An) = P(A1|A2, … ,An)*P(A2|A3, … ,An)*…*P(An-1|An)*P(An) (1.5)
Formulas 1.4 and 1.5 are often referred to as the “Chain Rule”, which says in a
BN the full joint probability distribution is the product of all conditional
probabilities specified in the BN. These formulas are important ones considering
BN since they provide means of calculating the full joint probability distribution in
BNs [5]. Many of the variables Ai will be conditionally independent which means
that the formula can be simplified as shown.
BN allows an injection of probability distributions associated with individual
nodes. The initial probability distributions can be simply based on “expert
opinions”, survey or other mathematical methods, i.e., BN approach is consisted of
expert opinions and mathematical calculations.
A BN consists of two parts: 1) qualitative part represents the relationships
among variables by a directed acyclic graph, and 2) quantitative part specifies the
probability distributions associated with every node of the model. The Figure 1.3
shows a BN representing a simple case about the relationship between sub-contract,
(team) staff quality and the possibility of delay in a task [20].
In the BN in Figure 1.3, the qualitative part consists of three nodes (represent
uncertain variables) and two edges. Each node has a set of states. For example, the
node Staff Quality has two states: “Good” and “Poor”. Another part of the directed
graph – the edges – represents influential relationships between variables. For
40
instance, an observed event on Sub-contract or/and Staff Quality may lead to Delay
in Task.
For the quantitative part: there is probability table associated with each node,
providing the probabilities of each state of the variable. For nodes without parents
(i.e., prior nodes), the associated table are not conditioned on the other variables and
are called prior probabilities or prior distributions that represent prior belief. For
example, for the node Staff Quality, P(“Good”) = 0.7 and P(“Poor”) = 0.3. For a
node with parents, the probability table has conditional probabilities for each
combination of the parents’ states (for example, see the table for the node Delay in
Task in the Figure 1.3).
Figure 1.3. An example of BN which represents a simple case
1.4.3. Bayesian Inference
Bayesian inference is based on a conceptually simple collection of ideas. We
are uncertain about the quantity of a parameter. We can quantify our uncertainties
as subjective probabilities for the parameter (prior probability), and also conditional
probabilities for observations we might make given the true value of the parameter
(likelihood function). When data arrives, Bayes’ theorem tells us how to move from
our prior probabilities to the new conditional probabilities for the parameter
(posterior distribution) [74]. For example, in the Figure 1.3, a project manager is
analyzing the cause of delay in a particular task in a project. A part of the task is
done by a sub-contractor. Based on previous experience and the good reputation of
the sub-contractor, the project manager believes that the chance of delivering the
sub-contract on time is 95 percent. There is an 80 percent chance of delay in the
41
task if the sub-contractor fails to deliver on time. Even if the sub-contractor delivers
on time, there is still 10 percent chance that the task is over scheduled (as a result of
other internal reasons). If the task is actually late, what is the probability that the
sub-contractor had failed to deliver on time?
Before knowing about this particular task, subjective estimation (e.g. subcontractor’s reputation) reflects the prior probability of having the sub-contract
delivered on time (SC):
̅̅̅) = 0.05.
P(SC) = 0.95, therefore P(SC
The likelihood function is the conditional probability of delay in task in the task
given the actual state of sub-contract delivery:
̅̅̅̅̅̅̅|SC) = 0.9.
P(Delay|SC) = 0.1 hence P(Delay
̅̅̅̅̅̅̅|SC
̅̅̅) = 0.8 and hence P(Delay
̅̅̅) = 0.2.
P(Delay|SC
Using Bayes’ rule (Formula 1.2) to update the probability, the posterior
probability, or the chance of sub-contract being delivered on time given the task is
late, is:
P(SC|Delay) =
P(Delay|SC)∗P(SC)
̅̅̅̅)∗P(SC
̅̅̅̅)
P(P(Delay|SC)∗P(SC) +P(Delay|SC
=
0.1∗0.95
0.1∗0.95 + 0.8∗0.05
≈ 0.70.
So the prior probability of 95 percent is revised to 70 percent as a result of the
evidence of a delay in the task.
Bayesian inference works simply well when there are only two variables
involved. It would become much more complex when several variables with several
states are involved and a complex set of conditional dependencies exists between
them. To overcome this problem, BNs will be built up.
1.4.4. Bayesian Networks and project risk management
BNs are a rigorous, normative method for modelling uncertainty and causality
which are already used for risk assessment in domains such as medicine and
finance, as well as critical systems generally [75]. Therefore, BNs are highly
suitable in the area of project risk analysis, with the following key benefits [15]:
- BNs provide a rigorous method to make formal use of subjective information.
BNs provide a visual and formal mechanism for observing and testing subjective
probabilities. This is a particularly attractive feature in project risk analysis, as in
most cases the only practical choice is the use of subjective judgments.
42
- BNs explicitly quantify uncertainty. Their causal framework provides a useful
and unambiguous approach for analyzing risk. This is in stark contrast with the
probability impact approach (as discussed in Section 1.4.2) where none of the
concepts has a clear unambiguous interpretation.
- Parameter learning. The probabilistic inference capability of BNs leads to
updating the posterior probability distribution in the light of observed values (i.e.
evidence). This specially offers a mechanism for updating the belief about unknown
factors, which are very difficult to measure and were assessed subjectively before
(see Section 1.4.2).
- Complex sensitivity analysis. BNs are capable of reasoning from effect to
cause as well as cause to effect. This can answer a wide range of ‘what-if?’
questions and offer a complex sensitivity analysis when several variables change
simultaneously.
- Make predictions with incomplete data.
BNs provide an ideal approach for modelling uncertainty in projects; however,
they are rarely used in project risk analysis. The first efforts to apply BNs in project
scheduling were conducted by McCabe [76] and Nasir et al. [77]. They developed a
BN to model the relationship between major risk variables that affect duration of
activities in a construction project. They identified ten risk categories specific to
building construction schedules (e.g. environment, geotechnical, owner, labor,
design, area, contractor, political, non-labor resources and material). Detailed risk
variables (in total 70 risks) in each category were identified. Eight activity groups
were identified to represent all types of activities in a construction project (e.g.
mobilization, demobilization, foundation/piling, labor intensive, equipment
intensive, technical/electrical, roof/external, demolition, and commissioning). In the
next step, by reviewing the literature and conducting a comprehensive expert
survey, the relationships between different risks and different activity types were
identified and subsequently quantified. For each activity group the output of the
model suggested a percent increase or decrease from the most likely duration to
define the pessimistic and optimistic durations. The most likely duration of
activities is assumed to be known and is used as a reference point. The result of the
BN model (in the form of upper and lower limits of activities duration) was
exported to a MCS model to incorporate the effect of risks on the project schedule.
The BN model provided a very flexible modelling environment. It was
validated with historical data from 17 case studies with very good results. However,
the model had the following limitations:
43
- The model was specific to building construction projects; therefore, it
cannot be applied to other industries and different type of projects.
- The BN model predicted the upper and lower bounds of activity duration as
percentage of the most likely duration. It assumes that the most likely duration is
already known and takes it as an input to the model.
- The output of the model (the upper and lower limits of activity durations)
needs another approach (i.e. MCS) to calculate decision making results such as the
expected project duration, the probability of delay/completion etc.
- The upper and lower bounds of activity duration were restricted to a few
pre-defined values. For example, on the pessimistic side the percent increase of
activity duration is limited to 10, 25, 50 and 100%.
- All the risk variables were binary types. Variables with more than two
states could not be modelled properly.
- The final BN model was overly complex. The graphical structure was
unorganized and difficult to follow and understand.
- Although it provided good predictive results, the most powerful feature of
BNs namely diagnostic analysis (e.g. reasoning from effect to cause, learning and
“what if?” type analysis) was not used.
Since many techniques of engineering project management are equally
applicable to software project management and technically complex engineering
systems tend to suffer from the same problems as software systems, in this thesis,
the author develops a BN approach to model and quantify risks/uncertainty in
software project scheduling.
1.5. Chapter remarks
This chapter has overviewed fundamental background on software project
management, software project scheduling, and risk management. The probabilistic
approach using BNs was also introduced, including Bayesian inference and how to
build up BNs. Bayesian features and BNs’ benefits make them the most suitable
approach for managing risks in software project scheduling.
As mentioned in the Introduction section, there is now still limited research on
the topic of applying BNs into risk management in software project scheduling.
Therefore, together with reviewing literature about project management, project
scheduling and risk management to get relevant knowledge for risk management in
44
software project scheduling, this thesis also takes into consideration of risk factors
in software project scheduling as specific attributes in software projects.
Those backgrounds will be used for proposed approaches and experiments in
Chapter 2 and Chapter 3. Chapter 2 will consist of initial attempts of applying BNs
into risk management in software project scheduling as well as experiments on
common risk factors and their impacts in software project scheduling. 19 common
risk factors for both traditional software development projects and agile software
projects are proposed.
Chapter 3, in turn, will incorporate BNs into popular software project
scheduling techniques, namely CPM, PERT and agile software scheduling to
enhance the predictability of schedules using those techniques. BNs are also applied
in examining the relationships among risk factors proposed in Chapter 2.
45
Chapter 2. Common risk factors and experiments on
Bayesian Networks and software project scheduling
This chapter is about the author’s work to find out a quantitative method to
better assess and analyse risks in software project scheduling (to achieve the first
objective mentioned in the Introduction section). In order to achieve this objective,
the chapter has to answer to following questions: what are the risks’ attributes of
software project scheduling? How to manage risks in software project scheduling
better? The chapter is on building tools and coming up with common risk factors in
software project scheduling. Experiments are carried out to test the tools and the
model of applying BNs and common risk factors.
As mentioned above, risk management has become crucial in software project
management since software development always involves uncertainty. The first
section of this chapter aims at providing an effective mathematical model and
proves that software teams can rely on the model to predict and quantify uncertainty
and their impacts on the success of the project, right from the early phases of the
project. From the model, an algorithm and a tool can be developed to help software
teams understand and evaluate possible risks. Based on the calculation, the team can
make appropriate decisions and take actions accordingly to mitigate risks, and the
project manager can better keep track of the project budget and schedule. The
author proposes the BRI algorithm to calculate risks and impacts in a BN model. A
software tool has also been built for experiments on the proposed model and
algorithm.
The second section of this chapter examines a model and a probabilistic tool
CKDY using BNs to evaluate risk factors in software project scheduling.
2.1. Application of Bayesian Networks into schedule risk
management in software project
This section is the work represented in publication 2 [PUB2]. The goal of this
section is to introduce a mathematical model and algorithm (BRI) to assess values
that are critical to a project by calculating their associated risks and the probability
of their occurrence each with a weight factor to derive their impact. Experiments
are carried out to prove that software development teams can rely on the model and
the algorithm to accurately predict, calculate the risks and their impacts on the
success of the project.
46
2.1.1. Common risk factors in software project management
The Arizona State University at Tempe in 2000 conducted a research to develop
a model that can be used to assess potential impacts of software risk factors on a
software development project. They came up with a model that consisted of 24
common risk factors in software projects [78].
Hui and Liu [5] built up software to calculate the impact of these 24 risk factors
to the chance of software projects’ success. Based on the software and model, they
surveyed 29 IT specialists who had 5 to 25-year experience in the IT industry. Each
specialist was interviewed and asked to refine the model by adjusting the associated
probabilities and weights. After collecting the survey results, the research proposed
the list of 24 risk factors together with associated occurring probabilities as shown
in the Table 2.1.
It can be seen from the Table 2.1 that although the 24 risk factors are for
software project in general, they are directly have impact of projects’ duration or
schedules. Therefore, the list can be used as the starting point for assessing risk
factors in software project scheduling.
Table 2.1. Hui and Liu’s common risk factors [9]
No. Group of Issues
Risk factor
Probability
1
Resources
Staff experience shortage
0.30
2
Resources
Reliance on few key person
0.75
3
Resources
Schedule pressure
0.70
4
Personnel
Low productivity
0.22
5
Personnel
Lack of staff commitment
0.20
6
Customer
Lack of client support
0.35
7
Customer
Lack of contact person competence
0.15
8
Research data
Lack of quantitative historical data
0.50
9
Research data
Inaccurate cost estimating
0.50
10
System
Large and complex external interface
0.40
11
System
Large and complex project
0.45
12
System
Unnecessary features
0.30
47
No. Group of Issues
Risk factor
Probability
13
System
Creeping user requirement
0.75
14
System
Unreliable subproject delivery
0.45
15
Management
Incapable project management
0.58
16
Management
Lack of senior management commitment
0.50
17
Management
Lack of organization maturity
0.25
18
Technology
Immature technology
0.46
19
Technology
Inadequate configuration control
0.45
20
Technology
Excessive paperwork
0.3
21
Technology
Inaccurate metrics
0.5
22
Technology
Excessive reliance on a single process
0.5
23
Experience
Lack of experience with project
environment
0.625
24
Experience
Lack of experience with project software
0.42
2.1.2. Bayesian Networks of risk factors
From the list of 24 software risk factors above (in Section 2.1.1), we have built
sub BNs (Figures 2.1 to 2.24 demonstrate the BNs associated 24 risk factors) and
overall BNs (Figure 2.25) for risk modelling in software projects. The BNs also
show the risk factors and its impacts and effects in three weight levels (+ means
level ONE, ++ means level TWO, and +++ means level THREE respectively. + is
lighter than ++ and ++ is lighter than +++) which will be described in the
calculation in Section 2.1.3.
For example, in Figure 2.1 the risk factor “Staff experience shortage” has one
level of impact weight to staff_training and one level of impact weight to
untrained_staff; staff_training has one level of impact weight to project_schedule.
This project_schedule effect is also related to risk factor “Low productivity” (Figure
2.2), risk factor “Lack of senior management commitment” (Figure 2.6) and risk
48
factor “Inadequate configuration control” (Figure 2.7). Of course, this is also related
to the risk factor “Schedule pressure”.
staff_experience_shortage
+staff_training
+untrained_staff
+project_schedule
Figure 2.1. A sub BN for the risk factor “Staff experience shortage”
As can be seen from Figure 2.3, the risk factor “Lack of client support” is
related to the risk factor “Creeping user requirements” and that it has potential
impact on software project schedule.
Figure 2.2. A sub BN for the risk factor “Low productivity”
49
+defect_rate
++lack_of_client_input
+lack_of_staff_commitment
lack_of_client_support
++missed_requirement
+creeping_user_requirements
Figure 2.3. A sub BN for the risk factor “Lack of client support”
inaccurate_cost_estimating
+staff_experience_
shortage
++schedule_pressure
Figure 2.4. A sub BN for the risk factor “Inaccurate cost estimating”
Figure 2.4 is an another example of the sub BN of the risk factors affecting the
software project schedule. This sub BN related to risk factor number 16, “Lack of
organization maturity”, risk factor number 9 “Inaccurate cost estimating” and
50
eventually risk factor 3 “Schedule pressure” in Table 2.1. Risk factors “Inaccurate
cost estimating” and “Schedule pressure” are also related to risk factor “Inaccurate
metrics” (Figure 2.8) and risk factor “Excessive reliance on a single process”
(Figure 2.9).
Figure 2.5 shows the sub BN of the risk factor “Incapable project management”
which stated in literature to have high level of impact on project schedule. The
experiments in this thesis will also confirm that. The risk factor relates to “Lack of
senior management commitment” (which will also a common risk factor in all the
lists examined in this thesis) and “Creeping user requirement”. According to Hui
and Liu [5], these three risk factors all have high probability of affecting software
projects.
Figure 2.5. A sub BN for the risk factor “Incapable project management”
51
+staff_experience_
shortage
+low_moral
lack_of_senior_management
_commitment
+schedule_pressure
++project_schedule
Figure 2.6. A sub BN for the risk factor “Lack of senior management commitment”
+rework
+productivity
++defect_rate
inadequate_configuration_
control
+manual_efforts
+project_schedule
Figure 2.7. A sub BN for the risk factor “Inadequate configuration control”
52
+schedule_pressure
++inaccurate_reporting
++inaccurate_cost_
estimating
inaccurate_metrics
Figure 2.8. A sub BN for the risk factor “Inaccurate metrics”
+inaccurate_cost
_estimating
+schedule_pressure
excessive_reliance_on_a_sing
le_process_improvement
+defect_rate
Figure 2.9. A sub BN for risk factor “Excessive reliance on a single process
improvement”
53
For further details on the sub BNs, please see Appendix. Sub Bayesian
Networks of the 24 risk factors.
Figure 2.10 shows the BN of overall model in software projects.
Figure 2.10. The overall BN for software risk factors
2.1.3. Risk impact calculation
As discussed in Section 1.4.2, BNs allow us to associate probability distribution
with each individual node. The initial probability distributions can be based on
expert opinions, surveys, or mathematical methods. The derived probabilities can be
calculated by Bayes rule, chain rules (as mentioned in Section 1.4.2) and Bayesian
inference (as described in Section 1.4.3).
We apply the following characteristics of BNs to calculate the impacts of risks
in software projects [37]:
- Expression of expert opinions, experiences or beliefs about the dependencies
between different factors.
- Consistent propagation of the impact of uncertain evidence on the
probabilities of outcomes.
54
- Calculation and revised calculation of probability when the evidence is
known.
Figure 2.11 illustrates how the above characteristics are applied. The Figure
shows that the events x, y, and z are dependent of each other, x is independent of z
with the condition y.
Figure 2.11. A simple example of Bayesian inference
- Expert opinions, experiences, beliefs: z impacts y, and y impacts x.
- Propagation of the impact of evidence: If we know that the probability of z
happen is P(z) = 0.9, the condition probability of y given z happen is P(y|z) = 0.7
and the condition probability of x given both y and z happen P(x|y,z) = 0.6.
Then by applying the chain rule, we can calculate that the probability P(x).
- First we calculate P(y):
P( y)   P( yz i )   P( y | zi ) P( z )
Since:
___
 P( yz i )  P( yz )  P( y z )
Therefore:
___
___
P ( y )  P( y | z ) P ( z )  P( y | z ) P( y )
__
Assume P( y | z )  0.5 , then:
P(y) = 0.7x0.9 + 0.5x0.1 = 0.68
- Now we can calculate P(x):
55
P(x)=
 P( x, y z)   P( x | y z)P( yz)
i
i
Since:
__
 P( x, y z)  P( xyx)  P( x y z)
i
Therefore:
__
__
P(x)=P(x|yz)P(y)+P(x| y z ) P( y )
__
Assume P( x | y z )  0.5 , then:
P(x) = 0.6x0.68 + 0.5x0.32 = 0.568.
Given Figure 2.26, the formula to calculate p(x|z):
p x | z    P( x | z  yi ) P( yi | z )
yi
  P ( x | yi ) P ( yi | z )
(2.1)
yi
Bayes Theorem has a very important property that we can calculate revised
parent probability when we know that the child is true. Recall formula 1.2 that:
P(x|y) = P(y|x)*P(x)/P(y).
- Revised probability of y being true:
P(y|x) = (P(x|y)*P(y))/P(x)
= 0.6 * 0.68 / 0.568
= 0.7183.
- Revised probability of z being true:
P(z|y) = (P(y|z)*P(z))/P(y)
= 0.7 * 0.9 / 0.68
= 0.9265.
Based on the above model, we looked at an algorithm that calculates the impact
of risk factors, allowing project managers to estimate and make appropriate
decisions for the team development, aiming to bring the software project completed
on time.
56
Figure 2.12. The three nodes of a simple-chain BN
Which is in Figure 2.12:
 x: the examined risk;
 y: the risk directly generated from the examined risk;
 z: the risk generated in the condition that the two previous risks
occurred.
 P(y|x), P(z|xy): possibilities of risks when the conditions are true (in
three weight levels: + (low) (p=0.3), ++ (medium) (p=0.6), +++(high)
(p=0.9)).
2.1.4. Bayesian Risk Impact algorithm
We propose the algorithm BRI (Bayes Risk-Impact) to assess the impact of risk
factors.
* Input: Risk factors and probability (Table 2.1)
* Output: ImpactWeight(examined_risk) - the degree of the impact of the risk
factor on the fulfillment of a software project in the form of a vector of numerical
values. The higher the value, the greater the impact.
The algorithm BRI assesses the impact of risk factors:
Step 1. Based on known probabilities, calculate the possibilities of child nodes
in each sub BN.
Step 2. With each child node, recursively find ImpactWeight(child_node).
Find Bayesian networks started by the examined child node in the original
BN. Calculated ImpactWeight(child_node) with the probability calculated in Step 1.
If not found, ImpactWeight(child_node) = P(child_node).
Step 3. ImpactWeight(examined_risk) = ∑ImpactWeight(child_node).
Step 4. Sum up together ImpactWeight(examined_risk) into impact vector.
Step 5. Repeat to examine the next risk.
57
Each risk factor in the BN might have child nodes in sub BNs. In the beginning,
we have known probabilities (or prior probabilities). Based on Bayes Theorem and
Bayes inference, child nodes’ probabilities could be calculated.
Each child node, in turn, might belongs to one or more sub BNs. Its
ImpactWeight value is initiated as its probabilities, and is summed up in each BN
associated with it.
The ImpactWeight value of examined risk is set as the sum of all the
ImpactWeights of its child nodes.
For example, assume that the risk factor 14 “incapaple project management” is
examined. As could be seen in Figure 2.5, the node “incapaple project
management” has three child nodes. The child nodes associated in other sub BNs
and in the figure they are related to risk factors 13 (“creeping user requirement”)
and 16 (“lack of senior management commitment”, associated with another BN in
Figure 2.6) which are also examined in some steps of the BRI algorithm. As a
result, the ImpactWeight of “incapaple project management” will be higher than the
ImpactWeights of “creeping user requirement” and “lack of senior management
commitment”.
2.1.5. Tool and experiments
a) Building tool
The purpose of the tool is to stimulate the above model and algorithm, helping
managers assess the level of impact of the risks on the ability to complete a software
project when the probability of occurrence of the risks is known in advance. The
software is built in C# programming language, with MS.NET Framework 4.5
library and the integrated developer environment (IDE) is Visual Studio 2012. The
graphical interface of the software is shown in Figure 2.13.
To use this tool, it is only required to input the initial probabilities of the risk
factors or the tool can simply accept the default probabilities that were established
in the tool based on research results. The tool then, will automatically calculate the
impacts weight level in terms of a numeric value.
At the different phases of a software project, managers and project teams can
assess the risks and their probabilities of occurrence, as well as making decisions in
58
the planning and management tasks. Through a comparison of the metrics, the
project team will make decisions to the software project.
Figure 2.13. The graphical interface of the tool
b) Experiments
The sample data set is the data of 2 real software projects and the results of the
research of Hui and Liu [9] which is shown in Table 2.1.
Test results (calculation of impact levels of risk factors) with the data of the
research of Hui and Liu [9] (the risk factors and the initial probabilities of the risk
59
factors are shown in Table 2.1) are summarized in Figure 2.14. The results show
that the two factors have the highest impacts are “incapable project management”
and “lack of client support”. This result fits with the fact that the sub-BNs of these
risk factors are of the most complex ones which related to some other key risk
factors (Figure 2.3 and Figure 2.5).
Figure 2.14. Result of experiment 1
With the two real-life projects, the project managers and secretaries based on
their practical experience in those projects to help the authors estimate the initial
probability of the risk factors.
Project 1 is a project about a social networking game, consisting of 8 people (1
admin, 1 tester, 5 developers and 1 designer). The project is expected to be in 4.5
months but last for 10 months. Some of the main problems encountered by the
project team were the large self-built framework, many bugs generated, workloads,
and slow response.
Project 2 is an outsourcing software project for a Japanese telecommunications
company. This project is expected to be done in 5 months with 15 stages, but in fact
it lasted in 10 months like Project 1. The biggest problems encountered are
identifying customer requirements (in phase 1), assessing the complexity of each
module to allocate resources, and a long lead time for transfer and guidance to
customer.
60
Table 2.2. Risk factors in the phases
Phase
Risk factor
Impact
(High,
Medium,
Low)
Probability
Requirements
Identification
Creeping user requirement
High
0.5
Requirements
Identification
Incapable project management
High
0.1
Requirements
Identification
Lack of client support
High
0.2
Requirements
Identification
Staff experience shortage
High
0.4
Software Design Staff experience shortage
High
0.2
Software Design Immature technology
Medium
0.1
Software Design Unnecessary features
Medium
0.99
Experiment method: joint testing for all 3 data sets and testing in each phase of
each project. The general experimental results are shown in the comparison chart in
Figure 2.15. Experimental results with all three data sets show the similarity of the
levels of impacts (all high) of the risk factors (e.g. incapable project management,
lack of client support, or excessive reliance on a single process...).
With Project 2 data: since the project is well organized from the early phases,
the author would like to go into the analysis of the impact of risk factors in each
phase of the project. The authors have been provided with estimates/ assessments by
the project manager on possible risks in some phases as shown in Table 2.2. The
project team highly considered the influence of factors incapable project
management and lack of client support.
61
The experimental results of the software with the Requirements Identification
phase show more clearly: for this phase, the greatest level of risk impact is
incapable project management, which requires the improvement of management
skills and quality; there should be clear and specific commitments from project
managers; do not let errors occur while making management requests and
accurately identify user requirements.
For Software Design phase, the biggest risk impact is from immature technology
(Figure 2.16), this requires a strategy to avoid risks when handling methods;
minimize the risk of schedule pressure. According to the project manager, this result
is consistent with what happened in Project 2.
Figure 2.15. Results of the three experiments
Based on the vector of the influence level of risks in each project period and the
overall project, it is possible to realize that the management's risk affects the ability
to complete the project on time as well as project success. The bigger the project,
the more complex it is and the more risk factors that need the higher skills level,
experience, and capabilities of the project manager. Other important factors are lack
of commitment and support from clients, incapable project management, and
excessive reliance on a single process... This result confirms the need for supportive
tools for the project team to estimate, evaluate and promptly adjust. Project
62
managers need to pay more attention to the risk factors that have the greatest impact
on the project to ensure the project is carried out smoothly.
Figure 2.16. Experimental results for Software Design phase
2.1.6. Conclusion and contribution
The tool and experimental results show that the algorithm BRI accurately
assesses the impact of risk factors on the project schedule. The algorithm would
quickly help the management team to foresee impacts weight caused by their risk
factors. Users of the experimental tool are only required to input the initial
probabilities of the risk factors or they can simply accept the default probabilities
that were established in the tool based on our research results. Project managers
need to pay more attention to the risk factor that causes highest impacts in order to
keep their software development projects from falling into troubles, especially time,
cost and quality ones.
The proposed models, algorithms and tools, in addition to quantifying risks and
their consequences, can also help identify problems and potential risks at the first
phase of the project – project scheduling and project planning. The authors also
assert that if we can identify and control issues from the early phases of the project,
we can significantly increase the likelihood of the project's success. Although the
BN model generally provides an accurate picture of the risks of typical software
projects at an early phase, it still needs further development especially when using it
for other specific industries, or at later phases of software development projects.
63
Therefore, further research on this issue can focus on implementing the
application of BN technique in modelling risks in project scheduling by
incorporating BNs with different project scheduling techniques (CPM, PERT,
simulation ...), then evaluate and make better recommendations to the project team.
The author would also collect and run the algorithm with more empirical data sets,
so that there is better evaluation and analysis for the algorithm. An expert survey
will also be carried out so that, together with the test results of the tool, a list of risk
factors that best suits each type of software developed as well as each method of
software development. The authors will also study more closely and integrate more
sources of risk into the scheduling process and how to deal with other types of
unforeseen factors (such as unknown unknowns).
The final development of the project is the management of actions and decision
making support for project managers when the project has a scheduling problem
(some phase is delayed or the whole project is delayed) by evaluating multiple
schedule scenarios right during the project scheduling process.
2.2. Experiments on common risk factors
This section is the work whose results were represented in publication 3 [PUB3].
In reality, all the phases of the software development life cycle (SDLC) are
potential sources of uncertainty since they have to deal with hardware, software,
technology, people, cost, and processes. Current state-of-the-art scheduling
techniques based on the assumption that every task, activity or phase of the project
is carried out exactly as it is planned, which almost never happens in real-life
projects. Recent research on risk management focuses on the relationships between
uncertainty (risk factors) and the outcomes of a project. This section examines a
model and a probabilistic tool CKDY using Bayesian Belief Network to evaluate
risk factors in software project scheduling.
2.2.1. Discovering the top ranked risk factors
We apply the method proposed by Kumar and Yadav [23] including 4 steps: (1)
Selecting the top ranked risk factors in software project scheduling; (2)
Constructing causal relationships among the software risk factors; (3) Constructing
the node probability table (NPT) for each node (factor) of the model; (4)
Calculating the probability value of software risk factors for the project. In each
step, we choose the right solution to put into the building of the tool CKDY. The
following is a detailed description of the options.
64
a) Selecting the top ranked risk factors
The risk factors in software project scheduling depend on various software
aspects such as project size, budget, human resources etc. There are certain sources
of useful information for identifying risk factors such as previous research and
analysis, historical data and lessons learned, system safety and reliability analysis,
expert interviews etc. A number of software risk prediction and risk assessment
models using software risk factors has been proposed [33, 79, 80]. Most of the
existing models evaluate a number of risk factors, although some risk factors are not
suitable for some types of projects, or less important. However, assessment and
estimation of software risk by taking all the risk factors have some drawbacks like:
computationally complex and more expensive processing cost. Selecting the most
important software risk factors that affect an entire project or each phase of the
project could increase the accuracy of the risk prediction and risk assessment.
We synthesize from a number of published risk factors such as SEI risk
classification [81], NASA NPD2820 risk classification [82], along with research
results (24 risk factors) of Hui and Liu [5], the selected 27 risk factors by Kumar
and Yadav [23], to select a set of risk factors in software project scheduling to test
the method proposed by Kumar and Yadav [23]. The set of 5 top ranked risk factors
in software project scheduling is shown in Table 2.2. These risk factors have
consequences on the software project and eventually lead to the project over
scheduled.
Table 2.3. Risk factors, consequences and impact
Component
Sub component
Risk Factors
Poor management skills and experience
Pressure on the schedule
Frequent changes in customer requirements
Inappropriate process
Inappropriate technology
Consequences
Incomplete mission
Wasted resources
Reliability
Impact
Over scheduled
65
b) Constructing causal relationships among the software risk factors
As mentioned in Section 1.4, causal relationships among the nodes in a BN can
be constructed from historical data, experimental observation and with the help of
domain expert (expert opinions). In this case, constructing causal relationships
among risk factors is the modelling process of causal relationships among Risk
Factors, Consequences and eventual Impact with the help of domain experts.
Sub BNs of risk factors and consequences are illustrated in Figure 2.17 and
Figure 2.18, and the overall BN model is shown in Figure 2.19.
Figure 2.17. Sub BN 1
Figure 2.18. Sub BN 2
c) Constructing the node probability table (NPT) for each node of the model
Designing the NPT data is one of the fundamental issues associated with a BN.
Constructing the NPT for each node of the model requires project data. The
indispensable factor in applying BN to project management is the evaluation and
judgment from experts, which help the project manager easily builds the model as
well as constructing NPTs. We chose to use the built-in PSPLIB (Project
66
Scheduling Problem Library) data set5 to construct NPTs for the model. Table 2.4
below is an example of probability of risk factors.
Figure 2.19. The overall BN model
Table 2.4. Examples of risk factors and probabilities
Risk factor
Probability (if
it happens)
Probability (if it
does not happen)
Poor management skills and experience
0.575
0.425
Pressure on the schedule
0.6179
0.3821
Frequent changes in customer
requirements
0.626
0.374
Inappropriate process
0.611
0.389
Inappropriate technology
0.55
0.45
d) Calculating the probability value of software risk factors for the project
Appling Bayes formulas to calculate the probability of each node, and finally the
probability of the success of the project. To make the calculations easier, a support
5
PSPLIB http://www.om-db.wi.tum.de/psplib/
67
library - HuginExpert6 – is used. In addition to predicting the probability of failure
of risk management in project scheduling, we also recommend a risk management
sequence, which helps managers to clearly see where the problem in the project is,
and which issues to solve first, which issues to solve later, as well as prioritizing
resources. The essence of the risk management sequence is to monitor the project
schedule through each phase, as well as giving timely alerts to the project managers
when the risk factors or consequences exceed the permissible threshold resulting in
an impact to the whole schedule.
2.2.2. Tool CKDY
a) Building tool
The tool CKDY7 is developed which inherits Hugin classes, functions and APIs
that provide Bayes analysis and prediction solutions for Java programming
language (ParseListener, Domain, Compiler...). The functions of the tool are shown
in Figure 2.13 including: Calculating, predicting probability of risks in the project
phases; giving warnings to managers after each phase; ranking risk factors; provide
a visual graph of probability variation for each period.
According to the IEEE Standard 1540 (2001) [83], the process of managing risks
consists of the following activities: a) Plan and implement risk management; b)
Manage the project risk profile; c) Perform risk analysis; d) Perform risk
monitoring; e) Perform risk treatment; f) Evaluate the risk management process.
Based on these guidelines, the author proposes the process of handling risks
using BNs in software projects that consists of 4 following steps. This process is
implemented into the CKDY tool.
Step 1 - Initialize the BNs (for carrying out “Plan and implement risk
management”): Based on the common BN model, suitable specific risk factors to
each project are put in. In this step, we identify nodes needed to be monitored and
make assumptions on the status of each node.
Step 2 - Calculate and make predictions from the BNs (for “Manage the project
risk profile”): When the project is started, it also starts a loop to monitor the status
of nodes. Whenever new data is updated they are added to the BNs to calculate and
update the probabilities and estimations. The data history of each week of the
project schedule is archived for easily referencing later.
6
7
HuginExpert http://www.hugin.com/
CKDY tool and data samples: http://bit.ly/2r4MsWb
68
Step 3 - Monitor and analyse risks, adjust resources (for “Perform risk analysis”,
“Perform risk monitoring”).
Step 3.1 - Monitoring and analyzing risks: In the general BN model, we have
related nodes and directly affect the success of project scheduling such as
“Incomplete mission”, “Wasted Resources”, “Reliability”. These nodes will be
monitored for a certain period of time (depending on the project resources and the
accuracy that the project manager will specify this time period).
If the probability that these nodes occur or their children nodes occur above the
threshold, the tool allows the Tracing function to be called to determine the cause.
Step 3.2 - Adjust project resources: As we all know software activities/tasks
always have a saturation point. The more resources we spend on tasks, the more
cost it will take, but the quality will not improve significantly or the performance
will decrease. Therefore, the Saturation function is called periodically (weekly or
longer depending on the decision of the project manager) to check whether the
monitored nodes have reached the saturation point or not from which appropriate
decisions are made.
The Ranking function is also called in cycles to sort the efficiency of the nodes:
If the efficiency level of the resource has quantitative data, it can be easily ranked
(according to the operation cost for each node). If there is no quantitative data for
those nodes, then we will use the BN model for the efficiency evaluation.
When a node is ranked higher, we will reallocate resources to increase the
effectiveness of the project.
Step 4 - Perform risk treatment: Based on the analysis as well as data on project
risks, the project manager will choose appropriate measures to handle risks.
b) The input structure
The tool’s input file is a ".net" file created by the tool HuginExpert. The tool
HuginExpert is based on Bayesian networks and influence diagram technology. It is
used to build a causal relationship diagrams among nodes and provide NPT for each
node of the model based on available data sets and experts’ judgment.
c) Flow of processing
The processing flow of the tool is to read the input data set from Hugin standarddesigned files. From the imported data, the library provided by Hugin will be used
to calculate the probability for the network nodes. The processing of the input data
determines the calculation results of the tool.
69
Once the calculation results are available, they will be printed on the screen so
that managers can see the probability of each item. Based on the results, the project
managers can predict, revise, add, replace some resources to meet the requirements
of the project scheduling.
In addition, the tool also allows scheduling up to seven project phases. This
allow project managers to look into more details of the planning and scheduling
process to find out issues that need to be addressed in the entire process. The tool
suggests the project manager weekly warnings about network nodes exceeding the
safe threshold on allowed time. This helps project managers easily manage their
job. The basic Ranking and Saturation functions aim at suggesting warnings for
over the threshold.
d) Data sample
The necessary data is the probability of selected risk factors in Table 2.3 (Poor
management skills and experience, Pressure on the schedule, Frequent changes in
customer requirements, Inappropriate process, Inappropriate technology). The
PSPLIB (Project Scheduling Problem Library) dataset is used as a set of standard
cases for evaluation solutions to single or multi-mode resource constrained project
scheduling problems. RESCON software (RESource CONstrained)8 is used to
display files in an intuitive interface. RESCON, developed by Katholieke Leuven
University (Belgium), is free and open source software for researches of constrained
project scheduling problems.
2.2.3. Experiments and analysis
a) Testing the model
Two data sets from PSPLIB were used. Each set has 7 corresponding files for 7
phases allowed in the design of the tool. For each data set there will be a test
scenario (with 7 phases). Using RESCON models the files in the PSPLIB data set.
Figure 2.20 is an example with the file j301_1.rcp in PSPLIB.
For example, j30 in the first dataset, in the early start schedule, the scheduling
task often violates the resource constraints since it is only aware to the earliest start
time and the activities’ precedence. For example, Resource 1 uses up to 21 units
while it can only take 12 resource units (vertical axis); or as Resource 2 uses up to
25 units while it can only take 13 resource units. In Figure 2.20, RESCON shows
the limit on the number of resources bound by the red line.
8
RESCON http://feb.kuleuven.be/rescon/
70
Figure 2.20. Experiment with j30 with the early start schedule
Figure 2.21. Activity joint in the file j301_1.rcp
The tool calculates the probability of each risk factor in each phase, as well as
the probability of being behind schedule in each phase (see Figure 2.22). Based on
probabilities and thresholds, the tool also alerts the consequences in 6 levels for
each phase and each risk factor. Based on these parameters, the project manager can
71
consider a reasonable allocation of resources, to meet the schedule requirements in
each project phase. Probability of risk factors in the whole project can be calculated
by taking the average of the probability in the phases (Table 2.5 and Table 2.6).
Figure 2.22. Diagram of probabilities of finishing phase by phase
Tracking each phase of each test scenario we find out that if there are issues
(risks) in the very first tasks, then the probability of failure of project scheduling
increases gradually. If the project team (project manager) does not have any
interventions, then the probability would increase beyond the allowed level as well
as directly affecting the whole project.
Table 2.5. Probability of risk factors in the whole project with data set 1
Risk factor
Probability (if it
happens)
Probability
(if it does
not happen)
Poor management skills and experience
0.505
0.495
Pressure on the schedule
0.7536
0.2464
Frequent changes in customer requirements
0.643
0.357
Inappropriate process
0.6625
0.3375
Inappropriate technology
0.666
0.334
72
Table 2.6. Probability of risk factors in the whole project with data set 2
Risk factor
Probability (if it
happens)
Probability (if
it does not
happen)
Poor management skills and experience
0.575
0.425
Pressure on the schedule
0.6179
0.3821
Frequent changes in customer requirements
0.626
0.374
Inappropriate process
0.611
0.389
Inappropriate technology
0.55
0.45
b) Remarks on the model and the tool
The model proposed by Kumar and Yadav [23] is for risk management for the
entire software project, but it also shows the effectiveness in applying for project
scheduling since it takes advantage of BNs to predict the probability of project
failure at each point of time, based on the highest ranked risk factors during project
scheduling.
The tool and experiments shows that nodes can be monitored and given
warnings when they reach saturation status, as well as ranking the effectiveness of
the nodes from which to provide supportive information for reallocating resources,
or can help project managers always keep track of the probability of project failure
within the allowed limit (in the experiments in this research, the limit is set at 0.5).
The CKDY tool has also been compared with Microsoft's MSBNx software9 for
building and computing based on BNs. MSBNx has been developed since 2001 and
has been used in many experiments. The tool CKDY is compared with MSBNx in
the proposed BNs model and input data of risk factors’ probability. The two
software calculate the probability of consequences and eventually the impact (the
probability of schedule failure).
The results show the similarity in the evaluation of consequences and impacts
according to the proposed research model. For example, with the probability set of
risk factors shown in Table 2.7, we have Table 2.8 comparing the results of the two
tools.
9
MSBNx https://msbnx.azurewebsites.net/
73
Table 2.7. Probability of the experimental risk factors to compare with MSBNx
Risk factor
Probability
(if it
happens, %)
Probability (if
it does not
happen, %)
Poor management skills and experience
18.39
81.61
Pressure on the schedule
23.04
76.96
Frequent changes in customer requirements
14.46
85.54
Inappropriate process
13.93
86.07
Inappropriate technology
12.32
87.68
Table 2.8. CKDY compared with MSBNx
MSBNx(%)
Consequences
Impact
CKDY(%)
Incomplete mission
9.3
9.26
Wasted resources
12.2
12.25
Reliability
11.3
12.47
Over scheduled
9.2
9.18
2.2.4. Conclusion and contribution
The section has applied BNs into the risk management model in software
projects in the early phase of planning - software project scheduling. By means of
literature review, the study selected a set of risk factors that affect the project
scheduling process. The tool CKDY using this set of risk factors has shown high
accuracy and reliability. As the first objective of the thesis, in this section, the
proposed model tries to give an accurate picture of risks in software projects at an
early stage as well as helping project managers control risks early in the software
project life cycle.
To further develop the model, the author would continue to analyse and review
the set of risk factors for the software project as well as each phase of the project.
Modelling and quantifying risks at later phases of software projects will also be
considered. Another relevant research direction is to consider integrating
74
probabilistic models into common software project scheduling techniques (CPM,
PERT, Monte Carlo simulation, etc.).
The tool CKDY is still in the experimental research stage so it is still difficult to
use for non-professionals as well as there are still some limitations of functions and
features. The tool needs to be diversified in features and interfaces as well as
simplifying the input so that non-professionals can use it easily. The author also
needs more expert judgment to help build the input probabilities and more real-case
data samples in the software industry.
2.3. Proposed common risk factors in software project
scheduling
An implication of Section 2.1 and Section 2.2 is that selection of most important
software risk factors could improve the software risk assessment and estimation
accuracy. In this section, the author proposes lists of common risk factors that are
related to (as well as having impacts) on software project scheduling. Section 2.3.1
is about common risk factors in traditional software projects, and Section 2.3.2
comes up with common risk factors in agile software projects.
2.3.1. The 19 common risk factors in traditional software project
Wallace et al. [28] summarized all previous studies and defined 27 software
risks which are classified into 6 categories: Organizational Environment Risk (4
risks), User Risk (5 risks), Requirement Risk (4 risks), Project Complexity Risk (4
risks), Planning and Control Risk (7 risks), and Team Risk (3 risks).
In Section 2.2, the author of this thesis also examined a simple model of
schedule risks in software project development with 5 risk factors listed in Table
2.2. These risk factors can be considered to relate to user risks, requirement risks,
team risks and planning and control risks defined by Wallace et al. [28].
In addition, Rai et al. [26] pointed out the list of 43 risk factors in Agile
software projects. The 43 risk factors cover 6 categories in Agile software
development. They are Development Environment Risks, Process Issue Risks, Staff
Size and Experience, Technical Issue Risks, Technology Risks and Schedule Risks.
In our research, only common risk factors (that the Agile software development
and the traditional software development have in common) those affect software
scheduling and planning are examined. These common risk factors are also
considered as specific software risks which are common to many activities or tasks
in a project. They can be mostly derived from development environment, process
issues, staff size, and experience and schedule risks.
75
In order to identify them, the three lists of risk factors mentioned above are
compared and combined (in consideration of the planning/ scheduling phase of
software development). The comparison and combination were based on the risk
factors’ description, even when they are not stated literally the same. For example,
the risk factor Continually changing system (in 27 risks [28]) can be considered the
same with Frequent changes in customer requirements (in 5 risk factors in Section
2.2) and related to Customer not certain that the functionality requested is "doable" (in 43 risk factors [26]), or Poor management skills and experience (in 5 risk
factors in Section 2.2) is considered the same with Lack of management experience
(in 43 risk factors [26]).
The author of this research also get advises from the experts (who are working
with the author in this research as well as providing real-life projects’ data) on the
risk factors. Table 2.9 lists the 19 common risk factors directly or indirectly
influence the possibility of a schedule success, which combined from the three lists
mentioned above. Their relationships were formed based on the literature in which
the risk factors were described, and based on experts’ opinions. For example, risk
factor (1) Large-scale, offshore and distributed would lead to risk factors (2)
Insufficient training and (19) Lack of management experience due to the size (based
on modules, function points, number of staff or duration) and the complexity of a
project (based on the information or the opinion of the project experts).
Table 2.9. List of 19 common risk factors for software project scheduling
No.
Risk factors
1
Large-scale, offshore and distributed.
2
Insufficient training.
3
Excessive preparation/planning.
4
Teams are not focused.
5
Inappropriate process.
6
The best people not available for self-organizing team.
7
The skill level of people (team/developer).
8
Staff is not committed for entire duration of the project.
9
Ineffective communication.
10
Staff does not receive necessary training.
76
11
Lack of tools and methods.
12
Software tools are not used to support software planning and tracking
activities.
13
Configuration management software tools are not used to control and track
change activity throughout the software process.
14
Incorrect scale.
15
Inappropriate technology.
16
Level of team/developer.
17
Customer not certain that the functionality requested is "do-able".
18
Lack of commitment of superior management.
19
Lack of management experience.
2.3.2. The 19 common risk factors in agile software project
As mentioned in Section 2.3.1, Rai et al. [26] pointed out the list of 43 risk
factors in Agile software projects. The 43 risk factors cover 6 categories in Agile
software development. They are Development Environment Risks, Process Issue
Risks, Staff Size and Experience, Technical Issue Risks, Technology Risks and
Schedule Risks. In this section, only risk factors that affect iteration scheduling and
planning are examined. Based on the author experience and experts’ opinions, these
factors can be mostly derived from development environment, process issues, staff
size, and experience and schedule risks.
Table 2.10. List of 5 risk factors for software project scheduling in Section 2.2
No. Risk factors
1
2
3
4
5
Poor management skills and experience.
Pressure on the schedule.
Frequent changes in customer requirements.
Inappropriate process.
Inappropriate technology.
In addition, Section 2.2 also examines a simple model of schedule risks in
general software project development with five risk factors extracted in Table 2.9.
77
In this section, the two lists of risk factors above are compared and combined in
consideration of Agile software project features (in the similar way with the way
done in Section 2.3.1). Table 2.11 list the 19 risk factors directly or indirectly
influence the possibility of an iteration success which combined from the two
researches above. These 19 risk factors then modeled to examine their relationships
using BNs. Each risk factor is represented by a node with a node probability table
(NPT). In our research, each project team or project manager played as an expert to
provide the values in the NPTs (based on his/her previous experience on the project
features and the team).
In order to have the most common risk factors in software project development,
the list of 19 risk factors in iteration scheduling (in Table 2.11) only differs to the
list in Section 2.3.1 (in Table 2.9) with the risk factor 9 (Ineffective communication
versus Staff doesn’t attend to daily meeting).
Table 2.11. List of 19 risk factors in iteration scheduling
No.
Risk factors
1
Large-scale, offshore and distributed.
2
Insufficient training.
3
Excessive preparation/planning.
4
Teams are not focused.
5
Inappropriate process
6
The best people not available for self-organizing team.
7
The skill level of people (team/developer)
8
Staff is not committed for entire duration of the project.
9
Staff doesn’t attend to daily meeting
10
Staff does not receive necessary training.
11
Lack of tools and methods
12
Software tools are not used to support software planning and tracking
activities.
13
Configuration management software tools are not used to control and
78
track change activity throughout the software process
14
Incorrect scale
15
Inappropriate technology
16
Level of team/developer
17
Customer not certain that the functionality requested is "do-able".
18
Lack of commitment of superior management
19
Lack of management experience
2.3.3. Conclusion and contribution
The section has come up with the two lists of proposed common risk factors for
both traditional software project scheduling and agile software project scheduling.
In some real-life projects, software teams or practitioners may find out some other
specific risk factors. However, in the models and methods proposed from now on in
this thesis, the two lists will be examined.
2.4. Chapter remarks
This chapter has developed an algorithm (BRI) and a tool (CKDY) to assess the
impact of risk factors and thereby propose a set of risk factors in software project
scheduling. The algorithm BRI is a step forwards the BN model to analyse risks in
software project scheduling while the tool CKDY is built on the purpose of
assessing risks’ impacts in software project scheduling, using probabilistic approach
including BNs. They both assert that Bayes Theorem and BNs can be used in
modelling risks as well as in quantitative analysis of risks in software projects.
The development of the algorithm BRI and the tool CKDY using the
probabilistic approach confirm that with BNs we always need expert’s opinions or
judgment, together with mathematical calculation.
This chapter focuses on using BNs to model the relationships among risk factors
as well as using Bayes Theorem in the calculation of risks’ impacts on software
project schedule. By doing so, the authors found out that it is also important to
examine common risk factors that affect software project scheduling in order to
better keep track of software schedules. Common risk factors in software project
scheduling are proposed.
79
Therefore, the chapter is the beginning effort to find out a quantitative method to
better assess and analyse risks in software project scheduling in terms of finding out
the risks’ attributes of software project scheduling and probabilistic methods to
manage risks in software project scheduling better. Project managers now have
scientific methods to keep track of risks in software project scheduling and they can
no longer do that by relying on their experience.
There could be further improvement of the way to use this probabilistic
approach in term of applying BNs in scheduling techniques as well as well as
examining the relationships among risk factors. This will be discussed in Chapter 3.
80
Chapter 3. Incorporation of Bayesian Networks into
software project scheduling techniques
Chapter 2 was initial attempts of applying BNs into risk management in
software project scheduling as well as experiments on common risk factors and
their impacts in software project scheduling. 19 common risk factors for both
traditional software development projects and agile software projects are proposed.
This Chapter 3 is about the author’s work to find out a probabilistic method to
improve well-known software project scheduling techniques, including both
techniques for traditional software scheduling and agile software scheduling (the
second objective mentioned in the Introduction section).
This chapter focuses on incorporating Bayesian Networks into software project
scheduling techniques to predict the chance of project schedule success. Section 3.1
incorporates BNs into agile software scheduling to enhance iteration scheduling.
Sections 3.2 to 3.4 both incorporate BNs into scheduling techniques and applying
BNs to examine 19 common risk factors in software scheduling (which proposed in
section 2.3).
3.1. Applying Bayesian Networks into specific software project
development
This section is the work represented in publication 1 [PUB1].
In software industry nowadays, Agile software development methods have been
largely adopted. Agile software development methods themselves can be considered
a certain level of reducing projects risks. However, optimization of software project
scheduling has always been big challenges in both practice and academia, since
industrial software development is a highly complex and dynamic process. There is
also a need for a probabilistic method that better model and predict uncertainty in
software projects. This section proposes an enhanced method and algorithm by
combining optimized agile iteration scheduling and the ability to predict and handle
risks in resource-constrained contexts of Bayesian Networks. Based on the method,
software was developed as a support tool for managers to control their project
schedules. The tool also provides a reliable set of strategies of sequencing tasks in
agile iteration scheduling.
3.1.1. Introduction
As introduced in Section 1.2, traditional software development methods often
characterized as predictive which focus on visioning and planning the future in full
details. A predictive development team announces exactly what features are planned
81
for the entire duration of the development process. Agile methods, in contrast, are
adaptive. An adaptive team would have difficulty describing what features are
planned for the entire duration of the development process, but they focus on
adapting to changing realities quickly. When project changes occur then the team
adapt themselves to the changes as well [50, 84, 85].
Agile methods break deliverables into small iterations (this would reduce overall
risk of realization of software features [56, 86]). Iterations are short time frames
(time boxes) that typically last from one to four weeks. Each iteration is a full
software development cycle which includes planning, requirements analysis, design,
coding, unit testing, and acceptance testing when a working software is
demonstrated to users and/or customers. This minimizes overall risks and allows
quick adaption to changes [87, 88, 89, 90].
Adopting Agile practices and processes brings certain benefits to organizations
such as quicker return on investment, higher product quality, and better customer
satisfaction [91]. However, they lack a sound methodological support of planning
(contrary to the traditional plan-based approaches). VersionOne’s survey [92]
identified 26 principal factors and the second one was iteration planning. The
survey also showed that three out of the five most important concerns (in the total of
13 most commonly cited greatest ones) about adopting agile within companies are
1) the loss of management control (36%), 2) the lack of upfront planning (33%) and
3) the lack of predictability (27%).
In addition, there have been some tools and techniques for project scheduling
that project managers used (Fox & Spence [93], Pollack-Johnson [94]). Szoke A.
[27] presented a new approach for iteration scheduling in agile software projects. As
mentioned in previous sections, Khorakadami et al. [20] also presented an improved
approach to incorporate uncertainty using BNs in general project scheduling.
3.1.2. Optimized Agile iteration scheduling
Scheduling problems constitute an important part of the combinatorial
optimization problems. Scheduling concerns about the allocation of limited
resources to tasks over time. The goal is the optimization of one or more objectives
in a decision-making process [27]. Software project scheduling has to deal with the
fact that resources such as human, time, technology and money are not always
predetermined. Moreover, there are always risks (uncertain events which cause
badly impacts) in software projects.
Technical tasks are the main concepts of iteration scheduling. These tasks are
fundamental working units accomplished by developers. The aim of iteration
82
scheduling is to break down selected requirements into technical tasks and to assign
them to developers (and usually require some working hour realization effort that is
estimated by the team). In other words, iteration scheduling aims at determining a
feasible fine-grained plan for the development that schedules the implementation of
selected features within an iteration [52].
Optimized (Agile) iteration scheduling problem can be derived by selecting the
extreme-valued schedule from the potentially feasible alternatives. This can be
considered as an optimization problem in which the resource allocation consists in
assigning time intervals to the execution of the activities (realization tasks) while
taking into consideration both temporal constraints (precedence between tasks) and
resource constraints (resource availability) and the minimum execution time
objective.
Although Agile software development represents a major approach to software
engineering, there is no well-established conceptual definition and sound
methodological support of Agile iteration scheduling.
3.1.3. Optimization model for Agile software iteration
Let R be the set of resources and the following typical properties for scheduling
be interpreted on technical tasks to schedule them:
Effort: wj - time estimation (in hours) is associated with each task. It is
calculated by simple expert estimation (e.g. 2, 4, or 8 working hour (Wh)).
Pre-assignment: aj - in some cases resource pre-assignment is applied before
scheduling. It is used by the scheduler algorithm during resource allocation.
Let the vector S = {S0 , S1 , … , Sn+1 } be start times for tasks’ realizations - where
Sj ≥ 0: j ⋲ A and S0 = 0. The vector S is called a schedule of development. In this
definition, the 0 and n + 1 are auxiliary elements to represent iteration beginning
and termination, respectively.
Dependencies between tasks j and j’ can be defined as:
Sj − Sj′ + dj′ ≥ Pj′,j ∶ j′ , j ⋲ A
(3.1)
with Sj is the start time for the realization of task j, Sj′ is the start time for task j',
dj′ is duration time of task j', Pj′,j presented precedence tasks, and A is the set of
tasks need to be done in the iteration.
Let the R i ⋲ N is a set of capacities of resources that have been assigned to the
project in an Agile iteration. The effort estimation yields resource requirements
83
r_(j,i) ∈ Z for each task j and each resource i. Let S be some schedule and let t be
some point in time. Then let A∗ (S, t) = {j ⋲ A|Sj ≤ t ≤ Sj + wj } be the active set of
tasks being in progress at position t. The corresponding requirement for resource
i ⋲ R at time t is given by ri (S, t) = ∑j⋲A∗ (S,t) rj,i .. As a consequence, the resource
constraints can be treated as follows:
ri (S, t) ≤ R i
∶ i⋲R
(3.2)
Thus, optimization problem for iteration scheduling can be formulated as
follows:
Minimize
z = Sn+1
Subject to
Sj − Sj′ + dj′ ≥ Pj′,j
ri (S, t) ≤ R i
∶ j, j′ ⋲ A
∶
i⋲R
Sn+1 ≤ lI
with lI is the length of the iteration.
(3.3)
a) Solving the optimization problem for iteration scheduling
The vector r indicates the available resources (developers) in the iteration.
Each wj is the planned effort (duration) for technical task j–both development and
defect correction. Every element of vector aj contains a reference to a resource
index (aj ⋲ {1. . |r|}) which indicates resources pre-assignment to task j. The aj =
0 means that task j is not pre-assigned.
Thus, the algorithm will find the best resource to its realization. Precedence
between tasks can be represented by a precedence matrix where Pj,j′ = 1 means that
task j precedes task j′ , otherwise Pj,j′ = 0. Both conditions Pj,j = 0 (no loop) and P is
directed acyclic graph (DAG) ensures that temporal constraints are not trivially
unsatisfiable. Iteration time-box is asserted by variable lI . It is used as an upper
bound in resource allocation to prevent resources overloading. The result of the
algorithm is a schedule matrix S where rows represent resources, and columns give
an order of task execution. Thus Si,p = j means that task j is assigned to resource i at
the position p. The ensure section prescribes the post-condition on the return value
(S): every task j has to be assigned to exactly one resource i.
84
An algorithm for iteration scheduling was proposed by Szoke A. [27]. Based on
this algorithm, an enhanced algorithm incorporated BNs will be discussed in the
next section.
b) Incorporating the optimization problem with BNs
The input of the above algorithm is defined which are resources, planned
duration for each task, task precedence and the length of each iteration. It is
assumed that given the resources, the task will be done in the planned duration.
However, there are always risks in real projects such as those about personnel or
technology. Those uncertainties can hardly be predicted, lead to the need of a
probability model which can quantify the uncertain issues as well as addressing the
most important concerns. BNs are believed to be the good probabilistic approach for
modelling uncertainty in projects as well as helping project managers making
decisions [20].
c) Using BNs to enhance Agile iteration scheduling
The authors propose the following factors added to the algorithm proposed by
Szoke A. [27]:
a) A matrix which represents the relationship between each task duration and
assigned resources:
[B]n+2,|r| = {bij ⋲ [0,1] |i = 0. . n + 1, j = 1. . |r|}
(3.5)
i.e. the probability for task i to be done in the time wi and allocated resource j is
bij
b) When a schedule is created, its probability of completion is examined. An
array should be imposed to represent the weights of tasks in an Agile iteration.
[M]n+2 = {mi |i = 0. . n + 1}
(3.6)
Each task has an impact on the schedule for the iteration. Therefore, we can
have an array T of those impacts:
[T]n+2 = {t i |i = 0. . n + 1}
(3.7)
In this research, we propose the following formula for t i :
Let the set Di of resources allocated for task i in the iteration
t i = min{bij | j ⋲ Di }
(3.8)
The probability for the schedule is done successfully:
85
p=
∑n+1
i=0 ti ∗mi
(3.9)
∑n+1
i=0 mi
c) The algorithm with BNs
The algorithm for calculating T:
Input: S, B
for i = 0 to n + 1 do
ti = 1
/* initiate the default t */
for j = 0 to|r| do
if ∃p ⋲ P′ Si,p ! = 0 ∧ bi,j < t i then
𝑡𝑖 = 𝑏𝑖,𝑗
/* if allocated */
/* update */
𝐞𝐧𝐝𝐢𝐟
𝐞𝐧𝐝𝐟𝐨𝐫
𝐞𝐧𝐝𝐟𝐨𝐫
𝐫𝐞𝐭𝐮𝐫𝐧 𝑇
Thus, an enhanced algorithm of the one proposed by Szoke A. [27] using BNs,
named Szoke-BN and formulated as follows:
Input:
𝑟𝑖 ⋲ 𝑁 , 𝑙𝐼 ⋲ 𝑁
𝑎𝑗 ⋲ 𝑁 ∶ 𝑎𝑗 ⋲ {1. . |𝑟|}, 𝑤𝑗 ⋲ 𝑅
tasks*/
𝑃𝑗,𝑗′ ⋲ 0,1 ∧ 𝑃𝑗,𝑗 = 0 ∧ 𝑃 𝑖𝑠 𝐷𝐴𝐺
/* resources and length of each iteration*/
/* pre-assignments and durations of
/* precedences */
[𝐵]𝑛+2,|𝑟| = { 𝑏𝑖,𝑗 ⋲ [0,1] , 𝑖 = 0. . 𝑛 + 1, 𝑗 = 1. . |𝑟|}/* matrix of completion
probability */
[𝑀]𝑛+2 = {𝑚𝑖 |𝑖 = 0. . 𝑛 + 1}
/* weights of tasks in iteration */
[𝑇]𝑛+2 = {𝑡𝑖 |𝑖 = 0. . 𝑛 + 1}
/* impacts of tasks in iteration */
Ensure:
𝑆𝑖,𝑗 ⋲ 0,1 ∧ ∀𝑗∃! 𝑖𝑆𝑖,𝑗 = 1
𝑚 ⇐ 𝑙𝑒𝑛𝑔𝑡ℎ(𝑟) , 𝑛 ⇐ 𝑙𝑒𝑛𝑔𝑡ℎ(𝑑 )
𝑺 ⇐ [0]𝑚,𝑛
/* number of resources and tasks */
/* initial set of resources */
86
𝑟𝑙𝑖𝑠𝑡 ⇐ Ø , 𝑠𝑙𝑖𝑠𝑡 ⇐ Ø , 𝑃′ ⇐ Ø
P' list */
/* initiate ‘ready list’, ‘scheduled list’ and
𝐟𝐨𝐫 𝑗 = 0 𝑡𝑜 𝑛 𝐝𝐨
𝑝𝑜𝑡 ⇐ 𝑓𝑖𝑛𝑑𝑁𝑜𝑡𝑃𝑟𝑒𝑐𝑒𝑑𝑒𝑛𝑡𝑒𝑑𝑇𝑎𝑠𝑘𝑠(𝑃)
𝑟𝑙𝑖𝑠𝑡 ⇐ 𝑝𝑜𝑡 \𝑠𝑙𝑖𝑠𝑡
𝐢𝐟 𝑟𝑙𝑖𝑠𝑡 = Ø 𝐭𝐡𝐞𝐧
𝐫𝐞𝐭𝐮𝐫𝐧 Ø
/* find potential task */
/* create ready list */
/* no schedulable task */
𝐞𝐧𝐝𝐢𝐟
𝑗 ⇐ max{𝑎𝑗 } ∶ 𝑗 ⋲ 𝑟𝑙𝑖𝑠𝑡
/* select a task */
𝐢𝐟 𝑎𝑗 = 0 𝐭𝐡𝐞𝐧
𝑖 ⇐ 𝑠𝑒𝑙𝑒𝑐𝑡𝑀𝑖𝑛𝐿𝑜𝑎𝑑𝑒𝑑𝑅𝑒𝑠𝑜𝑢𝑟𝑐𝑒𝑎𝑛𝑑𝑀𝑎𝑥𝑃𝑟𝑜 (𝑆)
/* without assignment */
𝐞𝐥𝐬𝐞
𝑖 ⇐ 𝑎𝑗
/* with assignment */
𝒆𝒏𝒅𝒊𝒇
𝑙 ⇐ 𝑠𝑢𝑚(𝑆𝑖,{1..𝑛} )
/* calculate load of resource 𝑖 */
𝐢𝐟 (𝑙 + 𝑤𝑗 ) > 𝑙𝐼 𝐭𝐡𝐞𝐧
/* overloaded iteration */
𝐫𝐞𝐭𝐮𝐫𝐧 Ø
𝐞𝐧𝐝𝐢𝐟
𝑝 ⇐ 𝑓𝑖𝑛𝑑𝑁𝑒𝑥𝑡𝑃𝑜𝑠 (𝑆, 𝑖)
𝑃′ ⇐ 𝑃′⋃{𝑝}
𝑆𝑖,𝑝 ⇐ 𝑗
/* the next task */
* add the index of the next task into 𝑃′ */
/* assigned task 𝑗 with resource 𝑖 at position 𝑝 */
𝑠𝑙𝑖𝑠𝑡 ⇐ 𝑠𝑙𝑖𝑠𝑡⋃{𝑗}
𝑃{1..𝑛},𝑗 = 0
/* add task 𝑗 into slist */
/* delete precedence related to scheduled task */
𝐞𝐧𝐝𝐟𝐨𝐫
𝑇 ⇐ 𝑐𝑜𝑚𝑝𝑢𝑡𝑖𝑛𝑔𝑓𝑟𝑜𝑚(𝑆, 𝐵)
/*calculating matrix T from B */
𝑥 = 𝑐𝑜𝑚𝑝𝑢𝑡𝑖𝑛𝑔𝑃𝑟𝑜𝐹𝑟𝑜𝑚(𝑆, 𝑇, 𝑀)
/* calculating 𝑥 from 𝑆, 𝑇, 𝑀 */
𝐫𝐞𝐭𝐮𝐫𝐧
𝑆, 𝑥, 𝑃′
Where:
 𝑓𝑖𝑛𝑑𝑁𝑜𝑡𝑃𝑟𝑒𝑐𝑒𝑑𝑒𝑛𝑡𝑒𝑑𝑇𝑎𝑠𝑘𝑠 - Find tasks without priority constraints based
on matrix of task priority 𝑃
 𝑠𝑒𝑙𝑒𝑐𝑡𝑀𝑖𝑛𝐿𝑜𝑎𝑑𝑒𝑑𝑅𝑒𝑠𝑜𝑢𝑟𝑐𝑒 (𝑆)- Select the resource with the minimum load
and the highest probability of success (with the criteria of maximum load and
probability).
87
Output: The set S of schedules set of time for tasks, and the probability of
successful S.
3.1.4. Tool and experimental results
a) Building tool
To elaborate the proposed model and algorithm, we built the tool BAIS
(Bayesian Agile Iteration Scheduling) using Java programming language. The tool
allows users to enter the number of resources (developers), the number of tasks, the
length of iterations, tasks’ precedence, pre-assignments and durations for tasks. The
tool also requires the input for a table of probability for each resource finishes the
assigned task in time (Figure 3.1).
BAIS implements four strategies for selecting tasks in scheduling (or scheduling
rules):
- SPT (Shortest processing time first): sequences the tasks in increasing order of
their processing time.
- LPT (Longest processing time first): sequences the tasks in decreasing order of
their processing time.
- AF (Assigned First): sequences the tasks based on team pre-assignments.
- AF+LPT: the combination of AF and LPT.
Figure 3.1. Home GUI of tool BAIS
88
b) Experimental results and analysis
The authors use two data samples in two experiments:
The first sample is a randomly generated one. Given an iteration with two
resources and 20 working-hour length (unbound). Its number of tasks is 8 and there
is only one precedence that is the 5th task need to be done before the 3rd one. Table
3.1 shows T1 , T2 are probability tables of resource 1 and resource 2 successfully
done their tasks (i.e., finish in time as scheduled) if the previous tasks were done in
time, and D1 , D2 are their probability tables of finishing their tasks in time if the
previous tasks were over scheduled.
Table 3.1. The first data sample
Task Time
Preassignment
𝑇1
𝑇2
𝐷1
𝐷2
1
4
-
0.92
0.83 0.38 0.34
2
3
-
0.74
0.72 0.27 0.22
3
5
-
0.93
0.85 0.22 0.34
4
4
2
0.94
0.84 0.43 0.36
5
2
-
0.83
0.73 0.26 0.42
6
5
-
0.82
0.96 0.17 0.23
7
4
-
0.73
0.94 0.32 0.52
8
3
-
0.68
0.73 0.27 0.13
The results of four strategies of selecting tasks SPT, LPT, AF, and AF+LPT:
 SPT: the shortest time that all resources finish in an iteration –
makespan – is 16 (hours). Resource 1 is assigned the tasks 5, 2, 7, 3 and
Resource 2 is assigned the tasks 8, 1, 4, 6 respectively. The probability for
success is 68.83%.
 LPT: makespan is 17. Resource 1 is assigned the tasks 6, 7, 8, 3 and
Resource 2 is assigned the tasks 1, 4, 2, 5 respectively. The probability for
success is 60.42%.
 AF: makespan is 15. Resource 1 is assigned the tasks 1, 2, 6, 8 and
Resource 2 is assigned the tasks 4, 5, 3, 7 respectively. The probability for
success is 66.77%.
 AF+LPT: makespan is 17. Resource 1 is assigned the tasks 6, 7, 8, 3
and Resource 2 is assigned the tasks 4, 1, 2, 5 respectively. The probability for
success is 60.37%.
89
According to the above result, if the team considers minimizing makespan is the
first optimized criteria, then the AF strategy should be chosen. If we want the
highest probability of success, then we take the SPT strategy. Figure 3.2 shows the
Gantt chart yielded by BAIS for the SPT strategy.
Figure 3.2. Gantt chart for SPT strategy
The second experiment was carried out with a real-life project data from the
Company A. There are 3 developers who need to finish 15 tasks in the project. Each
iteration is carried out in 40 hours. The authors asked the project manager and
experts in the company to provide the pre-estimated probability of finishing the
tasks based on each developer’s experience. The calculation from the tool BAIS
gives us the probability table shown in Table 3.2.
Table 3.2. The probability table for tasks and resources
Probability
Taks Time
Previous tasks in
time
Previous tasks over
scheduled
Re. 1 Re. 2 Re. 3 Re. 1
Re. 2
Re. 3
1
6
1.00 0.96 0.65 0.88
0.8
0.49
2
9
1.00 1.00 0.9
0.85
0.68
3
5
1.00 1.00 0.87 0.7
0.81
0.65
4
6
1.00 1.00 0.83 0.88
0.74
0.75
0.96
90
Probability
Taks Time
Previous tasks in
time
Previous tasks over
scheduled
Re. 1 Re. 2 Re. 3 Re. 1
Re. 2
Re. 3
5
9
1.00 1.00 0.89 0.9
0.77
0.66
6
6
1.00 1.00 0.96 0.98
0.9
0.82
7
2
1.00 1.00 0.8
0.85
0.75
0.75
8
6
1.00 1.00 0.77 0.95
0.75
0.68
9
6
1.00 1.00 0.75 0.96
0.9
0.75
10
4
1.00 1.00 0.89 0.85
0.72
0.67
11
8
1.00 1.00 0.89 0.8
0.9
0.67
12
7
1.00 0.94 0.75 0.9
0.74
0.68
13
4
1.00 1.00 0.82 0.84
0.75
0.62
14
3
1.00 0.86 0.75 1.00
0.84
0.67
15
4
1.00 1.00 0.76 0.83
0.9
0.67
The results for makespans and overall probability: SPT: makespan is 30, the
probability for success is 97.98%; LPT and AF+LPT: makespan is 28, the
probability for success is 46.98 %; AF: makespan is also 28, and the probability for
success is 88.89%. According to these results, the project manager should really pay
more attention on the tasks assignments and adjust the iterations’ times during the
project.
The results of both experiments show that the tool can support decisions in agile
software development planning to tailor the best plan for the specific project context
and users’ and/or customers’ feedbacks by altering constraints, capacities and
priorities. In a single project, the manager can also use the tool to predict next
phases or next iterations schedule and better understand how the failure of the phase
can impact the whole project.
91
3.1.5. Conclusion and contribution
The section has developed an algorithm for agile iteration scheduling with the
cooperation of Bayesian Networks to support software teams to analyse the
schedules as well as predicting the chance of their success. The method improves
the quality of agile software development planning to provide lower level risks by
considering all major planning factors (e.g. dependencies, capacities) in a
mathematical optimization model.
The results of experiments on the available data sets indicate that the approach
can provide practical value as a decision support tool for agile iteration planning. To
further affirm this, more representative real-life data sets needed and some case
studies can be carried out.
This section can be considered as a step towards a conceptual model of agile
iteration planning and scheduling. Since the research gives better insight into
resource-constrained project scheduling problems, this may suggest a new
optimization problem on agile iteration scheduling.
The developed tool provides options for scheduling rules which enable us to
compute an optimal active schedule for the singular resource or overall project. An
upgrade can be further developed which incorporates BNs for representing and
analyzing causal models involving uncertainty. The version can even provide a set
of tools for constructing probabilistic inference and decision support systems on
BNs and thus can assist software project managers in making decisions in
scheduling and planning all kinds of software projects.
3.2. Incorporation of Bayesian Networks into CPM
This section is the work represented in publication 5 [PUB5].
Although project managers nowadays can use a range of tools and techniques to
develop, monitor and control project schedules, the task of creating project
schedules is often very difficult since it has to deal with planning against
uncertainty. Popular techniques for project scheduling based on the assumption that
projects are carried out as planned or scheduled – which hardly happens. This
section takes the advantage of BNs in modelling uncertainty and incorporates them
in Critical Path Method - one of the most popular means of monitoring project
scheduling. The section also examines common risk factors in project scheduling
and proposes a model of 19 common risk factors. A tool was also built, and
experiments were carried out to validate the model.
92
3.2.1. The RBCPM Model
In this section, the idea is to use BNs to perform the well-known CPM
calculation. In other words, CPM is incorporated with BNs.
As described in Section 1.2.2, the main components of CPM calculation are
activities. Since this research is on software project, the term “task” is used as
“activity”. Tasks are linked together to represent dependencies.
In order to incorporate a CPM network to a BN, we first need to map a single
task. Each of the activity parameters formulated in Section 1.2.2 is represented as a
node in the BN.
Figure 3.4 shows a schematic model of a partial BN associated with a task. The
Figure also shows the relationship between parameters of a task as well as its
connection with other tasks, based on CPM algorithm and the incorporation with
BNs.
Figure 3.3. A part of a BN for 19 risk factors
To form the overall CPM network (or the overall BN), in which a task is a node
(and also a variable in BN), we define the connection between dependent variables.
Predecessor node i and successor node j is connected by:
 Connecting EF of i with ES of j;
 Connecting LS of j with LF of i.
In the directed graph CPM, each task is associated with parameters D, LS, LF,
ES, and EF. In our model, each task is also affected by a general risk which
represents the set of 19 risks.
93
Another BN formed is the BN for 19 common risk factors and a general risk
(Figure 3.3). As mentioned above, their relationship is also analysed based on
literature review and project managers’ experience.
Table 3.3 shows the relationship between the 19 risk factors and the general
risk. Each risk factor is represented by a node which may have parent node(s)
or/and children node(s). For example, risk factor (node) 3 has parent node 19 and
children nodes 4, 8, 9.
Table 3.3. Risk factors analysis
No.
Risk factors
Parents
Children
1
Large-scale, offshore and
distributed.
2
Insufficient training.
1
7,10
3
Excessive
preparation/planning.
19
4,8,9
4
Teams are not focused.
3,6,9,18
8
5
Inappropriate process
1,6,12
20
19
4,5
6
The best people not
available for selforganizing team.
7
The skill level of people
(team/developer)
2,11
20
3,12,14
20
8
Staff is not committed for
entire duration of the
project.
9
Ineffective
communication
3,12
4,14
10
Staff does not receive
necessary training.
2,11
20
11
Lack of tools and methods
12
Software tools are not
used to support software
planning and tracking
activities.
2,19
7,10,12,13
11
5,8
94
11
14,17
13
Configuration
management software
tools are not used to
control and track change
activity throughout the
software process
14
Incorrect scale
9, 13, 16
8
15
Inappropriate technology
7, 19
20
16
Level of team/developer
17
Customer not certain that
the functionality
requested is "do-able".
18
Lack of commitment of
superior management
19
Lack of management
experience
1
20
General Risk
5,7,8,10,15,17
14,17
12,13,16,18
20
4,17
3,15
In addition, each risk factor or node is associated with a node probability table
(NPT). In our research, each project team or project manager played as an expert to
provide the values in the NPTs (based on his/her previous experience on the project
features and the team).
Since there might be impacts of risks on each task, the estimated time for the
task (ED - Estimate D) is no longer a value, but a range of values.
Each task probability is calculated based on NPTs of D and ES. From NPTs of
D and ES we will have the NPT of EF as well as the NPT of ES of the successor
task.
𝐸𝑆 + 𝐷 −> 𝐸𝐹
𝐸𝐹 (𝑘 ) = ∑ 𝑚 ∑𝑛 𝐷 (𝑚) . 𝐸𝑆(𝑛), with k = m + n.
(3.10)
In the beginning (t = 0), P(ES) is initiated 1 (100%). Let Pi(m) be the
probability of finishing task i in m days. If m differs from ED - 1, ED, ED + 1 then
Pi(m) = 0.
95
Figure 3.4. Task’s parameters and connection to other tasks.
3.2.2. The RBCPM Method
The CPM calculation can now be adapted to the RBCPM (Risk Bayesian CPM)
procedure for software project with the following steps:
Step 1. Specify the individual tasks using a work breakdown structure.
Step 2. Specify the common risk factors (in our research, they are 19 risk
factors).
Step 3. Determine the sequence of those tasks and dependency between them.
Step 4. Determine the relationship between the common risk factors on the
project basis (i.e., context based).
Step 5. Form the BN for each task (that models the task parameters D, LS, LF,
ES, and EF).
Step 6. Form the CPM diagram in form of BN for all the tasks (that models the
tasks and their dependency).
Step 7. Calculate the general risk affecting each task and estimate the
completion time (duration) for each task.
Step 8. Identify the critical path (the longest-duration path through the network).
Step 9. Update the CPM diagram as the project progresses.
96
3.2.3. Tool and experimental results
a) The tool RBCPM
The tool RBCPM (Risk Analysis based on Bayesian CPM) was built in Java
programming language to test the RBCPM model and the procedure described
above as well as supporting project managers.
The tool has the following main functions:
 Model and calculate a project schedule (in form of CPM algorithm
described in sections 1.2.2 and 3.2.1).
 Alert project managers about tasks could be over-scheduled (so that they
can have impacts on the whole project).
 Calculate the possibility of the schedule (that is, chance that the project
will be finished on time).
 Calculate the possibility of each task (i.e., probability of finishing on
time).
 Model the network of risks (in form of BNs visualization) that have
impacts on each task.
Figure 3.5 is a screenshot of the tool which shows task information and its
possibility of finishing on time.
Figure 3.5. A screenshot of RBCPM
b) Experimental results and analysis
Data samples used in the experiments can be categorized as:
97
 Data samples used for CPM algorithm: sets of tasks with start time,
duration and precedence constraints.
 Data samples used for BNs: NPTs of nodes, which often provided by the
project manager or some project expert.
Data samples for CPM algorithm: two samples from real-life projects as shown
in Table 3.4 and Table 3.5.
In the first data sample: there were 13 tasks and the project was planned to be
finished in 80 days. The initial time allocations and task precedencies are shown in
Table 3.4. Similarly, the second project in the data sample 2 had 24 tasks and was
planned to be finished in 112 days.
Table 3.4. Data sample 1
No.
Task
Duration
Predecessor
1
A
5
-
2
B
7
A
3
C
4
A
4
D
5
B
5
E
10
B
6
F
7
CD
7
G
6
F
8
H
6
EG
9
I
3
H
10
K
7
F
11
L
8
I
12
M
5
K
13
N
9
LM
Data samples for BNs: Nodes in the BN are risks (or uncertainty) for each task
of the project. The authors asked project managers and some other key people of the
projects to judge the NPTs and relationship of risks based on their experience in
their projects. They provided as initial input a particular set of values for the node's
parent variables, and the NPTs of the variable represented by the node. There were
98
7 sets of NPTs provided by the project managers for the authors’ BNs of 19
common risks. Each task is impacted by one of these 7 sets of BNs.
For the first experiment: the possibility of finishing on time (as planned 80
days) is 67.56% (Figure 3.6). In fact, the project is done in 95 days, with the
possibility calculated by the tool is 88.34%.
Figure 3.6 and Figure 3.7 are graphical illustrations from the tool BAIS to help
users see the possibility of finishing the software project in certain point of time.
Table 3.5. Data sample 2
No.
Task
Duration
Predecessor
1
A
3
-
2
B
4
A
3
C
4
B
4
D
6
A
5
E
6
CD
6
F
7
E
7
G
10
E
8
H
3
E
9
I
3
H
10
J
4
I
11
K
12
F
12
L
3
GJ
13
M
4
K
14
N
2
KL
15
O
3
M
16
P
10
O
17
Q
4
P
18
R
6
P
19
S
18
N
20
T
7
SR
99
No.
Task
Duration
Predecessor
21
U
6
T
22
V
15
SR
23
X
3
UV
24
Y
3
XQ
For the second experiment: the possibility of finishing on time (as planned 112
days) is 69.11% (Figure 3.7). In fact, the project is done in 132 days, with the
possibility calculated by the tool is over 90%.
Figure 3.6. A result for experiment with data sample 1
The results show the reliability of the model and the tool, since the calculation
is appropriate to the situation of the real-life projects. However, the reliability of the
proposed model depends on the BN, i.e. 19 common risk factors, their relations and
their NPTs. Therefore, the feedback from experts and from project managers is
crucial to the results. This is also the similar case with any other system based on
BNs.
Results on RBCPM algorithm also confirm that tasks on the critical path have
important impacts overall project. Therefore, project managers need to take care of
these tasks.
100
Figure 3.7. A result for experiment with data sample 2
3.2.4. Conclusion and contribution
The research has improved CPM-based schedule with the incorporation of BNs
to support software teams to analyse the schedules as well as predicting the chance
of their success in such a way that fully characterizes uncertainty.
The approach makes it possible to capture different sources of risks and use
them to analyse software project scheduling. It also expresses uncertainty about
duration for each task and the whole project with full probability distribution.
The method provides a good way to deal with the uncertainty that cannot be
handled in traditional ways, such as the uncertainty caused by the co-relationship
between activities and risk factors. This also improves the quality of software
development scheduling to provide lower level risks by considering all major
planning factors (e.g. dependencies, capacities) in a mathematical model.
Therefore, this is an effective approach for software project scheduling. The
authors would carry out further research on incorporating BNs and some other
scheduling algorithms (e.g. PERT). Other expansions can be incorporating
additional uncertainty sources into the model, or further handling common causal
risks (which affect more than one task).
The results of experiments on the available data sets and indicate that the
approach can provide practical value as a decision support tool for software
scheduling and planning. To further affirm this, more representative real-life data
sets needed, and some case studies can be carried out.
101
3.3. Incorporation of Bayesian Networks into PERT
This section is the work represented in publication 6 [PUB6]. This section takes
advantage of BNs (including related mathematical calculations) in modelling and
assessing uncertainty and incorporates them in software project scheduling with
Program Evaluation and Review Technique (PERT) in case of a high level of
uncertainty. Common risk factors in project scheduling are also examined, and a
model of 19 common risk factors and their causal relationships proposed in Section
2.3.1 is confirmed. The research also borrows and implements categories and levels
of risks from construction projects into software projects. An own-built tool was
also built to experiment and validate the proposed model.
3.3.1. Proposed model
Since PERT is similar to CPM, this research proposes a model identical to the
model proposed in Section 3.2 with two main differences: (i) PERT scheduling
technique is used instead of CPM, and (ii) risk factors are deeper analysed using the
adapted risk categorization and levels from construction project.
a) Common risk factors
In this section, the list of 19 common risk factors (shown in Table 2.9) directly
or indirectly influence the possibility of a schedule success is also examined
similarly as what is done in Section 3.2.
In order to model the risk factors using BNs, each risk factor is represented by a
network node (and it is also an event with the possibility of happening at a certain
probability). That is, each node may have parent node(s) or/and children node(s).
For example, risk factor (node) 3 has parent node 19 and children nodes 4, 8, 9. A
general (total) risk factor is used as a representative of the impact of the common
risk factors on an activity at a point of time. The Table 3.3 analyses each risk factor
(network node) and the relationship between the 20 nodes (19 risk factor nodes and
one general risk node).
In addition, each risk factor or node is associated with a node probability table
(NPT). In our research, each project team or project manager played as an expert to
provide the values in the NPTs (based on his/her previous experience on the project
features and the team).
b) Proposed Risk Bayesian PERT method
To form the overall PERT network (or the overall BN), in which an activity is a
node (and also a variable in the BN), we define the connection between dependent
variables. Predecessor node i and successor node j are connected by:
102
 Connecting tEF of i with tES of j;
 Connecting tLS of j with tLF of i.
In the Bayesian PERT network, each activity is associated with parameters t
(total duration), tLS, tLF, tES, and tEF as shown in Figure 3.8.
Figure 3.8. Bayesian Network for each activity
In the authors’ model, each activity is also affected by a general (total) risk
which represents the set of 19 risks (Figure 3.9). The above Bayesian PERT
network represents the time nodes for each activity as well as the relationship
between activity nodes. However as mentioned in Section 3.1, there are always
impact of risk factors on each activity. Therefore, the impact of risk factors needs to
be brought into the model. It can be seen from the Bayesian PERT model the
relationship between t(i) and tEF of each activity and between the tEF of the
predecessor activity and the tES of the next activity through the Bayesian Network.
Thus, if the total duration node t(i) is affected by the risks, indirectly the risks will
also affect the tES and tEF nodes of the activity. Thanks to this, the calculation can
be greatly reduced while still mentioning the impact of risks on all time nodes.
Another BN is formed to model the 19 common risk factors and the general risk
as described in Section 3.1. The BN ensures the impact of all risks to each activity
node as well as easy and simple in integrating the impact of risk on the activity
node. Details of the model are shown in Figure 3.9.
It can be seen from Figure 3.9 that the "total duration" node is a representation
of the execution time of the activity node after being affected by the risk. The risk
model will be represented by a "total risk" node that represents the whole risk
model. The node estimated duration t(i) is the estimated execution time of the
103
activity in the scheduling process. The above model is the Risk Bayesian PERT
(RBPERT) Model.
Figure 3.9. Risk integration network model into PERT scheduling
c) The improved RBPERT Model
Chang et al. [22] categorized construction project risks into 2 categories and 7
levels. Risks are divided into 2 categories:
 Risks due to the physical environment
 Resource risks include 5 subcategories: people (the availability of different
skilled or unskilled laborers to perform an activity), machine (the
availability of required equipment to perform an activity), materials (the
availability of required materials to complete an activity), methods (the
availability of appropriate methods to perform an activity), money
(availability of required financial arrangements to conduct an activity).
Adapted to software development project, in this research the machine category
is considered as technological risks, and materials category is considered as support
tools.
Considering all the above risk factors, seven risk levels are classified for
estimating the duration of an activity:
Level 0: No risks
Level 1: There is a risk of physical environment
Level 2: There is a risk of physical environment + 1 of 5 resource risks
Level 3: There is a risk of physical environment + 2 of 5 resource risks
Level 4: There is a risk of physical environment + 3 of 5 resource risks
Level 5: There is a risk of physical environment + 4 of 5 resource risks
104
Level 6: There is a risk of physical environment + all 5 resource risks
The process of scheduling and risk management improvement is as following 8
steps and also shown in the Figure 3.10.
Figure 3.10. Process in improved RBPERT Model
Step 1: Project creation (initialize the basic information of the project).
Step 2: Activity definition (determine the activities needed to be done in the
project and estimate the activity durations).
Step 3: Relationship connection (draw relationships between activities using
BNs).
Step 4: Risk item breakdown (analyse possible risk factors in the project and
estimate their impacts on the activities).
Step 5: Risk allocation (allocate risk factors to the activities).
Step 6: Information check (check the accuracy of relationships between
activities, and risk factors information).
Step 7: Duration calculation (calculate the total duration of each activity node
by applying the 7-risk-level method).
105
Step 8: PERT calculation (calculate the overall project duration and critical path
for the 7 risk levels, with PERT technique).
3.3.2. Tool development and data collection
a) The tool RBPERT
The tool RBPERT10 was built in Java programming language which includes
the following main functions:
 Calculates the start and end time of each task (activity) with PERT
scheduling technique and calculation.
 Calculates and provides a distribution chart that accumulates the probability
of project completion time.
 Provides the RBPERT model of the project.
The Figure 3.11 shows the input screen of the tool, which simply allows
choosing input file and yielding schedule in PERT technique as well as the PERT
Bayesian Network associated with the schedule.
Figure 3.11. The input screen of the RBPERT tool
The input file of the tool is an XLS file which contains the list of the tasks in the
software project (Figure 3.12). Task attributes are id, PERT time estimates
(optimistic, most likely, pessimistic), id of predecessor tasks, name of the task
(optional, if provided).
10
RBPERT (2019). The RBPERT
https://github.com/tuanmasu/RBPERT/
source
code
and
sample
data.
Available
at
106
Figure 3.12. The input file type of the RBPERT tool
The tool processes input data based on the following flow:
Firstly, read data from the input file to create a tasks sequence and to build a
predecessor relationship between tasks. Calculate the Duration node and Total
Duration node of each task.
After initializing the task, calculate the parameters: the earliest starting time
tES, the earliest finish time of tEF, the latest starting time tLS, and the latest finish
time tLF of each task.
Initialize BNs of each task and between tasks as described in Section 3.2.
Calculate of probability accumulation of the earliest starting point tES, the
earliest finish time tEF of each task.
After the calculation, the tool provides the chart of cumulative probability of the
project completion time and the RBPERT network of the project (Figure 3.13).
b) Data collection
The authors collected data from 3 real-life projects from a big software
company in Vietnam. The sample data were put onto github together with the
source code of the RBPERT tool.
The first software project had 10 tasks (Table 3.8) and was planned to finish in
79 days. However, in fact, it was done in 88 days.
107
Table 3.6. Task attributes of the first data sample
No
id
Optimistic
Most likely
Pessimistic
Predecessor
1
A
2
3
5
2
B
24
30
35
A
3
C
15
21
27
B
4
D
1.5
2
2
BC
5
E
1
2
2.5
D
6
F
2
3
4
E
7
G
1.5
2
3
AF
8
H
5
7
10
G
9
I
1.5
2
5
H
10
J
5
7
10
IJ
The second software project consisted of 9 tasks (Table 3.9) and was expected
to finish in 95 days. In reality, the project was done in 104 days.
Table 3.7. Task attributes of the second data sample
No
id
Optimistic
Most likely
Pessimistic
Predecessor
1
A
2
3
5
2
B
10
14
18
A
3
C
16
21
25
BC
4
D
4
5
7
C
5
E
4
5
7
D
6
F
25
30
34
E
7
G
4
5
7
DEF
8
H
5
7
10
G
9
I
4
5
7
H
108
The third data sample is from a project of 15 tasks (Table 3.10) which lasted
almost 900 days due to the last two tasks took 1 year each as they were about the
assessment and confirm of the system security and the whole project supervision.
The project was planned to be done in 892 days but in fact it lasted for 907 days.
Figure 3.13. A result for the network provided by the RBPERT tool for the first data
sample
Table 3.8. Task attributes of the third data sample
No
id
Optimistic
Most likely
Pessimistic
Predecessor
1
A
5
7
10
2
B
5
7
10
A
3
C
5
7
10
BC
4
D
2
3
4
ABC
5
E
5
7
10
D
6
F
5
7
10
E
109
No
id
Optimistic
Most likely
Pessimistic
Predecessor
7
G
2.5
3
5
F
8
H
4
5
7
G
9
I
10
14
15
H
10
J
50
60
70
I
11
K
6
7
9
J
12
L
12
14
18
K
13
M
16
21
25
L
14
N
360
365
370
M
15
P
360
365
370
N
3.3.3. Experimental results and analysis
Results from the experiment with the RBPERT tool:
 For the first data sample: The probability for the project to finish on time
(79 days): 69%. The probability for the project to finish in 88 days (as it
happened): 86%.
 For the second data sample: The probability for the project to finish on
time (95 days): 68%. The probability for the project to finish in 104 days
(as it is the case in reality): 89%.
 For the third data sample: The probability for the project to finish on time
(892 days): 71%. The probability for the project to finish in 907 days:
81%.
The tool shows quantitatively the fact that the chance of success of the projects
in the planned time is not so high. Therefore, the projects needed more time, and
indeed they lasted longer than planned. However, the probability for the projects to
be done in real time durations could not be 100% (instead, they are 86%, 89% and
81% in the experiments for the real-life data samples). This difference can be
explained by the limitation of the PERT technique, which the tool also reflects.
Firstly, PERT requires a subjective (time) estimates of the tasks, leads to the
accuracy of PERT estimates rely on these subjective estimates; i.e., the schedule
would be affected if the people provide these estimates are not focused, lack of
experience or biased. Secondly, in our research, it is assumed that the critical path
will remain the critical path throughout the whole project, which is not always
110
guaranteed in real life PERT technique. Besides, like CPM, PERT technique
implemented in this research also assumes that all the resources are available during
the whole project (which might not be the case in the real-life project).
Figure 3.14. A result for RBPERT network provided by the tool for the first data sample
Figure 3.15. A result for experiment with the third data sample (distribution of Total
Duration of activity J)
Another explanation for the differences between the result from the tool and the
real-life projects is that in real-life, project teams can work overtime (so that the real
total duration can be longer than reported) in order to finish the project faster.
111
3.3.4. Conclusion and contribution
The section proposed a probabilistic approach for handling high level of
uncertainty/risks in software projects by applying BNs into both scheduling
technique (PERT) and common risks in software scheduling.
The proposed approach also enriches the benefits of PERT in handling
uncertainty by further incorporating risk factors together with the powerful
analytical capability of BNs. This approach provides the quantitative calculation
and analysis of influential factors to software schedules and software schedule
success, so that software teams can better analyse the PERT-based schedules as well
as predicting their chance of success. To better control projects’ schedules, this
approach also helps examine the influential factors to software schedule, as well as
levels of risks in the project so that project managers can capture different sources
of risks and use them to analyse software project scheduling. The classification of
risk factors into groups and levels would help assess better each risk factor impact
on the success of the project as well as assessing better each risk factor impact in
each group and level.
To further confirm this approach as well as refining the adapted risk
categorization and levels from a construction project, more literature study, more
representative real-life data sets needed, as well as more case studies could be
carried out. The risk categories would also be applied in a similar model of
incorporating CPM and BNs (as the expansion of the results of Section 3.2). Other
expansions can be incorporating additional uncertainty sources into the model and
the risk categorization and levels, or further handling common causal risks (which
affect more than one task). These are all about considering more about specific
attributes of risks related to software scheduling.
3.4. Incorporation of Bayesian Networks into Agile software
development scheduling
This section is the work represented in publication 4 [PUB4]. This section
proposes Bayesian Networks to model risk factors in Agile software projects as well
as managing risks in Agile iteration scheduling. The section also addresses 19
common risk factors that affect iteration scheduling. Based on the method, a
software was developed as a support tool for managers to control their project
schedules as it can assess the possibility of each schedule.
112
3.4.1. Incorporation of risk model
In this section, the list of 19 common risk factors (shown in Table 2.11) directly
or indirectly influence the possibility of a schedule success is also examined
similarly to the Section 3.2.1.
Each risk factor is represented by a node with a NPT. Each project team or
project manager played as an expert to provide the values in the NPTs (based on
his/her previous experience on the project features and the team). Moreover, the
relationships among the risk factors are also analysed based on literature review and
project managers’ experience.
Each task in an iteration at a certain point of time might be impacted by a
general risk which represents 19 risk factors.
3.4.2. Tool and experimental results
a) Building tool
To elaborate the proposed model and algorithm, we built the tool BAIS
(Bayesian Agile Iteration Scheduling) using Java programming language. The tool
allows users to enter the number of resources (developers), the number of tasks, the
length of iterations, tasks’ precedencies, pre-assignments and durations for tasks.
The tool also requires the input for a table of probability for each resource finishes
the assigned task in time.
Figure 3.16. A screenshot of tool BAIS
The tool has the following main functions:
- Providing an iteration schedule based on the input data
- Providing the possibilities for each task, resource and the whole iteration
- Displaying BNs for resources. There are also options for adjusting the nodes’
NPTs so that project managers can examine the projects in different scenarios.
113
Figure 3.16 shows a screenshot of the tool in which the pop-up message shows
an example of the possibility (69.95%) for finishing a schedule.
b) Experimental results and analysis
The authors use two real data samples in two experiments:
a) The first sample is an e-commerce project using Scrum method. Given a
sprint (an iteration) with 7 resources and 460 working-hour length. Its number of
tasks is 43 and there are five precedence tasks: P[1] = 3, P[2] = 3, P[3] = 4, P[5] =
6, P[6] = 7 where P[i] = j means task j has to be done before task i.
Table 3.9. The result for the first data sample
R1
T4:
85%
T8:
76%
T13:
70%
T17:
67%
T22:
65%
T27:
64%
T30:
63%
T40:
62%
R2
T3:
85%
T9:
76%
T14:
70%
T31:
66%
T35:
65%
T41:
63%
R3
T1:
81%
T18:
69%
T23:
62%
T28:
57%
T34:
55%
T37:
53%
T42:
52%
R4
T2:
85%
T10:
76%
T29:
70%
R5
T7:
81%
T11:
70%
T15:
63%
T19:
59%
T24:
56%
T38:
55%
R6
T6:
81%
T20:
70%
T25:
63%
T32:
59%
T36:
56%
T39:
55%
T43:
54%
R7
T5:
84%
T12:
75%
T16:
70%
T21:
66%
T26:
64%
T33:
63%
Table 3.13 shows the result for the first data sample. Each resource assigned
tasks associated with the completion possibilities. The overall completion
possibility is 59.85%. In this case, if the sprint is adjusted to 450 hours, the overall
completion possibility calculated to be 60%. These numbers are reliable compared
to the real situation of the project.
b) The second experiment was carried out with a more complex real-life
software project data in education. There are 17 developers who need to finish 85
tasks in 9 Scrum sprints of the project. There are 54 tasks in the first sprint (10
working days) and 35 tasks in the last sprint (10 working days). The calculation
from the tool BAIS gives us (as also shown in Figure 3.17):
114
- The overall probability for the first sprint is 67.12%. Infact, only 35 tasks were
done as scheduled, which can be considered as 64.82% (35/54). There are 2.3%
difference between the tool calculation and the real situation.
- The overall probability for the last sprint is 73.31%. Infact, the sprint has 9
tasks over-scheduled meaning 74.29% which is almost the same as calculated by the
tool (0.98% difference).
The results of the second case can be explained by the complexity of the
project. There are so many tasks together with many constraints (precedencies) that
lead to the possibilities of over-scheduled sprints.
Figure 3.17. The result of the second experiment
The results of both experiments show that the tool can support decisions in
Agile software development planning to tailor the best plan for the specific project
context and users’ and/or customers’ feedbacks by altering constraints, capacities
and priorities. In a single project, the manager can also use the tool to predict next
phases or next iterations schedule and better understand how the failure of the phase
can impact the whole project.
3.4.3. Conclusion and contribution
The section has developed an algorithm for Agile iteration scheduling with the
incorporation of BNs to support software teams to analyse the schedules as well as
predicting the chance of their success. The method improves the quality of Agile
software development planning to provide lower level risks by considering all major
planning factors (e.g. dependencies, capacities) in a mathematical optimization
model.
115
The results of experiments on the available data sets indicate that the approach
can provide practical value as a decision support tool for Agile iteration planning.
To further affirm this, more representative real-life data sets needed, and some case
studies can be carried out. The proposed 19 common risk factors in Agile iteration
scheduling can also be further examined and refined by means of literature review
and case studies.
Besides its limitation, this section can be considered as a step towards a
conceptual model of Agile iteration planning and scheduling. Since the research
gives better insight into resource-constrained project scheduling problems, this may
suggest a new optimization problem on Agile iteration scheduling.
An upgrade of the tool BAIS can be further developed which incorporates BNs
for representing and analyzing causal models involving uncertainty. The version can
even provide a set of tools for constructing probabilistic inference and decision
support systems on BNs and thus can assist software project managers in making
decisions in scheduling and planning all kinds of software projects.
3.5. Chapter remarks
This chapter aims at finding better model for handling uncertainty/risks in
software project scheduling by considering both scheduling techniques and common
risks in software scheduling. This chapter come up with proposing an improved
scheduling method based on the integration of BNs and risk factors in software
project scheduling with to the following techniques: PERT, CPM, and Agile
software development scheduling. The experiments’ results confirmed that the
model proposed for managing common risks in software project scheduling with a
model of 19 common risks factors works well both with CPM and PERT in cases of
high level of uncertainty. With these results, the thesis achieves the second
objective (mentioned in the Introduction section).
116
Conclusion
What has been done
The research has done the following tasks:
- Carrying out literature review on project scheduling techniques, software
project scheduling techniques, and on risk factors in software project scheduling.
- Proposing scientific models for risk management in software project
scheduling that examines specific features of software project scheduling such as
common risk factors and their impacts (on software project schedules) as well as
supporting software project managers to keep track of their projects’ schedules. The
models based on a probabilistic approach using Bayesian Networks, and applied
both to traditional waterfall software development and agile software development.
- Validating proposed models by building tools and carrying out experiments
with data from the real world.
The two objectives of the thesis (mentioned in the Introduction section) have
been achieved. The results have been published in 6 conference and journal papers
(see List of scientific publications section for more details).
Main contributions
The research has developed the algorithm BRI (Bayes Risk-Impact) and the tool
CKDY to assess the impacts of risks and hence proposes common risk factors in
software project scheduling. Based on literature review and experiments, the
research has come up with 19 common risk factors in software project scheduling
(for both agile development style and traditional development style).
The research also proposes advanced scheduling methods in software project
development. The methods based on incorporating Bayesian Networks and common
risk factors models into popular software scheduling techniques such as PERT,
CPM, and Agile software development. Tools have been built to experiment the
proposed scheduling methods and models. Experimental results show that the
proposed methods and models are reliable as well as providing practical value to
software development teams in analyzing, monitoring and predicting risks and the
chance of success of the project.
Limitations
Since the thesis aims at solving different pieces (in terms of the way of software
project development and the set of risk factors) of the puzzle of risk management in
117
software project scheduling, different data set is provided for each piece. As a
consequence, there has been no consistent data set for all the work presented in the
two main chapters (Chapter 2 and Chapter 3) of the thesis. Moreover, although the
author tried to get real software project data from well-known software companies
in Vietnam, his approaches have not been applied by those companies into real ongoing software projects yet. All the experiments have been done with finished
projects and with the judgements from the projects’ people (especially the projects’
managers).
The definition of empirical evaluation criteria is also another limitation of the
thesis since there is not enough data and information from real software projects (to
act as the basis for evaluating according to the criteria). The evaluation of the
experimental results in the thesis is currently based on the information provided by
the project teams, and they also act as experts to consider the validity of the
experimental results.
In addition, although the research tries to find out the optimization algorithms
for software project scheduling, it has not proposed a brand new algorithm yet. The
research has only improved the existing methods and algorithms using probabilistic
approaches.
Further research
The results of experiments on the available data sets and indicate that the
approach proposed in the research can provide practical value as a decision support
tool for software scheduling and planning. To further affirm this, more
representative real-life data sets needed, and some case studies can be carried out in
real on-going software projects. The author would come up with consistent data sets
for both traditional software project development and agile software project
development, and these data sets could also be contributed to the research
community.
Further research can be incorporating additional uncertainty sources into the
model, or further handling common causal risks (which affect more than one task).
The list of 19 common risk factors in software scheduling could be further refined
by case studies or surveys.
The research would also go further with finding out a software scheduling
optimization algorithm using Bayesian Networks.
118
List of scientific publications
PUB1: Nguyễn Ngọc Tuấn, Huỳnh Quyết Thắng (2017), “Iteration scheduling
using Bayesian networks in Agile Software Development”, Kỷ yếu Hội nghị Quốc
gia lần thứ X về Nghiên cứu cơ bản và ứng dụng Công nghệ thông tin (FAIR’10) –
Đà Nẵng, ngày 17-18/8/2017, trang 300-308, ISBN: 978-604-913-614-6
PUB2: Nguyễn Ngọc Tuấn, Võ Thị Hường, Huỳnh Quyết Thắng (2017), “Hướng
tới mô hình mạng Bayes để đánh giá rủi ro trong lập lịch dự án phần mềm”, Kỷ
yếu Hội nghị Quốc gia lần thứ X về Nghiên cứu cơ bản và ứng dụng Công nghệ
thông tin (FAIR’10) – Đà Nẵng, ngày 17-18/8/2017, trang 275-282, ISBN: 978604-913-614-6
PUB3: Nguyễn Ngọc Tuấn, Trần Trung Hiếu, Huỳnh Quyết Thắng (2017),
“Phương pháp xác suất cải tiến sử dụng mạng bayes đánh giá rủi ro trong lập lịch
dự án phần mềm”, Chuyên san Công nghệ thông tin và Truyền thông, Tạp chí Khoa
học và Kỹ thuật - Học viện KTQS - Số 184 (06-2017), trang 45-61, ISSN: 18590209
PUB4: Nguyen Ngoc Tuan, Huynh Quyet Thang (2018), “Risk management in
Agile software project iteration scheduling using Bayesian Networks”, New Trends
in Intelligent Software Methodologies, Tools and Techniques, Volume 303, 2018,
pp. 596 - 606 (SOMET 2018), ISBN 978-1-61499-899-0, DOI: 10.3233/978-161499-900-3-596, SCOPUS Indexed.
PUB5: Ngoc-Tuan Nguyen, Quyet-Thang Huynh, Thi-Huong-Giang Vu (2018), “A
Bayesian Critical Path Method for Managing Common Risks in Software Project
Scheduling”, SoICT 2018 Proceedings of the 9th International Symposium on
Information and Communication Technology, Danang City, Viet Nam - December
06 - 07, 2018, ISBN: 978-1-4503-6539-0, pp. 382-388, DOI:
10.1145/3287921.3287962
PUB6: Quyet-Thang Huynh, Ngoc-Tuan Nguyen (2020), “Probabilistic Method for
Managing Common Risks in Software Project Scheduling Based on Program
Evaluation Review Technique”, International Journal of Information Technology
Project Management, Volume 11(3), pp. 77-94, ISSN: 1938-0232, DOI:
10.4018/IJITPM.2020070105.
119
References
[1] Moore T. (2018), “Worst failure of public administration in this nation: payroll
system”, The Sydney Morning Herald, Retrieved 24 July 2018, available online.
[2] Glick B. (2014), “Government finally scraps e-Borders programme”,
ComputerWeekly.com, Retrieved 24 July 2018, available online.
[3] Boehm B.W. (1991), “Software Risk Management: Principles and Practices”,
IEEE Software, 8(1), pp. 32–41.
[4] Dedolph M. (2003), “The Neglected Management Activity: Software Risk
Management”, Bell Labs Technical Journal, 8(3), pp. 91–95.
[5] Hui A.K.T. and Liu D.B. (2004), “A Bayesian Belief Network model and tool to
evaluate risk and impact in software development projects”, Reliability and
Maintainability, 2004 Annual Symposium – RAMS, pp. 297-301.
[6] Karollay G. O. V., Carlos E. S. S., Sandra M. N. (2020), “Risk Management in
Software Development Projects: Systematic Review of the State of the Art
Literature”, International Journal of Open Source Software and Processes (IJOSSP)
11(1), pp. 1-22.
[7] PMI (2017), “A Guide to the Project Management Body of Knowledge (PMBOK
Guide)”, 6th Edition, Project Management Institute.
[8] Rao B.H., Gandhy A. & Rathod R.R. (2013). “A Brief View of Project
Scheduling Techniques”, International Journal of Engineering Research &
Technology, 2(12), pp. 1555-1559.
[9] Jun-yan J. (2012), “Schedule Uncertainty Control: A Literature review”,
Physics Procedia, Volume 33, pp. 1842 – 1848.
[10] Kaur R. et al. (2013), “A review of various software project scheduling
techniques”, International Journal of Computer Science & Engineering Technology,
4(7), pp. 877-882.
[11] Williams T. (1995), “A Classified Bibliography of Recent Research Relating to
Project Risk Management”, European Journal of Operational Resarch, 85(1), pp.
18-38.
[12] Malcolm et al. (1959), “Application of a Technique for Research and
Development Program Evaluation”, Operations Research, 7(5), pp. 646-669.
120
[13] Miller R.W. (1962), “How to plan and control with PERT”, Harvard Business
Review, pp. 93-104.
[14] Ward S. and Chapman C. (2003), “Transforming project risk management into
project uncertainty management”, International Journal of Project Management, 21,
pp. 97-105.
[15] Khodakarami V. (2009), “Applying Bayesian Networks to model uncertainty in
project scheduling”, PhD dissertation, Queen Mary, University of London.
[16] Erhan P., Yasemin S. and Barbaros Y. (2020), “Integrating Risk into Project
Control Using Bayesian Networks”, International Journal of Information
Technology & Decision Making, 19(5), pp. 1327-1352.
[17] Ali N., Siamak H. Y., Vahidreza Y. and Jolanta T. (2019), “Combining Monte
Carlo Simulation and Bayesian Networks Methods for Assessing Completion Time
of Projects under Risk”, International Journal of Environmental Research and
Public Health, 16, 5024; doi:10.3390/ijerph16245024.
[18] Lee, Y. P. and Shin J. G. (2009), “Large Engineering Project Risk
Management Using a Bayesian Belief Network”, Expert Systems with Applications,
vol. 36(3), pp. 5880–5887.
[19] Sharma S.K. and Chanda U. (2017), “Developing a Bayesian Belief Network
model for prediction of R&D project success”, Journal of Management Analytics,
vol. 4 (2), pp.1-24.
[20] Khodakarami V., Fenton N. and Neil M. (2007), “Project Scheduling:
Improved Approach to Incoporate Uncertainty using Bayesian Networks”, Project
Management Journal, 38(2), pp. 39-49.
[21] Fenton N.E. and Neil M. (2014), “Decision support software for probabilistic
risk assessment using Bayesian Networks”, IEEE Software, 31(2), pp. 21-26.
[22] Chang H.K, Yu W.D. and Cheng S.T. (2017), “A Risk-based Critical Path
Scheduling Method (I): Model and Prototype Application System”, Proceedings of
34th International Symposium on Automation and Robotics in Construction ISARC.
[23] Kumar, C. & Yadav, D.K. (2015), “A Probabilistic Software Risk Assessment
and Estimation Model for Software Projects”, Procedia Computer Science, 54, pp.
353–361.
[24] Hu Y. et al. (2013), “Software Project Risk Analysis Using Bayesian Networks
with Causality Constraints”, Decision Support Systems, vol. 56, pp. 439–449.
121
[25] Anthony B.J. et al. (2016), “A Proposed Risk Assessment Model for Decision
Making in Software Management”, Journal of Soft Computing and Decision
Support Systems, vol. 3 (5), pp. 31-43.
[26] Rai A. K., Agrawal S. and Khaliq M. (2017), “Identification of Agile Software
Risk Indicators and Evaluation of Agile Software Development Project Risk
Occurrence Probability”, Proceedings of 7th International Conference on
Engineering Technology, Science and Management Innovation (ICETSMI-2017),
pp. 489-494.
[27] Szoke A. (2014), “Models and Algorithms for Integrated Agile Software
Planning and Scheduling”, PhD Dissertation.
[28] Wallace L., Keil M. and Rai A. (2004), “How software project risk affects
project performance: an investigation of the dimensions of risk and an exploratory
model”, Decision Sciences, 35(2), pp. 289-321.
[29] J. Menezes Jr., Gusmao C. and Moura H. (2013), “Defining Indicators for Risk
Assessment in Software Development Projects”, CLEI Electronic Journal, 16(1).
[30] Sadiq M. and Shahid M. (2013), “A Systematic Approach for the Estimation of
Software Risk and Cost using EsrcTool”, CSIT, vol. 1(3): 243–252.
[31] Kumar C. and Yadav D. K. (2015), “A Bayesian Approach of Software Risk
Assessment”, International Journal of Applied Engineering Research (IJAER), 10,
pp. 2366-2371.
[32] Jefferson F.B., Hermano P.d.M, Marcelo L.M.M (2020), “Towards a
Quantitative Model to Deal with Uncertainty Management in Software Projects”,
The XI Brazilian Software Congress: Theory and Practice.
[33] Yong J. & Zhigang Z. (2011), “The Project Schedule Management Model
Based on the Program Evaluation and Review Technique and Bayesian Network”,
Proceedings of the IEEE International Conference on Automation and Logistics,
Chongqing, China, pp. 379-383.
[34] Wrike blog (2019), “What Is Software Project Management?”, Retrieved 3
September 2019, available online.
[35] Moder J. (1988), “Network Techniques in Project Management”, Project
Management Handbook, New York, Van Nostrand Reinhold.
122
[36] Fortune J. and White D. (2006), "Framing of Project Critical Success Factors
by a Systems Model", International Journal of Project Management, 24(1), pp. 5365.
[37] Fenton N. and Neil M. (2013), “Risk Assessment and Decision Analysis with
Bayesian Networks”, Reading book, CRC Press.
[38] Pollack-Johnson B. and Liberatore M.J. (2005), “Project Planning under
Uncertainty Using Scenario Analysis”, Project Management Journal, 36(1), pp. 1526.
[39] Van Slyke R.M. (1963), “Monte Carlo Methods and the Pert Problem”,
Operations Research, 11(5), pp. 839-860.
[40] Fishman G.S. (1986). A Monte Carlo Sampling Plan for Estimating Network
Reliability. Operations Research, 34(4), pp. 581-594.
[41] Ragsdale C. (1989), “The current state of network simulation in project
management theory and practice”, Omaga, 17(1), pp. 21-25.
[42] Oracle (2018), “Oracle Primavery Risk Analysis (Pertmaster®)”, Emerald
Associates, available online.
[43] PMI. (1999), “Project Management Software Survey”, Newtown Square, PA:
Project Management Institute.
[44] Pollack-Johnson B. and Liberatore M.J. (2003), "Analytical Techniques in
Project Planning and Control: Current Practice and Future Research Directions",
Unpublished manuscript, Villanova University.
[45] Van Dorp J. R., Duffey M. R. (1999), “Modelling statistical dependence in
risk analysis for project networks”, International Journal of Production Economics,
58, pp. 17-29.
[46] Williams T. (2004), “Why Monte Carlo Simulations of Project Networks Can
Mislead”, Project Management Journal, 35(3), pp. 53-61.
[47] Liberatore M.J. (2002), “Project Schedule Uncertainty Analysis Using Fuzzy
Logic”, Project Management Journal, 33(4), pp. 15-22.
[48] Kuchta D. (2001), “Use of Fuzzy Numbers in Project Risk Assessment”,
International Journal of Project Management, 19(5), pp. 305-310.
[49] Bonnal P. et al. (2004), “Where do we stand with Fuzzy project scheduling?”,
Journal of Construction Engineering & Management, 130(1), pp. 114-123.
123
[50] Abrahamsson P. et al. (2002), “Agile Software Development methods: Review
and Analysis”, VTT Publications 478, pp. 3-107.
[51] Stalhane T. and Hanssen G. K. (2008), “The application of ISO 9001 to Agile
Software Development”, PROFES 2008, pp. 371-385.
[52] Schwaber K. (1995), “The Scrum development process”, In OOPSLA ’95
Workshop on Business Object Design and Implementation, Austin, Texas, USA,
October 1995. ACM Press.
[53] Huo M., Verner J., Zhu L., Babar M.A. (2004), “Software quality and Agile
methods”, Proceedings of COMPSAC’04, pp. 520-525.
[54] Wailgum T. (2007), “From Here to Agility”, CIO.com, Retrieved June 2018,
available online.
[55] Glazer H., Dalton J., Anderson D., Konrad M., Shrum S. (2008), “CMMI or
Agile: Why not embrace both!”, Technical Note CMU/SEI-2008-TN-003, Software
Engineering Institute, Carnegie Mellon University.
[56] Cohn M. (2005), “Agile Estimating and Planning”, NJ, USA: Prentice Hall
PTR, ISBN: 0131479415.
[57] PRAM (2004), “Project Risk Analysis and Management Guide”, High
Wycomb, Association for Project Management (APM).
[58] RAMP (2005), “Risk Analysis and Management for Projects”, London
Institute of Civil Engineering and the Faculty and Institute of Actuaries, Thomas
Telford.
[59] Chapman C. (2006), “Key Point of Contention in Framing Assumptions for
Risk and Uncertainty Management”, International Journal of Project Management,
24(4), pp. 303-313.
[60] Barry, J. B. (1995), “Assessing Risk Systematically”, Risk Management, 42,
pp. 12-15.
[61] Williams T. M. (1994), “Using a Risk Register to Integrate Risk Management
in Project Definition”, International Journal of Project Management, 12(1), pp. 1722.
[62] Ward S.C. (1999), “Assessing and Managing Important Risks”, International
Journal of Project Management, 17(6), pp. 331-336.
124
[63] Patterson F.D. and Neailey K. (2002), “A Risk Register Database System to
Aid the Management of Project Risk”, International Journal of Project Management,
20(5), pp. 365-374.
[64] Hillson D. (1999), “Developing Effective Risk Responses”, Proceedings of the
30th Annual Project Management Institute Seminars and Symposium, Philadelphia
USA.
[65] Al-Bahar J. and Crandall K.C. (1990), “Systematic Risk Management
Approach for Construction Projects”, Journal of Construction Engineering and
Management, 116(3), pp. 533-546.
[66] UK Ministry of Defence (1991), “Risk Management in Defence Procurement”,
Ministry of Defence, Whitehall, London.
[67] del Caano A. and de la Cruz M.P (2002), “Integrated Methodology for Project
Risk Management”, Journal of Construction Engineering and Management, 128(6),
pp. 473-485.
[68] Wideman R.M. (1992), “Project and Program Risk Management”, Newtown
Square, PA, USA, Project Management Institute.
[69] BSI (1999), “Guide to Project Management”, London, British Standard.
[70] Rosenberg L.H. et al. (1999), “Continuous Risk Management at NASA”,
NASA, available online.
[71] Defense Systems Management College (2000), “Risk Management Guide for
Dod Acquisition”, USA, Department of Defense.
[72] US Department of Transportation (2000), “Project Management in the
Department of Transportation”.
[73] Baber R.B. (2005), “Understanding Internally Generated Risks in Projects”,
International Journal of Project Management, 23(8): 584-590.
[74] Goldstein M. (2006), “Subjective Bayesian analysis: Principle and practice”,
Bayesian Analysis, 1(3), pp. 403-420.
[75] Joshua H., Martin N., Norman E. F. (2020), “Product risk assessment: a
Bayesian network approach”, Proceedings of the 2020 ACM Southeast Conference,
April 2020, pp. 34–38.
125
[76] McCabe B. (1998), “Belief Networks in Construction Simulation”,
Proceedings of the 30th Conference on Winter simulation, IEEE Computer Society
Press.
[77] Nasir D., McCabe B. et al. (2003), "Evaluating Risk in Construction-Schedule
Model (Eric-S): Construction Schedule Risk Model", Journal of Construction
Engineering & Management, 129(5), pp. 518-827.
[78] Houston D. (2000), “Survey on potential effects of major development risk
factors”, Arizona State University Research Project.
[79] Cortellessa V. et al. (2005), “Model-Based Performance Risk Analysis”, IEEE
Transactions on Software Engineering, 31(1): 3–20.
[80] Islam S. (2012), “Software Development Risk Management Model - A GoalDriven Approach”, Technical Report.
[81] Alberts C.J. and Dorofee A.J. (2010), “Risk management framework”, SEI
Technical Report.
[82] NASA Policy Detective (2005), NPD 2820.1A NASA Software Policies.
[83] IEEE Computer Society (2001), “IEEE Standard for Software Life Cycle
Processes - Risk Management”.
[84] Tore D. and Torgeir D. (2008), “Empirical studies of agile software
development: A systematic review”, Information and Software Technology 50.9-10,
pp. 833–859.
[85] Augustine S. (2005), “Managing Agile Projects”, Upper Saddle River, NJ,
USA: Prentice Hall PTR.
[86] Tsun Chow and Dac-Buu Cao (2008), “A survey study of critical success
factors in agile software projects”, Journal of System and Software 81(6), pp. 961–
971.
[87] Schwaber K. and Beedle M. (2001), “Agile Software Development with
Scrum”.
[88] Martin R.C. (2002), “Agile Software Development, Principles, Patterns and
Practices”.
[89] Miller A. (2008), “Distributed Agile Development at Microsoft patterns and
practices”.
126
[90] Agile Alliance, “Manifesto for agile software development”, [Online]
Retrieved 14 May 2017. Available at: http://agilemanifesto.org
[91] Nguyen N.T. and Huynh Q.T. (2013), “Combining Maturity and Agility –
Lessons Learnt From A Case Study”, Proceedings of the 4th International
Symposium on ICT SoICT 2013, pp. 267-274.
[92] VersionOne, 7th Annual Survey (2013), “The State of Agile Development”,
Full Data Report.
[93] Fox T. L. and Spence J. W. (1998), “Tools of the trade: a survey of project
management tools”, Project Management Journal, 29, pp. 20-28.
[94] Pollack-Johnson B. (1998), “Project management software usage patterns and
suggested research directions for future development”, Project Management
Journal, 29, pp. 19-29.
127
Index
A
agile iteration scheduling ............................. 3, 4, 81, 92
agile software development 4, 22, 27, 32, 34, 36, 91, 92,
117, 126, 127
R
Bayesian Networks .... 3, 4, 5, 18, 22, 23, 38, 39, 42, 46,
48, 81, 92, 102, 112, 116, 117, 118, 119, 121, 123
BNs18, 19, 21, 22, 40, 42, 43, 44, 45, 46, 48, 54, 64, 66,
68, 73, 74, 78, 79, 80, 81, 82, 85, 86, 92, 93, 97, 98,
100, 101, 102, 105, 107, 112, 113, 115, 116
BRI.................................... 4, 10, 23, 46, 57, 63, 79, 117
risk analysis ............ 3, 4, 16, 17, 30, 36, 37, 42, 43, 123
risk factors 3, 4, 5, 18, 19, 20, 22, 23, 31, 45, 46, 47, 48,
50, 51, 54, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66,
67, 68, 70, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 92,
93, 94, 96, 100, 101, 102, 103, 104, 105, 112, 116,
117, 118, 126
Risk management.................................... 3, 34, 119, 126
risks .. 3, 4, 15, 16, 17, 18, 19, 21, 22, 23, 34, 35, 36, 37,
38, 43, 44, 46, 54, 57, 58, 61, 62, 63, 64, 68, 69, 74,
75, 76, 77, 79, 81, 82, 85, 92, 93, 95, 97, 98, 101,
102, 103, 104, 105, 112, 115, 117, 118
C
S
CKDY ............. 4, 19, 23, 46, 64, 68, 73, 74, 75, 79, 117
CPM ....3, 4, 5, 10, 17, 18, 19, 21, 22, 23, 27, 28, 29, 30,
45, 64, 75, 92, 93, 96, 97, 98, 101, 102, 111, 112,
116, 117
Scrum............................................ 32, 34, 114, 124, 126
Software project management ........................... 3, 24, 25
Software project scheduling........................ 3, 26, 27, 82
B
T
M
makespan ................................................................... 89
P
PERT...3, 4, 5, 10, 11, 17, 18, 20, 21, 22, 23, 27, 29, 30,
45, 64, 75, 101, 102, 103, 104, 106, 110, 112, 116,
117, 121
project management . 3, 5, 16, 17, 18, 22, 24, 25, 27, 29,
31, 34, 37, 44, 46, 47, 48, 60, 61, 62, 66, 123, 127
The RBCPM Method............................................... 96
The RBCPM Model ................................................. 93
the tool BAIS ................................ 88, 90, 113, 114, 116
The tool RBCPM ..................................................... 97
The tool RBPERT.................................................. 106
U
uncertainty .... 3, 4, 17, 18, 19, 21, 22, 27, 29, 30, 31, 32,
34, 35, 36, 37, 43, 44, 46, 64, 81, 82, 85, 92, 98, 101,
102, 112, 116, 118, 121
Q
quantitative risk analysis .................................. 3, 17, 36
128
Appendix. Sub Bayesian Networks of the 24 risk factors
This appendix demonstrates in details the sub BNs associated 24 risk factors
which was examined in Section 2.1.2.
staff_experience_shortage
+untrained_staff
+staff_training
+project_schedule
Figure 1. A sub BN for the risk factor “Staff experience shortage”
+decision_make_delay
reliance_on_a_few_person
+productivity
+low_moral
Figure 2. A sub BN for the risk factor “Reliance on few key person”
129
Figure 3. A sub BN for the risk factor “Schedule pressure”
Figure 4. A sub BN for the risk factor “Low productivity”
130
lack_of_staff_commitment
++productivity
+loss_of_staff
+staff_experience_shortage
Figure 5. A sub BN for the risk factor “Lack of staff commitment”
+defect_rate
++lack_of_client_input
+lack_of_staff_commitment
lack_of_client_support
++missed_requirement
+creeping_user_requirements
Figure 6. A sub BN for the risk factor “Lack of client support”
131
++decision_making_delay
++low_moral
+rework
lack_of_contact_person_
competence
+++schedule_pressure
+communication_overhead
++missed requirement
+creeping_user_requirements
Figure 7. A sub BN for the risk factor “Lack of contact person competence”
lack_of_quantitative_historical_data
++inaccuring_cost_estimating
Figure 8. A sub BN for the risk factor “Lack of quantitative historical data”
132
inaccurate_cost_estimating
+staff_experience_
shortage
++schedule_pressure
Figure 9. A sub BN for the risk factor “Inaccurate cost estimating”
++large_and_complex_project
large_and_complex_external_interface
Figure 10. A sub BN for the risk factor “Large and complex
external interface”
+communication_overhead
+++defect_rate
large_and_complex_project
Figure 11. A sub BN for the risk factor “Large and complex project”
133
++project_size
unnecessary_features
Figure 12. A sub BN for the risk factor “Unnecessary features”
++rework
++project_size
creeping_user_requirement
Figure 13. A sub BN for the risk factor “Creeping user requirement”
+defect_rate
++schedule_delay
unreliable_subproject_delivery
Figure 14. A sub BN for the risk factor “Unreliable subproject delivery”
134
Figure 15. A sub BN for the risk factor “Incapable project management”
+staff_experience_
shortage
+low_moral
lack_of_senior_managem
ent_commitment
++project_schedule
+schedule_pressure
Figure 16. A sub BN for the risk factor “Lack of senior management commitment”
135
lack_of_organization_maturity
++incurate_cost_estimating
++inadequate_process_method
++schedule_pressure
Figure 17. A sub BN for the risk factor “Lack of organization maturity”
+rework
+schedule_pressure
++inadequate_process_
method
+productivity
immature_technology
+defect_rate
Figure 18. A sub BN for risk factor “Immature technology”
136
+rework
+productivity
++defect_rate
inadequate_configuration_
control
+manual_efforts
+project_schedule
Figure 19. A sub BN for the risk factor “Inadequate configuration control”
+defect_rate
+productivity
++low_moral
excessive_paper_work
Figure 20. A sub BN for the risk factor “Excessive paperwork”
137
+schedule_pressure
++inaccurate_reporting
++inaccurate_cost_
estimating
inaccurate_metrics
Figure 21. A sub BN for the risk factor “Inaccurate metrics”
+inaccurate_cost
_estimating
+schedule_pressure
excessive_reliance_on_a_sing
le_process_improvement
+defect_rate
Figure 22. A sub BN for risk factor “Excessive reliance on a single
process improvement”
138
+communication_overhead
++productivity
lack_of_experience_with_
project_environment
++staff_training
Figure 23. A sub BN for the risk factor “Lack of experience with project
environment”
+defect_rate
+communication_overhead
lack_of_experience_with_project_software
+productivity
+staff_training
Figure 24. A sub BN for the risk factor “Lack of experience with project
software”
139
Download