Software Dependencies - EEL6883 - Software Engr II

advertisement
Presented By : Abirami Poonkundran

This paper is a case study on the impact of
◦ Syntactic Dependencies,
◦ Logical Dependencies and
◦ Work Dependencies
on a software development project, and
identifies which dependencies have the
higher impact on fault proneness


Introduction
Software Dependencies
◦ Syntactic Dependencies
◦ Logical Dependencies
◦ Work Dependencies





Data Collection
Measuring Failure
Results
Conclusion
Pro’s and Con’s


Research has shown that software faults are
caused by violation of dependencies
Dependencies could be:
◦ Software Dependencies
 Technical
 Caused by developers
◦ Work Dependencies
 Organizational
 Caused by how work is organized

This paper examines the relative impact that
each of these dependencies have on the fault
proneness of the software system

Software Dependencies could be:
◦ Syntactic
◦ Logical



Focuses on Control and Dataflow
relationships
Dependencies are discovered by analysis of
source code or from an intermediate
representation like byte code or syntax trees
These dependencies could be:
◦ Data Related Dependency - e.g., a particular data
structure modified by a function and used in
another function
◦ Functional Dependency – e.g., Method A calls
Method B



Dependencies between the source code files
of a system that are changed together as part
of software development
Often Logical Dependencies provide more
valuable information than Syntactic
Dependencies (eg., in Remote Procedure
Calls)
They can identify important dependencies
that are not visible in Syntactic Code analysis



Only recent research have started shedding light
on the impact of human and organizational
factors on the failure proneness of software
systems
Caused because of lack of proper
communication and coordination between
developers
Research have shown that identification and
management of work dependencies is a major
challenge

Examined two large software development
projects:
◦ Project A
 Complex distributed system
 Data are covered for 3 years of development activity
 The company had 114 developers grouped into 8
development team and has 3 development locations
 ≃ 5 million lines of code distributed in 7,737 source
code files in C language
◦ Project B:
 Embedded software system
 40 developers in the project over a period of 5 years
 1.2 million lines of code were used in both C and C++
language



In both projects, every change to source code
was controlled by Modification Requests (MR)
Every change made to Source code has to be
committed to Version Control System
Information Used for this Analysis:
◦ Collected a total of 8,257 and 3,372 MRs for Project
A and Project B
◦ Version control system from both projects
◦ The source code itself from both projects


Goal is to investigate failure proneness at the
file level
File Buggyness – indicates whether a file has
been modified in the course of resolving a
defect




Used C-REX tool to identify programming
language tokens and references in each entity
of each source-code file
Source code snap shot was taken every
quarter
Syntactic dependency analysis was done for
each source code snapshot
Syntactic dependencies between source code
file was identified by data, function and
method references




Relate source-code files that are modified
together as part of an MR
If only one file was changed for an MR, then
there is no dependencies
Using the Commit information from the
Version control system, a logical dependency
matrix (LDM) was created
LDM is a symmetric matrix of source-code
files where Cij represents the sum, across all
releases, of the number of times files i and j
were changed together as part of an MR

Used two measures:
◦ Workflow Dependencies
 Captures the temporal aspects of the development
effort
 Two developers i and j are said to be interdependent if
the MR was transferred from one developer i to
developer j some point during that MR
◦ Coordination Requirements
 Captures the intradeveloper coordination requirements
 Uses two matrix:
 Task Assignment Matrix – Developer to file matrix
 Task Dependency Matrix – File to file matrix

Analysis consists of two stages:
◦ First Stage: Focus on examining the relative impact
of each dependency type on failure proneness of
source-code files
◦ Second Stage: Verified the consistency of the initial
results by conduction a number of confirmatory
analysis

Constructed several logistic regression
models



If Odds Ratio is larger than 1, then positive
relationship between the independent and
dependent variables
If Odds ratio less than 1, then negative
relationship
Model 1:
◦ Based on LOC and Average Lines Changed
◦ LOC is positively associated with failure proneness
◦ Average lines changed is also positively associated
with defects

Model II:
◦ Introduces Syntactic Dependency measures by:
 Inflow Data
 Has significant impact on error proneness
 Inflow Functional
 This type of syntactic dependency has less impact on
failure pronenesss

Model III:
◦ Higher number of logical dependencies related to
an increase in the likelihood of failure

Model IV:
◦ Workflow dependencies do increase the likelihood
of defects

Model V:
◦ Coordination requirement has an higher impact in
Project A and lesser impact in Project B


All dependencies increases fault proneness
Logical Dependencies has the highest impact,
followed by Workflow dependencies and then
Syntactic Dependencies


Analysis is based on data collection from 2
projects
Logical Dependencies has the highest impact
when compared to other 2 dependencies
Weakness:



Data collection from only 2 projects
They have not mentioned about other
dependencies except software and work
dependencies
Not provided a method to solve the errors for the
dependencies



Need to provided a method to solve the
errors for the dependencies
Discussion about other dependencies
General concepts should be introduced
Download