Slides - faculty.sutd.edu.sg

advertisement
50.530: Software Engineering
Sun Jun
SUTD
Week 1: Introduction
ABOUT THIS COURSE
sunjun@sutd.edu.sg
level 3, room 9
Facebook: sunjunhqq
weChat: sunjunProf
Course Communication
• All class materials are on the course website
– Lecture slides
– Course project
• Q&A
– Email/WeChat/Facebook
• sunjun@sutd.edu.sg
• WeChat: sunjunprof
• Facebook: sunjunhqq
Course Structure
• Cohort class
– every Monday 10-12; every Tuesday 10-11
• Course Project: (50%)
• Problem Sets: (20%)
• Final Exam: Dec 15 (30%)
INTRODUCTION TO SOFTWARE
ENGINEERING
Software Engineering
User Requirements
the magical programming machine
System Implementation
***The synthesis problem (i.e., synthesizing a program from a specification
automatically) is undecidable
Software Engineering
User Requirements
The species we called programmers
System Implementation
A Programmer’s Life
Staged Approach
Are we getting the right requirements?
User Requirements
Specification is equivalent to requirements?
System Specification
Design satisfies the specification?
The design is correctly implemented?
System Design
System Implementation
***The verification problem (i.e., verifying whether a program satisfies certain
property) is undecidable too – but easier than the synthesis problem.
Requirements
• During the requirements workflow, the primary
activities include
– Listing candidate requirements
– Understanding the system context through domain
modelling and business modelling
– Capturing functional as well as non-functional
Requirements
• Requirements should be captured in the language of
the user.
– Use cases help distil the essence of requirements as sets of
action-response transactions between the user and the
system.
12
Requirements
13
Analysis
• A key theme of the analysis workflow is to
understand how and where requirements
interact and what it means for the system.
• Analysis also involves
– Detecting and removing ambiguities and
inconsistencies amongst requirements
– Developing an internal view of the system
– Identifying the analysis classes and their
collaborations
• Analysis classes are preliminary placeholders of
functionality
14
Analysis
15
Related Research
• Proposing formal specification languages
– The Z language, VDM, the B language, etc.
– CSP, CCS, etc.
• Providing facilities for programmers to write
specification
– Java modeling language
So far nothing has been working.
Design
• Deciding on the collaboration between
components lies at the heart of software
design.
– A component fulfils its own responsibility through
the code it contains.
– A component exchanges information by calling
methods on other components, or when other
components call its own methods.
17
Design
• The design workflow involves
– Considering specific technologies
– Decomposing the system into implementation
units,
– Engaging in high-level and low-level designs
18
Design
19
Implementation
• A large part of implementation is
programming.
• Implementation also involves
– Unit testing
– Planning system integrations
– Devising the deployment model
20
Implementation
21
Testing
• The primary activities of the test workflow
include
– Creating test cases,
– Running test procedures, and analysing test
results.
• Due to its very nature, testing is never
complete.
22
Test
23
Real-World Bugs
http://en.wikipedia.org/wiki/List_of_software_bugs
Staged Approach
Ad Hoc
User Requirements
Missing
System Specification
Missing
System Design
System Implementation
BUGGY
Research Questions
• How do we facilitate users to write the
specification?
• How do we help users to formally document
system designs?
People tried and people failed
Research Questions
• How do we help programmers debugging?
• How do we verify a given program?
The course is about debugging and verification, and
many smaller questions that are related.
Exercise 1
• Debug the program here.
A Big View
the space of all program behaviors
The synthesis problem:
How do we find a
program to cover (part
of) A?
the behaviors we wanted
A Big View
the space of all program behaviors
The verification
problem: Is C empty?
A
the behaviors we wanted
B
C
the behaviors we have
A Big View
the space of all program behaviors
The Debugging
problem: how to find
where the problem is
and change the
program so that C is
empty?
A
the behaviors we wanted
B
C
the behaviors we have
COURSE PLANNING
Course Outline
Date
Topic
Sep 14/15
Introduction
Sep 21/22
Automatic Testing
Sep 28/29
Delta Debugging
Oct 5/6
Bug Localization
Oct 12/13
Specification Mining
Oct 19/20
Race Detection
Nov 2/3
Research Idea Presentation
Nov 9
Hoare Logic and Termination Checking
Nov 16/17
Invariant Generation
Nov 23/24
Symbolic Execution
Nov 30/Dec 1
Software Model Checking
Dec 8/9
Assume Guarantee Reasoning
Dec 19
Final Exam
Remarks
Debugging
Verification
Class Format
• I will introduce one or two approaches
proposed (for the topic that week) in the
literature.
– In class exercises will be there
• We will discuss:
– When the approaches work
– When they do not work
– How to make them better
Project
• Pick one of the topics covered in the following
10 classes;
• Conduct a survey on related work on that
topic;
• Propose an improved approach;
• Write a research paper;
Research Paper
• Title/Abstract
– catchy, to the point, not too abstract or detailed
• Section 1: Introduction
– Start with motivation
– Explain your approach at a high level intuitively
• Section 2: A Running Example
– Use an interesting example to illustrate your approach step-by-step
• Section 3: Detailed Approach
– Explain how each step of the approach is done; highlight the technical
challenges and remedies
• Section 4: Evaluation
– Show evidence on how the proposed approach would work on real-world
programs
– (Optional) Implementation of your approach
• Section 5: Related Work
– Survey related work and make a fair comparison with the proposed one
Real-world Examples
• For debugging,
– http://sir.unl.edu/content/sir.php
• For verification,
– http://sv-comp.sosy-lab.org/2015/
• For some other topics,
– http://find-your-own.com
Project Due Dec 12
UNDERSTANDING PROGRAMMING
Programs
p(i) = o
program
input
output
Programs
Java Programs
Bytecode
JVM
Physical Machine
Motivational Example
NSA actually intercepted a RSA-encrypted
secrete message which tells the location of a
terrorist act, we believe that the act is going to
happen one week from now, we need your help
in decrypting the message.
Task: Write a Java program to factor a number as the product of two
prime numbers.
Task Breakdown
• Requirements/Specification
– given a semi-prime, your program outputs its
prime factors within certain time
green: pre-condition
red: post-condition
purple: non-functional requirement
Correctness: pre-condition => post-condition
Task Breakdown
• Design
– Use the trial division method
– Read: http://en.wikipedia.org/wiki/Trial_division
– More:
http://en.wikipedia.org/wiki/Integer_factorization
• Implementation
– “Enough talk, let’s fight” (Kong Fu Panda)
Exercise 2
Write a Java program such that given a semiprime, outputs its prime factors.
Hint: You need to use the BigInteger class.
FactorPrime.java
Task Breakdown
• Testing
– 4294967297 (famous Fermat Number)
– 1127451830576035879
–
160731047637009729259688920385507056726966793490579598495689711866432421212774967029895340327
197901756096014299132623454583177072050452755510701340673282385647899694083881316194642417451
570483466327782135730575564856185546487053034404560063433614723836456790266457438831626375556
854133866958349817172727462462516466898479574402841071703909138062456567624565784254101568378
407242273207660892036869708190688033351601539401621576507964841597205952722487750670904522932
328731530640706457382162644738538813247139315456213401586618820517823576427094125197001270350
087878270889717445401145792231674098948416888868250143592026973853973785120217077951766546939
577520897245392186547279572494177680291506578508962707934879124914880885500726439625033021936
728949277390185399024276547035995915648938170415663757378637207011391538009596833354107737156
273037494727858302028663366296943925008647348769272035532265048049709827275179381252898675965
528510619258376779171030556482884535728812916216625430187039533668677528079544176897647303445
153643525354817413650848544778690688201005274443717680593899
• Verification: how to show it always works?
Understanding Sequential Programs
“A program consisted of a sequence
of instructions (and a memory),
where each instruction executed
one after the other (to modify the
memory, etc.). It ran from start to
finish on a single processor.”
“The sequential paradigm has the
following two characteristics: the
textual order of statements specifies
their order of execution; successive
statements must be executed
without any overlap (in time) with
one another.”
int previousMax;
public int max (int[] list) {
int max = list[0];
for (int i = 1; i < list.length; i++) {
if (max < list[i]) {
max = list[i];
}
}
previousMax = max;
return max;
}
The Illusion
int previousMax;
0
0. public int max (int[] list) {
1. int max = list[0];
2. for (int i = 1; 3. i < list.length; 4. i++) {
5.
if (max < list[i]) {
6.
max = list[i];
7.
}
8. }
list = …
1
max = list[0]
2
9. previousMax = max;
9
10. return max;
previous=max
11. }
return max
10
4
i >= list.length
i++
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
Control Flow Graph
memory
0
previousMax
…
input
…
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
list = …
[2,4]
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
[2,4]
list
[2,4]
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
[2,4]
list
[2,4]
max
2
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
[2,4]
list
[2,4]
max
2
i
1
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
[2,4]
list
[2,4]
max
2
i
1
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
[2,4]
list
[2,4]
max
2
i
1
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
[2,4]
list
[2,4]
max
4
i
1
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
[2,4]
list
[2,4]
max
4
i
1
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
[2,4]
list
[2,4]
max
4
i
1
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
[2,4]
list
[2,4]
max
4
i
2
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
[2,4]
list
[2,4]
max
4
i
2
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
The Trace
• With input = [2,4]
0
1
2
3
5
6
10
9
3
4
8
7
11
i
…
: a configuration of the program with control at line i
The Trace
• With input = [4,2]
0
1
2
3
5
10
9
3
4
8
11
i
…
: a configuration of the program with control at line i
7
Sequential Programming is Easy
• It is deterministic: with one input, there is one
deterministic path through control flow graph
input1
input2
input3
input4
input5
0
0
0
0
0
1
1
1
1
1
2
2
2
2
2
3
3
3
3
3
Testing is to find the ‘right’ input
…
Concurrent Programs
p(i, sc) = o
program
input
output
scheduling
Concurrency: Benefit
• Better resource utilization
– With k processors, ideally we can be k times faster,
if the task can be broken into k independent pieces and if we ignore the cost of task decomposition and communication between the processors
Processor:
Read file A
Process A
Read file B
Process B
time
We can factorize the semi-prime faster with multiple computers or cores
Concurrency: Benefit
• Better resource utilization
– With k processors, ideally we can be k times faster,
if the task can be broken into k independent pieces and if we ignore the cost of task decomposition and communication between the processors
Processor 2:
Read file A
Read file B
Processor 1:
Process A
Process B
time
• Can we get better performance with 1 processor
only?
Read file A
Processor:
Read file B
Process A
time
Process B
Concurrency: Cost
• More complex design, implement, testing,
verification
public class Holder {
Will the exception occur?
private int n;
public Holder(int n) { this.n = n; }
public void assertSanity() {
if (n != n)
throw new AssertionError("This statement is false.");
}
}
• Overhead in task decomposition, communication,
context switch
• Increased resource consumption
Distributed Systems
CPU
CPU
Memory
Memory
messages
messages
…
…
CPU
Memory
messages
Network
• Each process has its own memory and
processes communicate through messaging.
Multi-core Processors
CPU
CPU
Cache
Cache
…
…
CPU
Cache
Memory
• Each thread has its cache and threads
communicate through a shared memory.
Multi-core Computer: More Like This
Multi-Threaded Program
• Write a program such that N threads
concurrently increment a static variable
(initially 0) by 1. Set N to be 2 and see what is
the value of the variable after all threads are
done.
FirstBlood.java
Scheduling
threads
Thread1
Thread2
Thread3
Thread4
Scheduler
The scheduler is ‘un-predictable’
Scheduling/Interleaving
thread1
thread2
0
0
1
1
2
2
3
3
00
01
02
10
11
12
03
13
20
30
21
22
23
31
32
33
There are exponentially many sequences.
Is This Real?
Thread1 Thread2
0
0
count++
1
count++
1
00 count = 0

count = 1 01
10 count = 1
11
count = 2
This is assuming that count++ is one step. Or is it?
Reality is Messy
Java Programs
Bytecode
JVM
Physical Machine
What are the atomic steps?
What are the order of execution?
What and where are the variable values?
What Really Happened?
Thread1
Thread2
0
0
read value of Count and
assign it to a register
1
read value of Count and
assign it to a register
1
Increment the register
2
Increment the register
2
Write the register value
back to Count
3
Write the register value
back to Count
3
For double type, even read/write is not atomic!
What Really Happened?
Thread1 Thread2
0
r2
0
r1
1
01
r2
i2
w2
02
2
2
w1
3
i1
20
w1
30
21
r1
w2
3
11
12
03
r1
10
i2
1
i1
00
r2
13
22
31
i1
i2
23
w1
32
w2
33
What Really Happened?
Thread1 Thread2
0
0
r1
1
r1
02
1
2
i2
i2
12
03
2
w1
00
01
r2
i1
3
r2
w2
13
3
23
11
i1
w1
20
w2
30
21
22
33
Is this correct?
10
31
32
count=1
Concurrency is Hard
• Heisenbug
– is a computer programming jargon term for a
software bug that seems to disappear or alter its
behavior when one attempts to study it.
• How do we find bugs in a multi-threaded
program or show that there is no bug?
Course Outline
Date
Topic
Remarks
Sep 15
Introduction
Sep 22
Automatic Testing
Sep 29
Delta Debugging
Oct 13
Bug Localization
Oct 20
Specification Mining
Nov 3
Race Detection
Nov 10
Hoare Logic and Proving
Nov 17
Invariant Generation
Nov 24
Symbolic Execution
Dec 1
Software Model Checking
Dec 8
Assume Guarantee Reasoning
Concurrency*
Dec 19
Final Exam
Project Due Dec 18
Concurrency*
Exercise 3
• Write a multi-threaded program to factor
semi-prime. Argue that it is correct.
FactorThread.java
Reading Materials
• References:
– “Checking a Large Routine” by Turing
– “The Humble Programmer” by Dijkstra
– “No Silver Bullet: Essence and Accidents of
Software Engineering” by Brooks
Download