reading & understanding code

advertisement
reading & understanding code
• experts are better at code comprehension
because they focus on higher level patterns
– patterns can be considered “discourse rules”
– naming conventions, design patterns, schemas
• experts work significantly better when reading
& writing code according to these patterns
1
reading & understanding code
program comprehension
expertise effects
mental models
tools
2
outline
• mental models
– types
– models
•
•
•
•
conventions & “discourse rules”
expertise effects
tool implications
interesting tools
3
outline
• mental models
– types
– models
•
•
•
•
conventions & “discourse rules”
expertise effects
tool implications
interesting tools
4
mental model
• explanation of a someone’s thought process
when carrying out a task
– our someone: programmers
– our task: program comprehension
• several models exist
5
mental model classes
• bottom-up
– read code statement by statement then ascend for
a higher-level picture
• top-down
– start with a high-level picture of what the code is
doing then descend into code
• mixed
– incorporate elements from both, based on the
situation
6
mental model classes
• bottom-up
– read code statement by statement then ascend for
a higher-level picture
• top-down
– start with a high-level picture of what the code is
doing then descend into code
• mixed
– incorporate elements from both, based on the
situation
7
bottom-up mental models
• 1st: read code statements
• 2nd: chunking: group statements as abstractions
• 3rd: repeat
8
chunking
sequence
chunk 1
chunk 2
chunk n
element 1
element 2
element k
modified from wikipedia
9
chunking
• program model
– reasoning about the order of computation, how
control moves throughout a program
– “control flow”
• situation model
– reason about how data moves through atomic
models
– “data flow”
N. Pennington
Stimulus Structures and Mental Representations in
Expert Comprehension of Computer Programs
Cognitive Psychology, 1987
10
program & situation model studies
• participants first primed for either control flow or
data flow
– shown a piece of code, asked to recall another piece
of code which is related through either control flow or
data flow
• participants then asked a question that relates to
either control or data flow
• participants primed to think about control flow
answered other control-flow questions faster,
same with data flow
N. Pennington
Stimulus Structures and Mental Representations in
Expert Comprehension of Computer Programs
Cognitive Psychology, 1987
11
types of programmer knowledge
• semantic: general programming concepts
– low-level knowledge, e.g. what a=1 means
– high-level knowledge, e.g. sorting algorithms
• syntactic: language detail
– overlaps between languages
• stylistic: programming conventions
– “discourse rules”
B. Shneiderman and R. Mayer
Syntactic/Semantic Interactions in Programmer
Behavior: A Model and Experimental Results
Journal of Computer & Information Sciences, 1979
E. Soloway, K. Ehrlich
Empirical Studies of Programming Knowledge
IEEE Transactions of Software Engineering, 1984
12
problem
statement
short term
memory
internal semantics
(working memory)
program
high level concepts
low level concepts
knowledge (long term memory)
semantic knowledge
high level concepts
syntactic knowledge
COBOL
FORTRAN
PL/I
low level concepts
LISP
B. Shneiderman and R. Mayer
Syntactic/Semantic Interactions in Programmer Behavior: A Model and Experimental Results
Journal of Computer & Information Sciences, 1979
13
evidence for
semantic & syntactic knowledge
• lab studies using FORTRAN
– participants: programmers and non-programmers
– asked to perform tasks that used one type of
knowledge
– six studies (will describe two)
B. Shneiderman and R. Mayer
Syntactic/Semantic Interactions in Programmer
Behavior: A Model and Experimental Results
Journal of Computer & Information Sciences, 1979
14
program memorization
• study
– two subject types: non-programmers & programmers
– two program versions: normal & shuffled
– participants asked to memorize a program
• results
– non-programmers performed equally poorly with normal &
shuffled programs
– programmers performed poorly with shuffled program, well
with normal
• were able to remember semantic details with syntactic variations
• conclusion
– programmers were not memorizing the program, but internal
semantics to represent its function
B. Shneiderman and R. Mayer
Syntactic/Semantic Interactions in Programmer
Behavior: A Model and Experimental Results
Journal of Computer & Information Sciences, 1979
commenting
• study
– two program versions
• 5-line high-level block comment at top
• numerous interspersed low-level comments
– participants asked to make modifications to program & memorize
program
• result
– high-level comment participants performed better
– strong correlation between ability to make modifications and ability
to memorize
• conclusion
– memorization is a strong correlate to comprehension
– hierarchical chunking to organize statements into a unit facilitate
comprehension process
B. Shneiderman and R. Mayer
Syntactic/Semantic Interactions in Programmer
Behavior: A Model and Experimental Results 16
Journal of Computer & Information Sciences, 1979
mental model classes
• bottom-up
– read code statement by statement then ascend for
a higher-level picture
• top-down
– start with a high-level picture of what the code is
doing then descend into code
• mixed
– incorporate elements from both, based on the
situation
17
mental model classes
• bottom-up
– read code statement by statement then ascend for
a higher-level picture
• top-down
– start with a high-level picture of what the code is
doing then descend into code
• mixed
– incorporate elements from both, based on the
situation
18
top-down models
• 1st: develop hypotheses about the program
• 2nd: evaluate and refine hypotheses
– with the help of beacons
• 3rd: repeat
• a process of “reconstructing knowledge”
19
beacons
• “indexes into existing knowledge”
• recognizable features in that are cues to the
presence of certain structures
• e.g., looking for a listener pattern
M. Storey
Theories, Methods, and Tools in Program
Comprehension: Past, Present, and Future
IEEE Workshop on Program Comprehension, 2005
R. Brooks
Towards a theory of the comprehension of
computer programs
International J. on Man-Machine Studies, 1981
beacon types
• semantic knowledge “plans”
– reusable generic program fragments
– high-level or low-level
• programming discourse conventions
– “rules” that make program comprehension easier
– found across programmers
E. Soloway, K. Ehrlich
Empirical Studies of Programming Knowledge
IEEE Transactions of Software Engineering, 1984
21
brooks’ model
external representation
problem
design document
requirement
documentation
program
code
match
beacons
syntactic
knowledge
semantic
knowledge
beacons
verify internal schema vs external representation
internal representation –hypotheses and subgoals
R. Brooks
Towards a theory of the comprehension of
computer programs
International J. on Man-Machine Studies, 1981
beacons
modified from Jonathan I. Maletic’s slides:
An Overview of Mental Models for Program 22
Understanding
mental model classes
• bottom-up
– read code statement by statement then ascend for
a higher-level picture
• top-down
– start with a high-level picture of what the code is
doing then descend into code
• mixed
– incorporate elements from both, based on the
situation
23
mental model classes
• bottom-up
– read code statement by statement then ascend for
a higher-level picture
• top-down
– start with a high-level picture of what the code is
doing then descend into code
• mixed
– incorporate elements from both, based on the
situation
24
opportunistic & systematic strategies
• programmers enhancing existing program
• two strategies:
– systematically read code in detail, tracing through
control and data flow manually
• developed control and data flow knowledge
– focus only on code relevant to a task
• developed only control flow knowledge, resulted in a
weaker understanding
Margaret-Anne Storey
Theories, Methods, and Tools in Program
Comprehension: Past, Present, and Future
Int. Workshop on Program Comprehension, 2005
25
integrated model
• maintainers switch between top-down and
bottom-up comprehension
– top-down if code or code type is familiar
– program model (control-flow) when code is
completely unfamiliar
– situation model (data-flow) after a partial dataflow understanding is developed through topdown or program model methods
– knowledge base: information from previous three
models
Margaret-Anne Storey
A. von Mayrhauser and A.M. Vans
From Program Comprehension to Tool Requirements Theories, Methods, and Tools in Program
Comprehension: Past, Present, and Future
for an Industrial Environment
Int. Workshop on Program Comprehension, 2005
IEEE Workshop on Program Comprehension, 1993
validating the integrated model
• taped professional maintenance programmers
– worked with a large code base
– classified as domain and language experts
• tape transcriptions classified into model types
• one of few studies with real world tasks
28
outline
• mental models
– types
– models
•
•
•
•
conventions & “discourse rules”
expertise effects
tool implications
interesting tools
29
outline
• mental models
– types
– models
•
•
•
•
conventions & “discourse rules”
expertise effects
tool implications
interesting tools
30
programming discourse rules
• specify the conventions of programming
– e.g., a variable’s name should reflect its function
– e.g., don’t include code that won’t be used
• similar to writing discourse rules, as outlined
in books like Elements of Style
– e.g., you expect to find the description for fig. 7
between those for fig. 6 and fig. 8
E. Soloway, K. Ehrlich
Empirical Studies of Programming Knowledge
IEEE Transactions of Software Engineering, 1984
31
rules of programming discourse
1. variable names should reflect function
2. don’t include code that won’t be used
a. if there is a test for a condition, then the condition must
have the potential of being true
3. a variable that is initialized via an assignment
statement should be updated via an assignment
statement
4. don’t do double duty with code in a non-obvious way
5. an if should be used when a statement body is
guaranteed to be executed only once, and a while
used when a statement body may need to be
repeatedly executed
E. Soloway, K. Ehrlich
Empirical Studies of Programming Knowledge
IEEE Transactions of Software Engineering, 1984
32
testing discourse rules
• lab study with expert & novice programmers
• two program types
– α (plan-like): obeyed discourse rules
– β (un-plan-like): disobeyed discourse rules
• participants given either α or β code, with one
blank
• task: fill the blank with what seems “natural”
– participants were not told about α or β code
• conclusion: experts fared best with α code
33
why have un-plan-like (β) code?
• machine limitations
– limited memory, processing, bandwidth, etc.
• language limitations
– less common. bugs, efficiency issues, etc.
• programmer limitations
– does not have full mastery of discourse
• historical traces
– resistance to changing legacy code, permanent
“temporary” code
source:
The Psychology of
Computer Programming
34
XXX: PROCEDURE OPTIONS(MAIN);
DECLARE B(1000) FIXED(7,2),
C FIXED(11,2),
(I, J) FIXED BINARY;
C = 0;
DO I = 1 TO 10;
GET LIST((B(J) DO J = 1 TO 1000));
DO J = 1 TO 1000;
C = C + B(J);
END;
END;
PUT LIST(‘RESULT IS ’, C);
END XXX;
modified from The Psychology of
Computer Programming
35
XXX: PROCEDURE OPTIONS(MAIN);
DECLARE A(1000) FIXED(7,2),
C FIXED(11,2),
I FIXED BINARY;
C = 0;
GET LIST((A(J) DO I = 1 TO 10000));
DO I = 1 TO 10000;
C = C + B(I);
END;
PUT LIST(‘RESULT IS ’, C);
END XXX;
modified from
The Psychology of
Computer Programming
36
rules of programming discourse
1. variable names should reflect function
2. don’t include code that won’t be used
a. if there is a test for a condition, then the condition must
have the potential of being true
3. a variable that is initialized via an assignment
statement should be updated via an assignment
statement
4. don’t do double duty with code in a non-obvious way
5. an if should be used when a statement body is
guaranteed to be executed only once, and a while
used when a statement body may need to be
repeatedly executed
E. Soloway, K. Ehrlich
Empirical Studies of Programming Knowledge
IEEE Transactions of Software Engineering, 1984
37
rules of programming discourse
1. variable names should reflect function
2. don’t include code that won’t be used
a. if there is a test for a condition, then the condition must
have the potential of being true
3. a variable that is initialized via an assignment
statement should be updated via an assignment
statement
4. don’t do double duty with code in a non-obvious way
5. an if should be used when a statement body is
guaranteed to be executed only once, and a while
used when a statement body may need to be
repeatedly executed
E. Soloway, K. Ehrlich
Empirical Studies of Programming Knowledge
IEEE Transactions of Software Engineering, 1984
38
naming conventions
• meaningful names
– variable naming reflects cognitive structure
• grammatical sensibility
– interact with language spec. to form expressions
• containers & paths
– objects & pointers
• polysemy, homonymy, & overloading
– operators, name sharing
B. Liblit, A. Begel, and E. Sweetser
Cognitive Perspectives on the Role of Naming in Computer Programs
Psychology of Programming Interest Group, 2006
39
naming conventions
• meaningful names
– variable naming reflects cognitive structure
• grammatical sensibility
– interact with language spec. to form expressions
• containers & paths
– objects & pointers
• polysemy, homonymy, & overloading
– operators, name sharing
B. Liblit, A. Begel, and E. Sweetser
Cognitive Perspectives on the Role of Naming in Computer Programs
Psychology of Programming Interest Group, 2006
40
meaningful names
• metaphors for domain tasks
– e.g. pushing objects onto a stack
• keywords for grouping
– e.g. common prefixes & suffixes
• informative names
– balanced with name length
A. Blackwell
Metaphor or analogy: how should we see
programming abstractions?
Psychology of Programming Interest Group, 1996
B. Liblit, A. Begel, and E. Sweetser
Cognitive Perspectives on the Role of Naming in
Computer Programs
Psychology of Programming Interest Group, 2006
41
name length
• length harm readability and recall ability
• idioms and memory ties improve readability
and recall ability
• takeaway: variable names with consistent and
abbreviated vocabulary are optimal
– (variable names that concisely express a metaphor)
D. Binkley, D. Lawrie, S. Maex, and C. Morrell
Identifier length and limited programmer memory
Science of Computer Programming, 2009
42
grammatical sensibility
• names as phrase fragments
– methods as actions (change state of program)
• e.g. addElement, setSize, removeAll
– methods as mathematical functions (compute result,
don’t alter state)
• e.g. true/false: contains, equals, isEmpty
• e.g. data: capacity, indexOf, size
• valence cues (phrase fragments w/ open slot)
– e.g. roster.contains(player)
– smalltalk makes use of this extensively:
• roster insert: player at: position
B. Liblit, A. Begel, and E. Sweetser
Cognitive Perspectives on the Role of Naming in Computer Programs
Psychology of Programming Interest Group, 2006
43
outline
• mental models
– types
– models
•
•
•
•
conventions & “discourse rules”
expertise effects
tool implications
interesting tools
44
outline
• mental models
– types
– models
•
•
•
•
conventions & “discourse rules”
expertise effects
tool implications
interesting tools
45
20:1 programmer performance
• Sackman et al.: best programmers are 20x
better than worst programmers @ bug fixing
– study originally meant to evaluate the
effectiveness of time-shared systems
H. Sackman, W. J. Erikson, and E. E. Grant
Exploratory experimental studies comparing
online and offline programming performance
Communications of the ACM, 1968
46
10:1 programmer performance
• there are substantial programmer efficiency
differences, but not as dramatic as initially
reported
• what makes experts so much better at
understanding code?
47
testing discourse rules
• lab study with expert & novice programmers
• two program types
– α (plan-like): obeyed discourse rules
– β (un-plan-like): disobeyed discourse rules
• participants given either α or β code, with one
blank
• task: fill the blank with what seems “natural”
– participants were not told about α or β code
48
α problem
PROGRAM Magenta(input, output)
VAR Max, I, Num INTEGER
BEGIN
Max = 0.
FOR I = 1 TO 10 DO
BEGIN
READLN(Num)
If Num ? Max THEN Max = Num
END
WRITELN(Max).
END
E. Soloway, K. Ehrlich
Empirical Studies of Programming Knowledge
IEEE Transactions of Software Engineering, 1984
49
α solution
PROGRAM Magenta(input, output)
VAR Max, I, Num INTEGER
BEGIN
Max = 0.
FOR I = 1 TO 10 DO
BEGIN
READLN(Num)
If Num > Max THEN Max = Num
END
WRITELN(Max).
END
E. Soloway, K. Ehrlich
Empirical Studies of Programming Knowledge
IEEE Transactions of Software Engineering, 1984
50
β problem
PROGRAM Magenta(input, output)
VAR Max, I, Num INTEGER
BEGIN
Max = 999999.
FOR I = 1 TO 10 DO
BEGIN
READLN(Num)
If Num ? Max THEN Max = Num
END
WRITELN(Max).
END
E. Soloway, K. Ehrlich
Empirical Studies of Programming Knowledge
IEEE Transactions of Software Engineering, 1984
51
β solution
PROGRAM Magenta(input, output)
VAR Max, I, Num INTEGER
BEGIN
Max = 999999.
FOR I = 1 TO 10 DO
BEGIN
READLN(Num)
If Num < Max THEN Max = Num
END
WRITELN(Max).
END
E. Soloway, K. Ehrlich
Empirical Studies of Programming Knowledge
IEEE Transactions of Software Engineering, 1984
52
percentage of correct responses
beta
advanced
novice
alpha
0%
20%
40%
E. Soloway, K. Ehrlich
Empirical Studies of Programming Knowledge
IEEE Transactions of Software Engineering, 1984
60%
80%
100%
53
debugging differences between
novices and experts
• experts: situation-dependent
problem solvers
• novices: situation-independent
problem solvers
I. Vessey
Expertise in Debugging Computer Programs: An
analysis of the Content of Verbal Protocols
IEEE Trans on Systems, Man, Cybernetics, 1986
54
outline
• mental models
– types
– models
•
•
•
•
conventions & “discourse rules”
expertise effects
tool implications
interesting tools
55
outline
• mental models
– types
– models
•
•
•
•
conventions & “discourse rules”
expertise effects
tool implications
interesting tools
56
tool implications
• browsing support
– browse from high to low level and low to high level
• searching
– looking for snippets by analogy
• multiple views
– show orthogonal object relationships
• context-driven views
– determine best view based on context
• additional cognitive support
– external devices to support cognitive tasks needed
Margaret-Anne Storey
Theories, Methods, and Tools in Program
Comprehension: Past, Present, and Future
Int. Workshop on Program Comprehension, 2005
57
tool implications
• browsing support
– browse from high to low level and low to high level
• searching
– looking for snippets by analogy
• multiple views
– show orthogonal object relationships
• context-driven views
– determine best view based on context
• additional cognitive support
– external devices to support cognitive tasks needed
Margaret-Anne Storey
Theories, Methods, and Tools in Program
Comprehension: Past, Present, and Future
Int. Workshop on Program Comprehension, 2005
58
browsing support
• traverse control and data flow paths
• switching between top-down and bottom-up
models
• breadth-first and depth-first
59
tool implications
• browsing support
– browse from high to low level and low to high level
• searching
– looking for snippets by analogy
• multiple views
– show orthogonal object relationships
• context-driven views
– determine best view based on context
• additional cognitive support
– external devices to support cognitive tasks needed
Margaret-Anne Storey
Theories, Methods, and Tools in Program
Comprehension: Past, Present, and Future
Int. Workshop on Program Comprehension, 2005
60
tool implications
• browsing support
– browse from high to low level and low to high level
• searching
– looking for snippets by analogy
• multiple views
– show orthogonal object relationships
• context-driven views
– determine best view based on context
• additional cognitive support
– external devices to support cognitive tasks needed
Margaret-Anne Storey
Theories, Methods, and Tools in Program
Comprehension: Past, Present, and Future
Int. Workshop on Program Comprehension, 2005
61
searching
• search for code snippets
– not just by text
• example: query the role of a variable, when a
function is called
• useful for top-down hypothesis testing
62
tool implications
• browsing support
– browse from high to low level and low to high level
• searching
– looking for snippets by analogy
• multiple views
– show orthogonal object relationships
• context-driven views
– determine best view based on context
• additional cognitive support
– external devices to support cognitive tasks needed
Margaret-Anne Storey
Theories, Methods, and Tools in Program
Comprehension: Past, Present, and Future
Int. Workshop on Program Comprehension, 2005
63
tool implications
• browsing support
– browse from high to low level and low to high level
• searching
– looking for snippets by analogy
• multiple views
– show orthogonal object relationships
• context-driven views
– determine best view based on context
• additional cognitive support
– external devices to support cognitive tasks needed
Margaret-Anne Storey
Theories, Methods, and Tools in Program
Comprehension: Past, Present, and Future
Int. Workshop on Program Comprehension, 2005
64
multiple views
• multiple ways of viewing programs
– call graph
– object hierarchy
– etc.
• different views are applicable for different
tasks
65
tool implications
• browsing support
– browse from high to low level and low to high level
• searching
– looking for snippets by analogy
• multiple views
– show orthogonal object relationships
• context-driven views
– determine best view based on context
• additional cognitive support
– external devices to support cognitive tasks needed
Margaret-Anne Storey
Theories, Methods, and Tools in Program
Comprehension: Past, Present, and Future
Int. Workshop on Program Comprehension, 2005
66
tool implications
• browsing support
– browse from high to low level and low to high level
• searching
– looking for snippets by analogy
• multiple views
– show orthogonal object relationships
• context-driven views
– determine best view based on context
• additional cognitive support
– external devices to support cognitive tasks needed
Margaret-Anne Storey
Theories, Methods, and Tools in Program
Comprehension: Past, Present, and Future
Int. Workshop on Program Comprehension, 2005
67
context-driven views
• alter views based on program metrics
– size of program
– interdependence of modules
– flatness of hierarchy
– etc.
68
tool implications
• browsing support
– browse from high to low level and low to high level
• searching
– looking for snippets by analogy
• multiple views
– show orthogonal object relationships
• context-driven views
– determine best view based on context
• additional cognitive support
– external devices to support cognitive tasks needed
Margaret-Anne Storey
Theories, Methods, and Tools in Program
Comprehension: Past, Present, and Future
Int. Workshop on Program Comprehension, 2005
69
tool implications
• browsing support
– browse from high to low level and low to high level
• searching
– looking for snippets by analogy
• multiple views
– show orthogonal object relationships
• context-driven views
– determine best view based on context
• additional cognitive support
– external devices to support cognitive tasks needed
Margaret-Anne Storey
Theories, Methods, and Tools in Program
Comprehension: Past, Present, and Future
Int. Workshop on Program Comprehension, 2005
70
additional cognitive support
• experts:
– tools to support cognitive tasks
• external devices
• scratchpads
• novices
– pedagogical support
• programming language
• task domain
71
outline
• mental models
– types
– models
•
•
•
•
conventions & “discourse rules”
expertise effects
tool implications
interesting tools
72
outline
• mental models
– types
– models
•
•
•
•
conventions & “discourse rules”
expertise effects
tool implications
interesting tools
73
structured editors
• reduce burden or memorizing syntax
– focus on semantics
Figure 5. A partial specification of the Java
programming
A.
Ko and B.language’s
Myers class declaration construct.
Even
with the
additional features,
the first
was ablethe
Citrus:
A Language
and Toolkit
forauthor
Simplifying
implement
the of
benchmark
in just
30 minutes
and 63and
linesData
of
Creation
Structured
Editors
for Code
code. This included 11 lines for the specification, and 25
2005 behaviors, 12 lines of pre-defined layouts
linesUIST,
of custom
and constraints, 4 lines of custom constraints for the
Figure 6. A prototype of a Java editor, created using
literate programming
• source code interwoven with exposition of
logic, like an essay
• allows programmers to work top-down or
bottom-up
D. Knuth
Literate Programming
Journal of Computer & Information Sciences, 1979
77
The purpose of wc is to count lines, words, and/or
characters in a list of files. The
number of lines in a file is ......../more explanations/
Here, then, is an overview of the file wc.c that is
defined by the noweb program wc.nw:
<<*>>=
<<Header files to include>>
<<Definitions>>
<<Global variables>>
<<Functions>>
<<The main program>>
@
We must include the standard I/O definitions, since we
want to send formatted output
to stdout and stderr.
<<Header files to include>>=
#include <stdio.h>
@
D. Knuth
Literate Programming
Journal of Computer & Information Sciences, 1979
78
conclusion
• beginners start off with an incomplete mental
model for how code works
• experts are better at code comprehension
because they focus on higher level patterns
– patterns can be considered “discourse rules”
– naming conventions, design patterns, schemas
• experts work significantly better when reading
& writing code according to these patterns
79
discussion
• what other discourse rules can you think of?
• do these mental models resonate with your
style of understanding code?
• what are some other tool implications of
these models?
80
references - 1
H. Sackman, W. J. Erikson, and E. E. Grant
Exploratory experimental studies comparing online
and offline programming performance
Communications of the ACM, 1968
B. Shneiderman and R. Mayer
Syntactic/Semantic Interactions in Programmer
Behavior: A Model and Experimental Results
Journal of Computer & Information Sciences, 1979
B. Liblit, A. Begel, and E. Sweetser
Cognitive Perspectives on the Role of Naming in
Computer Programs
Psychology of Programming Interest Group, 2006
R. Brooks
Towards a theory of the comprehension of
computer programs
International J. on Man-Machine Studies, 1981
A. Blackwell
Metaphor or analogy: how should we see
programming abstractions?
Psychology of Programming Interest Group, 1996
N. Pennington
Stimulus Structures and Mental Representations in
Expert Comprehension of Computer Programs
Cognitive Psychology, 1987
E. Soloway, K. Ehrlich
Empirical Studies of Programming Knowledge
IEEE Transactions of Software Engineering, 1984
D. Knuth
Literate Programming
Journal of Computer & Information Sciences, 1979
81
references - 2
A. von Mayrhauser and A.M. Vans
From Program Comprehension to Tool
Requirements for an Industrial Environment
IEEE Workshop on Program Comprehension, 1993
I. Vessey
Expertise in Debugging Computer Programs: An
analysis of the Content of Verbal Protocols
IEEE Trans on Systems, Man, Cybernetics, 1986
Margaret-Anne Storey
Theories, Methods, and Tools in Program
Comprehension: Past, Present, and Future
Int. Workshop on Program Comprehension, 2005
A. Ko and B. Myers
Citrus: A Language and Toolkit for Simplifying the
Creation of Structured Editors for Code and Data
UIST, 2005
82
does visual programming help?
significant
result, but
contribution of
AV uncertain
8%
significant result
46%
C. Hundhausen, S. Douglas, J. Stasko
A meta-study of algorithm
visualization effectiveness
Journal of Visual Languages & Computing, 2002
significant result
in wrong
direction
4%
non-significant
result
42%
83
underlying questions
• how do programmers read and come to
understand unfamiliar code?
• what kinds of mental models to programmers
create to think about code?
• why are experts significantly better than
novices when looking at unfamiliar code?
– hint: experts aren’t as good as you might expect!
84
why does it matter?
• reading code is done when:
– searching for relevant code
– re-acquainting oneself with a project
– reading someone else’s code
– refactoring
–…
85
the gist of the talk
• beginners start off with an incomplete mental
model for how code works
• experts are better at code comprehension
because they focus on higher level patterns
– patterns can be considered “discourse rules”
– naming conventions, design patterns, schemas
• experts work significantly better when reading
& writing code according to these patterns
86
var Dict = function() {
this.keys = [];
this.values = [];
};
Dict.prototype.set = function(key, value) {
var keyIndex = this.keys.indexOf(key);
if(keyIndex<0) {
this.keys.push(key);
this.values.push(value);
}
else {
this.values[keyIndex] = value;
}
};
Dict.prototype.get = function(key) {
var keyIndex = this.keys.indexOf(key);
if(keyIndex>=0) return this.values[keyIndex];
return undefined;
};
87
mental models
top-down models
• 1st: hypothesize about code
• 2nd: check hypotheses
• start on a high level, dig in
bottom-up models
• 1st: read code statements
• 2nd: mental chunking
• start on a low level, ascend
hybrid models
• incorporate elements from
both, based on the situation
88
shneiderman & mayer’s model
• semantic knowledge: general programming
concepts
– low-level knowledge, e.g. what assignments do
– high-level knowledge, e.g. algorithms
• syntactic knowledge: programming language
details
– sometimes overlaps across programming langs.
B. Shneiderman and R. Mayer
Syntactic/Semantic Interactions in Programmer
Behavior: A Model and Experimental Results
Journal of Computer & Information Sciences, 1979
89
brooks’ model
• “top-down”
– analyze code on a high level, then look at specifics
• argues that programmers form a series of
hypotheses
• beacons help verify or reject these hypotheses
R. Brooks
Towards a theory of the comprehension of
computer programs
International J. on Man-Machine Studies, 1981
90
containers & paths
B. Liblit, A. Begel, and E. Sweetser
Cognitive Perspectives on the Role of Naming in Computer Programs
Psychology of Programming Interest Group, 2006
91
polysemy, homonymy, & overloading
B. Liblit, A. Begel, and E. Sweetser
Cognitive Perspectives on the Role of Naming in Computer Programs
Psychology of Programming Interest Group, 2006
92
Download