Uploaded by Frank Westland

TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES

advertisement
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
1
Teaching Strategies for Programming Languages:
Explicit vs. Implicit Learning
Frank Westland
ANR: 278126
HAIT Master Thesis series nr. [THESIS SERIES NUMBER]
THESIS SUBMITTED IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF ARTS IN COMMUNICATION AND INFORMATION SCIENCES,
MASTER TRACK HUMAN ASPECTS OF INFORMATION TECHNOLOGY,
AT THE SCHOOL OF HUMANITIES
OF TILBURG UNIVERSITY
Thesis committee:
dr. A. Alishahi
dr. M.M. van Zaanen
Tilburg University
School of Humanities
Department of Communication and Information Sciences
Tilburg center for Cognition and Communication (TiCC)
Tilburg, The Netherlands
September, 2013
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
2
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
3
Table of Contents
Page number
Chapter 1: Introduction
1.1 Relevance
1.2 Decomposing a programming task
1.3 Useful ideas from second language learning models
1.4 Research question
1.5 Research approach
4
4
5
6
7
8
Chapter 2: Background information
2.1 Didactic and pedagogical developments
2.1.1 Pedagogical patterns
2.1.2 Constructivism
2.2 Computer science course
2.2.1 Programming skills
2.2.2 Paradigm parade
2.2.3 Models for teaching programming (Kaasb├©ll)
2.3 Didactic ideas for programming
2.3.1 Programming language as second language
2.3.2 A step by step approach to programming tasks
9
9
9
11
12
13
14
16
18
18
19
Chapter 3: Experimental design
3.1 Participants
3.2 Procedure
3.3 Material
3.4 Metrics and measurements
23
23
24
25
27
Chapter 4: Results
4.1 Distribution of completed tasks
4.2 Data analysis
4.3 Time spent on each task
31
31
32
37
Chapter 5: Discussion
5.1 Possible impact of time
5.2 The missing learning experience
5.3 Measuring results with the variables structure and quality
5.4 Distance between second-language learning and learn to program
5.5 Importance for Computer science
38
38
39
39
41
41
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
4
Chapter 6: Conclusion and future work
6.1 Findings and indications
6.2 Answers to the research question
6.3 Future work
42
42
43
43
Bibliography
45
Appendix 1: Three didactic approaches to programming (Kaasbøll, J. J.)
Appendix 2: An illustrative task example used in the experiment
Appendix 3: Screen-shots of the experiment
48
49
50
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
5
Chapter 1: Introduction
This research focuses on the steps taken by programmers to translate a given problem into
source code. Section 1.1 describes the need for more knowledge about programming education
and section 1.2 explains how a programming task can be divided into single translation steps.
Next, in section 1.3, we are looking into a few theories from second language acquisition (SLA)
which will offer insight into these translation steps. Section 1.4 is about the research question
and in section 1.5 the research approach is described.
1.1 Relevance
There has always been much discussion about the content of programming education. Often
the focus is on the choice of programming tools, languages and appropriate tutorials but which
didactic model fits best with the current approaches in programming education are less
frequently subject of debate. Kaasbøll (1998), associated with the department of Informatics at
Oslo University, states in his paper on didactic models for teaching programming that models do
exist in literature, but interviews with teachers reveal that teachers do not normally relate to such
models. Although his paper is over then ten years old, the findings are still relevant (Kaasbøll,
1998, p 1).
A possible explanation for this contradiction between theory and practice can be found in the
historical development of programming languages. This development is influenced by the rapid
changes in hardware and the way computers are used. New programming paradigms follow each
other in rapid succession and teachers struggle to keep up. This means that there is not much
time for reflection on the didactic model in use.
However, some models are of great importance to the development of didactics in
programming. For example, the constructivistic approach of learning is best visible in modules
on software engineering. This didactic approach gives students, who work in a group, the
opportunity to create (construct) a solution around an open challenge in which the students can
choose their own strategies. In addition, course developers for computer science (CS) make use
of general didactic principles known as pedagogical patterns. Examples of patterns are learning
from mistakes or different forms of collaboration. Such patterns will be described in more detail
in section 2.1.1.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
6
The way students learn is changing, which increases the need for didactic and pedagogical
knowledge. Ginat (2003) points out that students are over-reliant on intuition rather than rigor. In
particular, he noticed a repeated erroneous trend of turning to intuitive, but inadequate greedy
algorithmic solutions. As Ginat states, we might capitalize on student errors by influencing their
attitude and beliefs regarding intuition and rigor. Perhaps the approach to learning, such as
gathering skills and knowledge, differs from the traditional ways teachers believe the learning
process takes place. This illustrates that perhaps the question is not how students should learn,
but how they do learn (Ginat, 2003, p 11).
Kak (2008) states in his publication about teaching programming that, during the last decade,
the world of computing and programming has become so diverse that it is becoming more and
more difficult to define what constitutes as core programming skills. Many students who are not
only studying for a science or engineering degree but are also mastering other types of degrees,
learn to program computers in one form or another. It is rather common to encounter graduating
engineers and scientists whose programming competence is limited to scripting in Matlab and
other such languages (Kak, 2008, p 2).
Because of this diversity as mentioned by Kak (2008), it seems wise to point our research
focus mainly onto what these different ways of learning to program have in common. One of the
most striking similarities is the use of basic imperative programming instructions. These
instructions return in various forms in higher programming languages as summarized in section
2.2.2 about programming paradigms.
1.2 Decomposing a programming task
In short, programming is writing source code to solve a problem in such a way that a
computer can execute the coding instructions. The process of programming contains several
steps, which are defined as translation steps in this research. An example of such a process is the
following: Initially, the problem is described in natural language. Based on this information, the
problem is translated in a more formalized pseudo code or graphical presentation. Finally, a
translation step into source code takes place. See for more information section 2.3.2.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
7
1.3 Useful ideas from second language learning models
Learning how to conduct these translation steps has similarities with learning a second
language. Computer science students might benefit from the didactic experience gained in
second language education so we should look into second language acquisition (SLA) in more
detail.
Hulsman (2005) provides us with the terms explicit and implicit learning. He is warning the
definitions are still subject of debate. Explicit learning is input processing with the conscious
intention to find out whether the input information contains regularities and, if so, to work out
the concepts and rules with which these regularities can be captured. Implicit learning, however,
is input processing without such an intention taking place unconsciously (Reber et al., 1999,
cited by Hulsman, 2005, p 3).
Another useful theoretical distinction is deductive and inductive learning. DeKeyser (1993)
explains deductive learning takes place when rules are presented before examples are provided;
inductive learning takes place when examples are given before rules are presented (DeKeyser,
1993, p. 380). The terms deductive and inductive learning are used in an instructional context.
By definition, deductive and inductive learning are part of explicit instruction because the correct
rule is always given at some point.
Figure 1 shows the combinations labeled as so-called learning dimensions (DeKeyser, 1993,
p. 380, cited by Hulsman, 2005). Each combination prescribes its own learning setting. In an
explicit-deductive setting the learning and understanding of the rules is a primary goal. This
setting looks similar to traditional instructional teaching. In the implicit-deductive setting the
goals are the same but without the intention of pinpointing concepts and rules in the first place.
In this setting the student can be provided with examples of solutions instead of rules to solve. In
the inductive settings the implicit and explicit variants occur in the same way as in the deductive
setting but knowledge and skills are learned mainly by discovery and exploration instead of
instruction. See section 2.3.1 for more details.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
8
Figure 1: Learning dimensions (DeKeyser, 1993).
Deductive
Inductive
Explicit
Rules are lectured,
Rules are discovered,
Implicit
Using parameters,
instruction-based
example-based
keyword: noticing
Learning from input,
as in natural language learning
1.4 Research question
There are many different ways in which information science, including aspects of
programming language acquisition (PLA), can be taught. However, all learners of programming
skills make use of translation steps described in section 1.2 and 2.3.2.
Not all translation steps are always applicable. This depends on the school-specific
curriculum or didactic choices. Such choices include for example, the programming language or
the use of an educational programming tool. Anyway, each translation step can be performed in
one of the four learning settings as presented in figure 1.
For this research we investigate the differences in fulfilling a programming task between an
instruction-based explicit-deductive setting, an example-based implicit-deductive setting and a
mixture of settings allowing to use the Internet freely as an information source to accomplish a
programming task. See section 1.3 and 2.3.1 for more details. This comparison addresses the
following question:
What are the differences between the results for programming tasks fulfilled by
learners when conducted by means of textual instruction, using close-to-solution
examples only or by allowing the use the Internet freely?
Speaking with students we hear they often choose for the free Internet option. On the other
hand, teachers prefer one or even both of the two deductive learning options. Perhaps teachers
want to stay in control. We think the instruction-based setting will offer the best results for
beginning students, compared to other learning settings. The instruction is more guiding and
students, presumably, need to make less difficult decisions on their own.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
9
1.5 Research approach
At first we will explore the existing literature on topics about education of programming and
programming skills in general. We want to know what research has been done to draw lessons
from. The starting point of this study can be seen as an exploration in the field of general
didactics and pedagogy, but also in the knowledge area of programming. We will also look at
some publications that combine these areas of interest. Fairly new is the involvement from a
language learning point of view in the didactics of programming. The basic idea is to consider a
programming language as a natural language for writing and interpreting.
To compare results of programming tasks with each other statistically, we want to express
them in quantitative values. Therefore we make use of a way programming tasks are examined in
computer science (CS) courses for pre-university students in the Netherlands. To maintain the
objectivity during the assessment we will automate this process by scripts. Because of the
complexity, section 3.4 is dedicated to explain those scripts in more detail. To improve the
quality of the experiment, participants are recruited who have approximately the same age and
level of experience in programming.
In chapter 2 the background information is discussed, most of which are found in other
academic publications. Chapter 3 explains the experimental design, followed by chapter 4 in
which an analysis of the experimental results is given. In chapter 5, we discuss interesting
aspects arising from this study and finally, chapter 6 presents conclusions and recommendations
for future work.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
10
Chapter 2: Background information
In this chapter, two current didactic and pedagogical developments in general are introduced
which are decisive for important choices within the computer programming lessons, followed by
a brief description of programming skills in the computer science course.
2.1 Didactic and pedagogical developments
In recent years, in the development of didactics and pedagogy in general, there are two
notable topics shaping the discussion about computer programming education (CPE). Section
2.1.1 introduces Pedagogical patterns. Two examples of these patterns playing a role in CPE are
discussed separately namely “Learning from mistakes” and “Round and deep” followed by a
model for teaching Object Oriented programming where multiple learning patterns are
integrated. In section 2.1.2 the second topic, constructivism will be discussed as used in Object
Oriented programming especially in a learning sequence called software engineering.
2.1.1 Pedagogical patterns
According to the website of the Pedagogical Pattern Editorial Board the so-called “patterns”
are designed to capture best practice in a specific domain concerned with the practice of teaching
and learning (Bergin et al., 2012). In essence, a pattern solves a problem. This problem is one
that recurs in different contexts. In teaching there are many problems, such as motivating
students, choosing and sequencing materials, evaluating students, etc.
The patterns have a form similar to the one used by Alexander in his book “A Pattern
Language” (Alexander et al., as cited by Bergin 2012). Jerinic (2012) tells us that both the
experienced and novice could benefit from the ideas contained in patterns.
As an example pattern, we discuss the “Round and Deep” pattern. “Round and Deep” is the
name of a pattern in which the whole class benefits from the experiences of individual students
in your class. Sharp (as cited by Bergin, 2012), who has revised this pattern, explains that an
experienced student is likely to gain a deep understanding of a complex concept by relating it to
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
11
his or her own experience. But the same experience which results in a deep understanding may
also limit it because a comprehensive understanding of a complex concept can only be achieved
by considering different perspectives. According to Sharp, in order to gain a deep understanding
of a complex concept, the students need to consider it from many different perspectives, but their
own experience is limited and a classroom exercise is too simple to cover adequately the deep
issues surrounding the concept. Experienced students will relate a new concept to their own realworld experiences, and will form a deep understanding of it, but if their experience does not
validate the concept, then its significance may be lost, and if their experience does validate the
concept then although, understanding may appear deep, it may also be narrow." Sharp concludes
that we therefore need to exploit the students' own experiences in order to deepen their own
understanding of the concept and to provide alternative perspectives from other students (Sharp
as cited by Bergin, 2012). Applying a pattern like “Round and deep” has immediate
consequences for the learning activities. As the learning programmer previously worked mainly
individually, the interaction between learners is now inevitable.
Another example of a pattern is “Learning from mistakes” and is very powerful in teaching
programming. Students are asked to create an artifact such as a program or design that contains a
specific error. Use of this pattern explicitly teaches students how to recognize and fix errors by
asking them to cause them deliberately and then examine the consequences. In this pattern, it
should be noted how to deal with such as an in-line syntax correction system in a programming
environment, because those features correct typical errors automatically, sometimes even before
the student has a chance to examine the consequences.
In a brief description of designing a special class of Pedagogical Patterns for teaching
elementary programming, Jerinić gives an example for the use of less intentional errors. It is a
group of patterns for learning through trial and error. These patterns fit in an active learning
approach where students, due to the already mentioned enriched code editors, have immediate
feedback when errors occur (Jerinic, 2012 slide 12).
The model of Eckstein (2000) shows how learning patterns can be used in the teaching of
object oriented programming (Eckstein, 2000, p 1). The patterns were recognized in industrial
training settings, so it is not clear how well they can be applied in an academic (or educational)
environment. Nevertheless, Eckstein claims: “I realized that I had wandered into a wealth of
industrial training related patterns. You could say I found the first few nuggets in an unknown
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
12
and vast gold field”.
The structure of the patterns she found are similar to the ones introduced by Rueping (1999).
As shown in figure 2, the problem section formulates an issue as a question. The forces are the
considerations that lead to the solution. The solution answers the question of the problem section.
In the discussion section, some examples are presented or drawbacks of the solution are
discussed (Rueping, 1999, p 197).
According to Eckstein these patterns have to be regarded as a work in progress towards a
pattern language, but she concludes that there is a lot to be learned from industrial training.
Figure 2, learning patterns used in Object Oriented programming (Reuping, 1999).
2.1.2 Constructivism
Boyle (2000), introduces the term constructivism in which the central tenet is that knowledge
of the world is constructed by the individual. Through interacting with the world, the person
constructs, tests and refines cognitive representations in order to make sense of the world.
Constructive learning rather than instruction becomes the focal issue (Boyle, 2000). Table 1
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
13
shows the main characteristics that underline the constructivist learning environments
proposed by Jonassen (1994, p 5,7-12, 31).
Table 1, characteristics of the constructivist learning environment (Jonassen, 1994, p5,7-12, 31).
Description
1
Constructivist learning environment
uses authentic learning tasks meaningful for the
students.
2
Interaction is viewed as the primary source material for the cognitive constructions that
people build to make sense of the world.
3
Constructivist learning environment
encourages voice and ownership in the learning
process. Students should be allowed to choose the problems they will work on. The teacher
should serve as a consultant to help students to generate problems which are relevant and
interesting to them.
4
The experience is interleaved with knowledge construction. The emphasis on authentic
tasks and rich interaction provides a base for experience with the knowledge construction
process.
5
Meta-cognition is the ultimate goal of a constructivist approach. Problem solving involves
the processes of reflecting on problems and searching for solutions.
As an example, the module software engineering (SE) as part of many CS curricula, gives the
opportunity to embed a lot of these characteristics. Programming skills are embedded in a range
of activities like designing solutions, documenting, reflecting, communicating, planning, scaling,
organizing and collaborating. These activities affect the way learning takes place. Consider once
again the learning settings as explained in section 1.3. By use of this learning approach it seems,
instruction-based explicit-deductive learning has become less important. The learning activities
mainly lie with the students and seem more in line with learning by discovery and exploring.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
14
2.2 Computer science course
In the curriculum of the computer science course, programming skills are explicitly
mentioned. Section 2.2.1 explores the meaning of programming skills. Next, section 2.2.2 gives a
brief classification of the most well known programming paradigms and choices made in
computer science education. Finally, section 2.2.3 describes aspects didactic models for teaching
introductory programming studied by Kaasbøll (Kaasbøll, 1998, p 1). He discovered that
teachers do not normally relate to such models in daily life.
2.2.1 Programming skills
There are basically two meanings of the term programming skills; one which involves
solving some sort of computational problem, and another which can be seen as the craft of
coding and documenting.
In the early 1980s Pea and Kurland (1983) defined "computer programming" as the set of
activities involved in developing a reusable product, consisting of a series of written instructions
that make a computer accomplish some task (Pea & Kurland, 1983, p 149). They also discovered
a change in the nature of programming activities. In the early days of programming for example,
the programmer needed to know the details of the computer hardware in order to write a code
that actually worked. They argued in 1983 that this was no longer true. The set of activities that
constitute programming was that the "cognitive demands" made by computer programming
needs specification at the level of programming subtasks, or component activities. Table 2
defines at least four levels of ability.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
15
Table 2, levels of ability in accomplishing computational activities (Pea & Kurland, 1983, p149).
Level
1 Program
user
Description
Before learning how to program, one typically learns to execute already written
programs such as games, demonstrations, or computer assisted instructions
lessons (CAI).
2 Code
At this level students know the syntax and semantics of the more common
generator commands in a language. Users can read someone else's program and know what
each line accomplishes. They can locate bugs that prevent commands from being
executed, and can load and save program files. There is no effort to optimize the
coding, use error traps, or make the program user-friendly and crash resistant.
3 Program
At this level, students have mastered the basic commands and are thinking in
generator terms of higher level units. Sequences of commands that accomplish program
goals are known (e.g., locate and verify a keyboard input, sort a list of names or
numbers, read data into a program from a separate file). Students can now read a
program and say what the goal of the program is, what functions different parts of
the program serve, and how the different parts are linked together.
4 Software At this level, Students are ready to write programs that are both complex and are
developer intended to be used by others. Students now know several languages and has a
full understanding of all their features and how the languages interact with the
host computer (e.g. how memory is allocated, how graphic buffers can be
protected from being overwritten, how peripheral devices can be controlled by
the program).
This list is just an example of the detailed level description by Pea and Kurland. It says a lot
about what tasks student could master. Unfortunately, no explicit guidelines about how these
levels should be reached from a pedagogical perspective are available.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
16
2.2.2 Paradigm parade
From a historical perspective, we can draw parallels between the development of computers,
the way we use them and the development of programming languages. Based on these parallels
Bellaachia (2012) classifies examples of well-known programming languages in four main
computer paradigms as shown in table 3 (Bellaachia, 2012, p 6). The first column contains
examples of imperative programming languages. The imperative paradigm is based on
commands that update variables in storage. The language provides statements, such as
assignment statements, which explicitly change the state of the memory of the computer. This
model closely matches the actual executions of computer and usually has high execution
efficiency. In the second column the functional programming paradigm expresses computations
as the evaluation of mathematical functions. Functional programming paradigms treat values as
single entities. Unlike variables, values are never modified. Instead, values are transformed into
new values. The third column is meant for languages designed with the logic programming
paradigm. In this paradigm we express computation exclusively in terms of mathematical logic.
the logic paradigm focuses on predicate logic, in which the basic concept is a relation. A
computation is initiated by running a query over one or more relations. Finally, the objectoriented paradigm organizes programs as objects: data structures consisting of data-fields and
methods together with their interactions. Objects communicate with one another via message
passing (Bellaachia, 2012, p 8).
Table 3, four main paradigms with example programming languages.
Imperative
Declarative
Declarative
/Algorithmic
Functional
Logic
Algol
Lisp
Cobol
Haskell
Simula
PL/1
ML
C++
Ada
Miranda
Java
C
APL
Modula-3
Prolog
Object-Oriented
Smalltalk
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
17
Views on what language to use in CSE vary widely. According to Simha (2003), arguments
in favor of the imperative languages are the simplicity, the small size of code and the lack of a
graphical interface, which does not distract from fundamentals. Nevertheless some teachers
prefer the use of a graphical interface (GUI), moreover, they even let students start building such
an interface. In most cases they use an object-oriented language because all GUI-elements are
constructed as objects themselves. A strong argument in favor of the GUI-approach is the
positive effect in motivating students. They are going to use GUI's anyway, as does the real
world. If GUI's are a good way to get students interested early, what's wrong with using this
approach to achieve the end objective of learning the fundamentals? (Simha, 2003)
In New Zealand Robins, Rountree and Rountree (2003) published their findings about the
choice of object orientation (OO). In contrast to the sometimes, enthusiastic comments about OO
from the field of education, they report a critical comment: “In Object Oriented courses it may be
necessary, particularly for weaker students, to devote particular attention to procedural concepts,
flow of control, flow of data and design” (A. Robins et al., 2003, p 162).
2.2.3 Models for teaching programming (Kaasbøll)
Kaasbøll (1998), starts his publication on models for teaching programming with the remark
that although models exist in the literature, interviews with teachers reveal that teachers do not
normally relate to such models (Kaasbøll, 1998, p 1). In his study Kaasbøll aims at developing a
didactic model for teaching introductory programming as described in table 4.
Table 4, The didactic model aims (Kaasbøll, 1998, p 1).
Description
1 A meta-model to be taught to learners, so that they can verbalize more of their learning.
2 A model for teaching to be taught to the tutors in a course, such that they can align their
activities with the lecturers. In this context, tutors are often assisting senior students.
3 A basis for formulating research questions for further studies of programming teaching.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
18
Kaasbøll mentions three main didactic teaching strategies found in the literature before 1998.
Summarized, in strategy 1, the “Semiotic ladder”, the learning starts with syntactical knowledge,
preceding the learning of meaning of the language constructs. Strategy 2, the “Cognitive
objectives taxonomy”, resembles Bloom’s “taxonomy of cognitive objectives”, starts with
running a program, preceding the reading of a program, followed by changing a program and can
ultimately reach the level of creating a program. Strategy 3 is called “Problem solving”. Through
solving problems, the students should extend their experience and the basis for the process is the
knowledge structure of the field of programming. This is more a model of learning than a
teaching strategy. Appendix 1 presents the models by Kaasbøll in more detail.
Teachers were interviewed and asked which of the three approaches that suited them best.
Kaasbøll says that the main lesson from the interviews was that none of the three suggested
models had significant advantages, and that a forth model based on software development
process should be developed. This corresponds to ideas mentioned by Eckstein (Eckstein, 2000).
The approach of problem solving was already revisited by Barnes, Fincher and Thompson (1997)
and transformed into a cycle of problem-solving, which included steps from software
development involved with, for example, industrial design (Barnes, Fincher and Thompson,
1997, p 3). See table 5 for the four stages of problem solving.
Table 5, the four-stages of problem-solving (Barnes, Fincher and Thompson, 1997, p 3).
Stages
Activities
1
Understanding
•
•
Structuring and dividing
Clarifying
2
Design
•
•
Finding related problems and
solutions
Checking against in- and output
3
Writing
•
•
Completing
Adaptation to problem
4
Review
•
•
Testing
Summarizing, lessons learned
With respect to learning methods of programming, software development models have
iterative phases, repeatedly taken and periodically improved. The goals are to complete the task
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
19
totally in the first iteration and improve the achievements in every following. This model
emerged in response to the waterfall model, where one department ( e.g. programming) had to
wait for the results of another department (e.g. the global design). When a software company
consists of several departments and the passing of semi-finished products can only take place
after a department has finished its work completely, the process diagram can be shaped as a
waterfall.
The need for real problem-solving approaches in CSE remains valid even today but without
iterations of the phases this approach resembles to much a waterfall model. (Kaasbøll, 1998, p
5).
2.3 Didactic ideas for programming
This section describes two ideas regarding programming. Section 2.3.1 introduces different
learning strategies for second (natural) language learning (SLA). These strategies can be useful
in learning a programming language. The second idea is discussed in section 2.3.2 and divides a
programming task in translation steps.
2.3.1 Programming language as second language
If a programming language is a language with grammar, vocabulary, syntax and semantics, it
is useful to look into the way a natural language can be learned. Do second-language learners
use the same approach for acquiring a natural language as learners of a programming language
do? The answer is not clear but in both classes of languages, one benefits from already knowing
another natural language used in daily life.
De Oliveira e Paiva (2009) explains the influence from a known (native) language in her
theoretical introduction about Second Language Acquisition (SLA) by mental representations
and information processing from a connectionist view. She cites Ellis et al. (1998, as cited in de
Oliveira e Paiva, 2007): “Our neural apparatus is highly plastic in its initial state, but the initial
state of SLA is no longer a plastic system. It is one that is already tuned and committed to the
first language (L1). It is possibility that in the second language, forms of low salience may be
blocked by prior first language experience, and all the extra input in the world may not result in
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
20
advancement” (Ellis et al., 1998, as cited in de Oliveira e Paiva, 2007, p 82). If we look for
resemblances between learning a programming language and learning a second language, we
need to keep in mind that a first language could have a negative influence on learning a
programming language too.
To gain more insight on the similarities between learning a natural language and a
programming language we already mentioned one of the publications of Hulstijn (2005) in
section 1.3. Remember figure 1 presenting the learning settings: deductive-explicit, deductiveimplicit, inductive-explicit and inductive-implicit. In this perspective, it is interesting to explore
how in programming language acquisition (PLA) learners become aware of regularities, such as
grammar rules and syntactic rules. The question is what the role of explicit-deductive instruction
is and what can be learned implicitly by examples. As we already noticed in section 1.3,
according to DeKeyser (1993), by definition, deductive and inductive learning are part of explicit
instruction because the correct rules, like grammar are always given at some point (DeKeyser,
1993, p. 380). It suggests grammatical knowledge as such is part of the learning goals.
The role of grammatical knowledge has been the subject of intensive debate during the
Conference of Teachers in Natural Languages in 2006 in Arnhem, the Netherlands (NaB-MVT,
2006). Besides the question of at which didactic point grammatical knowledge plays a role, also
the question about the need for such knowledge was raised. Earlier, at the University of Delft in
the Netherlands, the Delft-method was developed for foreign students to learn the Dutch
language as soon as possible. In this method grammar rules are only implicitly explained. As the
developers state: "Talking about Grammar rules only distracts and slows down from the real
goals. Grammatical rules are explicitly explained with examples and can be learned implicitly,
explicitly or in a mix from text fragments. The learner decides!” (Montens & Sciarone, 1984, p
11). This corresponds to teaching in an example-based implicit-deductive setting.
2.3.2 A step by step approach to programming tasks
When we look closer at a programming task a number of subtasks become visible. With
reference to these sub-tasks a phased plan can be derived. We want to classify most of these
steps as a translation step in order to emphasize the linguistic aspects of programming. By
translating we mean converting to or interpreting and displaying content in a different form of
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
21
expression. In programming, we think translating in most cases, should be understood in a
broader context. Take for example the step where a problem statement is translated into a
description of a solution. This step certainly implies more than just a conversion.
In the step sequence, after the problem statement, a translation step into an intermediate form
or directly into source code, takes place. Source code is meant to assemble instructions for a
machine. It is editable and legible by a programmer and dedicated software can translate it into
specific machine instructions. Compiling is the technical term for this translation step from
source code into machine code. Ultimately the machine code consists of an array of ones and
zeros. Typically, for a machine this translation is only feasible if the code is syntactically correct.
Figure 3 shows an overview of the most common steps. The blue arrows indicate the translations
which are applicable for this research. One can notice there are multiple paths to reach the source
code phase. The choice of path is determined by the chosen programming language, a GUIapproach or the use of a specific educational integrated development environment (IDE). See
section 2.2.2 for more details.
Figure 3. Translation map for programming.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
22
An intermediate form is meant as aid for structuring and/or documenting solutions and
appears between the step of the solution description and the source code. Intermediate forms as
used in secondary schools and pre-university education can vary. They are often composed of
phrases in natural language where the positioning of these phrases often indicate a structure with
a fixed order. See figure 4, 5 and 6 for some examples.
Figure 4, a task in natural language.
“Write the word 'even' on the screen if the input is an even
number. If the input is an odd number the word 'odd' has to
appear on the screen”
Figure 5, a task in pseudo code.
Begin
Ask for input
Read the keyboard
If input/2 gives a whole number
write 'even'
Otherwise
write 'odd'
End
Another example of intermediate code is the Program Structure Diagram (PSD) in which
ordered language elements containing statements, decision rules and iteration rules visualize the
pseudo code. In figure 6 we see a PSD in which it is not yet specified how to decide whether the
input is 'even' or not, the structure of the code is already in the correct form.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
23
Figure 6, a task in a program structure diagram.
Sometimes we see learners make up their own notes and sketches, which can be seen as
an intermediate form, but in those cases the meaning is not always clear for others. The examples
above are useful for basic imperative language elements. For more complex engineering UML
diagrams can be chosen (Hoogenboom, 2004).
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
24
Chapter 3: Experimental design
To find out how learners perform on programming tasks influenced by a chosen learning
strategy, we looked into the way they perform translations as part of a programming assignment.
The task outcome is assessed on structure and quality. This experiment contains one independent
variable: the learning setting and two dependent variables: structure and quality. See section 3.4
for details on the measurements.
The experiment is conducted in computer labs at schools for secondary education in the
Netherlands. It is embedded in an on-line questionnaire, supervised by the CS teacher. The
questions are formulated in such a way that they can be used for examination purposes as well.
The test is conducted in Dutch. Ten schools are invited to participate in the experiment. Teachers
have to supply some configuration settings in advance so the test will be configured to suit the
educational situation on a particular school.
3.1 Participants
The group of participants who perform the tasks consist of learners in the age of 15 to 18
years old. At the moment of investigation (May-July, 2013) they are learners with beginning
programming skills. This means they have done some learning in programming tasks for
approximately 8 hours in total. When the total learning investment exceeds 16 hours we assume
they passed the “novice state” and their data are removed from this research.
Because it is not about the learning itself but about the way tasks are presented and
successfully performed in one of the three learning settings (LS's) as mentioned in section 1.4, no
specific pre-tests were planned for measuring what is learned so far. The preceding 8 hours of
training should be sufficient to allow the participants to deal with the difficulty of the tasks
given.
Initially the response of the CS teachers in the ten selected schools was very positive, but
eventually only two schools participated and 46 learners took part in the experiment. The
average level of experience, expressed in hours of programming by learners who participated,
estimated by teachers was 8. The learners however thought differently. Their estimation about
the average number of hours was less: M = 5.6 (sd =3.50). The gender distribution among the
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
25
participants was fairly skewed. Only 7 female and 39 male participants took part in the
experiment. Due to the low number of female participants, it is not clear what the influence of
gender is in our experiment. The distribution of the age in years was M = 17.02 ( sd = .42).
3.2 Procedure
The experiment takes 45 minutes divided over 3 equal periods of time. In every period,
participants have to perform translation tasks at their own pace. After each period of 15 minutes
the system changes the learning setting as shown in table 6. In period 1 the tasks are instructed
by an explicit explanation in which no examples are provided and the learner is not allowed to
use other resources. In period 2 the tasks are given without instruction about how to fulfill the
translation but learners can look at three didactically well chosen examples which are closely
related to the solution, meaning they differ only on a few points of the ideal translation. In the
third and final period the participants are allowed to solve the tasks by using the Internet, where
many examples and instructions can be found. In this final period the participants are expected to
submit the URL of the consulted web-pages. Appendix 3 shows screen-shots taken from all three
periods. After 45 minutes the test will close automatically.
Table 6, the three periods of different learning settings.
Elapsed time (t in minutes)
Setting
Period 1
Explicit Deductive instruction is available in (natural)
( t=0 to t=15 )
language
Period 2
Implicit Deductive information is available in parameterized
( t=15 to t=30 )
examples. Examples are close variants of the ideal solution.
Period 3
The LS is decided by the learner and depends on the way
( t=30 to t=45 )
Internet provides solutions as (close) examples & instruction
In the database for this experiment 18 different tasks are available in different
presentation forms matching the three learning settings. Due to practical reasons, during the
experiment, unfortunately, the participants have to sit next to each-other and might cheat by
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
26
looking at their neighbors screen so tasks need to appear to them in different order. To achieve
this, we made three series of tasks. In each experimental session no two participants with the
same series are sitting next to each other. To minimize the side effects of this difference in
ordering, the grade of difficulty between the tasks may not deviate too much. In table 7, a
schedule shows which task is presented to a student in one of the three time slots:
Table 7, the order of tasks per period and learning setting over the programs.
Series
Instruction-based setting
Example-based setting
Free use of the Internet
in period 1
in period 2
in period 3
A
Task 1 to 6
Task 7 to 12
Task 13 to 18
B
Task 7 to 12
Task 13 to 18
Task 1 to 6
C
Task 13 to 18
Task 1 to 6
Task 7 to 12
All answers are collected on-line in a browser-based questionnaire except when learners are
asked to draw diagrams or sketches. For those tasks paper and pencils are available. All data
input during the experiment is saved with a time registration, a program id, a task number and
learning setting id. It is not allowed for the participants to navigate backwards after submitting an
answer. To disable navigating back to a previous task, a session value keeps track of the passed
tasks and software for logging the key strikes ran at the background during the experiment for
possible evaluating or monitoring functions.
3.3 Material
In our experiment, the tasks contain only three basic imperative language elements. This way
we can maximize the amount of students involved. Even if a school teaches some other
programming paradigm, these basic elements are almost always present. Another reason for this
choice of elements is to prevent undesired learning effects. A task at hand should contain known
code components only so experience and gained information from already conducted tasks will
be of no value for the current task of turn. Table 8, shows three basic imperative language
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
27
elements which are set out in this experiment.
Table 8, the selected basic imperative language elements.
Description
Example
1
A simple statement
x=3+y
2
A decision-rule
IF-statement
3
An iteration-rule
WHILE or FOR construction
As shown in Figure 3 in section 2.3.2, each arrow represents one step of translation in
programming activities. Because of practical reasons as for instance the limitation in
experimental time, the CS teacher has to select three translations in advance which are part of the
course program or suits best for the experiment according to his opinion. In practice only the
steps shown in table 9 were chosen. Other possibilities were left out.
Table 9, the selection of chosen translation steps by the CS teacher.
from
to
1
Solution description in natural language
source-code
2
An intermediate form in PSD
source-code
3
Solution description in natural language
An intermediate form in PSD
This is an illustrative example of a task from the experiment, originally formulated in Dutch:
Task:
Show for all angles on the interval [1,91> that following equation is true:
sin(x)*sin(x)+cos(x)*cos(x)=1
For your convenience the library math has already been added.
The deductive explicit instruction:
Insert an finite loop from 1 to 91 containing
calculate variable y = pow(math.sin(x),2)+pow(math.cos(x),2)
write in one line the angle followed by sin of x, the cos of x and the value of y
Examples for the deductive implicit learning:
# import the math library
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
28
import math
# calculate the sin of an angle of 21 degree and place it in variable x
x = math.sin(21)
# y is the square of sin of x
y =pow(math.sin(x),2)
_______________________________________
#make a loop on the range of 1 tot 10
#and within this loop write 10 time x
for x in range(1,11) :
print x
_______________________________________
import math
for y in range(1,31) :
# calculate x = the sin of y2
x =pow(math.sin(y),2)
print “Angle = ”,x,” sin=”,math.sin(x),” cos=”,math.cos(x),”
cos(y)<sup>2</sup>-sin(y)<sup>2</sup>=”,y
Gold Standard array
Array('for', 'in range', 'pow(math.sin(x),2)+pow(math.cos(x),2)','print' )
(This array is used for measurements as explained in section 3.4)
PSD
Repeat for all integer angle values between 1 and 91
calculate variable y = sin(x)*sin(x) + cos(x)*cos(x)
write in one line the angle , sin(x), cos(x), y
3.4 Metrics and measurements
The independent variable is the learning setting (LS) mentioned in figure 1, section 1.3. The
submissions will be compared with each other as displayed in table 10.
Table 10, combinations of learning settings to be compared.
Combinations
Instruction-based explicit-deductive setting with example-based implicit-deductive setting
Example-based implicit-deductive setting with the free Internet setting
Instruction-based explicit-deductive setting with the free Internet setting
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
29
To measure the performance of a task two dependent variables, namely structure and quality
will be evaluated.
The structure of the given answer will be compared with the structure of a predefined Gold
Standard. This Gold Standard consists of an indexed list (an array) of compulsory code words.
The order in which these code words appear is used as the most important criteria for structure.
Text between these compulsory code words are neglected.
The other dependent variable quality defines a balance between completeness and efficiency.
Completeness is measured with the Gold Standard but without discriminating towards the order
aspect. Efficiency is defined as the ratio between lines required and the total number of lines in
an answer. Calculations for these variables are described in more detail in figure 7, 8 and 9.
Figure 7, function for structure evaluation.
Input:
The GoldStandard array containing compulsory words, in the right order
The AnswerText given by the participant.
Process:
•
The AnswerText is filtered for tabs and line breaks resulting in one long sentence.
•
All words in the GoldStandard array are searched for in the AnswerText individually as
search key. This search process is iteratively repeated in concatenated pairs, triples etc.
of successive key words with the regular expression /.*/ as glue.
•
When a regular expression is matched the value of the score will be increased by 1.
Pairs, triples etc. of successive keywords will increase the score for every matching
iteration. (This way longer concatenated key words will result in a heavier weight in
the score value.)
•
To normalize the score between tasks, the value will be divided by the number of
words in the GoldStandard.
Output:
A positive real score value.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
30
Figure 8, function for completeness (used by the function for quality).
Input:
The GoldStandard array containing compulsory words, in the right order
The AnswerText given by the participant.
Process:
•
The AnswerText is filtered for tabs and line breaks resulting in one long sentence.
•
All words in The GoldStandard array are processed individualy as search key.
•
If a regular expression is matched the score will be increased by 1.
•
To normalize the score between tasks, the value will be divided by the number of
words in the GoldStandard.
Output:
A positive real score value.
Figure 9, function for quality evaluation.
Input:
The GoldStandard array containing compulsory words, in the right order
The AnswerText given by the participant.
The score returned from the function completeness.
Process:
•
The AnswerText is filtered for tabs and line breaks resulting in one long sentence.
•
The number of lines in The GoldStandard array is divided by the number of lines in
the AnswerText.
•
The result is multiplied by the score from the function completeness.
Output:
A positive real score value.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
31
As stated before, participants should know the three basic imperative language elements as
mentioned in section 3.3 (table 8). Therefore, if an average score value for structure or quality
over all the submissions made by a particular participant is less than 20% (0.2) of the maximum
score value, their contributions will be skipped during the statistical analysis. Typographical
errors are not considered a huge problem because they are often directly corrected by automated
error detecting features in the development software. Structural errors, however, can influence
the meaning of the code. This can be critical because some unintended changes in meaning are
not always noticed and won't be corrected automatically by development software.
Scores are also normalized between participants. One could classify this research design
actually as a “between task” design. The influence of the differences between the participants
based on their talent, are corrected by dividing the result of an individual score by the average
score of all the tasks submitted by this particular participant.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
32
Chapter 4: Results
Because of some remarkable findings, we present the results starting with discussing the
frequencies of the eighteen tasks accomplished in different learning settings in section 4.1. In
section 4.2 we give the results of the one-way MANOVA conducted in our analysis. Because
time probably had more influence on the results then we initially expected, in section 4.3 we
show the differences between the average time used per task with regard to the learning setting.
4.1 Distribution of completed tasks
685 tasks were submitted. After filtering out the empty answers 259 remained in three LS's.
Some tasks were solved more often than others. Out of 46 participants the results of 8 learners
were removed because the individual average score (larger than .2) was not met. See section 3.1
for more details about the characteristics of beginning programming skills. Three submissions
were skipped because the answers were not filled in seriously and gave unnecessarily noise.
Eventually 177 tasks were evaluated. The distributions of completed tasks by participants
normalized for every LS is drawn in figure 10.
Figure 10, the distribution of completed tasks over the three learning settings.
0,35
Normalized frequency
0,3
0,25
0,2
Instruction-based
Example-based
Free Internet
0,15
0,1
0,05
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18
Task number
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
33
4.2 Data analysis
We have one categorical independent variable, the learning setting and two dependent
variables: structure and quality. To examine the influence of using different learning settings, the
use of a one-way MANOVA test seems to be the right choice. Therefore we have to consider and
test several assumptions .
The dependent variables should be measured at the interval or ratio level.
Structure and Quality are normally distributed. As presented in table 11, the Shapiro-Wilk
test shows only non significant results (p>.05). As marked in bold in table 11, quality in the
implicit-deductive setting resulted in a lower p-value.
Table 11, results for Normality with the Shapiro-Wilk test.
Learning setting
structure
quality
Statistic
df
Sig
Instruction-based
explicit-deductive setting
.979
68
.313
Example-based
implicit-deductive setting
.985
56
.702
Free Internet setting
.973
56
.243
Instruction-based
explicit-deductive setting
.980
68
.348
Example-based
implicit-deductive setting
.955
56
.036
Free Internet setting
.967
56
.128
Because of the robustness of a parametric tests in general, we assume both dependent
variables are normally divided.
The dependent variables can be treated to be continuous.
The independent variable contains three independent groups. The independence of
observations in each group is met by making them independent from the participant as
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
34
mentioned earlier. There is an adequate sample size. Looking at the Z-score of 3.29 one
univariate outlier is found for structure and will be removed. For multivariate outliers we
calculate the chances for Mahalanobis D2. Two cases have a p<.001 and are removed. The
removed items were already nominated to be skipped based on earlier demonstrated incapacity.
To test for multivariate normality, West, Finch, & Curran (1995) recommend concern if
skewness > 2 and kurtosis > 7. The Mardia's test shows that an normality estimate > 3 indicates
non-normality. Testing this using the script for SPSS from DeCarlo (1997) the following results
are measured: b2p (Mardia's estimate of multivariate kurtosis) = 8.796, N(b2p) = 1,312 and the p
= .4497. This indicates that the multivariate kurtosis is small enough to retain the multivariate
normality assumption. This also supports the earlier made decision to treat both dependent
variables individually as normal distributed.
As shown in figure 11, 12 and 13 there is weak linear relationship between each pair of
structure and quality for each learning setting.
Figure 11, the linearity between the dependent variables in the explicit-deductive setting.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
35
Figure 12, the linearity between the dependent variables in the implicit-deductive setting.
Figure 13, The linearity between the dependent variables in the free Internet setting.
The homogenety of variance-covariance is tested with the Box's M test of equality of
covariance. The test shows M=12,61, F = 2,067 with a p-value of .054, therefore the result was
not significant. Although this is very close to the acceptance value, the assumption of equal
group covariance matrices can not be rejected.
The assumption of no multicollinearity is tested by looking at the correlation between
structure and quality. The Pearson correlation value is r=.633 which is less than .8 which
indicates no multicollinearity.
As it seems, all the important assumptions needed for using a MANOVA test are met
sufficiently. The multivariate tests between the three learning settings with 95% confidence
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
36
interval are shown in Table 12.
Table 12, results of the multivariate tests.
F
Hyp. df
Error df
Sig.
Pillai's Trace
.067
3.325
4.000
386.000
.011
Wilks' Lambda
.934
3.311
4.000
384.000
.011
Hotelling's Trace
.069
3.297
4.000
382.000
.011
Roy's Largest Root
.043
4.114
2.000
193.000
.018
All p-values in the last column are below .05 which shows statistically significant
differences between the means. A Post Hoc test is needed to show in which direction they differ.
A Levene test shows that the error variances of the dependent variables structure and quality is
not equal across the learning settings. Structure has a significance value of .381 and quality of
.102. As part of the Post Hoc tests, with an LSD calculation (“Fishers' least significant
differences test”) it is possible to draw eventually some conclusions. The calculations of the
LSD's are displayed in Table 13.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
37
Table 13, results of the LSD tests.
Variable Compared learning settings
Mean
Std.
Difference
Error
Sig.
CI
CI
lower
upper
bound
bound
Instruction-based setting
structure
with
.0011
.1160
.992
-.2276
.2298
.2519
.1179
.034
.0193
.4844
-.2508
.1200
.038
.0141
.4875
.2423
.1154
.037
-.0092
.4488
.2266
.1174
.055
-.0072
.4603
-.0158
.1194
.895
-.2302
.2438
example-based setting
structure
Instruction-based setting
with
free Internet setting
structure
Example-based setting
with
free Internet setting
quality
Instruction-based setting
with
example-based setting
quality
Instruction-based setting
with
free Internet setting
quality
Example-based setting
with
free Internet setting
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
38
If we focus only on the column Mean Difference, we notice for structure a small
difference between the instruction-based setting and the example-based setting. However, the
difference between these deductive settings with the free Internet setting is much larger. Looking
at quality in the instruction-based setting the mean explicitly distinguishes from example-based
setting and the free Internet setting while the difference in mean between example-based setting
and the free Internet setting differ little. However, when we take the confidence intervals (CI)
into account, as shown in figure 14, we have to be careful drawing any conclusions.
Figure 14, The means in the three learning setting for structure and quality with error bars.
The error bars indicate how much the value of the mean may vary. Unfortunately they all
overlap each other. The hypothesis in this study can not be confirmed nor rejected.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
39
4.3 Time spent on each task
The amount of time spend on each task could play an important role. There were six tasks for
every condition available. In practice, no learner succeeded to fulfill all six in a period of 15
minutes. In figure 15 the average amount of seconds needed per task is shown for every learning
setting.
Figure 15, The average seconds used per task in an instruction-based setting.
A lot of task submissions were not useful for evaluation because they were empty. Several
items containing noise or were marked as outlier and were removed.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
40
Chapter 5: Discussion
Because of the possible impact of time-related issues, we discuss our view on this matter in
section 5.1. Section 5.2 disentangles the obscure feature of our experiment on learning settings
without the intention to learn. In section 5.3, the measurement of structure and quality is
compared to performance as assessment criteria in programming. Next, the distance between
second-language learning and learning to program is subject of section 5.4, followed by section
5.5 about the need for a serious approach to computer science.
5.1 Possible impact of time
Informal feedback from participants after the experiment revealed they chose to skip a task
because they thought the next one could lead to more success. The system was configured to
submit every task when leaving the web-page towards the next task. This could explain the large
quantity of empty submissions. Because of possible undesirable learning effects as described in
section 3.2, participants could not navigate to a previous task. Unfortunately, they only became
aware of this limitation during the experiment. We expect the adverse effects are minimal and
there were plenty of tasks left. It might be more complex to discover submissions with an atleast-try-something answer, because these items are hard to detect with scripted algorithms.
Fortunately, guessing the right answer does not occur as a programming skill and is therefore not
a problem to be validated as submission with a low score.
By dividing time into periods of 15 minutes, as mentioned in section 3.2, the same amount of
tasks in every learning setting was approached. It was no obligation to fulfill all available 18
tasks and the participants could work at their own pace. However, to perform well, some of them
mentioned time pressure and uncertainty, which by itself, might have caused an inhibitory effect
on the pace of work.
To perform under pressure of time seems a reasonable proficiency requirement, especially
for programmers working in a professional environment. Experienced programmers surely know
strategies to deal with time pressure issues.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
41
5.2 The missing learning experience
Although the learning settings as shown in figure 1 are described and developed within a
framework of learning, the learning itself is not measured in this experiment. Participants have
already gained the three basic imperative language elements as mentioned in table 8, section 3.3,
otherwise their submissions were filtered out from the analysis. Besides measuring how, for
example, conceptual knowledge as underlying concepts and rules as mentioned by Hulsman
(2005) in section 1.3, are learned is probably very hard to accomplish with an experiment like
this.
While preparing the experimental material, we became aware of the fact that searching for
examples that are close to a certain task, meaning they differ only on a few points of the ideal
solution, was very time consuming. Fortunately we decided to concentrate on novice
programmers. When a learner gains more skills, programming tasks will become more complex
and suitable matching examples will be harder to find. Learners probably need in an advanced
level more individually tailored material to learn from.
The free Internet setting is a normal situation for many programmers, because it provides 24
hours a day solutions and background information for the most regular programming tasks.
However, in our experiment the results for this setting were poor. This is probably related to time
issues. For beginners, finding the right information about programming on the web, proved to be
a real challenge.
5.3 Measuring results with the variables structure and quality
The quality of the results of the learners is harder to measure than expected. We searched for
guidelines for evaluating programming skills in literature about CSE. Auffarth et al. (2008) and
several other authors, formulate their assessment criteria on performance when code is executed
by a computer, not on the textual composition. Perhaps, many assessors believe that there is only
one possible correct composition of code per task possible. See table 14 for an example of
performance oriented assessment criteria (Auffarth et al., 2008 p 2).
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
42
Table 14, An example of assessment criteria for programming code (Auffarth et al., 2008).
Type
Performance check
1
Execution
2
Verification
3
Validation
Does the code work at all?
Does the code work given a set of inputs?
Given a set of inputs does the code output the expected
results?
Because of the difference in approach between our research and the performance-oriented
assessors, we asked the CS teachers of the participating schools in our experiment, to give their
opinion. Their ideas are interpreted in table 15.
Table 15, Ideas from CS teachers about better programming results.
Error rate
1
Minor error
Description
Spelling and punctuation errors
2
Lack of efficiency
3
Structure errors and incompleteness
4
Significant error
Semantical errors
Spelling and punctuation errors are not very harmful because smart editors will detect them
immediately. Structural errors and incompleteness are more of a problem, because these kind of
deficiencies are decisive for reaching the right goals and are sometimes not detected by the
software for editing. The semantical error is most harmful and occurs already at the beginning
when the problem is translated into a solution. Some of the teachers estimate efficiency as less
important. It doesn't yet have such a big impact on simple tasks. It becomes more influential in
complex tasks.
Between efficiency and completeness the right balance had to be found. For this experiment
we attempted to find such balance for every single task by checking which code fragments were
indispensable. See section 3.4 for more details about the Gold Standard used.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
43
5.4 Distance between second-language learning and learn to program
The possible similarities between learning how to conduct translation steps in programming
and performing equivalent exercises in learning a second language, varies per school in the same
way schools differ in a school-specific curriculum or didactic choices.
In many textbooks for modern foreign languages, examples and instructions are present. Also
exercises for which the use of the Internet is prescribed, appear more frequently. Thumbing
through schoolbooks for second language learning we notice an increase of different ways
learning material is presented, including explicit and implicit deductive material. Remarkable
though, is the discussion about the question with what to start, implicit text examples, explicit
theory or both? Anyway, these choices, with respect to computer programming, have not been
identified explicitly in this study.
As mentioned in the introduction, section 1.1, Kak (2008) and Ginat (2003) indicate the
considerable influence of the attitude of students on how they learn. Perhaps students should
not only be allowed to choose the language they need to learn, but also which
learning setting they prefer and in which order the knowledge is obtained.
5.5 Importance for Computer science
In this study, we focused on secondary schools in the Netherlands. At some schools computer
science has a poor image. Often one can hear about large differences between individual school
programs and also about large differences between scores of individual learners. Learners are not
always challenged by the topics in the course.
A serious approach for CSE is vital for its opportunities in the near future. We think many
recognize its importance, so curriculum developers have to free themselves from the phase of
pioneering and changing priorities between the areas of computing science too often. Paying
more attention to something as basic as writing or designing computational solutions seems a
far better choice to us.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
44
Chapter 6: Conclusion and future work
In this chapter we conclude our work. In section 6.1 findings and indications are derived
from the results of our experiment. Section 6.2 presents answers to our research question. In
section 6.3 we discuss some recommendations for future work.
6.1 Findings and indications
Three basic imperative language elements were used to test how participants made
translations in a programming task. The use of an instruction-based explicit-deductive setting, an
example-based implicit-deductive setting and what we have called a free Internet setting, showed
varied results. Looking back at these results the separation of structure and quality gives us
plausible information. The use of the free Internet setting proves to be not only time
consuming, it sometimes lacks the guidance needed for accomplishing this sort of translation
tasks. Students also lack directions to where the best information can be found. The implicitdeductive setting where solutions to similar problems are offered in order to accomplish the
tasks, provide helpful information on structure but less helpful information towards quality. The
explicit-deductive setting, where instruction provides the matter, seems the best choice for
quality. Structure and quality of the task submissions were evaluated and statistically tested for
differences.
The results of a MANOVA test were used together with a Post Hoc LSD calculation.
Because the confidence intervals in the Post Hoc tests are relatively large, They overlap all three
conditions. If this experiment will be repeated with a larger group of participants, we expect the
results in an instruction-based explicit-deductive setting and the example-based implicitdeductive setting, will still not differ very much and the results for the free Internet setting will
remain on great distance. Looking at structure separately the example-based implicit-deductive
setting shows better results compared to the instruction-based explicit-deductive setting and far
more better compared to the free Internet setting. If the confidence intervals can be reduced in
size, we expect quality will be best guaranteed when an instruction-based explicit-deductive
setting is applied. The example-based implicit-deductive setting and the free Internet setting both
will give low scores.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
45
6.2 Answers to the research question
With this information we can answer our research question, about the differences between
the results for programming tasks fulfilled by learners when conducted by means of textual
instruction, using close-to-solution examples only or by allowing the use the Internet freely.
Depending on what kind of assessment criteria are at stake and based upon the ideas of CS
teachers, as mentioned in table 14 and 15, our research reveals: an instruction-based explicitdeductive setting, gives better results if quality, defined as combination of completeness and
efficiency is the main criterion. When structure, compared to quality, is more important, the
example-based implicit-deductive setting is a better choice. In our research, the free Internet
setting gives weaker results for each assessment criterion mentioned in table 14 and 15.
6.3 Future work
In this research in the domain of PLA (Program Language Acquisition) the experience of the
participants about which learning setting works best, is not collected. Especially the possible
difference in opinion before and after the experiment could contribute much to our knowledge on
how the learners perceive the learning process. For future research we like to recommend to
listen to the learner more often. In combination with the data analysis from an experiment like
this, we might have more clues for explaining which learning setting or combination of settings
gives better results.
During computer programming classes where learners discuss code issues, problems or
solutions, they train themselves implicitly in translational steps. Sometimes I hear a conversation
where two learners use code fragments to communicate. Also in written exams about computer
programming these translational activities become visible, often written in natural language
accompanied by examples of intermediate forms or real source code. Mixed forms appear quite
frequently. These implicit translational utterances from a natural language into an intermediate
form or into source code and vice versa could tell us something about how successful the
learning process in programming progresses. This might be an interesting area for future
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
46
research.
Finally, more attention should be given to the time factor. It seems a plausible common
denominator for many of the low scores and empty replies in our experiment. However dealing
with time can be viewed as a standard skill in software engineering. Depending on the objectives
of the present CS courses, an experiment with time as independent variable could lead to more
insight in the way translations are made by programmers. This is a challenge that should be
tackled in future research and perhaps an issue that should be dealt with in a broader sense,
because in education a lot of testing and examining is done without concentrating on the impact
of time when it comes to results that matter.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
47
Bibliography
Alexander, C. Ishikawa, S. Silverstein, M. (1977). A Pattern Language. Oxford University Press
Auffarth, B. Lopez-Sanchez, M. Campos i Miralles. J. Puig, A. (2008) System for Automated
Assistance in Correction of Programming Exercises (SAC)
Department of Applied Math and Analysis, University of Barcelona
Barnes, D. Fincher, S. Thompson, S. (1997). Introductory Problem Solving in Computer Science
Computing Laboratory, University of Kent at Canterbury, Kent, CT2 7NF, England
Bellaachia A. (2006). Advanced Software Paradigms. George Washinton University (US).
Online Powerpoint, first created at 12 September 2005
<http://www.seas.gwu.edu/~bell/csci210/lectures/programming_paradigms.pdf>
Bergin, J. Eckstein, J. Völter, M. Sipos, M. Wallingford, E. Marquardt, K. Chandler, J. Sharp, H.
& Lynn Manns M. (Eds.) (2012). Pedagogical Patterns: Advice for Educators. Joseph Bergin
Software Tools (Pedagogical Pattern Editorial Board)
Boyle, T. (2000). Constructivism: a Suitable Pedagogy for Information and Computing
Sciences? University of North London. doi=10.1.1.176.8153
Carey, S., Bartlett E. (1978). Massachusetts Institute of Technology, Rockefeller University ,
Acquiring a Single New Word, Papers and Reports. on Child Language Development. pp. 17 29 Vol . 15
Cunningham, D. J. Duffy T. M. & Knuth R. (1993). The textbook of the future. Cited in
McKnight, C. Dillon A. & Richardson, J. (eds) Hypertext: a psychological perspective.
Ellis Horword.
DeKeyser, R. M. (2003). Implicit and explicit learning. In Doughty, C. J. & Long, M. H. (Eds).
The Handbook of Second Language Acquisition (pp.313-348). Oxford: Blackwell.
Duffy, T. M. & Jonassen, D. H. (1991) Constructivism: new implications for educational
technology? Journal, Educational Technology, 31, 5, 7-12.
Eckstein, J. (2000). Learning to Teach and Learning to Learn, Running a Course, Objects in
Action. Germany (EuroPLoP)
Ellis, N. C. (2007). Cognitive Perspectives on SLA: The Associative-Cognitive Creed.
Source: AILA Review, Themes in SLA Research: AILA Review, Volume 19 , pp. 100-121(22) Pub
Fee, S. B. & Holland-Minkley, A. M. (2012). Teaching Computer Science through Problems,
not Solutions. Information Technology Leadership, Washington & Jefferson College,
Washington, PA, US
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
48
Gelman, S. A. (2007). Two Insights about Naming in the Preschool Child, in The Innate Mind:
Structure and Contents. Publisher: Oxford University Press
Ginat, D. (2003). The Greedy Trap and Learning From Mistakes. CS Group, Science Education
Department Tel-Aviv University, Israel
Gómez-Albarrán, M. Jiménez-Díaz, G. López Fernández, M. Gómez-Martín M. A. DíazEsteban, A. Hernández-Yañez, L. & Ruiz-Iniesta, A. (2009) Example-supported learning of
programming concepts: from free-access to knowledge-controlled routing in repositories
deployed in a Virtual Campus. Paper for the XI International Simposium on Computers in
Education (SIIE 2009). Coimbra (Portugal)
Hoogendoorn, S. (2004) Practisch modelleren met UML 2. ISBN10 9043006521, ISBN13
9789043006521
Hulstijn, J. H. (2005). Theoretical and Empirical Issues in the Study of Implicit and Explicit
Second-Language Learning, SSLA, 27,129-140. Printed in the United States of America. doi:
10.1017/S0272263105050084
Jonassen, D. H. (2000) Toward a Design Theory of Problem Solving. Journal, Educational
Technology, Design and Development, p. 48 (4).
Kaasbøll, J. J. (1998). Exploring didactic models for programming, p. 195–203
Department of Informatics, University of Oslo.
Kak. A, (2008). Teaching Programming. First posted: July 2008; Revised (minor corrections):
October 2012. doi 310.1019/654.tp.2008.01.11. Purdue University
Ljubomir, J. (2012) Pedagogical Patterns For Learning, Programming By Mistakes.
University of Novi Sad. Faculty of Sciences, Department of Mathematics and Informatics, Serbia
<http://www.academia.edu/2356192/ ljubomir.jerinic@dmi.uns.ac.rs>
Mardia, K. V. (1980). Tests of univariate and multivariate normality. In P. R. Krishnaiah (Ed.),
Handbook of statistics (Vol. 1, pp. 279-320). Amsterdam: North-Holland.
Montens, F., Sciarone, A. G., (1984) Hoe leer je een taal? De Delftse methode
Naigles, L. (1990). Children use syntax to learn verb meanings, Journal, Child Language. 17,
357-374. Yale University, New Haven, CT
Nationaal Bureau Moderne Vreemde Talen (NaB-MVT). (2006). Enschede, Netherlands “De zin
en onzin van grammatica onderwijs! ” Conference documentation.
Neve, P. Hunter, G. Livingstone, D. & Orwell, J. (2012). NoobLab: An Intelligent Learning
Environment for Teaching Programming. Kingston University London, UK
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
49
Oliveira e Paiva, V. L. M. de (2009). Second Language Aquisition: from Main Theories to
Complexity.
Pea, R. D. Kurland, D. M. (1983). On the Cognitive Prerequisites of Learning Computer
Programming. Technical Report No. 18.
Pucher, R. Tesar, M. Mandl, T. Holweg, G. & Schmöllebeck, F. (2011) Improving Didactics in
Computer Science – The Example of the GEMIS and the QUADRO Projects, Journal,
International journal of education and information technologies. Issue 1, Volume 5
Robins, A. Rountree, J. & Rountree, N. (2003). Computer Science, Journal: Computer Science
Education 2003, Vol. 13, No. 2, pp. 137–172. University of Otago, Dunedin, New Zealand
Rueping, A. 1999a ‘Project Documentation Management’, in Proceedings of the 4th European
Conference on Pattern Languages of Programming and Computing 1999, Universitäts- verlag
Konstanz.
Rueping, A. (2003). Agile Documentation: A Pattern Guide to Producing Lightweight
Documents for Software Projects
Simha, R. (2003) website: High school, computer science: a recource. Department of Computer
Science at The George Washington University
<http://www.csteachers.gwu.edu/Teachers/javaoptions.html>
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
50
Appendix 1: Three didactic approaches to programming (Kaasbøll, 1998)
These didactic models for introductory teaching of programming have been found in
literature.
1. Semiotic ladder
Source: Koffman’s (1986), introductory book on programming is an example
This teaching and learning sequence starts out from syntax, and proceed to semantics and
pragmatics of the language-like tools. Syntactical knowledge precede the learning of meaning of
the language constructs.
3 Pragmatics
2 Semantics
1 Syntax
2. Cognitive objectives taxonomy
Source: Kirkerud (1996) & Reinfelds (1995)
They resemble Bloom’s taxonomy of cognitive objectives, The sequence of instruction
comprised using an application program, reading the program, an changing the program.
Creating a program may also be added.
4 Create a program
3 Change a program
2 Read a program
1 Run a program
3. Problem solving
Source: Rogalski & Samurçay (1990)
Through solving problems, the students should extend their experience and the basis for the
process is the knowledge structure of the field of programming. Compared to the previously
mentioned approaches, this one stresses the input and outcome of the learning process in terms of
knowledge and personal experience. It is therefore more a model of learning than of a teaching
strategy.
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
51
Appendix 2: An illustrative example of a task from the experiment
Task:
Show for all angles on the interval [1,91> that following equation is true:
sin(x)*sin(x)+cos(x)*cos(x)=1
For your convenience the library math has already been added.
The deductive explicit instruction:
Insert an finit loop from 1 to 91 containing
calculate variable y = pow(math.sin(x),2)+pow(math.cos(x),2)
write in one line the angle followed by sin of x, the cos of x and the value of y
Examples for the deductive implicit learning:
# import the math library
import math
# calculate the sin of an angle of 21 degree and place it in variable x
x = math.sin(21)
# y is the square of sin of x
y =pow(math.sin(x),2)
_______________________________________
#make a loop on the range of 1 tot 10
#and within this loop write 10 time x
for x in range(1,11) :
print x
_______________________________________
import math
for y in range(1,31) :
# calculate x = the sin of y2
x =pow(math.sin(y),2)
print “Angle = ”,x,” sin=”,math.sin(x),” cos=”,math.cos(x),” cos(y)<sup>2</sup>sin(y)<sup>2</sup>=”,y
Gold Standard array
Array('for', 'in range', 'pow(math.sin(x),2)+pow(math.cos(x),2)','print' )
(This array is used for measurements as explained in section 3.4)
PSD
Repeat for all integer angle values between 1 and 91
calculate variable y = sin(x)*sin(x) + cos(x)*cos(x)
write in one line the angle , sin(x), cos(x), y
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
52
Appendix 3, Screen-shot of the experiment in the explicit-deductive learning setting
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
53
Appendix 3, Screen-shot of the experiment in the implict-deductive learning setting
TEACHING STRATEGIES FOR PROGRAMMING LANGUAGES
54
Appendix 3, Screen-shot of the experiment in the free Internet learning setting
Download