Presentation - People

advertisement
Click to edit Master title style
Survey of Automated
Assessment Approaches for
Programming Assignments
Gayathri Subramanian
Spring, 2012 - Reinventing etextbook - Virginia Tech – Computer Science
Click to Reference
edit Master
title style
Papers
1. ‘A Survey of Automated Assessment Approaches for
Programming Assignments’ by ‘Kirsti M. Ala-Mutka’ (1995 –
2005).
1. ‘Review of Recent Systems for Automatic Assessment of
Programming Assignments’ (2005 – 2010)
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
2
Click to editOutline
Master title style
 Introduction and Motivation
 Static and Dynamic Assessment Techniques
 Features of a good Automated Assessment System
 Automatic Vs Semi-Automatic Approaches
 Summative Vs Formative Approaches
 Conclusion
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
3
Click to edit
Master title style
Motivation
 Programming Courses are integral part of Computer Science
and Software Engineering Curricula.
 Proficiency in a programming language is obtained with
practice.
 Programming Courses are large in size and heavy workload for
the teachers.
 Even small programs typically have a large number of possible
execution paths.
 Research suggest that it is not possible to consistently and
thoroughly grade students’ programs without automated
assistance.
 Programs can be automatically assessed !!
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
Click toMotivation
edit Master
title
.. Cntd
.. style
 New Automated Systems are being created every year
 Many System share common features
 Systems exist which satisfy most of the assessment needs
 There are far less system that are widely adopted than there
are papers about it.
 Literature survey helps teacher identify tool they are looking
for.
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
5
Click to edit Master title style
What are the features of a program which can
be automatically assessed and the tools which
support them ?
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
6
Click
to edit Assessment
Master title
style
Static
& Dynamic
of the
Code
 Programs follow syntax and semantics which makes
automatic assessment feasible
 Extract some kind of measurement value (justified by
teaching goals) from a program and Compare them
against the given requirements (or teaching goals)
 Some features requires execution of the program , some
are statically evaluated
 Functionality , Efficiency and Testing Skills are
Dynamically Assessed
 Coding Style , Programming Errors, Software metrics and
Design can be statically assessed
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
7
Click
edit Master
title style Assessed ?
What Features
of ato
Program
can be automatically
Dynamic Assessment :
 Functionality
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
8
Click to edit
MasterAssessment
title style )
Functionality
( Dynamic
 Running the program against test cases [Course Marker,
HOGG, BOSS, online Judge]
 Success depends on Test Case Design and Model Solution
 Coverage Analysis – Function, Statement, Decision measures
efficiency of test cases
 Correlated Test Cases - defining a test case with a planned
relationship to the program state created during previous test
input. [Quiver]
 There should be certain degree of freedom in representing
model solution [Assyst using pattern matching and Course
marker uses Reg-exp]
 Course Marker checks for the return status of the program.
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
9
Click toFunctionality
edit Master
title style
.. Cntd
 Assess functionality of single function/Method[Quiver for Java
, Scheme-Robo for Schema]
 Assessing the functionality of a program with a GUI requires a
means to monitor actions and responses communicated
through the user interface.[JEWL (a language library) for Java
provides students GUI and teachers to manage events of
program and its output actions]
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
10
Click
edit Master
title style Assessed ?
What Features
of ato
Program
can be automatically
Dynamic Assessment :
 Testing Skills
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
11
Click to
edit(Dynamic
MasterAssessment)
title style
Testing
Skills
 Testing is an essential phase in program development.
 Students submit test data sets along with the
programming assignment.
 Assyst was the first tool that provided assessment of
student test data. The assessment was based on Code
Coverage Analysis
 Chen (2004) assesses the student test suite by running a
set of buggy instructor programs against it.
 Webcat - When a student submits a test data set, it
assesses how well it covers all the different execution
paths.
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
12
Click
edit Master
title style Assessed ?
What Features
of ato
Program
can be automatically
Dynamic Assessment :
 Efficiency
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
13
Click
to edit
Master
title style)
Efficiency
( Dynamic
Assessment
 A simple efficiency measurement is the running time of
the program, measured either by the clock or CPU time
used.[Assyst , Online Judge]
 Efficiency measurements can be distorted by different
implementation of input/output actions.
 A simple solution is to offer a common input/ output
module for use in assignments.[Hansen and Ruuska]
 Efficiency can also be assessed by studying the execution
behavior of different structures inside the program.
 This is done by calculating how many times certain blocks
and statements are executed and by comparing the
results to the values obtained from the model
solution.[Assyst , Course Marker]
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
14
Click
edit Master
title style Assessed ?
What Features
of ato
Program
can be automatically
Dynamic Assessment :
 Language Specific Features
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
15
Click
to edit
Master
title style
Language
Specific
Features
( Dynamic
Assessment)
 Language specific implementation issues can be difficult
to learn and assess.
 students often misuse memory management, do not
deallocate all the reserved memory blocks.
 [Tutnew ] C++ library which overrides normal memory
management methods and thus can provide runtime
assessment for program memory usage.
 the test cases affect the coverage of this assessment,
since they define the execution paths for the program.
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
16
Click
edit Master
title style Assessed ?
What Features
of ato
Program
can be automatically
Static Assessment :
 Coding Style
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
17
Click
toStyle
edit( Master
title style
Coding
Static Assessment
)
 Programming style or coding style and its connections to readability,
maintainability etc.
Typographical - E.g. indentation, placement of parenthesis,
maximum length of line
Syntax - every switch-statement should have a default-branch, and
each case-branch should end to a break-statement.
Semantic - class names begin with a capital letter and each
declared variable should be used in the program.
Logical. Issues related to the logical structure of the program. E.g.
there should not be too deeply nested loops, methods should not
have a huge number of parameters, and global variables should not
be used as method parameters.
 Making use of effectiveness of compilers and their warning
capabilities
 GCC compiler (GCC) can provide feedback on unused variables, implicit
type conversions, and language features that are not following the
language standards, amongst other things.
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
18
Click toCoding
edit Master
title style
Style .. cntd
 Checkstyle is open source software for checking Java programs and can be
combined to several programming environments.
 Comments for classes, attributes and methods
 Naming conventions of variables, methods
 Number of parameters passed to a function
 Duplicated code sections
 The good practices of class construction
 Complexity Measurements of expressions
 Style++ is another tool that has been developed for assessing quality factors from
C ++ programs
 An automatic system PASS (PASS) has been implemented to assess these issues
from programs in Ada, C, and Java languages.
http://en.wikipedia.org/wiki/Checkstyle
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
19
Click
edit Master
title style Assessed ?
What Features
of ato
Program
can be automatically
Static Assessment :
 Programming Errors
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
20
Click to edit
Master
style
Programming
Errors
( Statictitle
Assessment)
 some errors , suspicious code fragments can be recognized
statically.
 Static check to recognize several typical error types caused by
students. Eg, mistakes in updating a loop control variable or
inconsistencies between a parameter type and usage.
 Xie and Engler (2002), who used code redundancies for
detecting errors. By implementing a tool to detect idempotent
operations, redundant assignments, dead code, and
redundant conditionals, they were able to find several errors
from the well known Linux source code.
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
21
Click
edit Master
title style Assessed ?
What Features
of ato
Program
can be automatically
Static Assessment :
 Software Metrics
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
22
Click toMetrics
edit Master
title style )
Software
( Static Assessment
 Software Metric are general metric that characterize the
program
 Hung, Kwok and Chan (1993) studied different metrics with
programming assignments and came to conclusion that the
number of code lines was a good measurement of students’
programming skill.
 counting different attributes, such as the number of operators
and operands in a program , Control Structures
 metrics as clear indicators of student performance and also
possible indicators of needs for instructional development.
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
23
Click
edit Master
title style Assessed ?
What Features
of ato
Program
can be automatically
Static Assessment :
 Design
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
24
Click
to edit
Master
title style
Design
( Static
Assessment
)
 Teachers often need to assess whether submitted programs
conform to given interface or structural requirements.
 Thorburn and Rowe (1997) implemented a system that
automatically recognizes the functional structure of a C
program. They call it the ‘‘solution plan’’ of the program and
compare it to the solution plan of the model program, or to a
set of possible plans.
 Truong, Roe and Bancroft (2004) implemented a structural
similarity analysis that transforms a student’s program to XML
presentation and compares it to the set of model solutions.
 MacNish (2000) used the Java reflection for analyzing if class
interfaces and method signatures in students’ Java programs
met the given requirements.
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
25
Click
edit Master
title style Assessed ?
What Features
of ato
Program
can be automatically
Static Assessment :
 Language Specific Features
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
26
Click
to editSpecific
MasterFeatures
title style
Language
 Search for specific key-word based on teaching goal.
 In Scheme language to assess whether program structure is
purely functional by searching for primitives set!, set-car!, and
set-cdr!
 A more flexible approach has been implemented in Ceilidh
(Foxley, 1999) by defining regular expressions to be searched
from the student’s program code.
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
27
Click to edit Master title style
Essential Features of Automatic Assessment tool
for a Programming Course
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
28
ClickAutomated
to edit Master
title style
Administration
 AA is a means for administrating submission , grading, general
information delivery
 Benefits of Automated Administration
Efficient way to track student progress and to Recognize
needs for improvement on the course
peer-reviewing becomes feasible. Students comment on
each others’ programs.
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
29
Click to
edit Master
title style
Plagiarism
Detection
 Computer programs are text files that are easy to copy.
 From Structural Information of the program
MOSS is based on document fingerprinting
JPLAG uses string tokenization with sub-string pattern
matching
 Attribute Counting Mechanism
Verco and Wise (1996) compared automated tools based
on attribute counting mechanisms
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
30
Click to
edit Master
title style
Resubmission
Policies
 Resubmissions are required for improving the answers.
 Resubmission policy should prevent the trail-and-error
strategy by some students
Limit the number of submissions
Limit the amount of feedback
Compulsory Time penalty
Making each exercise slightly different
Programming Contest approach [Mooshak]
Combination of limited and unlimited submissions
based on test cases
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
31
Click to edit
Master
Sand
Boxing title style
 Programming assignments are graded by running the code on
Server, its important to protect the sever from malicious and
unintended code bugs and flaws
Use Existing approaches like Linux security model, chroot,
Java Security policy etc to securely run code
Use Static Analysis to filter malicious code
Grading on the client side
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
32
Click to edit
OpenMaster
Sourcingtitle style
 Survey by Pears et.al in 2007 reported that tools were single
largest group amongst papers , other categories were
curricula, pedagogy and programming languages
 Many of these System share common features and there
exists systems which fulfills most assessment needs
 New Automatic assessment system are being created every
year
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
33
Click to editvs.Master
title
style
Semi-automatic
Automatic
Assessment
 Quality of the automatic feedback may not be as high as one
given by an instructor
 All issues related to good programming cannot be
automatically assessed.
 Hybrid approach uses Automation for small assignments and
to combine manual and automation for larger assignments
 [Advantages] gives teachers more time to concentrate on the
demanding assessment tasks and also provides a possibility to
double check the results of the automatic assessment.
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
34
Click to edit
Master title
style
Formative
vs Summative
Assessment
 Formative Assessment : Allows Resubmission to help student
improve the answer based on feedback.[Complete Program
should be submitted in first attempt , except Web-cat]
 Summative Assessment : [BOSS] can be used in homework
assignments , online examinations
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
35
Click to edit
Master title style
Conclusion
 Benefits are numerous!
 Immediate Feedback to students
 24h availability
 Objectivity and Consistency of the evaluation
 More Practice to students
 Some features of a program can only be assessed with automatic
assessment and some features cannot. Hybrid Approach may be useful.
 Tool Specific Issues
 Setting up configuration files may be time consuming.
 Specification should be non-ambiguous
 Effectiveness also depend Test Cases
 If similar tool approaches are used, good assignments and their
assessment routines could be stored and reused.
 Tools should be made widely available !
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
36
Click to edit Master title style
Thank You
Spring, 2012,
2012 -Reinventing
Reinventing eTextbook
etextbook - Virginia Tech – Computer Science
37
Download