2013 IAEA Conference (PPTX, 2.9MB)

advertisement
Report from the
2013 IAEA
Conference
Tel Aviv October 2013
IAEA
International Association for Educational Assessment
The IAEA is committed to improving the quality of
education on a global basis through sharing
• Professional expertise
• Research
• Training
They also produce a research journal.
Theme of the 2013 conference was
Educational Assessment 2.0: Technology in Educational
Assessment
2013 Conference Themes
• The use of technology in large scale assessment
• Assessment design
• How technology can be used to enhance the validity of
assessment
• Trends in the use of technology in assessment
• Test development and measurement
• Automated Essay Scoring
• Digital gaming
2013 Conference Themes
This presentation focuses on two key note speakers
(Richard Luecht and Randy Bennett), covering four of
those key themes:
• Technology in large scale assessment – computer
based assessment (CBE)
• Assessment design
• Automated Essay Scoring
• Digital gaming
Assessment
Design
Richard Luecht
Richard Luecht is a Professor of Educational Research Methodology at the
University of Greenborough. He focused on evidence–based assessment
design for formative assessments, technology-enhanced test design, computerbased test design, and cognitive science in assessment. He used the Common
Core Standards in the US to illustrate his points.
Some key points
• The student should be at the centre of assessment – not
technology
• Instructionally sensitive assessment – assessment that
should respond to student’s learning
• Assessment should be on demand with immediate feedback
• Assessment should be engaging and intrinsically motivating
“Just because we can use technology does
not mean we have to or that it makes the
assessment better” Richard Luecht
Computer Based Examinations
• Cito in the Netherlands have been using CBE for over
10 years
• In 2014 it is expected that 30% of their examinations
will be e-assessments
• Developed to encourage more schools to use
technology
• Have introduced a rigorous 6 year programme that
every subject is required to go through in order to
become an e-assessment
Computer Based Examinations
The Cito 6 year programme used a testing and piloting
approach:
•
•
•
•
A proof of concept is developed
The examination is tested in some schools
Dual assessment is offered
Finally they move to computer based assessments only – no
paper exams
At each stage there is a go/no go decision
Each stage also focuses on whether the examination
provides sufficient added value for the student.
Cito have four conditions for an
examination to be digitalised
1. By digitalising the examination it adds sufficient value for the
student
2. There is an adequate level of support
3. Sufficient funding and technical feasibility
4. There is sufficient security built in
Cito have found that over time, digital examinations enable
alignment to the student’s real world through contextualisation of
the examination and using a medium they prefer.
Cito are also looking at testing skills such as listening and
viewing skills in languages.
Security
A note about security:
• A fallback position is needed in case of network failure
• Items can be encrypted
• The examination can block other items
• User authorisation is needed
• Can introduce a timelock on content
• Must be scaleable
CBE - Denmark
Denmark has online assessment and allows candidates to
access the internet (but not for all examinations)
Questions require the candidate to demonstrate:
• in-depth analysis skills,
• presenting a supported perspective,
• critical thinking skills, and
• use of resources.
For example, the Economics examination enables downloading
of examination resources 6 days prior to the examination.
CBE - Israel
Israel examinations are matriculation (end of school) examinations.
They introduced Computer Based Examinations in 1999
• In 2013 online matriculation examinations were offered in English,
History, Bible, Chemistry, Biology, Geography and Bio-technology
• Moved to web based examinations in 2011
• The Bagrut examinations reflects contemporary teaching and
learning methods
• Their examinations make extensive use of simulation based
assessment
Simulation Based Assessment (SBA)
• Context rich
• Allows the candidate to control the stimulus
• Enables analysis of dynamic phenomena
• Enables data development and data processing
Simulation Based Assessment (SBA)
The following slide provides an example of a Physics
question.
• The candidate manipulates the skateboarder to start their
ride from various points on the half pipe. The graph records
the skateboarder’s kinetic energy and potential energy.
Subsequent questions require the candidate to analyse the
results and compare results when other variables are
introduced.
Simulation Based Assessment (SBA)
The following slide provides an example of a
Geography question.
•
•
•
•
•
Candidates were asked to provide rationale for specific natural disasters that
occurred in Mexico.
They can use the two animations on the left and open up a series of maps.
Both the animations can be altered as the students introduces different
conditions or over time
What the slide does not show is that Q3 asks the students to develop a means
of preventing some of the disasters and the question has a series of links to a
series of resources which in turn have links as well. The experience uses the
internet through a secure extranet.
There is no right or wrong way to answer the question and the candidate has to
use the resources to develop their thesis.
Data about candidate response time and keyboard strokes are also gathered
and used by examiners to improve the simulations
Simulation Based Assessment (SBA)
Some examples of how technology can support the
identification of cognitive processes
• Tracking the steps a student used to get an answer
• Measuring the response time for certain behaviours
• The strategies a student uses when they respond to a
task can also indicate their level of understanding
All of these are measureable through SBA
Enables the examiner to track and assess the thinking
process rather than just the outcome of these processes
“Embedding a task in realistic scenarios may help
students make the connections between targeted
skills and conditions of use in real-world problems –
this is assessment as learning”
Moshe Decalo (2013), Israel Centre for Educational Technology
Lessons Learned from introducing CBE
• Requires change at all levels
• Adapting existing systems originally designed for paper
based systems can be problematic
• Running dual systems can be cumbersome and
produces some duplication
• Some do not regard Computer Based Examination as
a ‘serious’ examination
Lessons Learned from introducing CBE
• Cito found the support needs for students, markers
etc, decreases rapidly over time.
• Israel and Netherlands developed their own
software, and found it challenging to continuously
adapt it to meet changing technologies.
• Israel used geography examination to test whether
PBE or CBE provided students with a distinct
advantage and found it was not statistically
significant.
Automated
Essay
Scoring
Automated Essay Scoring (AES) is about the computer marking
an essay or extended prose. The computer ‘learns’ what to
look for in a particular essay.
Automated Essay Scoring (AES)
• There are a number of AES programmes available
• Earlier versions counted words
• Programmes now use sophisticated sets of algorithms
to determine trends and usage of pre-determined
features or traits such as:
o Grammatical conventions – measure error rate, usage,
spelling and capitalisation errors
o Usage – compares the vocabulary usage with that of high or
low quality essays written on the same topic
o Fluency and organisation – measures essay organisation,
discourse elements and the relationship of these discourse
elements, style and sentence variety
o Content – measure vocabulary level and essay length
Automated Essay Scoring (AES)
• Research indicates there is a high correlation between
a human marker and a computer HOWEVER most of
the research has been undertaken by the AES
vendors.
• A computer score is, at best, a prediction. Questions
are raised about the “hidden judgement”. Such as,
whether the marker ‘likes’ the essay or not? Is
objectivity the best way to assess an essay?
• To date, there have been no significant studies across
different population groups.
Automated Essay Scoring (AES)
There are three different models to introduce AES:
1. A human marks, then the computer undertakes quality
assurance
2. Both a human and the computer mark, and then
scores are compared
3. A computer marks, then a human undertakes quality
assurance
Automated Essay Scoring (AES)
• In America, Utah and Louisiana use AES, and Florida
is looking to introduce it this year for a new state wide
writing test.
o Florida will use version 2. (from the previous slide), and use a
second human marker only where there is a significant
difference in the marks given by the first marker and the
computer.
• Most jurisdictions that use AES are not relying solely
on the computer to mark student material.
• The longer term vision is that AES will replace the need
for any writing test as the computer will constantly
assess a student’s writing and can be modified to
provide formative feedback.
Digital
Gaming
Digital Gaming
• Digital games provide students with a natural learning
experience within an informal context.
• Research indicates that students are able to learn
important cognitive social skills through gaming
• Games engage students for long periods of time
Digital Gaming
What the research has found
• Failing up – games provide a safe environment for
students to fail. Students tend to continue trying until
they succeed. Failure is used as a learning process.
• Games provide the notion of an epic win – they create
a sense of urgency. Stressful but provides a deep
focus on a challenging problem.
• Students are willing to work hard if given the right
challenge
Digital Gaming
What the research has found
• Games provide students with an opportunity to think
“outside the box” – they creatively solve problems.
• Games empower students – they inherently trust the
gaming environment.
• Gamers make sense of their experience together –
collaborative problem solving
Digital Gaming - Examples
Sim City – A player builds a
city, takes on the role of mayor
and has to balance economics
with the happiness of citizens.
They need to ensure the city
has enough power, water,
roads and services (ie, police,
health), and attract businesses
and tourism. They manage
taxes and trade and outgoings
as well as the cost of new
developments.
Digital Gaming - Examples
Civilisation – A player starts
with a basic village and builds
a civilisation of many cities.
They manage resources, fight
off barbarians, maintain
happiness and organise trade,
diplomacy and alliances with
other civilisations, keeping
civilisation safe, as well as
conquering civilisations.
Digital Gaming - Examples
Minecraft – allows a player to
build constructions out of textured
cubes in a 3D procedurally
generated world. Other activities
include exploration, gathering
resources, crafting and combat.
Players must find their own building
supplies and food, and find
resources to craft tools while
avoiding moving creatures such as
zombies or giant spiders.
Minecraft is now part of the Swedish
curriculum and is being used as part
of teaching programmes in UK and
Israel.
Where to from here – Randy Bennett
Randy Bennett is the Norman Fredriksen Chair in Assessment
Innovation in the Research and Development Division at the Education
Testing Service. Since the early 1980s his research is focused on
integrating advances in cognitive science, technology and measurement
to create new approaches to assessment.
Key points:
• Education must remain relevant,
• It is changing to include the development of new skills
and to allow individuals to personalise their experience.
• Education is happening any time and anywhere.
Where to from here – Randy Bennett
Randy Bennett outlined 13 considerations:
1. Education assessment must provide meaningful
information to the following groups:
o Education policy makers for effectiveness of the process, and
preparedness of the populations – because they are
accountable for the process.
o Teachers and students for feedback and to plan further
instruction.
o Parents so they understand their children’s progress.
Where to from here – Randy Bennett
2. Must satisfy multiple purposes:
o An assessment built for one purpose won’t necessarily be suited
to other purposes
o Multiple purposes might best be served by different related
assessments designed to work in synergistic ways.
3. Need to use modern conceptions of competency as a
design basis:
o 21st century skills
o Using technology for domain-based problem solving
Where to from here – Randy Bennett
4. Align test and task designs, scoring and interpretation,
with modern conceptions
o Simulations – better replicate the real world contexts under
which the integrated competencies need to be demonstrated
o Discrete tasks
o Automated Essay Scoring in conjunction with human makers
Where to from here – Randy Bennett
5. Adopt modern methods for designing and interpreting
complex assessments
o Creating opportunities to observe performance
o Connecting the observations to meaningful characterisations
6.
Tests of the future will make better account of context
o Currently ignore the social learning and teaching environment in
an attempt to produce inferences, generalisable across contexts.
o How a student performs, and the score achieved, is a fact. Why
the student performed that way is an interpretation requiring
knowledge of context.
o If the assessment is embedded into learning, then account of the
context becomes unavoidable.
Where to from here – Randy Bennett
7.
Design for fairness and accessibility
o Equal opportunity for individuals is a social value.
o That social value has been reflected in standardized tests to
varying degrees.
o As long as assessments are used, fairness will be an issue.
8.
Design for positive impact
o Assessment should be designed to be a valuable learning
experience – assessment as learning – preparing for it,
experiencing it.
o Assessment should model good teaching and learning.
Where to from here – Randy Bennett
9. Educational assessment must be designed for
engagement
o Equal opportunity for individuals is a social value.
o Assessment results are more likely to be measured if students
give maximum effort.
o Engagement should be enhanced by posing problems,
motivational feedback, use hardware that students prefer, use
multimedia and gaming elements.
o Embedding assessment into ‘games’.
Where to from here – Randy Bennett
10. Respect privacy
o Assessment information can be gathered ubiquitously, continuously, and by
stealth.
o Individuals must know when they are being assessed and for what purpose.
o Care is needed – could negatively affect teaching and learning if every action
has a consequence
11.
Incorporate information from multiple sources
o A single test cannot measure a competency domain with sufficient breadth
or depth
o All assessments have limits
o Results from multiple assessments, using multiple methods, and integrated,
are more likely to provide meaningful characterisation of individuals and
institutions.
Where to from here – Randy Bennett
12. Gather and share validity evidence
o Legitimacy is granted to a consequential assessment programme by the
user community and the scientific community connected to it
13. Use technology to achieve substantive goals
o Technology is used to enhance assessment and through that teaching and
learning
o To do what can’t be done as well (or not at all) using traditional testing
methods such as:
o measuring existing competencies more effectively and efficiently
o measuring new competencies including ones that require use of technology
o having a positive impact
In conclusion
The conference provided much to consider for the
introduction of computer based assessments:
• Importance of not taking a big bang approach – not all levels
or all subjects at one time
• The need to ensure there is a good infrastructure for
examination development, delivery and marking
• e-Assessment needs to reflect what is happening in the
classroom
Download