Evaluating E

advertisement
Evaluating Online and Blended
Learning Environments
What does evaluation mean to you?
•
•
•
•
•
•
•
•
•
analysis
Critique
Judgment
Feedback
Audit
Reflection
Improvement
Client perspective
Satisfaction
Agenda
1. Clarify the challenges of evaluating
online and blended learning.
2. Introduce evaluation model.
3. Present case studies.
4. Engage in
a planning
exercise.
Planning Exercise Part 1
• Sketch an evaluation
plan for a new online
course titled “21st
Century Communication
Skills”
• Jointly enrolls students
from around the globe.
• Course is designed with
participants from
multiple cultures and
various fields of study
“It would be very
surprising if even
10 percent of
organizations
using e-learning
actually
conducted any
well-structured
and executed
evaluations.”
http://www.alleninteractions.com/
“An evaluation can first
and foremost determine
whether the distance
learning version worked
as effectively as, or better
than, the standard
instructional approach –
teaching students face
to face.”
- The ASTD Distance
Learning Handbook
We already
know that
online
learning
works as
well as
face-to-face
instruction.
Despite 50 years of “no significant
differences” between media, people
persist in trying to find them.
Dr. Ken Allen
NYU
VS.
VS.
Allen, K., Galvis, D., Katz R. (2004). Evaluation of CDs and
chewing gum in teaching dental anatomy. Journal of Dental
Research, 83.
Evaluation Is the Key to Effective
Successful Online Learning Initiatives
A major evaluation challenge is
determining what your stakeholders will
regard as credible evidence.
Evaluation Paradigms
• People have
hold different
evaluation
paradigms.
• We should
recognize ours
and others.
• Try to avoid
paradigm wars.
Experimental (Quantitative) Paradigm
• There are facts with an objective reality
that exist regardless of our beliefs.
• The goal of evaluation is to detect the
causes of changes in phenomena
through measurement and quantitative
analysis.
• Experimental designs are best because
they reduce “error” which hides the truth.
• Detachment is the ideal state.
Interpretive (Qualitative) Paradigm
• Reality is socially constructed through
collective definitions of phenomena.
• The goal of evaluation is to interpret
phenomena from multiple perspectives.
• Ethnographic methods such as
observation and interviews are best
because they provide the basis for
sharing interpretations.
• Immersion is the ideal state.
Postmodern (Critical) Paradigm
• Reality is individually constructed based
upon experience, gender, culture, etc.
• The goal of evaluation is to improve the
status of under-privileged minorities.
• Critical theory that deconstructs
phenomena is best because it reveals the
“hidden curriculum” or other power agendas
in technological innovations.
• Political engagement is the ideal state.
Pragmatic (Eclectic) Paradigm
• Reality is complex, and many phenomena
are chaotic and unpredictable.
• The goal of evaluation is to provide
decision-makers with the information they
need to make better decisions.
• Methods and tools should be selected on
the basis of their potential for enhancing
the quality of decision-making.
• Wonder and skepticism are the ideal
states.
Experimental Evaluation Flaws
• There is over-reliance on
comparative designs using
inadequate measures.
• No significant differences
are the most common result
in media comparisons.
• Learning is difficult to
measure in most cases,
especially in higher
education.
 We don’t know
enough about the
outcomes of
teaching and
learning in higher
education.
 It is convenient for
everyone involved
to pretend that high
quality, relevant
teaching and
learning are
occurring.
“Quality”
ratings of
universities &
colleges by
commercial
entities have
enormous
impact in the
USA today.
The criteria
used for
these
rankings
are
surprisingly
dubious.
Film Clip from “Declining by Degrees” by
John Merrow and Learning Matters
Film Clip from “Declining by Degrees” by
John Merrow and Learning Matters
Interpretive Evaluation Flaws
• Administrators often
express distain for
“anecdotal evidence.”
• Observations and
interviews can be
expensive and timeconsuming.
• Qualitative
interpretations are
open to bias.
The Failure of
Educational Research
– Vast resources going into
education research are
wasted.
– They [educational
researchers] employ weak
research methods, write
turgid prose, and issue
contradictory findings.
The Failure of
Educational Research
– Too much useless work is
done under the banner of
qualitative research.
– Qualitative research….
[yields] ….little that can be
generalized beyond the
classrooms in which it is
conducted.
Postmodern Evaluation Flaws
• It is easier to criticize
than to propose
solutions.
• Extreme subjectivity is
not widely accepted,
especially outside
higher education.
• Whose power agenda
should be given
precedence?
The Trouble with
Postmodernists
• Write critiques in
a language
inaccessible to
decision makers.
• Regard
technologies as
inherently evil.
Pragmatic Evaluation Flaws
• Requires larger
commitment of
resources to the
evaluation enterprise.
• Mixed-methods can
be expensive and
time-consuming.
• Sometimes, decisionmakers ignore even
the best evidence.
The Lesson of the Vasa
http://www.vasamuseet.se/
So what are
some better
ideas about
evaluating
online and
blended
learning
courses?
Three core starting points.
• Plan up front.
• Align anticipated
decisions with
evaluation
questions.
• Use multiple criteria
and multiple data
collection methods.
Decision-Making and Evaluation
• We must make decisions
about how we go about
designing and using
e-learning.
• Information from
evaluation is a better
basis for decisionmaking than habit,
intuition, superstition,
politics, prejudice, or just
plain ignorance.
Planning is the key!
• A major challenge is
getting stakeholders to
identify the decisions
they face.
• Clear decisions drive the
rest of the planning.
• Evaluation questions
emerge from decisions.
• Methods emerge from
questions.
Conducting Evaluations - Step 1
• Identify decisions
that must be made
about e-learning.
–adopt
–expand
–improve
–abandon
–reallocate
funding
Conducting Evaluations - Step 2
• Clarify questions that
must be addressed to
guide decisions.
– Who is enrolled in elearning and why?
– How can CMS be
improved?
– What is the impact on
access?
– What is the impact on
performance?
Conducting Evaluations - Step 3
• Select methods.
–Observations
–Interviews
–Focus Groups
–Questionnaires
–Data log analysis
–Expert Review
–Usability Studies
Conducting Evaluations - Step 4
• Collect the data.
– Triangulate.
– Revise data collection
strategies as needed.
– Accept limitations,
but aim for quality.
– Be sensitive
to demands
on all participants.
Conducting Evaluations - Step 5
• Report findings so
that they influence
decisions in time.
– Report early and
often
– Use multiple formats
– Engage stakeholders
in focus groups
– Don’t hide complexity
Selective Criteria for Evaluation
Learning
Consistency
Economy
Safety
Flexibility
Efficiency
Learning
Low
High
In comparison to traditional instructor-led
instructional methods, e-learning may
show statistically significant, but modest,
learning gains as measured by most
standardized tests. Developing reliable,
valid measures of the most important
outcomes is difficult and expensive.
Who do we want our learners to become?
• Better problemsolvers and
communicators
• Capable of working
collaboratively as
well as
independently
• Knowledgeable
• Highly skilled
Who do we want our learners to become?
• Experts who possess
robust mental models
specific to the
professions in which
they work
• Lifelong learners who
value personal and
professional
development
Developing reliable and valid
online tests is expensive.
In addition,
many, if not
most,
important
outcomes are
difficult to
assess with
traditional
measures.
Consistency
Low
High
In comparison to traditional instructor
dependent instructional methods,
e-learning can be more consistent,
providing each learner with equivalent
exposure to content, interactions, and
assessment strategies, all of which can
be reliably documented.
Economy
Low
High
In comparison to traditional classroom
instruction, e-learning can be more
economical. Unfortunately, valid
examples of ROI evaluations for
e-learning are still rare, especially in
higher education.
Safety
Low
High
In comparison to many types of
laboratory or field activities, e-learning
can be safer for both people and
equipment. Safety is an increasingly
important criteria in higher education as
well as in business and industry.
Flexibility
Low
High
In comparison to traditional instructional
approaches, e-learning can be more
flexible for both instructors and learners.
This is clearly an important advantage in
business and industry and in
increasingly in many higher education
contexts as well.
Efficiency
Low
High
In comparison to traditional instructional
approaches, e-learning can yield 25 or
more percent savings in time to achieve
a given set of objectives. This factor
alone justifies its adoption in any
organization concerned with efficiency.
Approach A
Learning
Approach B
Consistency
Economy
Safety
Flexibility
Efficiency
Low
High
Bad News
There is no single,
easy to administer,
inexpensive,
reliable, and valid
approach to
evaluating
e-learning.
Oh...no!
Good News
There are practical
strategies for
documenting the
development and
use of e-learning,
improving it, and
building a case for
its effectiveness
and impact.
Thank
goodness!
Many models:
•
•
•
•
•
•
•
•
•
•
•
Objectives-based
Accreditation
Kirkpatrick’s 4 Levels
Countenance
Goal Free
Reform
Naturalistic
Adversary
Connoisseurship
Theory-based
Fourth Generation
Kirkpatrick’s Four Levels
http://coe.sdsu.edu/eet/Articles/k4levels/index.htm
Level 4 – detect the
impact on outcomes
Level 3 – find out if
behavior changed
Level 2 – assess
their learning
Level 1 – measure
how they liked it
Results
Transfer
Learning
Reaction
Phillip’s Fifth Level (ROI)
http://www.roiinstitute.net/
ASTD Survey
ROI
?
Results
2%
Transfer
11%
Learning
34%
Reaction
92%
Reeves/Hedberg Instructional
Product Evaluation Model (2003)
• Views evaluation as a process that:
– focuses on supporting decision-making
– adopts procedures and tools from many different
fields, e.g., usability testing
– involves innovative reporting approaches
– engages six facets (review, needs assessment,
formative, effectiveness, impact, and maintenance)
• Evaluation functions are keyed to Instructional
Design functions.
Evaluation Function
Development Activities
Review
Conceptualization
Needs Assessment
Design
Formative Evaluation
Development
Effectiveness Evaluation
Implementation
Impact Evaluation
Institutionalization
Maintenance Evaluation
Re-conceptualization
Review
• Ensure that the
development team is
well-informed about
previous work done
in the area during
the early stages of
course or program
conceptualization.
• Avoid recreating the
wheel.
Review
• Review
related
literature.
• Examine
competing
e-learning
courses and
programs.
I can do
better than
this!
Needs
Assessment
• Identify the critical
needs that an
e-learning program is
supposed to meet.
• Provides essential
information to guide
the design phase of
the development
process.
Needs
Assessment
• Primary methods:
• task analysis,
• job analysis, and
• learner analysis.
• Yields a list of specific
goals and objectives that
learners will accomplish
through engaging in
e-learning.
Goals
Formative Evaluation
• Collect information
that can be used for
making decisions
about improving
e-learning programs.
• Formative
evaluation
should be
continuous.
Formative Evaluation
• Provided the results
are used, formative
evaluation usually
provides the biggest
payoff for evaluation
activities.
• Faculty or sponsors
may be reluctant to
accept the results of
formative evaluation.
Effectiveness
Evaluation
• Estimate shortterm effectiveness
in meeting
objectives.
• Necessary, but
insufficient,
approach to
determining the
outcomes.
Effectiveness
Evaluation
• Evaluating
implementation is as
important as
evaluating outcomes.
• You must understand
how e-learning
programs were
actually implemented
to interpret results.
A connection with
the server could not
be established?
Impact
Evaluation
• Estimate the long-term
impact on performance,
both intended and
unintended.
• Extremely difficult
to evaluate the impact
of e-learning courses
and programs,
but increasingly
important.
Impact Evaluation
• Evaluating impact is
increasingly critical
because of increased
emphasis on the bottom
line.
• Some managers
expect impact evaluation
to include “return-oninvestment” (ROI)
approaches.
Maintenance Evaluation
• Ensure the viability of
an e-learning
courses and
programs over time.
• Maintenance is one
of the weakest
aspects of
e-learning, especially
in higher education.
Maintenance
Evaluation
• Methods include:
•
•
•
•
Document analysis,
interviews,
observations, and
automated data collection.
• Very few organizations
currently engage in
serious maintenance
evaluation of e-learning
initiatives.
USAFA Case Study
• Decisions
• Questions
• Methods
• Results
• Recommendations
Engineering Education
Engineering Education
• Problem: Cadets not achieving higher
order outcomes
• Critical Outcomes for 21st Century
Graduates of the US Air Force Academy
– Frame and resolve ill-defined problems
– Exhibit intellectual curiosity
– Communicate with multiple media
– Enrich mental model of engineering
• Solution – New ENGR 110
“Introduction to Engineering” blended
learning course developed
• Course was intended to be a
showcase for alternative pedagogical
dimensions
• Course was intended to take
maximum advantage of the
technological infrastructure available
at USAFA
• Pedagogical Dimensions of BL
Task-Oriented - cadets were given three
tasks during the semester
Get to Mars
Build a research site on Mars
Develop a power source on Mars
Constructionist - cadets created
knowledge representations of solutions
Conversational - cadets joined listservs
and other forums to discuss tasks
Collaborative - cadets worked in teams
throughout the course
• Pedagogical Dimensions of BL
Challenging - there were no “correct”
solutions to tasks, but lots of wrong ones
Responsive - faculty and external experts
provided multiple levels of guidance and
feedback
Reflective - cadets kept electronic journals
and participated in focus groups
Formative - cadets developed prototypes
and refined them over time
• Web provided rich resources about
Mars, space travel, engineering, Air
Force, etc.
• Web tools enabled cadets to
collaborate.
• E-mail supported consultation with
experts.
• PowerPoint used to construct
knowledge representations.
• Excel, Stella, and other tools afforded
problem-solving and modeling.
• Decisions had to be made:
– After a three year beta test, should
the new course become part of the
“core.”
– How could this type of course be
supported after faculty who created it
were gone?
• Evaluation questions:
–Did students achieve higherorder outcomes?
–What were the logistical
requirements for
implementation?
–How could the blended learning
course be improved?
Evaluation methods can be
represented using a matrix:
QUESTIONS
How is course
implemented?
Learner reactions?
Enhancements?
What learning occurs?
Observations
Interviews
PS Measure
• A comparative evaluation was
conducted using two experimental
classes and two control classes with a
range of measures:
– Standardized problem-solving instrument
– Concept maps
– Questionnaires
• Interviews and focus groups employed.
• Intensive observations.
Engr Mech
Task-Oriented
Challenging
Collaborative
Constructionist
Conversational
Responsive
Reflective
Formative
Engr 110
• Educationally significant differences
were found on a standardized
measure of problem-solving.
• Concepts maps revealed little.
• Observations indicated that course
was very demanding on both
cadets and faculty.
Pre- and Post- Course Results
• No pre-course differences between cadets in new
course and those in control course
• Significant post-course differences between
cadets in new course and those in control course
• Cadets in new course improved by a whole
standard deviation (1 Sigma difference)
Pre =
Post =
D
D+
S-
S
S+
E-
E
• Other benefits found included:
– richer mental models
– improved communication
skills
– enhanced research skills
– better team skills
• Recommendations:
–Continue to support the course
for two more years
–Explore extensions of
pedagogical dimensions into
other courses
–Provide more faculty release
time
Planning Exercise Part 2
Plan an evaluation of
the “21st Century
Communications”
course by identifying
the following
- Decisions
- Questions
- Methods
Recommendation
1
Embrace the
complexity of
e-learning, describing
it in many ways. And
don’t yield to the
temptation to
oversimplify.
Recommendation
2
Render judgment
with great care.
Whereas description
preserves
complexity,
judgment forces
decisions of
acceptance or
rejection.
Recommendation
3
Keep before you
the image of
multiple publics
ready to eat you
alive! Evaluating is
always a political
activity.
Recommendation
4
Remember that
data don’t make
decisions. People
do! Your task is to
help people
(including yourself)
make decisions
based on sound
information.
Evaluate
before
you
decide.
Recommendation
5
Don’t confuse
measuring with
evaluating.
And avoid the
empirical swamp of
traditional media
comparisons.
Recommendation
6
Don’t ask:
Which test should
we use?
Ask:
What can we count
as evidence that
learning has
occurred?
Recommendation
7
Lastly, prepare
to work far into
the night. The
evaluator’s
labors are long
and difficult.
Thank You!
Professor Emeritus Tom
Reeves
The University of Georgia
Instructional Technology
604 Aderhold Hall
Athens, GA
30602-7144 USA
treeves@uga.edu
http://it.coe.uga.edu/~treeves
Download