Teaching and Learning Research Design Guide: First Edition

advertisement
Teaching and Learning Research Design
Guide: First Edition
Version Date:
August 6, 2015
Authors:
Gregory Hum
PhD Candidate, Faculty of Education, Simon Fraser University
ghum@sfu.ca
Jack Davis
PhD Candidate, Department of Statistics and Actual Sciences, Simon Fraser University
jackd@sfu.ca
Creative commons license
This work is licensed under the Creative Commons Attribution-NonCommercial 4.0
International License. To view a copy of this license, visit
http://creativecommons.org/licenses/by-nc/4.0/.
Suggested citation (APA)
Hum, G. and Davis, J. (2015). Teaching and Learning Research Design Guide: First
Edition. Retrieved from
http://www.researchprism.com/roadmap/tlr_guide_1st_edition.pdf
Page 1 of 35
Author’s notes
This document originates from our work at the Institute for the Study for Teaching and
Learning in the Disciplines (http://www.sfu.ca/istld.html) as a means of documenting
what we have learned in the course of advising Teaching and Learning projects which
were a part of the Simon Fraser University Teaching and Learning Development Grants
(http://www.sfu.ca/tlgrants.html). As this original informal documentation grew, we found
that there was significant interest in this more formal documentation, both from our
immediate colleagues as well as others with whom we discussed our work.
Acknowledgements
The further development of this document was supported in part by funding through the
Institute for the Study for Teaching and Learning in the Disciplines and we would also
like to acknowledge and thank those who have provided feedback: Cheryl Amundsen,
Lannie Kanevsky, Cindy Xin, Andrew Wylie, Angela McLean and Veronica Hotton.
This work continues through the following efforts:
Teaching and Learning Research Roadmap: A practical guide to conducting
applied research on learning and instruction (Gregory Hum, Jack Davis)
This documentation continues and expands the original work into a broader
conceptualisation of project designs as well as linking design with analytic procedures
through revised documentation and an interactive website.
www.researchprism.com/roadmap/
This interactive website is a supplement to the aforementioned documentation and aims
to help researchers in a variety of contexts design projects to study learning and
instruction.
Teaching and Learning Research handout (Institute for the Study for Teaching
and Learning in the Disciplines)
This document is an adaptation of the work described here. Its content is tailored for,
and will continue to be developed by, the members of the Institute for the Study of
Teaching and Learning in the Disciplines to support the Teaching and Learning
Development Grants.
Page 2 of 35
Introduction
The overall goal of this document/work is to help you approach questions related to
teaching and learning, with a focus on evaluating student learning. This document is
primarily aimed at those who are already experienced in research, but may not have
experience in the issues and methods around teaching and learning research (TLR).
The goal is to provide a user-friendly guide that can help you relate and connect your
existing research and disciplinary knowledge with research and disciplinary knowledge
for TLR.
This guide provides a range of possible methods, data sources, and analyses to help
you consider how your existing knowledge might be applied to TLR questions, as well
as some potential new options to consider. The goal of this guide is to help you make
informed and practical choices about the design of your project to ensure you get
helpful and practical results to inform your own work and to share with others.
This document is informed both by our own experiences with this work, as well as the
documented experiences of previous Teaching and Learning Development grant
recipients and where possible, we have provided examples from these projects. This is
an ongoing work and we are always interested to expand on this and would be happy
and interested to hear about your experiences and insights with this documentation. Our
contact information is found at the beginning of this documentation.
Page 3 of 35
General Guidelines
There are two overlapping purposes for Teaching and Learning Research (TLR). You
should consider your relative interests/emphases for each research question and your
project as a whole.
Exploratory purpose questions are open and detailed. These are questions pertaining to
how or why something worked with few prior assumptions or guesses, allowing for
“emergent” findings. For example: What worked or did not work about the discussion
activity designed? What might be helpful or surprising to know if someone else were
adopting this activity or what would you change next time you did it? This is most often
associated with formative evaluation and qualitative research.
This type of research rarely aims for generalisability. Instead, findings from exploratory
research speak primarily to your particular course or situation. Even without full
generalisability, in-depth information about a particular subject of study can nonetheless
be informative for others to consider and/or adapt to their own work/contexts,
particularly if their own work or contexts share similarities with your own descriptions.
Testing by contrast is more narrow; testing aims to determine if something “worked”,
often by comparison (e.g., did students who did a discussion activity learn more than
those who only had a lecture?) or determining if a pre-established hypothesis or idea
was correct (e.g., watching videos increase student learning). This purpose is most
associated with summative evaluation and quantitative research.
Testing research is most effective after substantial prior information has been gathered
so that the most useful questions are being asked. A testing project may emerge and be
informed from the results of an exploratory project. The ultimate goal for some with this
intent is “generalisability” that is the ability to claim your findings apply to groups beyond
those whom you studied (e.g., math students in other universities, all students in
Canada, all instructors in the world). In practice, generalisability is often difficult to
realise within teaching and learning research projects due to constraints discussed in
this document.
Many projects will use some combination of both the testing and exploratory
approaches but the relative emphasis may differ based on the project’s focus, your
intentions, and practical constraints. While it is possible to have an exclusively testing
oriented project, it is recommended that you incorporate an exploratory element into any
testing project to help provide detail to your findings.
Page 4 of 35
General challenges to consider:
In most cases, you will be conducting research on a classroom. Much research,
especially quantitative research and especially controlled experimentation emphasises
controlled and simplified contexts/treatment. However classrooms are complex and
“natural” environments, not laboratories where goals and tasks can be simplified and
conditions well-controlled. The following are just some ways that “natural” learning
environments differ from controlled laboratory research. You should keep in mind how
the following may influence/bias your results and how you might account for them in
your design and/or interpretations.
•
•
•
•
Individual students and class cohorts (e.g., year to year or by terms) can differ
radically. Some have more or less prior knowledge and/or motivation for example.
Instructors heavily influence teaching and learning. You may influence the results
through your own expectations or changed level of motivation. Different instructors
also will invariably teach differently in subtle and not-so-subtle ways that can
influence outcomes (e.g., are different TAs teaching different tutorial sections? In
what ways are the TAs and the tutorial sections different?)
The classroom is not a laboratory. Many things influence teaching and learning in
the classroom, most which are not easily accounted for or controlled for (e.g., time of
the course, other courses taken in the same semester, lighting, student cohort
differences etc.).
To learn more you can read about the Hawthorne effect, Placebo, or John Henry
effect for known versions of these issues.
 http://en.wikipedia.org/wiki/John_Henry_effect
 http://en.wikipedia.org/wiki/Hawthorne_effect
Note that there are statistical and research designs means of mitigating these effects.
Some of these are described further on in this documentation.
Choosing data sources
Each data source is best for certain research methodologies and purposes and has
particular challenges. To inform your selection of the types of data you will collect, think
carefully about how each data source contributes to one of your research questions and
how/whether the findings will be useful. Think about what the findings might “look like”
and what different results might mean and if they are meaningful. Some questions to
consider:
•
How will the data I collect answer/inform my specific questions?
Page 5 of 35
•
•
•
•
•
•
What are some ways I expect the findings to “turn out” and what will different
results mean?
How will the findings support improvement in this course or in other courses?
How will the findings influence my approach to teaching and understanding of
student learning in this course or other courses?
How can the findings inform future decisions and/or designs?
Who will read the findings and what would I like them to take away?
What data sources have been used in similar projects in the past?
Some things to think about:
Sample size
Some data sources work best with large numbers, especially when the intent is to
summarise attributes or study larger patterns. For example, quantitative surveys or tests
with the intent of comparing measures between groups. The definition of a “large”
sample is subjective but 30 subjects in each group of interest is a good minimum cut-off
for their statistical conclusions to be reasonably accurate. Results for these kinds of
analyses on small samples can sometimes be deceptive for quantitative analyses.
Data sources that work well with small numbers tend to emphasise depth or richness of
data. For example, short answer questionnaires, interviews, focus groups, think aloud
protocols. From a practical standpoint the transcription that is often required to analyse
this data limits the maximum number of participants that can be properly analysed.
Evaluating student learning
Grades alone will usually be inadequate to assess student learning. Grades are one
measure of this, but they do not collect all information about students’ thinking and the
nature of the assignment and/or grading criteria will deeply affect what this measure
means. To address this, make sure the grading is meaningful and gather additional
information if necessary (a detailed marking rubric can provide more detail information).
Take time at the beginning of your project to consider what learning outcomes you are
hoping to produce, and how these outcomes would be assessed in your students. You
may want to collect additional information on students’ thinking by doing interviews or
having them do specialised activities like concept maps.
In assessing student learning, keep in mind that students will inevitably learn something
regardless of the course or program so this information alone is not helpful. Having a
Page 6 of 35
basis for comparison for testing (learn better than who or what?) or an exploratory
component is important.
Practical considerations and limitations
Think about the practical issue of whether you possess or have access to the
knowledge and/or resources to make sense of the data you choose to collect. Can you
reasonably collect this data given constraints and likely participation rates (e.g., trying to
find students from past semesters can be very challenging)?
While it is a good idea to collect multiple data sources to capitalise on their relative
strengths, and ability to access different kinds of information, try to avoid collecting more
data and kinds of data than needed to answer the questions you pose. It can be
tempting to collect as many kinds of data as you can think of, but you will likely run out
of time and resources to analyse them all and asking too much of participants can
negatively affect the quality of their responses or participation (e.g., each subsequent
survey in a semester is likely to have lower and lower participation).
Page 7 of 35
Data sources
A brief overview of some of the major data sources/methods follows. Each data source
has relative pros and cons. You will likely want to select more than one for your project
and/or each research question.
Personal observations (reflections, notes)
These are easy to collect and your thoughts and anecdotes can give useful detail and
insight. This is good for exploratory projects. However, this kind of data is fundamentally
subjective, and thus not appropriate for testing. It also provides no information on
students’ perspectives and thus shouldn’t be your primarily data source. This data
source shouldn’t be used exclusively.
Pros: easy to collect, may have deep contextualised insights
Cons: subjective, limited perspective
Student performance/thinking (grades, projects, concept maps)
Grades are always collected as part of a course and are easily understood by most.
This is a common source of data for testing projects. However, grades can mean
fundamentally different things based on the activity being evaluated and the evaluation
method itself. They also provide little detailed information on student learning or
thinking. For instance, asking a student to answer fact based questions on a quiz
assesses a different level of understanding than asking a student to complete an
“authentic” task (e.g., write a computer program; write a short a story)
You can design specific assignments and/or activities to assess student thinking.
Concept mapping is one such method. Another would be a complex project with a
detailed grading rubric. In either case, consider what kind of learning the task students
are being asked to do demonstrates (is it what they know? is it what they can do? is this
task like a “real-world” task?).
This kind of more detailed analysis of student thinking can help serve either testing or
exploratory purposes. However, this is more time consuming and difficult to analyse in
general than “grades” alone.
In all cases, it should be noted that student performance and thinking can be difficult to
attribute to a specific course or intervention since students have individual differences
Page 8 of 35
(e.g., prior knowledge, time). This issue should be considered when designing your
study.
In general, simpler, numerical measures such as grades work well for testing purposes
whereas more complex evaluations such as detailed evaluations of concept maps or
projects work well for exploratory purposes.
Pros: may already be part of a course, easy to analyse statistically, many purposes can
be served or assessed by changing the evaluation
Cons: grades do not tell the whole story about student thinking/performance, can be
challenging to create an accurate measure, can be difficult to separate from individual
differences
Surveys and survey responses (quantitative rating surveys and/or short answer
surveys)
Surveys are a flexible tool which can quickly gather a large amount of data and be
quickly analysed. Many tend to collect relatively undetailed information through simple
response items so are best for a testing purpose, however, focusing instead on short
answer questions can support an exploratory purpose (a combination is of course
possible as well). Surveys have many uses including gathering general
attitudes/opinions, comparing responses between subgroups, and looking at the relative
average responses for individual questions.
Put another way, surveys can be used to collect a variety of data. Quantitative data
such as rating scales work best with relatively large sample sizes and having small
samples can make numerical results less reliable. Qualitative data such as short written
answers can work with smaller samples since more detailed and unconstrained
information can be gained from each response. It is of course possible to qualitatively
survey large populations, but analysing the data may be very time consuming.
Survey research designs may include single surveys, pre-post designs (two different
time points to assess change), post-pre designs (one time point but asking for
retrospective responses for the “pre” component), multiple longitudinal surveys (more
than two different time points)
Some things to note:

Low response rates are the norm (e.g., 20-30%).
Page 9 of 35




Answered surveys may not necessarily provide quality information as some people
will answer without adequate consideration of each question (e.g., answering the
middle response throughout). It is important to look for this issue at analysis.
While pre-post designs are a commonly considered design, in practice they can be
challenging to implement and get meaningful results from.
o Firstly, these designs works best when individual students can be linked from
their pre-responses to their post-responses. Attrition and maintenance of
confidentiality are thus common issues.
o A second common difficulty is a shift in how individuals interpret questions
when answering for pre and post. For instance, it is common for people to
self-rate their knowledge on a topic as lower after learning more about it as
they now know what they did not know!
o A post-pre survey design (where respondents are surveyed only once but
asked what their “pre” response would have been on each question) can help
address this issue. Validated instruments are another approach.
Typically, “best practice”, especially for quantitative research designs, requires a
“validated” instrument whose questions have been statistically tested with many
participants and refined over time. However, this is extremely time consuming and
out of the scope of most teaching and learning research projects. Already validated
tests can be found, but can be difficult and/or expensive to gain access to. Also they
tend to be relatively abstract in their wording and thus may not work well for specific
projects and are not generally recommended for these reasons. Adapting validated
instruments is possible, but any modifications/adaptations strictly speaking would
require a re-validation process. Thus, while validation is ideal for some purposes, for
most TLR projects, a carefully designed, non-validated survey is the recommended
approach. Survey support and guides are available to Teaching and Learning
Development grant recipients.
Another survey research design is to have multiple (generally 3 or more) surveys
over time to track changes of individuals’ responses over time. Note the concerns
with pre-post designs also apply and attrition is an even larger danger. This design,
however, has the potential to create very in-depth results not possible under other
designs such as how opinions and/or learning change over time.
Qualitative surveys or survey questions are generally less complex. Analysis, however
may be difficult and time consuming depending on the nature of the responses
received.
Pros: collect many kinds of information quickly and efficiently, quantitative data
collected this way is easy and quick to analyse, qualitative data collected this way
allows for diverse and unconstrained responses.
Page 10 of 35
Cons: a good survey is difficult to design, quantitative surveys collect undetailed
information; quantitative surveys require large samples for reliable findings; low
response rates, attrition and low quality responses can make it difficult to interpret
results; qualitative surveys can be time consuming to analyse
Interviews
Interviews can serve a variety of primarily exploratory purposes (some are described
below). They are good for collecting highly detailed information on a person’s thinking
and perspective. There is of course an inherent subjectivity that should be accounted for
when interpreting/analysing them. They can also be time consuming to collect and
transcribe.
When conducting interviews it is common to construct an interview protocol which
outlines the questions to be asked. This is especially important if there are multiple
interviewers. Interviews can be structured, semi-structured, or unstructured depending
on how much they allow deviation from the protocol to “probe” and whether they use
one at all. Semi-structured is arguably the most common form where most questions
are asked but probes are common.
Some potential uses for surveys:




Interviews with students can help gain insight into their course experiences and/or
their needs.
Interviews with TA’s or instructors can help gain insight into their instructional
perspective.
Interviews with experts (e.g., subject matter experts, other instructors) can provide a
variety of insights and may be particularly helpful at the outset of a project to inform
a design.
The focus group method is an interview variant that can efficiently access multiple
opinions and perspectives. If considering a focus group, also see: Nominal Group
Technique or Delphi Technique.
Some variants to consider:
 “Think-aloud protocol” is a methodology common in social psychology and usability
testing where individuals describe their thinking as they work through a task. This
can help to uncover how people think through problems and/or uncover
misconceptions.
Page 11 of 35
Pros: collects detailed information that can serve a variety of purposes, can access an
individual’s thinking and perspective, multiple interviews can help develop common
themes or consensus around issues.
Cons: individual perspectives are subjective, interviews are time consuming to collect
and transcribe
Observations
Observations can be useful for getting information on classroom dynamics, instructor or
student behaviour etc. This can be adapted to either testing or exploratory purposes. In
the former, some sort of structured observation checklist by an “objective” observer is
ideal whereas the latter can be done by taking detailed notes. This kind of data can be
highly convincing and is considered relatively objective. However, this is by its very
nature time consuming and resource intensive as it requires a dedicated observer. It
may help and/or be important to have a knowledgeable or trained observer such as an
expert in the field.
Pros: collects “objective” evidence
Cons: time intensive to collect, needs a dedicated observer
Trace data and records
A large and general category created for this guide that includes course evaluations,
attendance, canvas records, clickers, documents
This is a highly varied data source and the purposes served will vary depending on what
it is. Generally speaking, they do not provide much detail on their own but can be widely
collected/collated and can serve exploratory or testing purposes depending on what the
data is and how it is used.
In some cases these overlap with and/or substitute for student grades/performance
(e.g., canvas records or clickers). This data is easy to collect once the method for
collecting it is established.
Courses with an online component can allow trace data such as page views, video
views, and the time of use to be collected automatically.
In general, university records such as course evaluations or documents are relatively
easy to locate and collect. However, in most cases these will be difficult to repurpose for
the specific questions of a particular project.
Page 12 of 35
Pros: can be easy to collect large amounts of data
Cons: can be difficult to analyse, may need to be adapted for particular purposes
Page 13 of 35
Participant selection/recruitment
The consideration of who is participating in your research, or who will participate is an
important one. While in many cases your choices will be largely determined by the
constraints of your project, it is important to consider what the trade-offs and relative
pros/cons of different methods are.
Some points to consider:




Aim to recruit as many and as diverse participants as possible (e.g., through
reminders and/or incentives). The more students participate the better your findings
will reflect the group as a whole.
Whether the number of participants you have is appropriate for the data sources you
are using (or vice versa)
Who is actually participating in your study and how this might affect/influence results.
o What are the attributes of your sample? Are the demographics of your sample
somehow unusual relative to the total “population” of your whole class or
program (whatever level your study is looking at).
o Is your current “population” (e.g., class) the same or different from other
potential similar populations (e.g., classes from previous years).
Who isn’t participating in your study and what you might not know because of this.
o Dropout over time can change who is not participating. Are certain kinds of
students more likely to drop out?
o Students may not be participating to their full potential or effort. Students might,
for instance fill out the survey but answer the same neutral response throughout
without reading the questions.
o In many research designs, especially in quantitative research, the ideal is that
participants are randomly selected from a larger population, all who have an
equal chance to be selected. This randomly selected “sample” can then be
used to “generalise” to the entire population, even those who did not
participate in the study. However, random selection is rarely possible in TLR
and you should consider how this more “limited” sample may affect your
results.
 You should consider/examine who is actually participating in your
study and how this difference from a larger population of interest might
impact results. For instance, if you study your class, how are they the
same or different from other students at SFU or in Canada?
• Random selection is seldom used in qualitative research but it is still important to
consider who your participants are and how their specific attributes may influence
your findings.

When developing a new method or tool, you want to select students (and perhaps
faculty) who will actually use what you are developing or who are “like” those who
Page 14 of 35
will be using it (e.g. students in the same course in which you plan to eventually use
what you are developing/building.

While in the building and designing phase of developing a new method or tool,
having a fast turn-around time for feedback is more important than gathering lots of
feedback at once. Plan for multiple rounds of feedback, for example in the form of
short surveys, in expert interviews, or in focus groups so you can quickly use the
feedback to make improvements and correct problems early.

If conducting a survey for feedback, convenience sampling (described below) should
be sufficient.
o Convenience sampling may lead to some students in a group being underrepresented (e.g. EAL students) because of their reluctance to volunteer
information.
o However, your goal is to find the most pressing issues with the approach, tool
or method you are building, if any, and deal with those. Most of these issues
will be found by taking a convenience sample.

Expert interviews may be helpful to inform your design in terms of content or
pedagogical practices or techniques.
o For example, you may want to interview another instructor whom has used a
similar method, or an industrial professional or content expert in the field.
o A technique such as the Delphi technique can be used to aggregate opinions
and build consensus.

If you want more detailed information, or wish to have a chance to ask further
questions, a focus group is recommended.
o In a focus group, 4-10 volunteers give feedback together as a panel in a open
discussion for a fixed time, usually 30-90 minutes.
o To prepare a focus group, have a set of open questions and short discussion
activities, such as having your focus group collectively make a mind-map.
o Be prepared to offer an honorarium or gift certificate for people's time.
o To find subjects for a focus group, purposive sampling is ideal, also shown
below.
o One potential problem with focus groups is that, because it is an open
discussion, certain members of the group may dominate the conversation or
drown out dissenting opinions. Try to use question formats and activities that
encourage less dominant voices to contribute.
Participant selection/recruitment methods
Convenience/non-random sampling:
Page 15 of 35
“Convenience sampling” is a term for the method of sampling whoever is the most
available or convenient to get answers from. In the context of TLR this will often be
students in your class who volunteer to participate in our research.
In general, in many projects convenience sampling is the default as your choice of
sample is obvious or a given. The main alternative is purely random selection, however
this is usually not possible for TLR projects (though random grouping may be for details
see below). That is, you cannot, especially in a large class, intentionally choose who will
participate and your participants are obvious and/or a given: they are your students or
they are people you have access to (e.g., experts you know). You cannot usually
compel all students to participate (although offering incentives can help increase
participation rates). As noted above you should consider who actually participated.
There are two variations/alternatives to convenience/non-random sampling which are
discussed subsequently (purposive and snowball sampling).
Qualitative
For qualitative research such as interviews, , you should keep in mind who may or may
not participate and consider how this might influence the perspectives you hear and/or
how you might entice a more diverse sample to participate. For example interviews are
typically time consuming and optional so it is likely that you will hear from highly
motivated students and those who have particularly strong opinions to share. You
should also consider students that are less likely to participate, and whose perspective
will be missing. For example, given the highly language intensive nature of these,
participants who are not confident in their English language skills. You may also have
trouble finding students to participate more generally.
You can try to mitigate some of these issues by attempting to recruit more students
through snowball sampling and/or targeting your sampling through purposive sampling
(see below).
Quantitative
Voluntary class surveys are always a convenience sample because you will only be
getting information from the people that are willing to take the time to provide it. This
includes the student surveys given at the end of each course.
This is the easiest way to collect information, but the results have some major
limitations. When taking or analyzing a convenience sample, it is important to be aware
the kinds of students that are more likely to answer a volunteer survey or be available to
sample (e.g. Highly engaged students, or those without language difficulties). These
kinds of students will be over-represented in a convenience sample, which makes
generalisation to a group beyond the sample unreliable. That is, it is hard to tell if the
survey responses represent your class as a whole, and it is even more difficult (if not
Page 16 of 35
impossible) to say how this survey could apply to other classes or students more
generally.
However, if your goal is to collect whatever feedback people will volunteer, or if you are
not worried about generalising the findings beyond the sample that was found (i.e.,
assuming your findings/conclusions apply to others), then a convenience sample is
sufficient. These limited surveys can be informative so long as you recognise the
general demographics of who actually answered (e.g., these results tell me what strong
and motivated students think). Most projects will draw on these types of samples.
It is important to note that the degree of the problem of convenience sampling within a
natural group such as a classroom varies based on your “response rate” since the more
participants there are the more representative your sample is of your class (and if
everyone responds it is perfectly representative!). Incentives to participate can help
increase the response rate (though it still cannot guarantee honest effort by all
students). The problem of generalisability, however, remains as your class is not a
random assembly of all possible students (see natural/existing grouping below for more
information).
If generalisation to a wider group is important, a random sample within a larger
“population” (e.g., a class or cohort) is typically needed (see below). It should be noted
that it is possible to “adjust” for the biases in samples introduced in non-random
sampling through advanced statistical techniques such as propensity matching, but this
is beyond the scope or requirements of most projects.
Pros: Easy to collect, the “default” sampling method.
Cons: Not an ideal means of sampling as you have no control over who participates.
Some “kinds” of students may participate at far greater levels than others. This is
especially problematic for testing purposes and some quantitative analyses, especially if
the goal is to generalise beyond your sample.
Purposive and snowball sampling
These are non-random techniques for recruiting a sample for TLR. Their particularities
are discussed below. To some extent, because you can tailor your invitations to
particular individuals you are more likely with both approaches to have a higher
invitation/recruitment ratio than purely convenience or random methods.
Purposive sampling
Purposive sampling is primarily a qualitative and exploratory technique/term. It refers to
intentionally selecting certain participant or participants for specific reasons (e.g.,
demographic, expertise, unique perspective).
You should carefully consider what group(s) may have insights that would be helpful.
You can purposively individuals from multiple groups. Possible sampling methods
include:
Page 17 of 35




Multiple perspectives: You may want to collect a variety of specific views on the
same phenomenon. For example you may want an instructor, TA, high achieving
student, and struggling student to share their views on the same issue or
method.
Experts: you seek out members of a community of interest with the most valuable
insights. These members could instructors, subject matter experts,
administrators.
Extreme cases: you seek out notable cases such as students at the top of the
class, or students that are at the highest risk to fail a course.
Critical cases: you identify specific cases which together will be revealing for
some aspect of study. For example, if threshold concepts are of interest,
students at that are barely passing and may be missing those concepts could be
good candidates for critical cases.
One major advantage of purposive sampling is that it focuses strongly on a particular
group tailored to the intent of the research and aligns well with both qualitative and
exploratory forms of research. It is likely you will know these individuals and/or your
requests for participation can be tailored so response rates may be higher (though
typically relatively few participants are asked for in these designs). However, this is not
typically appropriate for quantitative designs due to the emphasis on individual
researcher judgement and lack of large-scale or randomisation. Further, this sampling
method is inherently limited to the researcher’s awareness and so there may be some
important individuals that you are not aware of and thus will not be sampled.
Pros: Can provide strong and significant qualitative insights if targeted properly, can get
good participation rates as targeted recruitment is more likely to get participation
Cons: Reliant on being able to identify and convince specific individuals to participate
for success, not generally appropriate for quantitative designs
Snowball sampling
The term "snowball sampling" comes from the way that a snowball being rolled on the
ground can collect more snow into itself. In a sense, the snow in the ball is being used
to “recruit” snow from the ground. Likewise, a snowball sample is one in which members
already in the sample are used to recruit additional members.
To conduct a snowball sample, first a small part of the target sample is found, preferably
by random selection or pragmatically, by a purposive selection. Each of the people
added to the sample are asked to recruit people they know (sometimes with specific
instructions to recruit those who fit the sample of interest). In general, people are more
likely to participate when asked by someone they know personally (i.e., another student)
so this will often gain a higher response rate.
This technique is commonly used in qualitative designs and can be used in quantitative
designs as well. Snowball sampling may be used to try to raise interest and response to
Page 18 of 35
research as blanket requests for participation sometimes do not get many participants.
Additionally, snowball sampling is used in situations where specific individuals are rare
or hard to recruit with random sampling, such as additional language speakers, students
with mental health issues, or students with particular interests. This method can also
access and identify networks of individuals, such as a support network of students
working together and tutoring each other. In these situations where a certain part of the
population is hard to find, snowball sampling can produce a much larger sample than
random or purely convenience methods.
More generally, this method often gets higher participation than purely convenience or
random methods and is commonly used to find participants
For qualitative research, this is a commonly used and accepted technique, however
note that this technique isn’t as targeted as purposive sampling and as is always the
case you will want to carefully study and understand the attributes of the participants
who actually participate.
The major quantitative drawback to snowball sampling is that it isn't random. Members
of the population that are easier to find directly, or are relatively well connected in their
community are more likely to be included in the sample than other groups. Thus
statistical analyses are often limited in terms of their generalisability as with
convenience sampling. As previously noted, there are statistical means to correct for the
potential bias from such limited samples, but this it out of the scope of most projects.
Pros: May facilitate higher response and access difficult to identify/recruit
subsamples/individuals
Cons: Produces an inherently limited sample whose participants cannot be fully
anticipated, challenging for quantitative designs
Grouping participants:
The goal is not always to simply sample/study all available “students” or “experts”
similarly or at once. Oftentimes there is a need to create groups or subgroups in TLR,
especially for quantitative testing designs. Most commonly, you need groups in order to
assess differences or make comparisons. The prototypical example is an experimental
design such as a drug study where one group gets a placebo (i.e., sugar pill) (i.e., the
“control” condition) drug and another group gets an actual treatment (new drug). The
goal is to see if there is a difference between the groups. You would expect the health
outcomes in the treatment group to be better. Since the groups are assumed to be
largely similar (including receiving a pill; see below for variations) the change can be
attributed to the treatment. In experimental designs these are typically called “control”
and “treatment” groups.
In a classroom, you wouldn’t be testing drugs of course, but the logic remains the same
and the treatment can just as easily be a new instructional tool or method. An important
point to note is that comparisons need not be between no treatment and a new
Page 19 of 35
treatment. It is (as it is frequently done in drug studies as well) valid to compare “old”
treatments to “new” ones (it’s often not interesting/important to know if something is
simply better than nothing!). Some question the ethics of giving a group what might be
an “inferior” treatment. However, this isn’t always the case. The new treatment might in
fact be the inferior one as in most cases the old treatment is known to at least produce a
mildly positive outcome and it’s possible the new treatment doesn’t work at all or is
negative! Another way to address this potential problem is by allowing all students to
receive all “conditions” through counterbalancing (see below).
Note that the classroom is not a laboratory. Thus, TLR has particular challenges which
you should keep in mind:
• How will the various people (including researchers and participants) involved in
the project directly or indirectly influence my results?
• What are some elements which are beyond my control but that I should keep in
mind? (e.g., time of course, students’ prior knowledge)
Grouping is typically a concern with quantitative research rather than qualitative
research. While you can form groups in qualitative research, this does not typically
significantly affect the analysis or how you approach findings beyond those mentioned
in data sources above. In quantitative designs, however, the requirements and
assumptions of statistical analyses require a careful consideration of how students are
grouped. Thus the sections below primarily refer to quantitative research.
Natural/existing grouping
Most projects will have to work with existing “natural” groupings and work with these as
groups. For instance, you can try different treatments with different tutorial sections or
class sections. The disadvantage of this approach over a purely random one, from a
research standpoint is that the samples are likely to be biased. For instance, students
who choose a morning tutorial are likely different in a variety of important ways than
those who choose an evening one.
All the challenges previously noted as inherent in non-random/convenience sampling
apply here as well. It is, as always, important to understand the attributes of the
samples/participants you have.
Note that natural groups are not always less desirable than randomly assigned groups.
Sometimes natural groups may be intentionally selected. For instance in a class that is
streamed into two sections: one where students have certain pre-requisite courses and
another where students lack these pre-requisites. These natural groups have different
attributes (pre-requisites) and a variety of designs can be applied to this. For example
you may want to see how differently these groups respond to a new instructional
technique (does this work better with struggling students?). Alternately, you may want to
try a new program meant to support students who lack pre-requisites who historically do
poorly in this course and you want to “close” the performance gap between these two
groups. In these cases the usual caveats regarding non-random samples apply, but you
Page 20 of 35
will likely have more knowledge about the factors which influence your results than you
might otherwise.
Another design that uses natural or existing groupings is one that groups students
based on demographic attributes. Normally this only requires identification of the groups
students belong to at the data collection phase and may not necessarily involve treating
the students differently during the study. Groupings will then be handled at the analysis
stage. For example you may ask students to identify their major and then later compare
student performance between majors, or compare the effects of different treatments on
students in different majors.
Note that a natural group doesn’t have to receive only one treatment such as in a
counterbalancing design (see below).
Pros: Simple to group and manage, natural groupings can be desirable/interesting
Cons: non-random sample has inherent biases and generalisability isn’t possible
Counterbalancing
Counterbalancing designs are not a means of grouping, but a way of organising
different conditions to groups. This design involve subgroups that typically alternate
treatments (and/or control). For example, if you were trying to compare two instructional
methods, such as a new one (N) and a traditional one (T), you could design an
experiment like so:
- Split the student body into two groups. This may already be done by way of class or
lab sections.
- Teach the first group with traditional instruction in weeks 1-6 before the midterm, and
use the new method in weeks 7-12.
- Teach the second group with the new method in weeks 1-6, and use the new
traditional method in weeks 7-12.
This type of design serves to address several common issues:
It addresses the concern that a portion of students not receiving the “new” treatment
since it is possible to have all students to be in all groups (e.g., instructional methods).
Counterbalancing also allows you to effectively “double” your sample size since you can
use individual students both as a “control” and “experimental/treatment” group.
Having students experience every “order” of the treatment also serves to address
“order” effects. That is, results may change depending on what order the treatments are
received. By having each participant participate in every order/combination possible you
can later statistically “adjust” for this difference based on order.
Page 21 of 35
Note that unless students are randomised into these subgroups, the general problems
with non-random samples will apply (e.g., groups may be different in attributes such as
background knowledge). It may be helpful to establish a baseline either with a pre-test
or by using an indirect measure such as cGPA to try to determine how/if the groups are
different and/or potentially correct for this difference statistically later.
In the example given at the beginning of this section, the counterbalanced design is
“fairer” than having one group get the same method for all 12 weeks by counter
balancing the order of the methods between the groups. At analysis you can make a
comparison of the two methods using the midterm grades of the two groups, and you
can look for order effects by comparing the final grades. Assuming the students in each
group have the same background skill at the beginning of the course, if there is a
difference in the final exam scores between the two groups, you have evidence that the
order that the two teaching methods were delivered made a difference.
Pros: This design allows all participants to participate in all conditions which may be of
benefit to participants, and can provide more data for the relative conditions. This
design also can offset “order effects” where the order of conditions affects results (e.g.,
receiving A first improves performance in B, but not vice versa).
Cons: if not randomly assigned the same challenges as natural/existing groupings
apply, can be complex in terms of organisation and analysis
For more details, see
http://www.unc.edu/courses/2008spring/psyc/270/001/counterbalancing.html
Random grouping
Random sampling is especially important quantitative research, and especially in testing
designs (this is rarely done or considered in qualitative designs by contrast). This is
because many statistical analyses require that for results to be generalisable and valid,
especially to groups beyond the sample at hand, the data must be comprised of
samples are drawn randomly and individual students have an equal chance of being in
each group. In part, this is needed because random sampling typically “averages out”
other factors which might influence your results other than what you are interested in
studying.
For example if you have two different instructional methods, a new one you think will
increase test scores and an old one that will have relatively lower test scores. In an
ideal situation you would randomly assign students into two equal groups, one that gets
the new instruction, one that gets the old. The assumption here is that the only
difference between the groups is the instructional method (and thus accounts for any
differences in test scores you find. That is, you assume that each group, from the
random assignments, other than getting a different instructional method should have
similar gender ratios, academic ability, language ability, level of wakefulness in class
etc.
Page 22 of 35
Practically speaking, it is often difficult to randomly assign participants in a TLR project
since students often by necessity make choices about the various groups they join (e.g.,
teams, course sections, TA tutorial sections). Thus commonly samples are
natural/existing groupings (see below).
It is possible to create randomised subgroups around different treatments within a class
through counterbalancing (see below).
Sometimes the goal is not only to randomly select, but to produce a particular
distribution of participants with this random selection. Sometimes random selection
alone is not enough to make reasonable generalisations for smaller subgroups.
Stratified sampling (see appendix) can be used to address this.
Pros: Ideal for statistical research, needed to make generalisable claims
Cons: can be difficult to implement in TLR as most projects have to rely on existing
natural groupings
Stratified grouping
Stratified sampling, or two-stage sampling, is a method where your population of
interest naturally separates into groups or 'strata'. The aim of this technique is to create
groups with specific attributes which pure random assignment is unlikely or unable to
achieve (for instance if you want one of your groups to be comprised of a relatively rare
attribute within your group such as a specific major). Stratified samples are useful when
there are many groups and you would prefer to have lots of information from specific
“kinds” of individuals or groups rather than sparse information on everyone generally.
The first stage of a stratified to sample is to identify the attributes you would like to use
to sort into strata and sort students into these sampling “pools” (e.g., major). The
second is to then to predetermine the attributes of each group then random assign
students from the pool into the groups until these quotas are met (e.g., group 1: arts;
group 2: engineering).
The attributes could also be natural groupings to ensure a good representation around
certain subgroups. For example, if a large classroom was broken into dozens of small
study groups, you could select 10 of the groups (at random) and look at the exam
results from two students of each group.
A major advantage of stratified sampling is that, when done properly, each student with
a specific attribute now has an equal chance of representation, regardless of how
common/rare it is in the whole group which can facilitate generalisability for certain
groups.
One major drawback to stratified sampling is the complexity - it involves identifying
strata and performing two rounds of random selection. Also, the ability to generalise
Page 23 of 35
from the sample, given that it is not truly “random” relies heavily on your ability to know
what the population to “generalise to” is like to make for valid statistical inferences and
analyses.
Pros: Can address some weaknesses in random sampling, can help study smaller
subgroups that may not be properly represented otherwise
Cons: Is complex to organise and track
Page 24 of 35
Teaching and Learning Project Types and Examples
The following the process was developed for the Institute for the Study for Teaching and
Learning in the Disciplines projects. It is divided into four steps which help guide those
conducting Teaching and Learning projects towards the general elements of their
designs and suggested methods used.
Overview of the process:
Steps 1 and 2 will help you get started by helping you identify the purpose(s) and
general type of inquiry you will be conducting. In turn each has recommended data
sources (step 3) and participant recruitment/grouping methods (step 4) for given project
types.
Step 1: Identifying your purpose(s)
There are two overlapping purposes for Teaching and Learning Research (TLR). You should
consider your relative interests/emphases for each research question and your project as a
whole.
Exploratory
• Gathering in-depth information on why/how something worked (or didn’t)
• Typically qualitative and exploratory
• Typically has a formative intent (aim to inform change, what worked or didn’t work? Why
did this happen?)
• Findings speak primarily to your particular class or situation
Testing:
• Determining if or how well something “worked” (or did not)
• Typically quantitative and hypothesis driven
• Typically a summative intent (aim to inform about what happened)
• May aim to “generalise” findings to other classes/situations beyond your study
• Significant prior information is usually needed to do this. A testing project for example,
may emerge and be informed from the results of an exploratory project.
Page 25 of 35
Step 2: Identifying your project type
Project Type 1: Designing/building a new instructional approach, tool or method
Project Type 2: Evaluating an existing instructional approach, tool or method
Project Type 3: Evaluating a complete course design or redesign
Project Type 4: Program evaluation
Follow the flow chart below to find the type of project most likely to describe your own.
Page 26 of 35
Project Type 1: Designing/building a new instructional approach, tool or method
Project focuses on developing or adapting something new and usually has a primarily
exploratory purpose.
Project focus:
•
•
•
I want to develop a new active learning activity into my lecture
I want to create a new technology/simulation that will support students’ learning
I want to develop a new way of evaluating my students
The focus here is on developing something for your course. Note that this type of
project will often overlap with the subsequent type (evaluation). The relative emphasis
between these purposes will depend on how much relative effort and need there is for
design and/or evaluation (for example whether something already exists and/or how
much adaptation will be needed for your course).
In the design phase, it is likely that you will not have research data to work with as your
design will be relatively new and untested for your course. You can, however, try to
inform your design a variety of ways:


You can perform a literature search to find similar designs and pick out the
features of those designs you think might be helpful and/or relevant.
You can consult with experts in a structured (Delphi technique) or semi
structured (interviews) way. These could be experts on the material to determine
what content might be relevant, or they could be experts on the type of
instruction to determine what features and/or adaptations may be needed for
your course.
The “regular” process for a design focused project is to build-pilot-revise. That is, you
want to design your new tool/method/evaluation, implement/test/explore the
implementation, then make revisions or recommendations for revisions based on this.
This cycle can be repeated multiple times, those most projects, especially single phase
ones will likely only complete the cycle once. For details on how to develop research
insights into your pilot see the following project type. Note that for the most part, for
design focused projects, the “exploratory” research purpose should normally be given
more emphasis over testing as it will provide you with the detail needed to tweak the
design.
Page 27 of 35
See the next project type (Project Type 2: Evaluating an existing instructional approach,
tool or method) for design examples and data sources for performing research on the
implementation
Project examples
Title of Project: Future of the Book: Pedagogical Tool for English Literature Students
Grant recipient: Margaret Linley, Department of English
Title of Project: Using Digital Humanities to Teach How Historians Think
Grant recipient: Elise Chenier, Department of History
Page 28 of 35
Project Type 2: Evaluating an existing instructional approach, tool or method
Project focuses on evaluation (whether and/or how well something worked). May be a
part of the previous project type. Evaluations may have exploratory or testing purposes
or foci.
Project Focus:
•
•
•
I want to incorporate and evaluate an active learning activity into my lecture
I want to integrate and evaluate an existing technology/simulation that will
support students’ learning
I want to assess the effectiveness of a new way of evaluating my students
The focus is replacing or introducing something into an existing course. When designing
this prior to the evaluation phase you want to begin to identify what key features of your
design you want to assess. What about this tool/method/evaluation are you most
interested in? What makes it special/different? What does “success” look like? What
kind(s) of student learning are emphasised.
Exploratory: This can be used in addition to, or instead of testing. This is especially
important if you plan to refine or improve the tool/method/evaluation in future semesters
or by other instructors. You will want to identify and focus on the features or elements
that you feel are most important and gather information about these specifically. For
example if you are developing a video, what is it about the video that is especially
helpful or unhelpful to students?
Data source suggestions (exploratory):
 Short answer surveys
 Qualitative student performance/thinking measures (e.g., concept maps,
reflections/diaries, evaluating complex assignments such as portfolios or
projects)
 Observations by an external evaluator (e.g., instructor, expert in the field)
 Interviews and/or focus groups (students, TA’s)
 Think aloud interview protocols (ask students to describe their thinking process
as they work)
Testing: This lends itself particularly well to testing because oftentimes what you
replace/introduce is the only thing that has changed so you can compare to a previous
semester or even to different sections/groups/cohorts within the course (if too many
things change it can be hard to tell how to account for changes in performance). In
industrial settings, such as when Google wishes to try a new interface, they call such
Page 29 of 35
block-and-compare methods as “A/B testing”. In traditional statistics, they are referred to
as “experimental design”. Keep in mind that one of these groups doesn’t necessarily
have to have a “control” in that one group gets “nothing” or a “placebo”. The comparison
can be a different treatment or even just the old method that you already know is
somewhat effective. This purpose can be difficult to achieve, however, if you have no
basis of comparison.
Design examples:
Comparing one group of students within a class with another one (e.g., one group gets
one method and another group gets a different/old method)
• Students can get different methods at different points of the course and be
evaluated at those points
• If you have a large number of students quantitative (e.g., survey methods) may
be best
• If you have a smaller number of students, qualitative (e.g., interviews, short
answer surveys) may be best
Comparing students from this offering of the course to a previous offering
• If it is a past class you can use grades or assignments. It is important/ideal that
the course has only “changed” in terms of what you have developed/added
• If this is a multi-semester project you can collect data such as surveys or
interviews and compare or track data over time
Data source suggestions (testing)
 Surveys
 Quantitative student performance/thinking measures (e.g., scored or graded
assignments relevant to project focus)
 Trace data and records (attendance, canvas, clickers)
Project Example(s):
Exploratory:
Title of project: Development of instructional videos to improve students’ techniques in
General Chemistry Laboratory courses
Faculty Investigator: Sophie Lavieri (Chemistry)
Title of project: Mapping Expatriate Paris, 1800--‐1960
Faculty Investigator(s): Colette Colligan and Michelle Levy (English)
Page 30 of 35
Title of project: Student Response to Instructor Feedback on Writing
Faculty Investigator(s): Marti Sevier, English for Academic Success/Linguistics
Testing:
Title of project: Comparing regular session, i>clicker and online-tutorials: Exploring
student experiences and learning outcomes
Faculty Investigator: Sheri Fabian and Barry Cartwright (Criminology)
Page 31 of 35
Project Type 3: Evaluating a complete course design or redesign
Project is similar to both designing and evaluating an instructional tool/method, but
evaluates multiple changes, complete redesigns, or even new courses. The purpose is
likely exploratory.
Project focus:
•
•
•
•
I am designing a new W course
I am updating a course to better meet student needs
I am updating a core course in my program
I want to implement a flipped classroom
These often involve making multiple and/or fundamental changes to a course, or
designing a completely new course. In designing this, keep in mind what the overall
goal(s) of the course is, and how you see the various elements contributing to that goal
or those goals. How do the different modules relate to each other? Which elements are
most critical or likely problematic? What are their common goal(s)? What are the
anticipated learning outcomes of the course? What do you want students to be able to
do after the course? This type of project shares much in common with the design of an
instructional tool/method/evaluation and many of the notes there apply here, the
difference being mainly in complexity and scope.
Exploratory: This purpose is arguably most appropriate since the complexity and lack of
precedent of/for the complete redesign makes it difficult/impossible to easily determine
“effectiveness” on individual elements. You will want to gather formative information and
focus on observations (your own, TA’s) and student feedback. You want to collect data
and information that will support you in knowing how to improve/change/remove/add
elements of the course design going forward. Did students see the relevance and/or link
between two activities? Did TA’s find certain activities/modules too difficult or
impractical?
Data source suggestions (exploratory):

See exploratory data source suggestions for project type 2 above
You may want to conduct research prior to the design phase to help inform your design.
A literature search, survey of previous students, or consultation with experts (e.g.,
subject matter, instructors) may be helpful for this.
Page 32 of 35
Testing: This purpose is relatively limited as there is usually little to meaningfully
compare to if the course is effectively new. It is not recommended that this be the only
or main focus of your inquiry. Course redesigns often involve making many changes to
an existing course at once, so inference about specific aspects of a course are difficult
or impossible to make. However comparisons can be important, particularly in the longterm. For example if you want to establish that this new offering of the course is
“effective” or valuable enough to continue to be offered. In this case you want to
consider what effectiveness means to you (e.g., grades, student interest, scalability,
cost, student retention, performance in future related classes).
Data source suggestions (testing):
•
•
•
Collecting and comparing course evaluations
Collecting and comparing grade distributions from previous course offerings
Collecting subsequent student outcomes (e.g., enrolment, grades in the next
course)
Project Example(s):
Has exploratory focus:
Title of project: Increasing Opportunities for the Development of Case--‐Based
Knowledge in High Enrolment Design/Production Courses
Faculty Investigator: Michael Filimowicz (SIAT)
Has testing and exploratory focus:
Title of project: Development of a New Course: ENSC 180 Introduction to Engineering
Analysis Tools
Faculty Investigator: Ivan V. Bajić, School of Engineering Science
Title of project: Flip the Classroom: An Investigation of the Use of Pre-Recorded Video
Lectures and Its Impact on Student and Instructor Experience in Two First-Year
Calculus Courses
Principle Investigators: Veselin Jungić and Jamie Mulholland, Department of
Mathematics, Cindy Xin, Teaching and Learning Centre (TLC)
Project team: Harpreet Kaur, research assistant
Page 33 of 35
Project Type 4: Program evaluation
Project focus:
•
•
•
We are redesigning the course sequence in our program
We are determining if the program meets student needs
We are matching learning outcomes to assessment points
These are the largest scale projects and involve the largest number of stakeholders.
They do not focus on individual elements of courses or individual courses. The main
emphasis is on collecting and organising a large amount of disparate information.
Exploratory: This is the likely focus. The range of questions are large and will depend
heavily on your goals and context. Some potential areas of inquiry may include:
•
•
•
•
•
•
•
What have other similar programs done and what ideas can I incorporate?
Are we meeting our stated program goals?
What parts of this program are working and which ones need improvement or
replacement?
How do the courses relate to one another?
What factors are associated with student success in this program?
What do we anticipate students should be able to do or should know by the end
of the program?
What do students think of the program and what do they do afterwards?
Testing: This is unlikely to be applicable for these kinds of projects as there are far too
many elements to compare with and likely no reasonable comparison possible.
Data source suggestions:
•
•
•
•
•
•
Surveys of recent graduates
Surveys of current students
Focus groups
Interviews with stakeholders (e.g., faculty, students, professional organisations)
Examining course evaluations across courses and time
Examining program documentation/reports for similar programs
Example(s):
Page 34 of 35
Title of project: Evaluation of Student Perspectives on Their Learning Experiences in
Biomedical Physiology and Kinesiology
Principal Investigators: Victoria Claydon, Department of Biomedical Physiology and
Kinesiology (BPK)
Title of project: Transitioning to Outcome Based Education: Optimizing the Mapping of
Graduate Attribute Indicators to the Curriculum of the School of Engineering Science
Principal Investigator: Michael Sjoerdsma, School of Engineering Science
Title of project: The Academic Enhancement Program: Evaluation of Expansion to the
School of Engineering Science
Principal Investigators: Diana Cukierman, School of Computing Science, and Donna
McGee Thompson, Student Learning Commons
Page 35 of 35
Download