Uploaded by Iqra Pashtana

combined - module 1

advertisement
Empiricism and the Scientific Method
Transcript
Okay. So to start the course off, it's important that we're all on the exact same page about what is truth in
psychology? How do we go about acquiring knowledge in a way that is reliable and accurately reflects the world
around us? So we'll start off by talking about empiricism, which is this knowledge, framework, this philosophy that
guides psychological science, and then the scientific method, which is our means of actually seeking out and
acquiring this knowledge. Okay. So empiricism is the belief that in order to be true and reliable, knowledge must
come from systematic observations that are documented or recorded as data. Okay. So we're trying to turn our
own subjective experiences into something more objective, something that you can reproduce, that you can clearly
document and show to other people as evidence of your finding.
So as important as it is to talk about what empiricism is, it's important to think about what empiricism isn't. So what
isn't a valid means of acquiring knowledge and psychology? Throughout your life you might encounter people
saying, "I think that..." where they share their opinion, they share their thoughts, they hypothesize about how the
world works. As an empiricist, it's important to counter that by saying, "Okay, prove it. Give me some evidence.
What have you documented that you can show me beyond your own opinion or your own self-reflection that
proves what you're saying?"
Oftentimes we'll hear about esteemed names like Sigmund Freud, who theorized one thing or another. Now as
esteemed of a name as Freud is, as an empiricist, we would have to ask, "Okay, Freud, where's your evidence?
What have you documented that proves what you're saying is true?" We can't just rely on one person's thoughts or
reflections to tell us anything truthful about the world. It's common knowledge. This is a common thing you'll hear
in day to day life and often people will make these kind of appeals to folk wisdom and saying that, "We all know X,"
or. "We all know Y." But as an empiricist, we have to really check our assumptions and say, "Okay, show me the
data. Prove to me that this is common knowledge. Prove to me that this common knowledge is actually true and
people aren't being commonly misled."
And finally, in my opinion, now, all research questions are going to start with our own ideas and our own thoughts
about how the world might work. But as an empiricist, it's important to remember that we need objectivity. Opinions
are subjective, and in order to back them up and to see if they hold any water or they actually reflect some truth in
the world, we need to strive for objectivity. We need to strive to document something concrete about our world, that
we can show to other people and say, "This supports my belief." So as a psychologist, it's always important to hold
yourself to hold yourself to this gold standard of knowledge. So it's not enough just to think something or to
assume something about the world is true. You actually need to go out and either find somebody who has
documented or document it yourself. So how do we go about actually acquiring this knowledge? We use the
scientific method. This is the empirical approach to testing our beliefs that involves choosing a question,
formulating hypothesis, testing that hypothesis, and finally drawing a conclusion from your data.
Now already I've introduced some terminology like, "Okay, what is a hypothesis? How is that different from a
research question?" So really quickly, I want to go over some of these key definitions and highlight some of the
differences between them. So a theory is the overarching system of interrelated ideas that are used to explain and
unify a wide set of observations and to guide future research. So here we're not talking about any one particular
finding. We are integrating a whole bunch of different findings into an overall idea or explanation for how at a very
broad and high level, how the world works. So here's an example. A child's social development is uniquely molded
by the interactions between their temperament and their self-regulation skills. Okay? So this broad theory is
describing how specific aspects of a child's personality or their self-regulation works together to govern how they
end up developing in their social world.
This is a very broad idea. We're not talking about a very specific study. And in fact, there's lots of different studies,
you could imagine, would feed into this idea and could be tested from this idea. It's very broad, it's condensing a
lot of ideas into one higher order organizing system of ideas that we're going to call a theory. A research question
is more specifically the question you are trying to address with a particular study. So this can be conceptual and it
doesn't need to mention specific measures, but it's a lot more specific than a theory. A research question might
emerge from a theory or test a theory. So to stick with our example of a child's social development, and this theory
of that, our research question in this domain might be, how does shyness relate to children's ability to control their
impulses over prolonged periods of time?
So this is a research question that emerges from one component of our broader theory. So we're interested in how
shyness, an aspect of kids' temperament relates to their ability to control their impulses, an aspect of their selfregulation skills. So our theory suggests that these two things should be intimately related. Our research question
says, "Okay, let's put that idea into practice. How does this aspect of temperament relate to this aspect of selfregulation?" Now, when we actually go to test this research question, we're going to generate a hypothesis.
So hypothesis is a statement that specifies a relationship between two or more measurable variables that can be
proved or disproved within the bounds of your study. So we have a broader question. So how does shyness relate
to how children can control their impulses over a prolonged period of time? Now, our hypothesis is our hunch, is
our idea, it's guided by past work to say like, "Well, based on past work, we think these two aspects of personality
or these two factors of the child will be related in this particular way." So an example of hypothesis here might be
children scoring higher in a shyness questionnaire will be able to resist eating a marshmallow for longer than non
shy children. Right? So by testing this very specific hypothesis, we'll be able to answer our research question, and
in doing so will inform our theory. So we can see that these three ideas are kind of hierarchically organized. So at
the broadest level, you have your theory, which governs a wide set of phenomena. From that theory merges a very
specific research question.
So, okay, how would this theory play out in this particular set of circumstances? Based on past work, you'll
generate a hypothesis about how you think the world would work. You'd measure what you're measuring in your
study, and your findings would either support or fail to support that hypothesis. And then that has implications back
up the chain. So if our hypothesis was not supported, and that addresses our research question in that way, this is
going to have implications for our theory.
Now, one hypothesis that isn't going to totally cause a theory to crumble, but if enough evidence of accumulates
over a series of different studies, maybe we'll have to update our theory or shift to a new theory that may provide
an alternative interpretation at a broad level of how the world works. The really cool studies, rather than just testing
one theory, actually kind of pit two different theories against each other. So then its findings can actually put them
to the test and will either support one theory while failing to support the other. This is a really good way of
scientifically evaluating these different high order ideas and finding what's the best way of explaining what we
observe in the world.
Cross-Sectional Designs Transcript
Okay. Now we have terminology out of the way. We've talked about some of the different contexts and methods,
which we can use to study children. Now, let's really talk about how we're organizing the designs of our studies.
How are we choosing which groups of children to compare? Are we using the same children or different children?
And we'll start off today by talking about cross sectional designs.
A cross sectional design is when you use different children of different ages and you compare their performance.
Each participant is usually only tested once, and we use this to highlight developmental trends in children's
performance. You'd have a group of three year olds, a group of four year olds, a group of five year olds, and you'd
test them all independently, and you'd see, "Hmm. How did the three year olds perform differently than the four
year olds? Or the four year olds from the five year olds?" Et cetera.
Okay. Let's look at one of my favorite examples of a cross-sectional study. This is a study by Shtulman and Carey
from the year 2007. They were really interested in how children think about the possibility of events from mid-early
into later childhood. How do children reason about what can happen versus what, no, no, no, it's impossible. It
couldn't happen? They showed children of different age groups, three different kinds of stimuli. First, they showed
them totally ordinary events, or they asked them about ordinary events. They would say, "Could you drink orange
juice? Or could somebody drink orange juice?" It's pretty ordinary. They would ask them about improbable events,
like, "Could somebody drink onion juice?" I mean, I've never had onion juice myself, but I know that onions
produce juice, so you could, theoretically, create onion juice.
And then, they ask them about totally impossible events, "Could you drink lightning juice?" No. As adults, we know
that you can't turn lightning into juice. This is an example of an impossible event. But they're interested in how
children would reason about these three different kinds of events, as they got older. When would they start thinking
that impossible events were impossible, or that improbable events were improbable, but still possible, et cetera.
All right. I'm going to show you a graph of their data, really, really cool stuff. On the Y axis, going up on the left
hand side, is the average number of events judged possible. The higher point is, for any particular group, the more
possible that group judged the event to be. On the X axis, right along the bottom, we have our four separate age
groups. Remember, this is a cross sectional design. That means we have four distinct groups that were tested. We
have four year olds, six year olds, eight year olds, and adults. Now, let me show you some of their findings. Here's
all their data. First along the top, where we have these circular dots, this is how children responded to ordinary
events. Even from the age of four years olds, onwards, everybody understands that these ordinary events, like
drinking orange juice are totally possible, even young children understand this.
If we look down at the bottom, we can see how people responded to impossible events like drinking lighting juice.
Four year olds, just like adults, understand that lighting juice just doesn't exist, so no, you can't drink lighting juice.
Now, what's really interesting, is what happens in the case of these improbable events. What we see here is that,
early in childhood, around the age of four, children thought that drinking onion juice was impossible. It wasn't just
improbable, it was impossible. You couldn't do it. On the far top right corner, we can see that adults realize that, as
weird as it is to drink onions juice, there's nothing stopping you from doing it. It's totally a possible thing that you
could do. And in the middle, we can see that this kind of understanding that these improbable or weird events, are
nonetheless possible. This emerges over time. It's not like you either know it, or you don't. It seems like as children
were getting older, they increasingly realized that improbable events could, nonetheless, still occur.
I really love this study and I think it really perfectly demonstrates the power of cross-sectional designs and what
they're really useful for. Here we have a perfect example where at one point in development, at the age of four,
children were thinking about things a certain way. Now, by getting these little snapshot images of how children at
different stages, all the way up until adulthood, think about these kind of problems, we can see how children's
thinking changes as they develop. This gives us great snapshots that we can compare across different stages of
development.
Cross-sectional designs have their drawbacks too. One of the major drawbacks is that, because all of these four
groups featured different children, it doesn't give us any insight into the stability or patterns of change within
individual children. We don't know exactly what individual differences might promote kids' thinking about one thing
or the other. Maybe within each age group, there's some variability, and some kids are thinking a certain way, but
other kids aren't, we don't really get to look at any of that, because we're just grouping all children of the same age
in together. What it do well, it allows us to map note these changes across broad developmental time. What it
doesn't allow us to do, is to hone in on the individual and to see, "Okay, what is it that's unique about that
individual, that predicts how they develop."
Correlational Designs Transcript
Okay, so now that we've talked about experimental designs, we'll talk about another empirical research design,
which is known as correlational designs. So, in correlational studies, you examine how variables are related to or
associated with one another. Okay? So, this is really, really useful when we're looking at variables that can't be
directly manipulated by the experimenter. This is things like age, someone's, gender, their personality traits, their
socioeconomic status. As an experimenter, we can't assign someone, or randomly assign someone an age, or
randomly assign someone socioeconomic status. These are things that occur in the person, and there's not really
anything we can do as an outside influence to assign someone, one of these characteristics. So the best we can
do to empirically study these topics or these factors is to look at their relationship with other factors.
Importantly, however, there's some caveats we need to consider in how we interpret correlational studies. One is
that we need to be really careful, as some correlations could be totally coincidental. Consider this totally made up
data I'm going to present on the left hand side of the screen. So, let's imagine this first line represents the average
number of diagnoses of autism spectrum disorder each year in Canada, over a 10 year period. The second line,
let's consider this to be the average number of vegetables eaten by a family every day, or for over a 10 year
period. So for both of these two lines, they're both increasing gradually over this period. Now, if you were to
eyeball it, you might say, hmm, it seems like these two are associated with one another, but importantly, we have
absolutely no theoretical reason for thinking that there be a relationship between these two variables, right?
Just because we may perceive there to be an association between the two is not enough to say that necessarily
they're related to one another, right? It could be total coincidence. So, when we are proposing that there is an
association between two variables, it's important that we really reflect on, does this make sense with it, in our
broader understanding of the world. Is this rooted in previous empirical observations or existing theories, right?
Does it make sense for us to infer an association between these two?
Another important consideration is that we cannot determine causality or the direction of associations between
two correlated variables, okay? So, just because two things are associated with one another, we cannot just base
solely on correlational studies, say that one thing caused the other, or that one thing is influencing the other, and
not the other way around.
I'll give you an example of that in just a second, but you've probably heard the mantra before, correlation is not
causation. And this is really something to take to heart. Correlational studies can be really, really important. They
can be really illustrative and can give us a lot of insight into these, the associations between these variables or
these factors that we can't manipulate ourselves, but, we just need to recognize the limitations of this particular
kind of empirical study.
Okay, so let's consider an example. A lot of people are really interested in the relationship between children's
screen time, so how much the time they spend on computers or iPads, cell phones, et cetera, and how that relates
to their behaviors. For example, their hyperactivity or their impulsivity. Some people might argue that screen time
act actually promotes hyperactivity or impulsivity. So let's imagine we saw an association between these two
variables, kids who spent more time on their phones also tended to be more hyperactive or more impulsive.
It could be that maybe screen time is actually directly promoting or causing this hyperactivity. But, on the other
hand, based on the exact same correlational data, it could be just that more impulsive high energy children
naturally gravitate towards screens. There's something about screens they find really compelling. And so this
relationship between more screen time and higher impulsivity, it could be the opposite direction, right? Or, as a
third possibility, it may be that there's some third variable that's actually explaining the relationship between these
altogether, right? Maybe children who have a particular diet, for example, one that's higher in sugar is influencing
both of these things. Maybe that's what's resulting in their hyperactivity and also resulting in their increased screen
time. And it's this third variable that's actually misleading us into thinking that these things are directly related to
one another, right?
Keep in mind, I just made this up for the purpose of this example. There's lots of really cool research around both
these topics, if you find them interesting, I'd encourage you to go take a look. But anyway, the big takeaway here is
that correlational research, really important, can help us understand lots of really cool domains of psychology, but
they're only one piece of the broader research puzzle. And there's the limitations of them we really need to be
aware of, right? So, they're great when we can't directly manipulate the characteristics of interest, right? Things
like age, SES, but we can't employ, we can't determine causality and we can't strongly imply a directionality just
from correlations either, right? That's all they are, is associations between these two, and on their own, they can
only tell us so much. Okay?
Important Terminology in Empirical
Research Transcript
All right now that we've defined empiricism and the scientific method, we can jump into some other important
terminology and empirical research. Okay. So, let's start off by looking at our measures. An operational definition is
exactly what is meant by each variable in the context of a study. This really governs how you're going to turn these
abstract ideas or concepts that you're interested in into something more concrete and measurable. So, let's
imagine we're interested in studying shyness. Shyness is an aspect of someone's personality. It's not really
something you can put your finger on or measure on its own. So, the way we might get at, what is shyness is
perhaps by, giving someone a questionnaire and looking at their score on a variety of items that are assessing
their experience of shyness in day to day life, right? So, by turning this abstract aspect of personality and
operationally defining it as their score in response to these questions, now we have something palpable that we
can measure.
Another example might be someone's reading ability. You might give them a comprehension test where you ask
them to read a passage or a couple passages, and then answer questions on those, right? And you'd say, okay,
we will define how good someone is at reading based on how well they can answer these comprehension
questions. Or maybe you're looking at something like, someone's memory. You might give them a long string of
numbers and only give them a very short period to look through them and then say, okay, what's the longest string
of numbers that you could recall after this short presentation, right? All right. So, the key idea here is we're taking
something abstract, like an aspect of personality or an aspect of someone's cognition, and we're boiling it down
into something that's actually measurable. Something that we can reliably and objectively measure. And we'll use
this to address our research questions.
Okay. So, now that measure are out of the way, let's talk about what actually we mean when we're talking about an
experiment. So, an experiment is a situation in which the investigator holds all things constant, except for one
aspect, which is strategically varied across conditions. The thing that we actually change or manipulate across
these conditions is what we'll call, an independent variable. Okay? So all things are going to be held entirely
consistent in this experiment with the exception of one thing, which we're going to change. An example might be
you have two groups, your independent variable is the kind of music that is being played to the people in each of
these groups. Maybe in one group, they're listening to classical music and the other they're listening to rock, right?
So the key here is that literally everything else about the environment of these two groups is identical.
The only thing that we're changing is the music being played to them. And in this way, any differences that we
observe between these two groups, we can totally chalk up to that thing we manipulated. The dependent variable
on the other hand, is whatever you're measuring as your outcome. This is like maybe the measures that we
identified in the previous slide. So, for example, in this study, we might be interested in how the music being
played impacts people's scores on a math test. So we have of people in these two groups, one listening to
classical, the other, listening to rock, and they are studying or completing a math test. Everything else is exactly
the same. That way, whatever differences we see in the two groups' scores on these math tests, we can totally
chalk up to the kind of music that was being played for them.
Then you might be thinking, wait a minute, there's one other thing that could be influencing the results here. That's
the fact that some people are better at math than others, right? And this could be influencing our results. Maybe
there are just more people in the piano group who are better at math than people in the rock group. And that's
going to throw off the results of our study. Luckily, there's a solution to this problem. And this is called, random
assignment. The central idea behind random assignment is that when you have your full sample and you're sorting
people into the two groups you need to do so entirely at random, right? So half of the people will go in one group in
our classical music group and the other half will go into the rock music group. This is totally done at random using
like a random number generator or something of that sort.
There's absolutely no decision making, putting people into either group based on any particular characteristic,
right? It is totally at random. The key idea here is that because we are randomly sorting people into these two
groups, all traits, for example, math skill in this case, should be equally distributed across the two conditions. So if
we turned out that we have three people in our classical condition, that are excellent at math, chances are there
would likely be just as many people who are really good at math in the rock music condition as well. So random
assignment controls for this variability or this individual variability in math skill. Okay. So, obviously random
assignment is central to the underlying logic of experimental design, but it doesn't always work out like that in the
real world. So, it's important to consider what can go wrong or what we need to do to ensure that random
assignment does its job.
One is that we need to make sure we have enough people going into both of our groups. With an insufficient
sample size, it's more likely that you're going to have a couple math whizzes in one group versus the other. That's
totally going to throw off your results, right? You need to make sure that there's enough people that you're sorting
into these two groups to ensure that everything averages out. The other thing is that it's good just to measure any
variable, you're worried about throwing off your results. What we might call an extraneous variable. So if we were
really concerned that people's intrinsic math skills was going to throw off the results of the math quiz, we gave
them under these specific conditions. Maybe we want to have some other measurement of their intrinsic math skill,
like maybe their grades in a math class, right? That way we can double check.
Hmm. Do we get as many math whizzes in this group, as in the other one? Is the average math skill intrinsic, math
skill of our two groups equal. If we didn't have this measurement of this extrinsic or extraneous variable, we
wouldn't even be able to identify if there was a problem. So if you're designing a study, an experiment and you
identify that there's maybe some intrinsic characteristic of your participants that could throw off your results, it's a
good idea to measure them, to make sure that you're, when you do your random assignment, that it does
adequately sort them equally into these two groups.
Okay.
Experimental Designs Transcript
In this video, we're going to do a deeper dive into experimental designs. We're also going to revisit an example
from the correlational designs video and interpret it again, this time through an experimental lens.
So the important distinction between experimental and correlational designs is that unlike correlational studies,
experimental studies can infer causation. The reason we can do that is that everything is kept constant except for
the variables of interest, right? So the variable that we manipulate is known as the independent variables, right?
This is the one factor that we change between our different treatment groups, and therefore, any differences we
see between those groups, we can only chalk up to that experimental manipulation. We can isolate the causal
factor as being that thing that we changed, that independent variable. The dependent variable, by contrast is, what
we are measuring, right? So this is the change we're looking to document as a function of the manipulation of our
independent variable.
Again, experimental studies really rely on random assignment so that participants are randomly assigned into
either conditions as a means of ruling out whether any individual characteristics could influence our results. In our
previous video on correlational designs, which talked about this potential association between screen time and
impulsivity in children. Now, remember this was a made-up hypothetical situation, but let's imagine we wanted to
explore that exact same question in an experimental design, or use an experimental design rather than a
correlational design.
Here's an example of how we go about doing that. So let's imagine we had our two groups, okay? We split our
sample into two groups. Each group is going to be given... Of children is going to be given a certain amount of
screen time, or time playing on an iPad, okay? So one group is going to be given an iPad to play with for five
minutes. The other group is going to be given an iPad to play with for a period of 30 minutes.
So that's our independent variable out of the way. Now for our dependent variable, we're going to give kids a
choice. After they're done playing with their iPad, they're going to be offered two options. One, they can either
have a candy cane now, or if they're willing just to sit and wait a little bit longer, we'll give them two candy canes
later on, right? So our dependent variable is going to be, which of these two options do children select? And
children in each of our two treatment groups, group one and group two, are both going to have the same choice to
make. Now, if screen time causes an increase in impulsivity, what we would predict is that children in the first
group, who only had five minutes of iPad time, would say, "No, I can wait a little bit," and they would pick the
bigger reward later, while kids in the second group, who only who had 30 minutes of iPad time, would say, "No, no,
no. I'm more impulsive. I want my reward now," and they would pick the only one candy cane right now.
So this would be in line with our prediction that screen time caught causes an increase in impulsivity. So from this,
you should be able to contrast the distinction between experimental and correlational research. While the first
correlational study only allowed us to look at the association between these two variables, the experimental study
allowed us to hone in on the causal relationship between the two. One of those factors was causing the other, and
that's the power of experimental research.
Microgenetic Designs Transcript
Microgenetic design sounds like it's straight out of a biology textbook, but it's actually another design that's used in
some niche psychology studies. So the specialty of microgenetic designs, almost as a counterpoint to longitudinal
designs, which look across very long periods of time, microgenetic designs really try and pull apart the single
processes and try to understand how people think in very particular circumstances.
Microgenetic designs track small scale developments in children's cognitive and behavioral processes. So usually
this means having a couple or several observations of the same child over a short period of time, so whether it's a
couple different sessions throughout the same day or over a couple weeks, and what you're really trying to do in
these microgenetic studies is you're trying to observe how a child's cognition or behavior is changing with
prolonged experience in a particular context.
So here's an example. Imagine you're trying to observe how a child learns to solve a complex puzzle over the
course of several sessions. You're really trying to identify what was the child doing right before their eureka
moment when they suddenly could figure out how to solve it. So what microgenetic designs offer that other
designs in developmental psychology don't, is a really detailed look at how specific processes develop, so much
more fine grain than the time scale that other designs would allow you to look at.
If you were really interested in what specific patterns of behavior led to a child having a new kind of insight or
having a cognitive breakthrough, that's when a microgenetic design would be really, really useful. So this is a very
specific function. This would be really useful for someone who is interested in learning processes, or maybe in
certain applications in clinical psychology.
That being said, though, it's not that common in most other areas of developmental psychology, and it's not
something you'll encounter very often in modern developmental research. For most of the research questions that
we're going to be asking, you'll encounter cross-sectional research, occasionally short-term longitudinal or longterm longitudinal occasionally. Those are the real bread and butter of modern psychology. Microgenetic designs
are definitely super cool, but not something you're going to encounter all the time.
Interviews and Questionnaires Transcript
There are lots of different contexts and methods by which we can acquire data from children in developmental
studies. So in this first video, we'll start talking about interviews and questionnaires. So we'll start off by looking at
structured interviews. So structured interview occurs when the experimenter asks a participant a series of
predetermined questions and records their responses. So an example might be experimenter asking "If you had to
choose between these two options, which would you pick?"
Then after the child provided their like dichotomous choice of which one they would prefer, the experimenter might
in a very predetermined fashion follow up with the question, "Why?" The responses the child gives to this
structured interview would be recorded at the time of data collection and then later on would be analyzed and
coded. So when we're talking about coding, what we mean is that their answers are going to be interpreted and
categorized by a trained coder. So coders will often work in teams to ensure that there's a lot of inter rater
reliability, which you'll recall all means that different people who are interpreting the same answers from the child
would categorize them in very similar ways.
Oftentimes, they'll also convert different aspects of children's responses or their behavior into scales, so following
predetermined coding criteria. So there might be certain ways of interpreting child's responses being either very
negative or very positive and coders in an effort to make things more objective might turn their response from like
a transcript or a sentence into just a numerical code of this answer was very positive or very negative or
somewhere in between, etc. So let's look at a couple examples of studies that you use structured interviews of
varying complexity.
So here's a study by Golding Accounts and Friedman from the year 2019. So in this study, they were really
interested in whether children's beliefs about their preferences when they later became adults will be different from
their preferences than as children. So do children think that the things they like now they're going to like forever. In
this study, they asked very straightforward questions. They would give children the choice between two options. So
here a sippy cup and a coffee cup, and they would say, "Which one will you like when you grow up?"
A very straightforward a question with these stimuli accompanying it to which children could just give a simple
answer in the form of either a pointing finger or by naming the object. So they would say, "I think when I grow up, I
would like a sippy cup," for example. So this very straightforward. Coding children's responses here is very easy
because it's just an answer to a dichotomous or two option choice. All right. So now let's look at a slightly more
involved example. This is a study by Slaughter and Griffiths from 2007 in which they used a structured interview to
examine children's thinking about death.
So here's an example in this table of some of the questions and the scoring criteria they used in their study. The
first column on the far left talks about some of the sub components that they were interested in. The middle
column looks at some of the interview questions they used and the right most column looks at their scoring criteria.
So if we look at some of their interview questions, they ask questions like tell me some things that die. If children
did not name people as some things that die, there was a planned follow up where the experiments asked, "Do
people die?"
So the important thing here in is that while there was a contingency in place where if they didn't mention people in
their answer, the experimenter was planned to follow up. This wasn't just a random diversion that the experimenter
chose to follow up on. In a structured interview, everything needs to be planned out and you need to follow the
exact same structure with all participants in your study. You can't just shoot from the hip. Now, when it came to
actually scoring children's responses to these, there were like numerical values given to their responses.
So for example, to that first question, tell me some things that don't die. If the kid did not mention that people died
and when asked if people died, they said, "No, people don't die," they were given a score or of zero. By contrast, if
children would volunteer that "Yes, people die," they would actually be given a score of two. So in this way, tallying
their scores across all the items would probably give you a total score that would say how much does this child
know about the nature of death. So this is an example of a structured interview that can be used to build a really
interesting measure of children's comprehension of death.
So an alternative to a structured interview that's similar in its content, but a little different in how it's administered is
a questionnaire. So questionnaires give participants a pen and paper and allow them to read through items and to
respond using multiple choice or scales, etc. This is great if your participants can read, because it's a fast and
efficient alternative to doing more lengthy drawn out interviews. So for example, if you wanted to collect data from
an entire school or an entire grade of children, giving out questionnaires to everybody would be much more
efficient than interviewing them all one on one.
So here's an example of a questionnaire that yeah, you might use in child research. So this is the screen for child
anxiety related disorders, AKA the SCARED. It was developed by Birmaher and Associates in 1999. So basically
it's a questionnaire. You can see it on the left hand side of the screen. You have a series of questions. This is the
first page. There will be a lot more with questions like when I feel frightened, it is hard for me to breathe or I am
nervous or people tell me that I look nervous. To any of these items, participants could circle the answer that best
applies to them.
So they could say not true or hardly ever true, which would give them a score of zero, somewhat true or
sometimes true, which would give them a score of one, or very true or very often true, which give them a score of
two. So basically at the end when the participant has completed this questionnaire, they could hand it in, a
researcher could very quickly tell you their score and you'd have a total score for their anxiety with a higher score
meaning more anxious-like symptoms. So very fast, very efficient.
But again, this is all with the assumption that the participants in your study, the children are able to read through
the item by themselves and answer honestly without any kind of confused or question about whether or not they'd
be able to complete the survey accurately. So another type of interview we'll talk about is called the clinical
interview. So in a clinical interview, questions can branch off from the pre-planned questions to follow up on the
answers that the interviewee provides. So imagine the child is responding and says something that the interviewer
based on their training thinks they want a little bit more insight into, or they would like a little bit more of an
explanation.
They could follow their intuition and ask a different follow up question. So this is very unlike the structured
interview in which the structure of the interview is set from the beginning. So this is really useful for like clinicians
or like child psychologists for obtaining in depth info about a particular child. So if they were trying to diagnose a
clinical disorder or a learning stability, this is the kind of follow up question that they would likely ask. However, for
larger scale psychological studies, when we're not trying to learn about an individual child, but rather we're trying
to learn about children in general and generalize to a broader population, this type of interview is not really what
you would use.
The reason being that in order to ensure you have high internal validity in your studies, that you can actually
generalize your findings to the real world, all the participants in your study need to be presented with the exact
same questions and prompts. If you don't present the exact same questions and prompts like you would in a
clinical or in a structured interview, you wouldn't be able to generalize your findings because every child's interview
would be different. So now you should have a clear understanding of the distinction between structured interviews,
questionnaires and clinical interviews and in what kind of context you might use each of them.
Longitudinal Designs Transcript
So longitudinal designs are great because they give us snapshot glimpses of different stages of development. So
we can see developmental trends over time, or we can capture when there's suddenly a eureka moment, where
suddenly children at a certain age grasp a new concept that they wouldn't have gotten earlier. Right? So we can
really map out these protracted changes across development. But one thing cross-sectional designs don't do well
is give us insight to the individual. And this is where longitudinal designs really shine.
So longitudinal designs look at changes within the same children over significant periods of development. This
could be months, years, decades, et cetera. Right? So what question you could ask with the longitudinal design
might be, within a single child, how does their self-esteem change across adolescents? Right? So in this graph I've
made over here, again, this is all invented data. We have self-esteem on the Y axis. So higher points on this graph
mean greater self-esteem. And on the X axis along the bottom, we have age in years going from age 11 all the
way up to age 18.
So as we can see with this first line, starting at age 11, it's relatively high on the scale. It increases gradually up
until around about the age of 15, at which point, which we see a big plummet in self-esteem all the way down,
continuing to go all the way down to around the age of 17, and then stabilizing somewhat or starting to rebound.
Right? So this could be the pattern that emerged in one particular child. Now, if we looked at a different child, we
might see a different trajectory altogether, right? So this child started off slightly lower at age 11, but their selfesteem seemed to maintain relatively similar levels across adolescence, and in fact, increasing ever so slightly. So
whereas the first child plummeted at age 15, this child did not. Right?
If we were just looking at this from a cross-sectional perspective, both of these children would've been grouped in
together at each of these developmental stages, right? First of all, we would've only looked at them once, but they
also would've been averaged out. Right? So all of these differences in their unique developmental trajectories
would've been lost in a cross-sectional design, but not in a longitudinal design. And so when we see these different
trajectories, we can ask followup questions like, "Okay. What individual factors are actually impacting these
different trajectories?" And it's really only by looking at these changes within the individual and these individual
factors can we really answer questions like this. You wouldn't be able to a answer question about these individual
factors using a cross-sectional design.
That being said, there are also some problems, or not problems, but some difficulties that come with longitudinal
designs. So as rewarding as they are, they're very difficult to pull off for a couple of reasons. One is it's a really big
investment. So it costs a lot of time and a lot of money, right? You need to be able to facilitate people coming back
to participate in the study over long periods of time. But of course, people move and people will fall out of the
study. So you need to find ways of really incentivizing them to stay in the study, to facilitate them coming to
participate in the study, even if that means buying them a bus ticket or buying them a flight. So basically, you need
a lot of money and a lot of planning to pull off a longitudinal study.
I just alluded to this. Many participants are likely to drop out, right? So over the example I gave here was an eight
year period. A lot of your participants are likely to drop out, right? Maybe they move away and they don't want to
participate anymore, or maybe they had a bad experience, or maybe they've become shyer. There's any number
of reasons why someone might not want to participate in the study over time. But for that reason, you tend to lose
people over time, which makes them a high risk, but high reward endeavor. So really, oftentimes, cross-sectional
designs are just a lot more practical because you can get a bunch of people at single time points. They only have
to come in and be studied a single time, making it a lot more practical to answer most, but not all developmental
research questions.
When people think about developmental psychology, I think they usually think about these longitudinal studies. But
in reality, these are quite rare. You will not see near as many longitudinal studies due to these difficulties we just
discussed as you would see cross-sectional studies. What really makes them special is their ability to look within
the individual and to see how individual differences in people influence their developmental trajectory across time.
That's the real key question that a longitudinal design can answer that other designs just can't.
Reliability Transcript
For a study to provide worthwhile and meaningful results, we need to have good measures, right? And so in this
video we're going to be talking about the reliability of the measures we use in psych studies.
When we talk about reliability, we're really talking about the consistency of a measure. We're asking, does our
measure produce similar results when we expect that it should? If not, we got a problem. The first form of reliability
we'll talk about is called test-retest reliability. If a trait being measured is supposed to be relatively stable across
time, our measurements of that trait should be relatively stable across time too. The thing we're trying to measure
isn't supposed to be changing from day-to-day. Maybe it's thought to be a permanent aspect of someone's
cognition or personality, or maybe it's something that changes over a much slower and longer time scale.
If we're getting day-to-day changes, we got a problem, right? If the thing you're supposed to be measuring is
supposed to be relatively stable across time but your measurements aren't relatively stable across time, you have
a problem with test-retest reliability. Here's an example. Imagine we're developing a new test of children's abstract
reasoning. Okay? On the first day, Timmy completes the task and is told that he's highly gifted. Like, wow. He
scored really, really well. That's awesome. Timmy gets the exact same test four days later only this time his results
indicate that he's only about average. Given that his abstract reasoning is conceived to be a relatively stable trait
about his cognition, it couldn't have changed internally in about four days of time, right? So we know that
something's up with our measure. If our measure is not getting consistent results for this relatively stable trait, we
have a problem.
Another kind of reliability is known as inter-rater reliability. In this case you have two different experimenters who
are both using the exact same measure and they're scoring the same participant so you'd expect them to get
similar results. The question is, if two different experimenters are scoring a measure, do they get similar results?
Let's generate a hypothetical example. Let's imagine the University of Waterloo developed this new scale for
evaluating how cute a baby is. The Waterloo Baby Cuteness Scale. We have two different experimenters who are
using this scale to evaluate the cuteness of a baby. Experimenter one says, this is the cutest baby I've ever seen,
15 out of 15, very cute. Experimenter two on the other hand using the exact same scale, came up with the finding
that, eh, it's a pretty standard baby, only giving it an 8 out of 15. So these are two people who are supposed to be
using the same objective measure of cuteness and they have dramatically different results. This is an example of
poor inter-rater reliability.
If our goal is to have an objective measure of cuteness, then our measure is currently not reliable. Okay, so let's try
and improve on this. Imagine we have a different inventory now. This is the Wilson Cuteness Inventory. This one's
broken down a little bit more specifically into subscales so it has specific criteria, so cheek size. Both of our raters
would say, okay, I've been trained on how to evaluate cheek size and this is a 4 out of 5. Head roundness, okay,
this is 4 out of 5. Nose stubbiness, very stubby, 5 out of 5. Total score, 13 out of 15. Following these very strict
criteria, both of our evaluators landed on the exact same score and this would indicate for us that now this new
scale, the WCI, is much more reliable than our previous scale because two people, two independent coders are
able to land on the exact same results for the same person. So that's inter-rater reliability.
One important thing to remember is that some things that would be measuring do vary naturally, right? For
example, mood. Your mood on Monday could be totally different from your mood on Tuesday, which could be
totally different from your mood on Wednesday, et cetera. If we had a mood scale that varied dramatically across
these three days, that's totally fine because we also expected the underlying construct that we're measuring,
mood, to vary as well. The important thing about reliability in this case is that your measurements should only vary
as much as the thing you're measuring varies. That's the key point to take away here.
Observational Research Transcript
Okay. Now that we've talked a little bit about interviews and questionnaires, let's take a little look at observational
research. There are two main kinds of observational studies that we're going to look at. The first is, naturalistic
observations. This occurs when you examine children in their usual environments, you might be examining how
children and their parents interact at home, or how children interact at school, in the classroom, or on the
playground. What the researcher is doing in this context, is that they're not in control of what the child does at all.
They've chosen their context in which they're going to observe the child's behavior, and they just code whatever
naturally elicited behaviors occur. For example, they might code or enumerate how often a child engages with a
peer, how much time they spend engaged in a particular type of play, for example, in the classroom, how much of
their time do they spend playing with blocks?
They might code how often or how much time is spent smiling, or laughing, or expressing happiness, as opposed
to neutral emotion, or a sad, or angry emotion. Think back for a second to our discussion about validity. When it
comes to external validity, you can't really get better than naturalistic observations. This is a... People out in the
real world are interacting in very natural environments. And so, this is about as valid as it gets when it comes to
extending the findings from our research to real world context. However, what you sacrifice there is the ability to
control what's going on. So, if we were studying children on a playground, we have no means of controlling how
many children are there. What they're doing on the playground, who may enter the scene or leave the scene, what
people might say to each other. We have zero control at all. It's always a trade off like that.
Now, you can recoup some of that control in what is known as a structured observation. A structured observation
would involve each child being placed in the exact same controlled situation, while their behavior is recorded. Each
child will get the exact same environment, the exact same instructions, and the exact same proceeding
interactions. Imagine you took two children, you brought them into an observation room and you said, "Okay, I
need the two of you to complete this Lego model. It's a really complex Lego model. I'm going to give you five
minutes, have at it." And then you left the room, you had a camera recording, and you observe their behaviors as
they interacted with each other to try and complete the Lego model.
By tightly controlling all these environmental factors, the exact same room, the exact same number of children, the
exact same instructions, researchers can then identify what specific factors are influencing children's behavior. For
example, you could vary some of the different task characteristics. The different roles in the game. You could bring
two children in to do this Lego model. One of them could be the planner and the other one could be the builder.
And you could observe how these different roles influence the communications they engage in, in each of these
roles. You could vary different environmental factors. Maybe in one study you look at two children per room,
whereas in a different study, you look at four children per room, and you see how the dynamic changes, or how
their observed behavior changes, as a function of how many kids are in the room.
Or, you could even look at individual differences. You could purposefully seek out children with certain personality
characteristics, such as higher, low shy, children. And you could pair them in groups as a function of that, and
observe how their communications differ as a function of these individual differences. The key here is that, it's still
an observation, so you're still leaving children to their own devices and observing their behavior, but rather than
being out in the wild, you're in a much more predictable environment, and you can control for external influence to
an extent. What the children do when they're in the room is still up to them, but you've provided them a closed
system, in which to interact.
Validity Transcript
Okay. So we've talked about reliability, which is how consistent a measure is. Now we're going to look at validity,
it's counterpart.
So validity focuses on the accuracy of a measure. Or it asks, are we actually measuring what we meant to
measure? A measure is only as good as it's telling us what we need it to tell us. So questioning whether or not our
measure is accurate, or whether it's actually getting at the kind of conceptual content that we want to be
measuring, is really, really important.
There's two components, or two aspects, of validity that we're going to talk about. And in any study, you're really
trying to balance these two aspects, because sometimes what will increase one might decrease the other. So it's
always a balancing act.
First there's internal validity. Internal validity asks, are we sure that the effects in our study are caused by the
manipulations we made? Or could other factors be influencing the outcome?
By contrast, external validity asks, are we sure that the finding of our study will generalize to other groups, for
example, the children from another culture, or to real world situations outside of the lab? For example, in real world
social interactions.
So the thing about designing a study with internal and external validity in mind, is that oftentimes you have to make
trade offs when making decisions about how you're going to design your study. So the decisions that might
optimize your internal validity, give you lots of control and the power to really clearly see what's influencing what,
might at the same time kind of reduce how much your experimental situation really resembles the real world.
By contrast, when you design a study that really closely resembles the real world, or maybe even takes place in a
real world context, you end up losing the power to tightly manipulate and control every single variable that's going
on in the environment. So we're going to look at an example now of two different studies and how they exemplify
this trade off between internal and external validity.
Let's imagine we were really interested in examining the research question, are shy children more sociable in the
presence or absence of an authority figure? Okay? So here's one approach to designing a study to investigate this
question. Let's imagine we put kids in, they brought them into the lab, and we gave them a task, in which they
were supposed interact with another child over the computer, right? So secretly, this child they're interacting with is
all pre-programmed by the computer to make sure that every kid who participates in this study has the exact same
interaction, right? In one condition, you leave the kid to have this interaction with this unknown peer. In the second
condition, it's the exact same setup, but in this time you have a teacher or an authority figure who is monitoring the
situation on the computer. So this is again going to be a very tightly controlled, like popup window, where it says
"The authority figure is watching what you're doing right now." Right?
In this case, what you'd be looking at is whether the child's social behavior, the kind of messages they type to this
peer they're interacting with, would be more social in condition one, when there is no authority figure, or in
condition two, when there is an authority figure. And whether that varies as a function of the kid's shyness.
So the positive aspect of this study is that it has really high internal validity. So we've got a really tightly controlled
environment here, right? So we have every kid in the exact same situation. Their peer is going to have the exact
same communications. In the authority figure condition, that authority figure is going to have the exact same
communication. So the only difference between these two conditions is the presence of the authority figure.
Everything else is very, very highly controlled.
The weird thing about this study though, is going to be that, if what we're really trying to learn about is how shy
kids socialize with other kids, this doesn't really look a whole lot like normal socialization context, right? This is very
contrived and it doesn't really reflect the kind of socialization that they might do on the average day-to day basis,
right? So while we have a tightly controlled situation, what we don't have is something that might necessarily
extend to all circumstances.
Okay. Let's take a slightly different approach. Now let's consider actually going out into the world and observing
kids' behavior in a playground context, or in a schoolyard context. So we go out and we start observing children as
they're interacting with each other in the school yard. And we make note of two different circumstances. One, how
children are behaving when they're all alone, it's just the kids. Versus two, how the kids are behaving when their
teacher is nearby. Okay?
So we're coding their behavior, and we're making note of whether their teacher is there or not there. The pro of this
design is that we have much higher external validity, right? We are not controlling anything at all. We have not
made up some weird lab context. This is the real world context that we're trying to learn about. We're observing
kids in their real world environment. But the thing we're sacrificing is, we have no control about any other factors,
right? We don't know what games they're playing. We don't know what kind of external things might be going on,
what they might have been doing immediately before they were being observed or after. What kind of interruptions
may occur. We don't know how often their teacher is going to be around, or how their teacher is going to be
interacting with them. So while we have the high external validity of observing them in this very real world natural
context, what we don't have is high internal validity, because we have very little control over what's actually going
on in this environment.
So is one of these studies better than the other? Not really. They're both really important actually. And they really
compliment each other. So really, if you had a theory about how shy kids were interacting with their peers in the
presence of a authority figure, you really would want both of these studies to really work together to inform your
theory, right? The point is that, when you have certain considerations to maximize one kind of validity, you might
actually be making sacrifices about the other. That's the real takeaway point here. So hopefully that's nice and
clear.
Download