IQ Test Issues

advertisement
Testing 461 3 – 15 – 99
Brief lecture
 Cognitive tests are tests of achievement and aptitude. The are less
expensive and are easy to score.
 Group tests are less expensive and easier to score. They are cheap and
easy. Anyone can administer them. Its drawback is that the test quality is
dependent on the individual administering the test. They cannot discriminate
as finely as an individual test.
 Group tests – There is a “lower ceiling” and higher ceiling”. We cannot
differentiate the lower from the higher scores. So GROUP TESTS ARE
GOOD FOR PEOPLE WHO SCORE IN THE MIDDLE.
A person’s IQ is dependent on genetic endowment & past experiences. There is
a complex interaction between the two. An IQ test is a general level of
intellectual functioning. IQ tests measure the same underlying trait. Intelligence
tests predict future behavior. Know what the “Jangle Fallacy” is.
1
Testing 461 3 – 22 – 99
Second lecture for test two
Intelligence is one of the most researched topics in psychology; but its
definition remains elusive. An OPERATIONAL DEFINTION defines a concept in
terms of the way it is measured. “It tests what it tests.” Intelligence tests were
invented to measure intelligence, not to define it. A REAL DEFINITION is one
that seeks to tell us the true nature of the thing being defined. So we ask
experts in the field.
Experts broadly agree that Intelligence is the capacity to learn from
experience and the capacity to adapt to one’s environment. That learning and
adaptation are both crucial to intelligence stands out in certain cases where
mentally disabled persons fail to possess one or the other capacity in sufficient
degree.
Theorists and contributors to the concept of intelligence.
 Francis Galton (1869)
 Binet & Simon (1905)
 Charles Spearman (1904, 1923, 1927)
 L. L. Thurstone (1931)
 Wecshler (1939) The aggregate to global capacity of the individual to act
purposely, to think rationally and to deal effectively with the environment.
 H. Gardner (1983, 1993)
 Raymond Cattel
 R. Sternberg (1985, 1986)
The STANFORD – BINET TEST (4TH EDITION)
Why do IQ tests if genetics have your destiny in stone? So we can identify
those who have impairments and intercede in time to help them.
Binet and Simon introduced in 1905 the first IQ test to identify kids who
would do well in education; those that did not do well would be placed in a
special education program. There were 30 items on the test. They were
representative items for what kids were going to be asked. The aim was not
measurement but classification.
It was a brief and practical test. It measured directly what Binet & Simon
regarded as the essential factor in IQ, practical judgement, rather than wasting
time with lower level abilities involving sensory, motor and perceptual elements.
They took on a pragmatic view of IQ. Much of the test weighed on verbal skills –
to get away from Galton’s tradition.
2
The major innovation of the 1908 scale was the introduction of the concept of
mental level. The tests had been standardized on about 300 normal children
between the ages of 3 and 13 years of age. This allowed Binet & Simon to order
the tests according to the age level at which they were typically passed.
Sterns corrected a flaw with their work and said a 5-year-old functioning at
the 2-year-old level was impaired than a 13-year-old at the 10-year-old level
even though the age difference is 3 years for each case.
William Sterns came up with IQ = mental age/chronological age x 100.
We do not use ratio IQ anymore we use deviation IQ. Deviation IQ indicates the
degree to which an individual deviates from the expected performance for a
person of their age. We are compared to people of our same age.
In 1916, Louis Terman and his associates at Stanford revised and translated
the Binet & Simon scales producing the Stanford – Binet test. Terman was the
one to abbreviate “IQ”. It is in its 4th edition. It is one of two individual IQ tests;
the other is the Wechsler test (WPPSI – R)
The David Wechsler Preschool and Primary Scale of Intelligence – Revised
looks at both verbal and nonverbal scores. There is a verbal IQ and a
Performance IQ. They combine to make a full scale IQ.
 The psychoanalytic view looks at IQ as a multifactorial trait that involves both
general cognitive ability, plus the sum total of knowledge that someone has
acquired. Captures aptitude and achievement.
 p. 156 Remember Galton said IQ was by hereditary. Did a lot of
psychophysical tests. IQ was determined by sensory ability; his ideas relate
to the “speed of processing” tasks on IQ tests. This is all inherent according
to him. Different from Binet & Simon’s views on verbal ability.
3




Spearman and the g factor. 2 Factor Theory of Intelligence
p. 156 Charles Spearman and the G Factor. Spearman proposed that IQ
consisted of two kinds of factors: a single general factor, g, and numerous
specific factors, s1 s2…Spearman also helped invent FACTOR ANALYSIS to
aid his investigation of the nature of IQ.
In Spearman’s view, an examinee’s performance on any homogenous test of
intellectual ability was determined mainly by two influences: g, the general
factor, and s, a factor specific to that test. He likened g, to “energy” or
“power” in the cortex. And s, the specific factor, was to have a physiological
substrate localized in the group of neurons serving the particular kind of
mental operation demanded by the test.
Spearman believed that individual differences in g were most directly
reflected in the ability to use three principles of cognition: apprehension of
experience, education of relations and education of correlations. “Education”
refers to the process of figuring things out.
He reasoned that some tests were loaded with the g factor and some
concentrated more on the s factors. 2 tests with the g factor should correlate
strongly. In contrast, tests not saturated with g should show minimal
correlation with one another. This is referred to as the 2-factor theory of
intelligence.
THURSTONE and the Primary Mental Abilities. p.158
He didn’t agree with Spearman’s g factor, he said there are several broad
group factors that could best explain empirical results from tests. He came up
with another factor theory. There are 7 factors out of 12 he proposed that have
been corroborated. They have been designated as primary mental abilities.
They are verbal comprehension, word fluency, number (math), space (visualize
a 3 dimensional object in your head), associative memory, perceptual speed
and inductive reasoning (finding a rule in a number sequence completion test.)
Thurstone viewed IQ as being able to be flexible and modifiable in behavior
in a meaningful way.
4
Raymond Cattell and the Fluid/Crystallized Distinction
Cattel said that g could be subdivided further. Cattel proposed an influential
theory of the structure of intelligence that has been revised and extended by
Horn (1968, 1985). Instead of finding a single general factor or a ½ dozen group
factors, Cattel and Horn identified 2 major factors which they labeled fluid
intelligence, gf and crystallized intelligence, gc.
Fluid intelligence is a largely nonverbal and relatively culture – reduced
form of mental efficiency. It is related to a person’s inherent capacity to learn
and solve problems. So it is used during adaptation to new situations. By
contrast crystallized intelligence represents that one has already learned
through the investment of fluid intelligence in cultural settings.
Crystallized intelligence is highly culturally/experience dependent and is
used for tasks which require a learned or habitual response. Since crystallized
intelligence arises when fluid intelligence is applied to cultural products, we
would expect these two kinds of intelligence to be correlated.
GARDENER and the Theory of Multiple Intelligences
Howard Gardener had a much broader theory of intelligence. He did his work
in the 1980’s. He proposed a theory of multiple intelligences based loosely upon
the study of brain – behavior relationships. There are several relatively
independent human intelligences. He says six natural intelligences have been
confirmed: linguistic, musical, logical/math, spatial, bodily/kinesthetic and
personal intelligence. So someone like a ballet dancer had intelligence. IQ is the
skill to be able to solve problems or create products that were valued by more
than one culture.
5
Sternberg and the Triarchic Theory of Intelligence p.168
In addition to proposing that certain mental mechanisms are required for
intelligent behavior, he also emphasizes that intelligence involves adaptation to
the real – world environment. So he valued “practical intelligence”. Sternberg’s
theory is called Triarchic because it deals with three aspects of intelligence. He
said IQ tests look for only memory, not the fit between the person and the
culture/environment. Truly intelligent people adapt themselves to the
environment or change the environment themselves to fit themselves. Sternberg
(1986) has made it clear that intelligence just has too many components to be
measured by any single test.
1. Componential intelligence – Consists of the internal mechanisms that are
responsible for intelligent behavior.
2. Experiential intelligence – A person with this intelligence is able to deal
effectively with novel tasks.
Contextual intelligence – Defined as “mental activity involved in purposive
adaptation to, shaping of and selection of real world environments relevant to
one’s life.” So it is a “niche finding.” We either shape or select the environment
that fits our needs.
6
Testing Class /4 – 5 – 99
IQ Test Issues
 Validity – Do they tests measure what they are supposed to measure? They
do well for predicating. But not for individual prediction. Better predictions
come with grades, study habits, motivation and home and family variables.
 Stability of IQ – IQ is a fairly stable trait. If you are real young, age 5 for
example, then ones IQ can greatly differ, but the older you get the more
stable it becomes. By 17 your IQ is in stone. But individual variation can be
great due to dramatic changes in health and living conditions.
 Correlations of IQ – Family size and birthdays. The oldest/first born is the
smartest if there is a small family size. Socioeconomic status says the richer
you are the higher your IQ. There is more resources to learn. Compared to a
poor working class family who do not have the time and resources to make
their children smarter. The poor kids score 10 – 20 points lower than middle
class folk. Not that the poor are less intelligent but the opportunity to learn is
not there plus their own helplessness living in poor neighborhoods doesn’t
motivate them to do well in school.
 Ethnicity – Asian Americans score 10 – 20 points higher than whites.
Blacks/Hispanics/Native American Indians score 10 – 15 points lower than
whites. But 15% - 25% of blacks score higher than whites. Asians in Asia
score higher than anyone else.
TEST BIAS Possible explanations for group differences in IQ.
Test bias Hypothesis – Middle class whites were made the norm so of
course the test will reflect their way of life and not the life of a poor kid. So these
white kids have more experience with the test already because they “lived it” in
a sense. They were the target for the assessment.
The definition of test bias is: Differential Validity for different sub – populations.
Whereas fairness is the appropriateness of action which taken based on those
results.
7
However some culture fair tests show some of the same biases among races.
The differences may be due to MOTIVATION, which is culturally influenced.
 The real issue is BIAS Vs. FAIRNESS. A test is bias if it is differentially valid
for different groups.
 Fairness is the appropriateness of action based on those results.
 2nd hypothesis – genetic hypothesis was proposed by Howard Jensen. No
great research to prove this. He said differences are in abstract reasoning
research.
 Last Hypothesis – Environmental hypothesis, growing up impoverished with
poor nutrition, poor education and fewer resources to learn. They may
become less motivated to do well in school. They won’t see an education as
a ticket out of their destructive lives. If there is no hope then “why try”
becomes their attitude.
 Support for the environmental hypothesis comes from adoption studies. Poor
blacks raised in rich white families scored higher on IQ tests than blacks
raised in their poor neighborhoods.
 The BELL CURVE book says IQ is inherited - So a task force was formed by
the APA to refute this. As a group when people reach adolescence their IQ’s
remain stable. But there’s a difference on longitudinal research, which looks
at individuals where scores can change by 18 points.
 The higher your IQ, the more you can pick out a particular stimulus.
 Nerve conduction – People who are intelligent have higher nerve conduction
rates.
 Heritability is .45 for children and .75 for adults.
The Shipley –test we took was used as a general – full scale IQ test. It was first
used to determine dementia. It uses T – corrected scores. Good concurrent
validity of .6 to .9.
Culture reduced tests try and minimize the effects of culture. Some cultures
place values on speedy performance, language and familiarity in IQ tests.
3 factors:
1. Minimize language on culture free tests by using non – verbal test items.
2. Second factor is SPEED
3. TEST CONTENT – Items on the test are equally familiar or equally
unfamiliar.
 Culture fair IQ test. Non –verbal group IQ test. CATTEL took Spearman’s G
factor into crystal (culture) and liquid intelligence.
8
 Cattel – create a test on pure “LIQUID” intelligence. It has to be
accomplished.
The Stanford Binet IQ test was the Standard IQ test for a long time. Then in
1939 David Wechsler came about. He created his “Wechsler Intelligence
Scales.” Wexler said the Stanford test concentrated on verbal abilities too much.
The Wexler test now referred to as the WAIS –III came about in 1998. It
overtook the Stanford in the 1970’s.
5% of kids in school have some kind of disability. Achievement tests – the
students have to read them. It is not given orally like an individual test.
 A BASELINE is established when the kid answers all the correct # of
questions in a row. The Assumption is that these are all easy questions.
 A CEILING is reached when the kid answers a string of questions incorrectly
in a row. The questions are too hard.
WISC is for kids.
 Coding test subtest – we give them a test booklet and pencil with no eraser.
Another subtest is for Similarity. Ex. how are a wheel and a ball alike. Their a
circle, they roll etc…
 Rearrange pictures to put them in order of a sequence.
 The Block test is the single highest performance for IQ.
 Single most important subtest is Vocabulary. Verbal skills are emphasized –
just define words.
 Object – Assembly test. A picture puzzle for everyone. Everyone gets same
presentation. Points are awarded for junctures – they don’t have to solve the
entire test. This is the least reliable of the subtests.
Common sense test look for practical everyday problems.
9
4 – 12 – 99/Last lecture before 2nd test
Continue the Stanford – Binet & Wexler Subtests lecture from last week.
Remember the list we came up with one month ago on what are the constructs
of intelligence. This is that list she typed up for us.
 Creativity – Not really assessed on mainstream tests today
 Adaptation – coding sequence ability & block design test.
 Speed of Processing – All timed tests assess this.
 Logic – Picture story sequence and block design.
 Breadth of knowledge – This is covered in those tests.
 Motivation – Well, just the fact they are taking an IQ test says something
about this. Higher scores would mean higher motivation.
 Interpersonal Skills – Not really assessed either.
So creativity and interpersonal skills are NOT ASSESSED on IQ tests.
Sterberg did a study on his own:
Expert list on IQ is:
1. Verbal ability IQ
2. Problem – solving IQ
3. Practical IQ
Public list of IQ was:
1. Social Competence – This was on the test; public chose this as #1.
2. Verbal ability
From last lecture:
The Stanford – Biney test was very popular. It was the only test to be used for a
long time then Wechser’s test takes over in the early 1970’s.
10
Continue w/S.B. Test
The Stanford – Binet has been criticized for its lack of uniformity. However,
Psychometric principles are good. Test –retest is really good for the S.B. The
Wechsler test became very popular because of its use of both verbal & non –
verbal subtests.
The S.B. is better than the Wechsler if we want to find out the people in the
lower or higher part of the percentiles. The S.B. can spot finer differences in
scores here. It can spot out mentally retarded people better because the
Wechler test bottoms out in the mid 50’s for IQ regardless of which test is used
for IQ.
 90 – 110 Average. 50% of the population will fall into this range.
 110 – 120 Above average
 130 and up is Very Superior
Then:
 80 – 90 low average
 70 – 80 Borderline
 70 and below is Mentally Retarded.
 55 – 70 mildly retarded
 20 – 40 moderately retarded
 20 – 25 Profoundly retarded
Developmental Tests – 4 Scales/categories
These scales are used to assess infants and pre – schoolers. Much more
difficult to assess than school age kids because they cannot talk and don’t keep
their attention on you. And they don’t have any motivation yet. They don’t care
what you are doing. They all have to tested individually. Mothers are good
informants about their baby’s behavior. These tests have low reliability and low
validity then with school age kids.
So developmental tests is to Identify At Risk Children for Development
Delays. So Concurrent Validity is going to be important.
We want to know that our assessment is consistent with what mom is telling
us about their babies, with what pre –school teachers are saying. The goal is to
get early intervention. When we spot a developmentally delayed baby then the
earlier the intervention the better off he will be.
These test lack predictive validity. It will not predict cognitive ability. We are
looking at sensorimotor abilities only so it won’t predict cognition. There is no
correlation between sensorimotor abilities and cognition.
11
BAYLEY II Test.
The NORM group was babies from 2 months to 42 months old – Bayley II
scales. We give this test to babies we think are NOT NORMAL. This test will
spot the delayed kids out. The manual is very specific and the tester must know
the test cold. You cannot keep looking back at the manual to look for
instructions. Tester must “engage” with baby with novel toys – bells and sugar
pills. You must first look at the baby’s behavior. Is he teething or not. Mean of
100 and SD of 15 but it is not an IQ test.
Denver II Developmental Screening Test
Most babies will get this test. Don’t have to be trained like the Bayley test. It
is quicker to take. Works both for healthy and “at risk” kids. A delay is when a
kid did not pass an item that 90% of other kids their age did pass. Abnormal kid
when a kid has 2 or more delays in the area the kid is being tested on.
It is individually given test. It is just a screening instrument. It rarely yields
false positive – 5% of kids turn out normal when the test said they were
abnormal. But it has a huge problem with false negatives – It doesn’t screen out
kids who should have been screened out. 80% turned out screwed up when the
test said they are okay. But when it does “red flag” kids, those kids are likely to
have developmental problems.
Vineland Test
Today we have a law passed by Congress to define mental retardation as
being not only an IQ that fell two standard deviations below the mean but also
an adaptive behavior measure
That fell 2 SD’s below the mean. Self sufficiency counts; kids who were once
classified as mentally retarded were not because of language barriers.
The Vineland test identifies kids who are not self – sufficient. It can be done
over the phone.
12
Factor Analysis
We divide the “g” factor up into distinct mental components. We use
subtests to identify the “g” factors. Take a Global score then use factor
analysis on a group of items to find “g”. Possible to make multiple aptitude test
batteries. Ex. BAT, Differential Aptitude test, the vocational aptitude test.
GRE Test
GRE is now done on the computer. Easy to take it now – plus the results come
back immediately. Been around since 1949. But you cannot go back to change
an answer. You can drop the test before leaving the door. Doesn’t really predict
future GPA but does when the range is restricted. You must wait 60 days before
taking the test again. Start out with moderately difficult question – then if you
score correctly you get harder questions, if you score bad you get easy
questions. Sorta “interactive”. Abstract reasoning part was added on in the
1970’s. Different Programs look at different parts of the test. The entire
purpose of the test is to predict graduate performance. GRE scores
are correlated with first year scores.
You are not invited to come back in grad school if you have a “C” grade. But
the norm was taken with kids regardless of grades and admitted them. Then
there is good validity with the GRE.
CREATIVE TEST (not in text)
p.100 in workbook.
J.P. Guilford said creative people create new ideas and products which are
novel. This comes from Divergent thinking. Convergent thinking is what we are
trained to do; we get a single correct response to an question.
Paul Torrance came up with a test using 4 factors Guilford found. Divergent
thinking would first be a period of more common unoriginal type of thinking, then
later we get more creative. These 4 factors are:
1. Fluency – Total # of ideas
2. Flexibility – Total # of categories those ideas fall into. More categories, more
creative
3. Originality – Novelty of response.
4. Elaboration – How many details were we able to come up with.
13
 Bias in Content Validity – Bias in content validity is probably the most
common criticism of those who denounce the use of standardized tests with
minorities. The items ask for information that minority or disadvantaged
children have not had equal opportunity to learn.
 Bias in Predictive or Criterion – Related Validity – A test is considered
biased with respect to predictive validity when the inference drawn from the
test score is not made with the smallest feasible random error or if there is
constant error in an inference or prediction as a function of membership in a
particular group. Acoording to this viewpoint, a test is unbiased if the results
for all revelant subpopulations cluster equally well around a single regression
line. An example is an unbiased SAT test will predict future academic
performance of both blacks and whites with near – identical accuracy.
 Bias in Construct Validity – Since this is such a broad concept, the
definition of bias in construct validity requires a general statement amenable
to research from a variety of viewpoints with a broad range of methods. If a
test is non – biased then comparisons across relevant subpopulations should
reveal a high degree of similarity for the factorial structure of the test & the
rank order of item difficulties within the test.
In general, ability and aptitude tests fare quite well by these criteria.
FAIRNESS – Social Values & Test Fairness
The concept of test fairness incorporates social values and philosophies of test
use.
1. Unqualified Individualism – In the open market American tradition, the ethical
stance of unqualified individualism dictates that, without exception, the best
qualified candidates should be selected for employment, admission or other
privilege.
2. Quotas – The ethical stance of quotas acknowledges that many
bureaucracies and educational institutions owe their very existence to the city
or state in which they function. The city exists at the will of the people so
these institutions may be ethically bound to represent their city by hiring a
proportionate number of each race into their employment.
3. Qualified individualism – is a radical variant of individualism. This is when
one is hired regardless of race and gender and solely upon tested abilities.
14
With respect to selection ratios, the practical impact of qualified individualism is
therefore midway between quotas and unqualified individualism.
Vandenberg & Vogler concluded that a substantial genetic component to IQ
has been proved by decades of adoption studies, familiar research, and twin
projection. The genetic contribution to human IQ is usually measured in terms of
a HERITABILITY INDEX which can vary from 0 to 1.0. The heritability index is
an estimate of how much of the total variance in a given trait is due to genetic
factors. 0 means no genetic factors make a contribution while 1.0 means that
genetic factors are exclusively responsible for the variance in a trait. Of course,
heritability is somewhere between the 2 extremes.
15
Download