Types of assessments

advertisement
Examining Early Child Development in LowIncome Countries: A Toolkit for the
Assessment of Children in the First Five
Years of Life
Lia Fernald, Ph.D.
Patricia Kariger, Ph.D
Patrice Engle, Ph.D
Abbie Raikes, Ph.D.
Acknowledgements
• Inspiration & funding from the World Bank
– Barbara Bruns, Sophie Naudeau, Harold Alderman, Ariel Fitzbein
• External reviewers
–
–
–
–
–
–
–
–
Frances Aboud, McGill University
Santiago Cueto, Catholic University, Peru
Ed Frongillo, University of South Carolina
Jane Kvalsvig, University of Kwa-Zulu Natal, South Africa
Ann Weber, University of California, Berkeley
Paul Wassenich, University of California, Berkeley
Michelle Neuman, The World Bank
Mary Eming Young, The World Bank
• Collaborators
– Emanuela Galasso, The World Bank
– Lisy Ratsifandrihamanana, Madagascar
– Lourdes Schnaas, Mexican Institute of Perinatology
• Research assistants
– Robin Dean (UC Berkeley), Kallista Bley (UC Berkeley), Melissa Hidrobo (UC
Berkeley), Anna Moore (Cal Poly)
• Photo credits for photographs included in presentation
– Lia Fernald, Emanuela Galasso, Lisy Ratsifandrihamanana, Ann Weber, Tricia Kariger
Today
•
•
•
•
Importance of measuring child development
Domains of development to be measured
Theoretical decisions in selecting instruments
Modification, adaptation and standardization of
existing tests
• Creation of new tests
• Training and quality control
• Conclusions and recommendations
Today
•
•
•
•
Importance of measuring child development
Domains of development to be measured
Theoretical decisions in selecting instruments
Modification, adaptation and standardization of
existing tests
• Creation of new tests
• Training and quality control
• Conclusions and recommendations
Introduction: Why measure child development?
>200 million disadvantaged children worldwide
Percentage of disadvantaged children under 5 years old by country in 2004
Grantham-McGregor et al., Lancet (2007)
Ecological model of child development
Adapted representation of Bronfenbrenner’s ecological model of child development (Wortham, 2007)
Conceptual framework
Direct
&
Indirect
Effects
From Walker and al. Lancet, 2007
• Environmental factors
– Psychosocial risks: harsh disciplinary techniques, maternal depression
– Biological risks: malnutrition and infectious diseases
• Poverty and socio-cultural factors increase likelihood of both
types of risks
Timeline of development
Timing of human brain development, from Grantham-McGregor, et al., 2007
• Early childhood is characterized by developmental spurts
and plateaus
• Skills emerge at different rates and ages
Differential risk and vulnerability
• Children’s development from
0-5 is dependent on quality of
early environments and
relationship with caregiver.
• Young children growing up
in poverty are
disproportionately exposed
to a wide range of risk
factors:
• Poor nutrition
• Less stimulating learning
environments
• Poor sanitation
• Stressful life events
• Exposure to environmental
risks
Poverty and cumulative risk
• Number of risk factors
increases over time.
– Cumulative effect of risk factors
becomes more evident as
children get older
• Higher cumulative levels of
risk are associated with:
– Poorer cognitive development
– Psychological distress and
behavioral problems
– Slower and lower quality
communicative development
Cultural norms and development
• Cultures have a wide range of
values for when and how skills and
abilities develop in children.
• As school becomes more universal,
however, the necessary skills
become more consistent across
cultures.
• Through modification and
adaptation, every effort must be
made to ensure that tests are fair
for all children assessed.
Today
•
•
•
•
Importance of measuring child development
Domains of development to be measured
Theoretical decisions in selecting instruments
Modification, adaptation and standardization of
existing tests
• Creation of new tests
• Training and quality control
• Conclusions and recommendations
Domains of development
• Domains of development:
–
–
–
–
Cognitive
Language
Motor
Executive function/selfregulatory
– Social/emotional
• Domains are overlapping
and mutually influencing
• Every effort should be made
to include multiple domains
when assessing children’s
development
Cognitive skills
• Cognitive skills include: Analytical
skills, mental problem-solving,
memory, and early mathematical
abilities
• Indicators:
– Children near school age: knowledge
of letters and numbers, ability to retain
information in short term memory,
knowledge of key personal information
– Children in school: knowledge of
letters and numbers, reasoning,
problem-solving, memory, and
mathematical abilities
Executive function
• Defined as fluid abilities or processes
that are engaged when a person is
confronted with a novel situation,
problem or stimulus
• Both cognitive and emotional processes are
involved
– Cognitive: remembering arbitrary rules and
other non-emotional aspects of the task
– Emotional: inhibition or delayed gratification
• Indicators:
– Working memory
– Inhibition of behavior or responses as
demanded by the task (e.g. not opening a box
until a bell rings)
– Sustaining attention as required or ability to
switch attention as necessary (e.g. Shifting
focus from the color of a test stimulus to the
shape of the stimulus)
“NIGHT”
“DAY”
Language development
• Early indicators (infancy): babbling,
pointing, and gesturing. Use
maternal report during this period.
• Later indicators (preschool years):
production and understanding of
words, ability to tell stories, identify
letters, comfort and familiarity with
books. Can use direct assessment.
• Quality and speed of development
highly dependent on quality of
caregiving environment
Motor skills
• Large motor: acquisition of movements that promote
an individual’s mobility (useful to measure in young
children)
– Contributing factors: brain and neuromuscular maturation,
physical growth, caregiving practices, opportunities to
practice emerging skills
• Fine motor: involves hand eye coordination and muscle
control (e.g. drawing, holding utensils, etc.) (more
relevant for older children)
Socio-emotional development
• First two years: relationships
with caregivers, attachment,
trust, and early strategies for
dealing with negative feelings
• Preschool years: social
competence, behavior
management, social
perception, self-regulatory
abilities
Today
•
•
•
•
Importance of measuring child development
Domains of development to be measured
Theoretical decisions in selecting instruments
Modification, adaptation and standardization of
existing tests
• Creation of new tests
• Training and quality control
• Conclusions and recommendations
STEP 1: Define purpose of assessment
Step 1:
Define
purpose of
assessment
For example:
1. To plan interventions or services;
2. To monitor programs;
3. To conduct impact evaluations;
4. To investigate the effect of interventions or
programs on specific outcomes of interest;
5. To design a curriculum for a particular child;
or
6. To diagnose and assess child progress
STEP 2: Determine type of assessment
Abilities
Screening
Brief assessment; identifies children likely to
have problems based on cutoffs derived in test
population. Does not yield continuous scores.
Useful for examples 1-4 above.
Detailed assessment of child’s maximum skill
level for age. Provides continuous scores that
allow comparisons within and across
children/groups. Suitable for all examples
above.
STEP 3: Determine mode of assessment
Direct
Ratings/
Reports
Direct
Observation
Ratings/
Reports
Observation
STEP 4: Determine which assessment to use (examples below)
Denver
(DDST II)
Ages and
Stages
Questionnaires
Naturalistic
sample or
structured
sampling
Bayley Scales III
WoodcockJohnson
WPPSI
Stanford-Binet
Kaufman-ABC
Executive
function tasks
MacArthur
Communicativ
e Inventories
Naturalistic
sample or
structured
sampling (see
IEA’s Child
Coding
System)
Key Questions in Selecting Instruments
• What are the goals of the assessment/evaluation?
• What dimensions of child’s development do you expect to be
affected by the intervention?
– What developmental systems are most vulnerable at a given age range?
– What are immediate outcomes and longer term outcomes?
• What are the mechanisms at work?
– What physiologic processes are influenced by iodine/iron/poverty?
• What are key elements of context that must be considered in
selecting the test?
– Urban/rural, level of poverty, parent education.
• At what level will effect be measured?
– Individual? Household? Population (then consider test such as EDI)?
• How will the sample be selected?
– Population sample? Sub-sample?
• What is the analytic plan?
– Are norms relevant and/or available? Will a cut-off score be used?
STEP 1: Define purpose of assessment
Step 2:
Determine
type of
assessment
For example:
1. To plan interventions or services;
2. To monitor programs;
3. To conduct impact evaluations;
4. To investigate the effect of interventions or
programs on specific outcomes of interest;
5. To design a curriculum for a particular child;
or
6. To diagnose and assess child progress
STEP 2: Determine type of assessment
Abilities
Screening
Brief assessment; identifies children likely to
have problems based on cutoffs derived in test
population. Does not yield continuous scores.
Useful for examples 1-4 above.
Detailed assessment of child’s maximum skill
level for age. Provides continuous scores that
allow comparisons within and across
children/groups. Suitable for all examples
above.
STEP 3: Determine mode of assessment
Direct
Ratings/
Reports
Direct
Observation
Ratings/
Reports
Observation
STEP 4: Determine which assessment to use (examples below)
Denver
(DDST II)
Ages and
Stages
Questionnaires
Naturalistic
sample or
structured
sampling
Bayley Scales III
WoodcockJohnson
WPPSI
Stanford-Binet
Kaufman-ABC
Executive
function tasks
MacArthur
Communicativ
e Inventories
Naturalistic
sample or
structured
sampling (see
IEA’s Child
Coding
System)
Screening versus ability test
• Screening tests: brief assessments to identify
children who are at risk of having development
problems
– Inexpensive, quick, and relatively easy to administer
– Classify children into categories
• Cutoffs used in one population to classify children should
not be applied to another population!
• Ability tests: longer tests that assess the
maximum skill level for a child at any given age
– Continuous scores that can be used to compare
children’s developmental levels with more precision
STEP 1: Define purpose of assessment
Step 3:
Determine
mode of
assessment
For example:
1. To plan interventions or services;
2. To monitor programs;
3. To conduct impact evaluations;
4. To investigate the effect of interventions or
programs on specific outcomes of interest;
5. To design a curriculum for a particular child;
or
6. To diagnose and assess child progress
STEP 2: Determine type of assessment
Abilities
Screening
Brief assessment; identifies children likely to
have problems based on cutoffs derived in test
population. Does not yield continuous scores.
Useful for examples 1-4 above.
Detailed assessment of child’s maximum skill
level for age. Provides continuous scores that
allow comparisons within and across
children/groups. Suitable for all examples
above.
STEP 3: Determine mode of assessment
Direct
Ratings/
Reports
Direct
Observation
Ratings/
Reports
Observation
STEP 4: Determine which assessment to use (examples below)
Denver
(DDST II)
Ages and
Stages
Questionnaires
Naturalistic
sample or
structured
sampling
Bayley Scales III
WoodcockJohnson
WPPSI
Stanford-Binet
Kaufman-ABC
Executive
function tasks
MacArthur
Communicativ
e Inventories
Naturalistic
sample or
structured
sampling (see
IEA’s Child
Coding
System)
Types of assessments: Direct tests
• Pros:
– Data are gathered first hand
– Data can be less biased than
parental reports
– Potentially wider range of
outcomes can be assessed
– Many of the “cons” can be
overcome with careful planning
and preparation
• Cons:
– Young children can be difficult
to test (sleeping, hungry)
– Testers need a lot of training
and oversight
– Accuracy depends on testing
demands and child must be
familiar with parameters (e.g.
best v. worst)
Types of assessments: Parent report
• Pros
– Easy to administer and require
minimal training and instruction
– Often are quick and easy to
complete and to score
– Parents can become involved and
express concerns
– Often correlate well with direct
assessments
– Teachers can be an additional
source of information as children get
older
• Cons
– Parents and teachers may artificially
inflate scores
– Parents may not accurately report
abilities
– Parents and teachers may have
different interpretations of items in
different cultures
Types of assessments: Observation
Types of observation: Naturalistic observation, Sampled observation, Structured situation
• Pros:
– Highly valid
– Measures behavior in an
identified context
– Can provide additional or
confirmatory information for other
types of assessments
• Cons:
– Requires a lot of time and
training
– Need to identify if culturally
appropriate
– Difficult coding since
observational codes and
definitions are not always clearly
defined
STEP 1: Define purpose of assessment
Step 4:
Determine
which
assessment
to use
For example:
1. To plan interventions or services;
2. To monitor programs;
3. To conduct impact evaluations;
4. To investigate the effect of interventions or
programs on specific outcomes of interest;
5. To design a curriculum for a particular child;
or
6. To diagnose and assess child progress
STEP 2: Determine type of assessment
Abilities
Screening
Brief assessment; identifies children likely to
have problems based on cutoffs derived in test
population. Does not yield continuous scores.
Useful for examples 1-4 above.
Detailed assessment of child’s maximum skill
level for age. Provides continuous scores that
allow comparisons within and across
children/groups. Suitable for all examples
above.
STEP 3: Determine mode of assessment
Direct
Ratings/
Reports
Direct
Observation
Ratings/
Reports
Observation
STEP 4: Determine which assessment to use (examples below)
Denver
(DDST II)
Ages and
Stages
Questionnaires
Naturalistic
sample or
structured
sampling
Bayley Scales III
WoodcockJohnson
WPPSI
Stanford-Binet
Kaufman-ABC
Executive
function tasks
MacArthur
Communicativ
e Inventories
Naturalistic
sample or
structured
sampling (see
IEA’s Child
Coding
System)
Other constraints to consider
• Budget: Tests can be very expensive (e.g. $1000 for
Bayley); administration time is a budget issue, too.
• Copyright issues: Must obtain permission for most tests.
• Time allocated for testing: Direct assessment v. parent rep.
• Training: Capacity for administration.
• Test setting: Set-up, lighting, noise, observers
• Capacity of respondent: Education/knowledge of parent
• Language and cultural differences: Words used in testing
materials, approach used for testing (e.g. speedy
response)
• Materials: Must be familiar and/or available (e.g. mirror,
ball)
Ethical risks and responsibilities
• All assessment protocols must
be reviewed and approved by an
ethical review board
• Accuracy and validity are
extremely important especially if
test scores are being used to
identify children “with delays”
• Follow-up (e.g. referrals for atrisk children) should be
mandatory even in the context of
a developing country.
Today
•
•
•
•
Importance of measuring child development
Domains of development to be measured
Theoretical decisions in selecting instruments
Modification, adaptation and standardization of
existing tests
• Creation of new tests
• Training and quality control
• Conclusions and recommendations
Instruments: Modifying and adapting
• No test is “culture free”
– Construct bias (e.g. test doesn’t
measure “intelligence” the same way in
both cultures)
– Method bias (e.g. procedures are
unfamiliar and differentially affect
responses)
– Item bias (e.g. individual test items do
not translate well)
• Existing tests that are reliable and
valid can be used across different
cultures but they must be modified
and adapted to achieve:
–
–
–
–
Linguistic equivalence
Functional equivalence
Cultural equivalence
Metric equivalence (level of difficulty)
Preparatory work for test adaptation
• Involve local professionals to
gather information relating to
linguistic, cultural and technical
details that could be relevant.
– Psychologists, community health
workers
– Early childhood educators
• Produce an accurate
translation
– Translation and back-translation
– Review, comparison, correction
• Pilot translated version to
explore possible areas of
confusion
Steps for test adaptation
• Adapt test content to local
context
– Make as many changes as
necessary while maintaining the
intended “meaning” of the item
– Examples
Example: Modifying Peabody test
Change from dollars
to Ariary
Remove stairs from
bannister
Example: Modifying Peabody test
Modify tractor, smaller
Replace skiing child with
sledding/skating child
Example: Modifying Stanford Binet
• Most materials could be
used as intended
• Description of picture
included automatic washing
machine – changed to
traditional wash board
Example: Modifying Leiter test
Replace car with tractor
Replace ram with pig
Replace flag
Replace straight hair with curly
Replace reindeer with ox
Example: Modifying ASQ
• When in front of a large mirror, does your baby smile or
coo at herself?
Example: Modifying ASQ
• When in front of a large mirror, does your baby reach
out to pat the mirror?
Example: Modifying Motor tests
Steps for test adaptation, cont’d
• Adapt administration
procedures
– Tester (e.g. affect, responsivity,
sensitivity, development of
rapport, willingness to change
environment)
– Test environment (e.g. materials,
table, chair, lighting, sound,
observers, other distractions)
– Test procedures (e.g. accuracy of
parent response, clarity of
instructions)
Example: Modifying ASQ
Will caregivers make accurate assessments of
their children’s development?
– We added 5 demonstration items to
• Provide children a chance to demonstrate behaviors
that may not be easily observed (looking at pictures in
a book; looking in a mirror)
• Act as a validity check of parent responses
Example: Modifying ASQ
Without showing him first, does your child point to the correct
picture when you say, “Show me the kitty” or ask, “Where is the
dog?”
GIVE THE PICTURE TO THE CAREGIVER AND ASK HER TO
SHOW IT TO HER CHILD.SAY TO THE CAREGIVER: “I know
children do not always do what they are asked, but let’s see if he
will do this for us today. Go ahead and ask [CHILD] to show the
kitty, dog, ball or shoes.” INSTRUCT THE CAREGIVER NOT
TO POINT TO ANY PICTURES. YOU CAN ALLOW ABOUT
ONE MINUTE FOR THE CHILD TO DEMONSTRATE THE
BEHAVIOR.
Example: Modifying ASQ
Can we adapt the majority of items across all cultures?
– We added clarifications where items seemed ambiguous
• Does your baby get into a crawling position by getting up on her
hands and knees? [BABY DOES NOT HAVE TO CRAWL, BUT
MUST BE ABLE TO MAINTAIN SELF ON HANDS AND
KNEES.]
• Does your child drink without help from a cup or glass, putting it
down again with little spilling? [CHILD CAN DRINK ALONE
FROM A CUP WITHOUT SPILLING TOO MUCH.]
Today
•
•
•
•
Importance of measuring child development
Domains of development to be measured
Theoretical decisions in selecting instruments
Modification, adaptation and standardization of
existing tests
• Creation of new tests
• Training and quality control
• Conclusions and recommendations
Requirements for creating a new test
• Involve an inter-disciplinary research team
• Use a representative sample for testing items and test
cohesion
• Conduct a detailed analysis of the instrument’s
psychometric properties
• Develop norms or standards that represent typical
development in the population under study
Examples of new tests
• Africa
• Kilifi Developmental Inventory -assesses psychomotor development in a
resource-limited setting
• Grover-Counter Scale of Cognitive Development -developed in South
Africa to assess the level of cognitive functioning of children 3-10 years with
impaired verbal skills
• Asia
• Cambodian Development Assessment Test – measures level of cognitive,
social, motor, and academic development based on country specific standards
• Latin America
• Test de Desarollo Psicomotora -developed in Chile, it evaluates child
development in motor function, coordination, and language
• Escala de Evaluacion del Desarrollo Psicomotor – screening measure of
language, social, coordination, and gross motor skills. Norms and cutoffs
developed for Chile.
Using the “Standards” approach
• How to develop a set of
Standards
Example from Vietnam for children 5-6 years old
– Define domains
– Within each domain, define a set
of standards or goals
– For each standard, outline the
specific objectives and indicators
for each age level
• Pros of Standards approach
– Culturally appropriate
– Process can be informative
• Cons of Standards approach
– Time-intensive and requires long
term follow-up
– Indicators are not necessarily
translated into a test
– Needs to be done slowly and
carefully
NOTE: UNICEF has worked with over 40 countries
to develop Standards
Today
•
•
•
•
Importance of measuring child development
Domains of development to be measured
Theoretical decisions in selecting instruments
Modification, adaptation and standardization of
existing tests
• Creation of new tests
• Training and quality control
• Conclusions and recommendations
Steps for training
• Involve
local psychologists
• Establish “gold standard”
interviewer
• Test for inter-rater reliability
• Test for rater accuracy
Reliability and accuracy
Inter-rater reliability
Rater accuracy
Today
•
•
•
•
Importance of measuring child development
Domains of development to be measured
Theoretical decisions in selecting instruments
Modification, adaptation and standardization of
existing tests
• Creation of new tests
• Training and quality control
• Conclusions and recommendations
STEP 1: Define purpose of assessment
For example:
1. To plan interventions or services;
2. To monitor programs;
3. To conduct impact evaluations;
4. To investigate the effect of interventions or
programs on specific outcomes of interest;
5. To design a curriculum for a particular child;
or
6. To diagnose and assess child progress
STEP 2: Determine type of assessment
Abilities
Screening
Brief assessment; identifies children likely to
have problems based on cutoffs derived in test
population. Does not yield continuous scores.
Useful for examples 1-4 above.
Detailed assessment of child’s maximum skill
level for age. Provides continuous scores that
allow comparisons within and across
children/groups. Suitable for all examples
above.
STEP 3: Determine mode of assessment
Direct
Ratings/
Reports
Direct
Observation
Ratings/
Reports
Observation
STEP 4: Determine which assessment to use (examples below)
Denver
(DDST II)
Ages and
Stages
Questionnaires
Naturalistic
sample or
structured
sampling
Bayley Scales III
WoodcockJohnson
WPPSI
Stanford-Binet
Kaufman-ABC
Executive
function tasks
MacArthur
Communicativ
e Inventories
Naturalistic
sample or
structured
sampling (see
IEA’s Child
Coding
System)
Broad recommendations
• Assess characteristics of the child that the intervention is intending
to affect.
– Make sure to measure variables that could also be contributing to the
outcomes (e.g. maternal responsiveness, home environment)
• Decide on the type of outcome measure that is appropriate for the
evaluation.
• Rely upon multiple measures of children’s development.
– Include assessments of executive function and socio-emotional development
• Consider the cultural context and how it may affect children’s
development and school readiness
– Always work with local collaborators!
• Look for national level tests where possible and use parent/teacher
report when possible.
• Begin following children early in life.
Criteria for being recommended
• Psychometrically adequate, valid and reliable;
• Balanced in terms of number of items at the lower end
to avoid children with low scores;
• Enjoyable for children to take (e.g. interactive, colorful
materials);
• Relatively easy to adapt to various cultures;
• Easy to use in low-resource settings, e.g. not requiring
much material;
• Not too difficult to obtain or too expensive;
• Able to be used in a wide age range.
Specific recommendations: 0-36 mo.
• Continuous measure, direct assessment
– Bayley Scales of Infant Development
– Nationally adapted test (e.g. Indian version of Bayley II)
– Kilifi Executive Function Tasks
• Continuous measure, maternal report
– MacArthur Communicative Development Inventories
– Nationally adapted test (e.g. Turkish Guide for Monitoring
Child Development)
• Screening test, direct assessment
– Denver Developmental Screening test
– Nationally developed test (e.g. EEDP from Chile)
• Screening test, maternal report
– Ages and Stages Questionnaire (ASQ)
Specific recommendations: 3-5 y.
• Cognitive development
– Stanford Binet
– British Ability Scales II Early Years
– Wechsler Preschool and Primary
Scales of Intelligence (WPPSI)
Specific recommendations: 3-5 y.
• Language development
– Peabody Picture Vocabulary Test (PPVT) or spanish
version: Test de Vocabulario de Imagines Peabody
– Reynell Developmental Language Scale
Specific recommendations: 3-5 y.
• Executive function
– Leiter Examiner Scale
– Day/Night Task and
Backward Digit Task
– BRIEF-P
(Parent/teacher report)
• Social and behavioral
development
– Strengths and
Difficulties
– Achenbach Child
Behavior Checklist
Contact info and further reading
CONTACT INFORMATION:
Lia Fernald: fernald@berkeley.edu
Patricia Kariger: patriciakariger@gmail.com
Patrice Engle: pengle@calpoly.edu
FURTHER READING:
Peña, E. D. (2007). Lost in translation: Methodological considerations in crosscultural research. Child Development, 78(4), 1255-1264
Snow, C.E. and Van Hemel, S.B. (Eds) Early Child Assessment: Why, What,
and How. Washington D.C.: The National Academies Press. 2008
Young, M.E. and Richardson, L.M. (Eds) Early Child Development: From
Measurement to Action. Washington D.C.: The World Bank. 2007
Download