Delaware`s State Test format - Council of State Science Supervisors

advertisement
SECOND SCIENCE
ASSESSMENT WEBINAR
Performance Assessments
in Science at the State
Level
October 2012
WEBINAR AGENDA
Introduction
 WestEd and 3 states will give a 10 minute
overview of what they are doing in performance
assessments at the state level in science:

WestEd (Edys S. Quellmalz and Matt Silberglitt)
 Ohio (Lauren V. Monowar-Jones)
 Vermont (Gail Hall)
 Connecticut (Liz Buttner)


Open discussion
WESTED
PERFORMANCE ASSESSMENTS FOR
SCIENCE LEARNING
Edys S. Quellmalz and Matt Silberglitt
WestEd
Presented to the Council of State Science Supervisors
October 24, 2012
GOALS
Needs for performance assessment
 Limitations of current science assessments
 Advantages of technology for science assessment
 Types of performance assessment
 Design principles for the assessment of iSTEM learning
outcomes
 Promising innovative approaches
 Needed research and development

PERFORMANCE ASSESSMENT FEATURES





Assessment targets specified are difficult or
impossible to measure well with conventional item
formats
Students construct responses, solutions, or products
Tasks represent significant, recurring, realistic
problems
Criteria for evaluating performances are specified
and communicated to examinees
Performances represent both science and
engineering practices in progress as well as
culminating solutions or products
LIMITATIONS OF CURRENT ASSESSMENTS
Emphasis on disconnected, declarative knowledge
• Neglect of integrated, knowledge structures in
fundamental science systems
• Emphasis on procedural algorithms and skills
• Neglect of strategic inquiry practices in authentic
problems
•
RELEVANCE TO CURRENT ASSESSMENT
PROGRAMS
 Need
innovative, technology-enhanced
assessments that align with new frameworks
that focus on

fewer, deeper, more integrated core knowledge targets

E.g., the new Framework for Science Education and next generation
national science standards
 Models
as structures for understanding and
studying science systems (model-based learning)
 Science practices for using knowledge and
inquiry in significant, recurring authentic tasks
RELEVANCE TO CURRENT
ASSESSMENT PROGRAMS
Need innovative, technology-enhanced assessments
that
 Target 21st century skills within STEM
 Use technology to engage students in use of “tools
of the trade”
 Provide evidence supporting technology-enhanced
performance assessment


For summative and formative purposes
Called for in the assessment consortia
ADVANTAGES OF TECHNOLOGY FOR
SCIENCE ASSESSMENT
Present authentic, rich, dynamic environments
 Support access to collections of information, expertise
 Present phenomena difficult or impossible to observe and
manipulate in classrooms
 Represent temporal, causal, dynamic relationships “in action”
 Allow multiple representations of stimuli and their simultaneous
interactions (e.g., data generated during a process)
 Allow overlays of representations, symbols
 Allow student manipulations/investigations, multiple trials
 Allow student control of pacing, replay, reiterate
 Capture student responses during research, design, problem
solving
 Allow use of simulations of a range of tools (internet, productivity,
domain-based)

ADVANTAGES OF TECHNOLOGY FOR
SCIENCE ASSESSMENT






On-line search for assessments aligned with standards
Digital collections of assessments
Access to innovative technology-based prototypes and collections
Tools to support on-line assessment delivery and scoring
On-line guidelines for scoring by teachers and students
Online guidelines for interpretation of scores and implications for
instruction
On-line professional development on assessment literacy
11

COGNITIVELY-PRINCIPLED ASSESSMENT
DESIGN
Learning science research (E.g., How People
Learn
Measurement theory and research on
measuring learning (E.g. Knowing What
Students Know)
Assessment argument linking claims of
learning, to evidence of learning, to tasks
eliciting the evidence

EVIDENCE-CENTERED DESIGN
Student
Model
What complex of
knowledge, skills, or
other attributes
should be assessed?
Evidence
Model
Task
Model
What behaviors or
performances should
reveal the relevant
knowledge and skills
described in the
student model?
What tasks or
situations should
elicit the behaviors or
performances
described in the
evidence model?
Messick, 1993
Mislevy, Almond, &
Lucas, 2004
SCIENCE CONSTRUCTS
(STUDENT MODEL-ASSESSMENT
TARGETS)
From national frameworks and standards for
science, mathematics, engineering, technology
Cross-cutting concepts-E.g., Systems and System
Models (Next Generation Science Standards)
Cross-cutting practices-E.g., problem solving,
communication, collaboration (NAEP
Framework for Technology and Engineering
Literacy)

TASK MODELS
 Integrated applications to natural and
designed world
 Applied, significant, recurring problems in
the natural and designed world
 Scenario-based tasks building across a
problem solving/inquiry/design sequence
EVIDENCE MODEL

What evidence is collected
Explicit responses
Logged processes for technology delivered


How the evidence is evaluated and
summarized
Scoring
Rubrics


How the evidence is reported for intended
purposes and users
LIMITATIONS OF PERFORMANCE
ASSESSMENTS
 Less available information/documentation of
measures-descriptive or technical quality
 Lack of attention to alignment of outcome
measures to science standards
 Few descriptions of coverage/balance
 Outcome measures tend to emphasize
content, declarative knowledge
 Little attention to application of practices
LIMITATIONS OF SCIENCE ASSESSMENTS
 Practices
–not measured well by static, conventional
formats
 Few measures during (processes, formative) vs. at end
(summative)
 Little measurement of collaboration and communication
 Lack of deliberate design to measure for transfer of
cross-cutting concepts and practices
 Little attention to establishing/documenting technical
quality
CHALLENGES FOR DESIGNING ISTEM
ASSESSMENTS
 Specification
of desired learning outcomes
 Need for focus, coherence of knowledge and
processes and whether situated in a domain and/or
in integrated problems
 Coverage-What is the balance of assessment
targets?
 What is the balance and coherence of classroom
curriculum-embedded for formative purposes and
district and state summative tests?
CHALLENGES FOR DESIGNING SCIENCE
ASSESSMENTS
Need to tailor assessment design to assessment purpose-intended
use of the data
 Formative/summative
 Embedded to monitor (use of feedback and coaching) and
adjust vs.
 Culminating to report proficiency status
Duration, scope, time
 More extended, spread over multiple classes/periods
 Embedded vs. external
Documentation of measures
 Descriptions, technical quality (validity of interpretation,
reliability)
PROMISING NEW ASSESSMENT DESIGNS ENABLED BY
TECHNOLOGY
 Alignments
 Access to resources and expertise
 Network with collaborators, experts
 Collections
 Delivery
 Entry of rubrics, ratings, work in progress,
final artifacts
 Scoring-auto and online training and scoring,
moderated rating sessions
 Reporting-customized to users
PROMISING NEW ASSESSMENT DESIGNS FOR SCIENCE
USING TECHNOLOGY TO SUPPORT HANDS-ON
PROJECTS AND ASSESSMENTS
Blended model of equipment and technology
 Entry of rubric ratings, calibrated training sessions
 Annotated postings of designs, prototypes, tests
 Embedded tasks to test knowledge and skills during projects
 Electronic science notebooks
 Electronic portfolios
 Juried exhibitions posted, streamed, archived

Promising New Assessment Designs for
Interactive Task Design Features
Dynamic presentations of spatial, causal, temporal
phenomena in a system

Multiple overlapping representations
Interactivity
Supports iterative, active inquiry and design
Multiple response formats
Reduce reliance on text
Rapid, customized interaction, feedback, reporting
RESEARCH ON LEARNING IN SCIENCE SIMULATIONS
 Facilitate
formation of organized mental models of
system components, interactions, and emergent
behaviors
 Facilitate transfer
 Facilitate use of systematic problem solving & inquiry
 Situate in authentic, significant, recurring problems
in the natural and designed world
 Highly engaging
NAEP 2014 FRAMEWORK AND SPECIFICATIONS
FOR TECHNOLOGY AND ENGINEERING LITERACY
 SimScientists:
Force and Motion-Fire
Rescue
 PISA: Reactor
 http://www.nagb.org/publications/frameworks
/tech2014-framework/ch_toc/index.html
SIMSCIENTISTS
TEST EFFECTS OF POLLUTION ON CELLS
SIMSCIENTISTS TEST EFFECTS OF
CALORIES ON ACTIVITY LEVEL
RESEARCH NEEDS
 Analysis
of extant assessments-large scale and
classroom, formative and summative
 Analyses of performance assessment opportunities
 Review of promising exemplars
 Formulation and testing of different purposes,
designs, and evidence collection strategies
 Pilot studies of performance assessment design
models for established and new genre of technologyenhanced learning environments
 Documenting technical quality with alternative
psychometric methods
Contact Information
equellm@wested.org
msilberg@wested.org
http://simscientists.org
OHIO
Lauren V. Monowar-Jones, PhD
Project Coordinator
Ohio Performance Assessment Pilot Project
Ohio Department of Education
Office of Assessment
Lauren.Monowar-Jones@education.ohio.gov
1
A Look Into the Future of Ohio’s Science
Assessments
10/24/2012
THE OHIO
PERFORMANCE
ASSESSMENT PILOT
PROJECT
ALWAYS DO WHAT YOU ARE AFRAID TO DO.
10/24/2012
THE TASK DYAD LEARNING SYSTEM

Learning Task


Curriculum embedded
Assessment Task
OHIO’S TASK DYAD LEARNING SYSTEM
THE DYAD SYSTEM
OHIO’S NEXT GENERATION
ASSESSMENTS
PARCC-Developed
Assessments
English language arts
- Grades 3 – 8 and
high school
Mathematics
- Grades 3 – 8 and
high school
Operational school
year 2014-15
State-Developed
Assessments
Science
- Grades 5, 8 and high
school
Social Studies
- Grades 4, 6 and high
school
Operational school
year 2014-15
A SNEAK PEEK
PILOT: TEACHERS’ ROLES

Coaches for students.

Scorers.

Developers.

Reviewers.
OPAPP PARTICIPANTS
TEACHERS



Cohort 1: Sep 2008- May
2012
 15 LEAs
 HS: ELA, Math,
Science
Cohort 2: Sep 2011- Dec
2013
 7 LEAs
 HS: ELA, Math,
Science, SS, Career
Tech
Cohort 3: Jan 2012 June 2014
• 6 LEAs
• ES: ELA, Math,
Science, SS
• Cohort 4: Nov 2012 – December
2013
• 15 LEAs
• HS: ELA, Math, Science, Social
Studies Career Tech
• Cohort 5: July 2013 – May 2014
• Recruiting in March
• ES: ELA, Math, Science, SS
OPAPP PARTICIPANTS
COACHES

Cohort 1:


Cohort 2:


Grade 3: 3 coaches,
Grade 4: 3 coaches,
Grade 5: 2 coaches
Cohort 4:


2 ELA, 3 Math, 2
Science, 2 SS
Cohort 3:


2 ELA, 3 Math, 2
Science
4-5 online coaches
Cohort 5:

4-5 online coaches
OPAPP PARTICIPANTS
HIGHER ED






Cohort 1: 3
Cohort 2: up to 20
Cohort 3: up to 15
Cohort 4: none*
Cohort 5: none*
Purpose of HE
involvement is
To influence HE
teaching
 To influence teacher
preparation
 To provide content
expertise

LESSONS LEARNED

Task Writing:





It is hard to write to a non-native delivery system.
It is hard for assessment contractors to learn to write
good curriculum.
It is hard to develop good rubrics for Learning Tasks.
It is hard to align Learning and Assessment Tasks
well.
Online Delivery System:
Schools are not always “teched up” enough for this
model.
 School firewalls can be problematic for learning
tasks.
 Internal internet access may be more of a problem
than previously thought.

LESSONS LEARNED

Teachers:
Not all teachers are ready to use technology in their
classrooms or labs.
 Professional Development needs to be low impact on
time and high impact on practice.


Scoring/Reporting:
Need method for identifying student work for re-score
(that does not put the state in the position of
qualifying teachers to score).
 Need more data and information about how to
present results to teachers so they make sense (both
to psychometricians and to teachers).

LESSONS LEARNED

Teachers:
Not all teachers are ready to use technology in their
classrooms or labs.
 Professional Development needs to be low impact on
time and high impact on practice.


Scoring/Reporting:
Need method for identifying student work for re-score
(that does not put the state in the position of
qualifying teachers to score).
 Need more data and information about how to
present results to teachers so they make sense (both
to psychometricians and to teachers).

LESSONS LEARNED

Teachers:
Not all teachers are ready to use technology in their
classrooms or labs.
 Professional Development needs to be low impact on
time and high impact on practice.


Scoring/Reporting:
Need method for identifying student work for re-score
(that does not put the state in the position of
qualifying teachers to score).
 Need more data and information about how to
present results to teachers so they make sense (both
to psychometricians and to teachers).

VERMONT
VERMONT
STATE SCIENCE
ASSESSMENT
An Overview
Gail Hall and Kathy Renfrew
Science Assessment Coordinators
VERMONT’S JOURNEY
A Winding Trail…
State Performance Assessments—since 2000
 Vermont PASS Assessment
Partnership for Assessment of
Standards-based Science
 NECAP Assessment
New England Common Assessment Program
Collaboration with RI and NH
THE DETAILS..
 Spring
Assessment—Grades 4, 8, 11
 Content
Domains
Life Science… 24%
 Physical Science… 24%
 Earth/Space Science… 24%


Inquiry Task… 28%

Data from Inquiry Performance Task
investigations are collected by
student partners. Scored items are
answered individually.
Spring in Vermont
NECAP SCIENCE TEST DESIGN
Session 3:
Grades 4 & 8
Estimated time
needed:
75 minutes
(Schedule 120
minutes)
7 or 8 Inquiry Task
Questions
2-point Short
Answer & 3-point
Constructed
Response
NECAP SCIENCE TEST DESIGN
Session 3:
Grade 11
Estimated time needed:
45-60 minutes
(Schedule 60 minutes)
The High School Task is
always a Data Analysis
Task.
7 or 8 Inquiry Task Questions
based on a variety of data
sets.
2-point Short Answer & 3point Constructed Response
2008 GRADE 8
INQUIRY PERFORMANCE TASK
The Scenario:
2008 GRADE 8
INQUIRY PERFORMANCE TASK
The Set-up:
2008 GRADE 8
INQUIRY PERFORMANCE
TASK
Prediction: Posed jointly by working partners

Using Ethan’s experience and your understanding of
force and the motion of objects, predict how the mass of
a parked car will affect the distance the parked car
moves when hit. Explain your answer.


Write your prediction and explanation in the box below.
Using Ethan’s experience and your understanding of
force and the motion of objects, predict how the slope of
a hill will affect the distance moved by a car that gets
hit. Explain your answer.

Write your prediction and explanation in the box below.
2008 GRADE 8
INQUIRY PERFORMANCE TASK
Next Steps:

Materials for the Investigation

Investigation Directions Provided

Varying Slope –using block of wood

Varying Mass 1-3 washers in cup
2008 GRADE 8
INQUIRY PERFORMANCE TASK

Collect data

Measure distance cup(with washers) moves
2008 GRADE 8
INQUIRY PERFORMANCE TASK
Score able Task Items
Construct a graph
 Explanation—effect of mass on movement of parked car—
with evidence
 Explanation—effect of slope on movement of parked car—
with evidence
 How well do data support prediction?
 Predict movement in different situation --flat, dry surface.
 Identify and explain variables.
 Design a new investigation.

LESSONS LEARNED…
• Measure of Critical Thinking
• Student Progress
• Challenging to Construct
• Time
• Outstanding PD Opportunity
• Collaboration
ADDITIONAL VERMONT
INQUIRY PERFORMANCE TASKS
http://education.vermont.gov/new/html/pgm_assessment/necap/r
esources/released_items.html
Grade 4
Grade 8
(All are Performance Tasks)
• Playground Trash
• Magnetism
• Birds, Beaks &
Survival
• Natural
Selection
• Sled Pull
• Force & Motion
• Sand Movers
• Erosion
• Soil and Water
• Conductors and
Insulators
Grade 11
(All are Data Analysis)
• Rainy Morning (PT)
• Colliding Plates (PT)
• Plate Tectonics
• Pond weeds (Data
Analysis)
• Aquatic Ecology
• Mass and Matter
(PT)
• Conservation of
Mass
• Fox and Rabbits
(PT)
• Predator/Prey
• Ocean Currents
(Data Analysis)
• Acid Lakes
• Driver’s Education
• Force & Motion
• Location
• Earthquakes
• Cod on Georges
Bank
• Human Impact
on Ecosystems
• Antifreeze
• Properties of
Matter
• Mercury in Fish
Guidelines for the Development
of Science Inquiry Tasks
http://education.vermont.gov/new/pdfdoc/pgm_assessment/necap/other_resourc
es/science/guidelines_inquiry_tasks_021508.pdf
Thank you for attending.
For further information:
Gail Hall—Middle and High School Science
Assessment Coordinator,
Vermont Department of Education
gail.hall@state.vt.us

Kathy Renfrew—Elementary Science Assessment
Coordinator,
Vermont Department of Education
kathy.renfrew@state.vt.us

CONNECTICUT
CONNECTICUT
CURRICULUMEMBEDDED SCIENCE
PERFORMANCE TASKS:
Teaching Tools Linked
to State Assessments
FAST FACTS
Extended, open-ended investigations related to a
concept within a learning standard (5-7 class sessions).
 Models of activities that engage students in using
inquiry standards to learn content in state standards;
 Teachers decide when (and how) to EMBED the task
within a standard-based learning unit.
 Teacher Manuals – pacing guide, materials list,
pedagogy notes, safety, resources
 Descriptive inquiry feedback rubrics (Gr. 3-8 only)
 One task for each grade (Gr. 3-8)
 Five high school tasks (one for each strand of stds)

SUPPLIES

In 2005, SDE provided “durable” equipment kits
to all elementary and middle schools ($250,000):

Graduated cylinders, hand lenses, droppers,
stethoscopes, wires, bulb holders, soil sieves, etc.
Districts are expected to provide consumable
materials and replace equipment
 Materials kits for Gr. 3-8 embedded tasks are
available for purchase through SK Boreal Labs.

ANATOMY OF A GR. 3-8 CURRICULUMEMBEDDED PERFORMANCE TASK LEARNING
CYCLE

“Mini” instructional units, each including:






Experiment 1 (guided inquiry)
Formative feedback to student (state rubric)
Research Through Reading and Writing
Experiment 2 (independent inquiry)
Summative writing assignment
Summative feedback to student (state rubric)
ANATOMY OF A HIGH SCHOOL
CURRICULUM-EMBEDDED
PERFORMANCE TASK
 Five
embedded tasks
 Each includes a laboratory activity
and a Science, Technology and
Society (STS) research investigation.
BASIC INFORMATION



Curriculum-embedded Performance Tasks are suggested
models for instruction and not mandated exercises.
Instructional materials for use within the classroom during
the course of the normal instructional day and within the
appropriate instructional context (teacher determined).
Questions assessing INQUIRY skills on state tests
reference embedded task “scenarios”



Elementary test: 6 out of 18 SR questions are related to tasks for Gr. 3,4
and 5
Middle grades test: All 3 CR questions are task-related (6 points out of 21
inquiry points are related to tasks for Gr. 6, 7, and 8)
High school test: All 5 CR questions are task-related (15 out of
MORE BASIC INFO
Science Content Areas Addressed:
Gr. 3 – properties of matter (absorbency)
Gr. 4 – electric circuits, conductors/insulators
Gr. 5 – central nervous system; reaction time
Gr. 6 – soil porosity and permeability
Gr. 7 – cardiovascular system; pulse rate
Gr. 8 – friction
High school tasks…
EVEN MORE BASIC INFO
Strand I: Energy Transformation
-Solar Cooker, Laboratory Investigation
-Connecticut Energy Use, STS Activity
Strand II: Chemical Structures and Properties
-Synthetic Polymers, Laboratory Investigation
-Plastics Controversy, STS Activity
Strand III: Global Interdependence
-Acid Rain, Laboratory Investigation
-Connecticut Brownfield Sites, STS Activity
Strand IV: Cell Chemistry and Biotechnology
-Enzyme, Laboratory Activity
-Labeling Genetically Altered Foods, STS Activity
Strand V: Genetics, Evolution and Biodiversity
-Yeast Population Dynamics, Laboratory Investigation
-Human Population Dynamics, STS Activity
FINAL BASIC INFO
Assess knowledge and inquiry skills
 STS research depends on doing internet data
searches and working with Excel spreadsheets
 Student work on embedded tasks is assessed by
teachers using state-developed rubrics for inquiry
and for lab reports.

LINKS TO EMBEDDED TASKS


ELEMENTARY AND MIDDLE GRADES TASKS –
http://www.sde.ct.gov/sde/cwp/view.asp?a=2618&q=32
0890
HIGH SCHOOL TASKS http://www.sde.ct.gov/sde/cwp/view.asp?a=2618&q=32
0892
STATE ASSESSMENT SAMPLE ITEMELEMENTARY
Some students did an experiment to find out which type of paper holds
the most water. They followed these steps:
1. Fill a container with 25 milliliters of water.
2. Dip pieces of paper towel into the water until all the water is
absorbed.
3. Count how many pieces of paper towel were used to absorb all the
water.
4. Repeat with tissues and napkins.
If another group of students wanted to repeat this experiment, which
information would be most important for them to know?
(a) The size of the water container
(b) The size of the paper pieces *
(c) When the experiment was done
(d) How many students were in the group
STATE ASSESSMENT SAMPLE ITEM –
HIGH SCHOOL
STATE ASSESSES INQUIRY ABILITIES
REFERENCING TASK SCENARIOS
NOTE THAT THE CMT QUESTIONS DO NOT
ASSESS A CORRECT “OUTCOME” OF A
PERFORMANCE TASK OR STUDENTS’
RECOLLECTION OF THE DETAILS OF THE
PERFORMANCE TASK.
Students who have had numerous opportunities to
make observations, design experiments, collect data and
form evidence-based conclusions are likely to be able to
answer the task-related CMT questions correctly, even
if they have not done the state-developed performance
tasks.
However, familiarity with the context referred to in the
test question may make it easier for students to answer
the question correctly.
INQUIRY STANDARDS - ELEMENTARY
Embedded Tasks engage students in using all the inquiry skills defined in
the 2004 CT Science Framework. The embedded tasks for Grades 3-5
feature the following Expected Performances for Scientific Inquiry, Literacy
and Numeracy:
1. Make observations and ask questions about objects, organisms and the
environment.
2.
Seek relevant information in books, magazines and electronic media.
3.
Design and conduct simple investigations.
4.
Employ simple equipment and measuring tools to gather data and
extend the senses.
5.
Use data to construct reasonable explanations.
6.
Analyze, critique and communicate investigations using words, graphs
and drawings.
7.
Read and write a variety of science-related fiction and nonfiction texts.
8.
Search the Web and locate relevant science information.
9.
Use measurement tools and standard units (e.g., centimeters, meters,
grams, kilograms) to describe objects and materials.
10. Use
mathematics to analyze, interpret and present data.
INQUIRY STANDARDS – MIDDLE SCHOOL
The embedded tasks for Grades 6-8 feature the following Expected
Performances for Scientific Inquiry, Literacy and Numeracy:
1. Identify questions that can be answered through scientific
investigation.
2. Read, interpret and examine the credibility of scientific claims in
different sources of information.
3. Design and conduct appropriate types of scientific investigations to
answer different questions.
4. Identify independent and dependent variables, and those variables
that are kept constant, when designing an experiment.
5. Use appropriate tools and techniques to make observations and
gather data.
6. Use mathematical operations to analyze and interpret data.
7. Identify and present relationships between variables in appropriate
graphs.
8. Draw conclusions and identify sources of error.
9. Provide explanations to investigated problems or questions.
10. Communicate about science in different formats, using relevant
science vocabulary, supporting evidence and clear logic.
TECHNICALITIES
The performance tasks are instructional
materials modifiable by teachers to accommodate
student needs and interests; not rated for
reliability or validity.
 Inquiry Feedback Rubrics were taken through an
inter-rater reliability study led by TERC (2009).
 No data collected on student work; data comes
from item stats on CMT/CAPT task-related
questions.

LESSONS LEARNED
Teachers appreciate the flexibility, but often
don’t use the tasks effectively (short-cuts)
 Teachers don’t use the tasks when they’re
teaching the content (connection between inquiry
and content is missed)
 On-going PD is needed
 PD providers need to be trained
 District administrators don’t reinforce the
important learning value of the tasks.
 District administrators don’t purchase supplies
for the tasks.

ELIZABETH BUTTNER
SCIENCE CONSULTANT
CT STATE DEPARTMENT OF EDUCATION
860-713-6849
ELIZABETH.BUTTNER@CT.GOV
Download