Dr. Joni M. Lakin
Dr. Margaret Ross
Dr. Yi Han
http://www.auburn.edu/~jml0035/
(Under “Conference materials and resources” at the bottom of the page)
• How many of you primarily use SPSS for data analysis?
• How many are comfortable with using syntax (in SPSS or other programs)?
• How many already have plans to use a specific dataset?
• How many just curious about what’s available?
Dr. Yi Han
• NCES
• Restricted use licenses http://nces.ed.gov/nationsreportcard/researchcenter/license.aspx
PISA PIAAC
Dr. Margaret Ross
See PDFs
Dr. Joni Lakin
1.
2.
3.
Statistical weighting in SPSS
Practical significance and large samples
Matrix sampling
4.
Plausible values
SPSS skills that make working with large datasets easier:
5.
6.
Keeping and managing syntax
Merging datasets
7.
8.
Checking for duplicate cases
Missing data imputation
• Weights allow us to better approximate the full population
• If African American students are 18% of population but 9% of my sample, I could weight each AA student 2.0 (so each observation is included twice in analyses) to get results that better reflect population-level effects.
• Types of weights
• Scale weights = multiplies observations to create a weighted sample of same size as population
• Proportional weights = may be below 1 to keep overall sample size the same as the sample
• Note
• When you’re reporting results, you can report weighted sample size, but you should also report unweighted sample sizes too
These “weight” values are already in large datasets
ELS:2002 Race
UNWEIGHTED
Amer. Indian/Alaska Native
Asian, Hawaii/Pac. Islander
Black or African American
Hispanic, no race specified
Freq.
130
1460
%
.8
9.0
2020 12.5
996
1221
6.1
7.5
Hispanic, race specified
More than one race
White, non-Hispanic
Total
Amer.
Indian/Alaska
Native
1%
735 4.5
8682 53.6
16197 100.0
Asian,
Hawaii/Pac.
Islander
10%
Black or African
American
13%
White, non-
Hispanic
57%
Hispanic, no race specified
6%
Hispanic, race specified
8%
More than one race
5%
ELS:2002 Race
WEIGHTED
Amer. Indian/Alaska Native
Freq.
32781
%
1.0
Asian, Hawaii/Pac. Islander 142518
Black or African American 491321
4.2
14.4
Hispanic, no race specified 243607
Hispanic, race specified 298648
More than one race
White, non-Hispanic
147896
2054103
7.1
8.8
4.3
60.2
Total 3410873 100.0
Amer.
Indian/Alaska
Native
1%
White, non-
Hispanic
60%
Asian,
Hawaii/Pac.
Islander
4%
Black or
African
American
15%
Hispanic, no race specified
7%
Hispanic, race specified
9%
More than one race
4%
• Because of large sample size, many negligible effects
(and ALL correlations) will be significant
• Must consider effect sizes and practical significance
ELS:2002 variables
Math test score
Reading test score
Mathematics self-efficacy
English self-efficacy scale
Independent Samples
Test
t df Sig.
8.71
8593 <.001
-4.14
8593 <.001
14.65 8593 <.001
-2.19
8593 .029
• Actually negligible differences for reading and small differences for math
ELS:2002 variables
Independent Samples
Test
Math test score
Reading test score
Mathematics self-efficacy
English self-efficacy scale t df Sig.
Cohen’s d
8.71
8593 <.001
0.19
-4.14
8593 <.001
14.65
8593 <.001
-0.09
0.32
-2.19
8593 .029
-0.05
• Used in large-scale assessments when
•
•
•
Large domain being sampled (e.g., world history)
Need to cover many topics in limited time
Individual estimates of the constructs are less important than aggregate estimates (state level achievement)
• Usually requires IRT (item response theory) scoring methods to allow for comparable scores across examinees completing different items
Table from von Davier et al., http://www.ierinstitute.org/fileadmin/Documents/IERI_Monograph/IERI_Monograph_Volume_02_Chapter_01.pdf
• Can result from matrix sampling (with IRT models), bootstrapping, and missing data imputation
• In matrix sampling, individual estimates of skills are less reliable and plausible values better capture this error variance compared to single scores
• Results in multiple estimates of the student’s true score on the construct (will appear as multiple variables)
• Poor practice = averaging plausible values before analysis
• Produces biased estimates (von Davier et al., see notes)
• Better practice = using methods that analyze the different estimates together and produce standard error bars
• Refer to von Davier et al. link in notes
• From any command window, can select “Paste”
• Makes sure analyses start with the same data selections:
Sample weights, split files, selecting relevant cases
• Good for keeping record of computed and recoded variables
• Add cases = add more participants’ data
• Add variables = add variables for same participants from another dataset
• Have to exclude duplicate variables from one dataset
• Check that values are really identical (if not, change variable name)
• Use Key Variables to match cases
• Will appear as a new variable “PrimaryLast”
• Will need to decide how to handle on case-by-case basis
• Merging datasets incorrectly can result in duplicates
• If variables are identical, delete one
• If variables are different, check that identification variables are correct
• Methods that bias results:
• Mean substitution, listwise or pairwise deletion
• Methods that can provide less biased estimates
•
• Single imputation regression (better than above, but restricts variability)
Expectation-maximization (EM) —best of SPSS options, works well when data is missing at random
• Analyze Missing Value Analysis
• Be sure to read up on “missing completely at random, missing at random”, and “missing not at random”
Dr. Lakin
•
“The program seeks to stimulate research on U.S. education issues using data from the large-scale, national and international data sets supported by the National Center for Education Statistics (NCES), NSF, and other federal agencies, and to increase the number of education researchers using these data sets .”
Suggestions based on personal observations and the RFP:
•
• Must use a strong quasi-experimental design ( Schneider et al.,
Estimating Causal Effects: Using Experimental and Observational Designs )
• Regression discontinuity, propensity score matching, etc.
• Bringing in new quantitative approaches for other fields also very appealing (economics, epidemiology, etc.)
Check past grants to see which datasets are “neglected” (more recent datasets better)
• Prefer ideas that involve more successful multiple datasets in meaningful research are
• Analyses of recently international datasets have been more successful
• IES Research Grants do fund secondary data analyses with
Exploration grant goals (any subject area) http://ies.ed.gov/funding/
• IES data training workshops http://ies.ed.gov/whatsnew/conferences/?cid=2
• AERA annual meeting usually has data training events:
• PDC02: Analyzing NAEP Assessment Data with Plausible Values…
• PDC13: Advanced Analysis using Adult International Large Scale
Assessment Databases
• PDC16: Using NAEP Data on the Web for Educational Policy Research
• Several on quantitative methods (including propensity scores)
• AERA Institute on Statistical Analysis for Education Policy
(summer)
• IES/NCES hosts STATS-DC conferences and summer institutes to train researchers in using specific datasets
Presentation files are available from http://www.auburn.edu/~jml0035/
(Under “Conference materials and resources”)