After all the coding is done

advertisement
After all the coding is done ...
Harry Ganzeboom
Center for Survey Research –
Academia Sinica
July 24-25 2008
Scaling occupations
• Detailed occupation codes have various uses, but for most
applications they are condensed again into social status
scales.
• There is a great variety of national and international social
status scales and ways they are constructed.
• Main division:
– Nominal categories: EGP (Goldthorpe), Wright, Esping-Andersen.
– Continuous scales: Prestige, Socio-economic Index [SEI]
• Each of these have their own theoretical backgrounds.
• The varieties of social status scales can only be compared
when you have access to detailed occupations (and more).
Analysing occupational information
2
Tools for ISCO-88
• http://home.fsw.vu.nl/HBG.Ganzeboom/ISMF
• This webpage contains several useful [SPSS] tools to work with
ISCO-88 codes:
–
–
–
–
ADD VALUE LABELS for all occupations
RECODE for EGP social classses
RECODE for SIOPS [Treiman’s] prestige scale
RECODE for ISEI [Ganzeboom et al.’s] SEI scale
• Note that the tools will work (A) for multiple occupations,
and (B) for all levels of detail of coding (providing you
have used trailing zeroes).
• There are also tools for ISCO-68 and will be for ISCO-08.
Analysing occupational information
3
ISEI (1)
• A SEI [socio-economic index or Duncan] score
scales occupation by averaging status
characteristics of job holders, most often their
education and earnings.
• Often the criterion information is taken from
census data.
• ISEI was created for ISCO-88 using criterium
information for educational and earnings ranks on
a ‘world-wide’ sample of 70.000 men from 17
countries.
Analysing occupational information
4
ISEI (2)
• ISEI was constructed as an optimal scaling of
(detailed) occupations as an intervening variable
between education and earnings: “Occupation is
what you do to convert your qualifications into
income”.
• Metric between 10-90, but this is entirely
arbitrary.
• ISEI was originally developed for ISCO-68, but
its second generation version (for ISCO-88) has
become widely used, also outside sociology.
Analysing occupational information
5
Prestige
• Prestige: popular evalation of occupational status,
i.e. you ask respondents to value occupations.
• Many local versions have been integrated by
Treiman (1977) into the Standard International
Occupational Prestige Score [SIOPS], related to
ISCO-68.
• The version on my website is a mapping of the
original SIOPS to ISCO-88.
Analysing occupational information
6
EGP
• EGP class typology combines detailed
occupation codes with measures on selfemployment and supervising status.
• This leads to a nominal (partly ordered) set
of distinctions: 12-10-7-5 categories.
• EGP has become the de facto standard for
stratification research. Much used.
Analysing occupational information
7
Relationships EGP, ISEI, SIOPS
• All these measures are strongly associated. You
need a lot of data if you are going to argue about
the differences.
• EGP and ISEI resemble each other more than
SIOPS.
• SIOPS [prestige] is theoretically the best idea, but
it does not work well in practice.
• I prefer to use ISEI for my further discussion here.
Analysing occupational information
8
Checks to be run ...
• Use value labels to see whether the coders
have indeed entered only valid codes.
• It is surprising to learn how often this check
has not been run!
• It is even more surprising to learn how often
this is the only check ever run!!!
Analysing occupational information
9
MTMM-models
• Multi-Trait Multi-Method models were developed
in psychometrics to estimate the reliability and
validity of attitude items.
• The idea is that you can learn about reliability and
validity (both!!) when you apply multiple methods
(e.g. respons formats) to multiple [related] traits
(e.g. personality characteristics.
• Remember:
– Reliability: lack of random errors
– Validity: lack of systematic error
Analysing occupational information
10
MTMM model
ROCC
FOCC
FISEI1
FISEI2
RISEI1
Analysing occupational information
RISEI2
11
Estimating MTMM for two coders
• The elementary MTMM model for two traits (occupations)
and two methods (coders) has 7 parameters.
• The data generate only 6 degrees of freedom.
• However, by contraining (equalizing) the parameters, we
can find the following interesting information:
–
–
–
–
How random error each coder has coded relative to the other.
Whether FOCC and ROCC differ in the amount of random error.
How much systematic bias each coder has added to their codes.
Degree of attention brought about by the coding unreliability;
corrected (disattenuatud) correlation between FOCC and ROCC.
Analysing occupational information
12
TWO ITALIAN CODERS, N=1800 OCCUPATIONS
fisei1
fisei2
isei1
isei2
fisei1
1.000
fisei2
0.772
1.000
isei1
0.352
0.332
1.000
isei2
0.321
0.322
0.811
Analysing occupational information
1.000
13
What are we learning by staring
at these correlations?
• Within-coder correlation at best 0.81. This means
0.90 index of reliability.
• Coders agree slightly less on father’s occ than
respondent’s. Loss is around 0.97.
• Within- and cross-coder intergenerational
correlations are around 0.33 and fairly
homogenous.
• Coder 1 has created slightly more consistency
between father and respondent.
Analysing occupational information
14
MTMM assumptions
• Coders are equally reliable for fathers and
respondents.
• However, fathers’ occupations may be
easier to code (less) reliably than
respondents’ occupations.
• Systematic error is the same for all coders.
Analysing occupational information
15
If estimated by SEM (Lisrel), we
learn:
•
•
•
•
Reliability coder 1/coder 2: 0.915 / 0.886 (NS).
Reliability FOCC/ROCC: 0.975 / 1.000 (NS).
Coder unique consistency: 0.015 (significant).
Corrected intergenerational correlation: 0.413.
The interesting conclusion for this (Italian) example is
clearly the corrected intergenerational correlation.
Note that this is even so with high coder reliability!
Analysing occupational information
16
Conclusions
• Even if coders do a decent and honest job,
they introduce random and systematic error.
• These errors are in the coding process, not
by the data collection!
• If coders introduce only 10% error, they
bring down the intergenerational correlation
by 20%!
Analysing occupational information
17
More sources of measurement
problems .. and their repairs
• It is important to see that coder errors are just one
single source of bad measurement.
• It might be true that even bigger trouble is created
by what the respondents say.
• If you want to assess measurement error at the
respondents level, you need to ask the question
twice:
– Within the same interview
– From different sources (e.g. spouses about each other).
– At diffent interviews, e.g. in panel designs.
Analysing occupational information
18
Another source of error: the
respondent.
• Note that all of the above is about errors generated
in the coding proces.
• Occupational measures also contain other errors,
most prominently generated by the respondent /
interviewer.
• This type of error can only be estimated by asking
the question again:
– In the same interview.
– From a different source (e.g spouses about each other).
– In a different interview (panel).
Analysing occupational information
19
Can you ask the occupation question
again in the same interview?
• Yes, an acceptable way for respondents is to ask
an open question (see above) and a closed
question.
• Closed questions may not be as valid and flexible
as open questions, but they may be more reliable.
At least they do not suffer from coding error...
• This type of multiple measurement has been tried
in ISSP87 for four countries and six Dutch
surveys. It will be replicated in ISSP09.
Analysing occupational information
20
Main conclusions on double
measurement
• Crude closed questions are slightly more reliable than
detailed open question.
• Crude questions suffer slightly more from systematic error
than detailed questions:
– Correlated error (‘echo effects’)
– Education bias.
• However, the main boost comes from using multiple
indicators, that leads to disattenuation. Estimates from
ISSP and Dutch data suggest measurement relationships of
around 0.85. This would suggest that coding error is the
major source of random error.
Analysing occupational information
21
ISCO 2008
• ILO has recently revised the ISCO to ISCO08.
• Current situation is that the new
classification has been fixed and published.
• However, there are no definitions or
manuals available yet.
• For previous versions it laster 1-2 years
before these became available.
Analysing occupational information
22
Stated goals of ISCO-08
• Bring occupational classification in line with
changed technologies and division of labor (e.g.
ICT/IT).
• Make ISCO applicable in a wider range of
countries and economies.
• To mend often noted problems with the
application of ISCO-88.
• To produce a minor revision, not a totally different
classification.
Analysing occupational information
23
Problems with ISCO-88 (1)
• Unlike its predecessor (ISCO-68), ISCO-88
is primarily skill oriented. However, in
practice the major group differentiation
does not closely correspond to major
ISCED (education) levels.
• ISCO-68 was more sensitive to employment
status (self-employment) and industry.
Analysing occupational information
24
Problems with ISCO-88 (2)
• Despite its stated principles, it is hard to pay
tribute to skill level differentiation in manual
work. ISCO-88 differentiates between (7000)
Craft workers, and (8000) Machine Operators,
which is similar, but not the same as Skilled
versus Semi-skilled Manual Workers.
• In addition, many occupations occur both in the
7000 and 8000 categories.
Analysing occupational information
25
Problems with ISCO-88 (3)
• ISCO-88 argued that occupation and
employment status are different things and
need to be measured separately.
• As a consequences some employers became
classified with their employees, in particular
there is no distinction between managing
proprietors and managers, and not between
working proprietors and their employees.
Analysing occupational information
26
Problems with ISCO-88 (4)
• Managers were organized into three levels:
– Corporate managers
– Department managers [Production, Support]
– General [Small enterprise] managers.
• The primary distinction here is the number of
managers in an organisation, which is not often
available in data.
• It is somewhat hard to classify work supervisors
[Foremen] in ISCO-88.
Analysing occupational information
27
Problems with ISCO-88 (5)
• Farmers are hard to classify in ISCO-88, because
they appear in 5 places:
–
–
–
–
–
Operations Department Manager (1211)
Small Establishment Manager (1311)
Skilled Agricultural Worker (6100)
Subsistence Farmer (6200)
Farm Laborer (9200)
• None of this corresponds closely to distinctions
made in farm work in national classifications.
Analysing occupational information
28
Problems with ISCO-88 (6)
• ISCO-88 is overly broad in (5000) Service
and Sales Occupations.
• In particular (5200) Sales Workers is very
undifferentiated.
Analysing occupational information
29
Problems with ISCO-88 (7)
• It is hard to find fitting codes for ‘crude’
occupations: factory worker, skilled worker,
foreman, semi-skilled worker, apprentice.
• However, in some instances, there is no
problem if one used major and sub-major
groups codes: e.g. (9000) for Unskilled
Worker.
Analysing occupational information
30
ISCO-08 versus ISCO-88
ISCO-08 groups
ISCO-88 groups
•
•
•
•
•
•
•
•
10 major
34 sub-major
120 minor
403 unit
Total: 567 groups
10 major
28 sub-major
115 minor
363 unit
Total 516 groups
Analysing occupational information
31
Mergers and Splits
•
•
•
•
Mergers: Many-to-one recodes.
Splits: One-to-one recodes.
Mergers & splits: Many-to-many recodes.
All of these occur when comparing ISCO88 to
ISCO08.
• When we crosswalk from 88 to 08 (and have no
further information), only mergers are relevant.
• When we have ISCO88 and further information (like
original verbatim info of original source
classification), we also need to consider splits.
Analysing occupational information
32
Mergers
Table X2: Mergers that occurred to occupation codes when transferring ISCO-88 into
ISCO-08, by number of digits of ISCO-08.
MERGER
DIGIT08
Total
1
Total
2
1
10
2
30
4
3
4
105
317
462
14
64
82
3
4
5
5+
10
34
1
13
14
5
5
Analysing occupational information
2
2
2
2
120
403
567
33
Splits
Table X1: Splits that occurred to occupation codes when transferring ISCO-88 into ISCO08, by number of digits of ISCO-88
Total
SPLITS
1
DIGIT88
Total
2
3
4
5
5+
10
1
10
2
24
1
3
3
4
92
274
400
14
56
71
7
21
31
28
1
4
5
Analysing occupational information
1
1
1
7
8
115
363
516
34
Major groups
• 10 major groups: Essentially unchanged, with
minor changes of titles.
• However: If minor groups have been moved
between major groups (see below), this de facto
changes major groups too!
• The major group that is likely most affected by
such shifts is (5000) and in particular (5200) Sales
Workers, that now contains a number of
Elementary Sales Occupations.
Analysing occupational information
35
Sub-major groups (2 digits)
• 34 sub-major groups: expanded from 28 major groups.
• Truly NEW:
– (0100, 0200, 0300) Army ranks (3x)
– (9400) Food Preparation Workers
• Other ‘new’ major groups are ‘upgraded’ or ‘merged’
minor groups. Roughly speaking, about half of the
sub-major groups has remained the same, the other
half has a different composition than in 1988.
Analysing occupational information
36
ICT occupations
• Altogether, ISCO-08 distinguishes ca. 20 ICT
occupations, that occur at several levels:
–
–
–
–
–
(2500) ICT Professionals (11x)
(3500) ICT Technicians (5x)
(1330) ICT Service Manager (1x)
(2356) ICT Teachers (1x)
(2434) ICT Sales Professionals (1x)
• Neither (2500) nor (3500) are new – actually both
existed already in ISCO-68!
Analysing occupational information
37
Problem 1: Imperfect skill orientation
• Some ambiguities between (7000) Craft
Workers, and (8000) Machine Operators
have been removed.
• An NEW feature is the distinction between
(8100) Stationary Machine Operators, and
(3130) Process Control Technicians, which
probably refers to the complexity of the
process / machine controlled / operated.
Analysing occupational information
38
Problem 2: Employment status
• Although somewhat indirect, ISCO-08 has
better fitting codes for Large Entrepreneurs
and Foreman.
• There is an ambiguous distinction between
(1420) Retail and Wholesale Trade
Managers, and (5221) Shop Keepers.
Analysing occupational information
39
Problem 3: Managers
• The implicit reference to firm size (i.e. number of
departments) has disappeared, the same things are
now referred to by main activity.
• At the sub-major group level Corporate Managers
are now longer grouped with department
managers, but with (high) Government Officials.
• Major changes occur at the 3-digit and 4 digit
level.
– (1330) ICT Services Managers
– (1340) Professional Services Managers (9x)
Analysing occupational information
40
Problem 4: Farmers
• Self-employed farmers can still be coded in
as (1310) Managers in Agriculture etc.
• However, it also remains possible to code
them with (6100) Market-oriented Skilled
Agricultural Workers.
• Interestingly, a NEW feature is that (6200)
Subsistence Farmers has now four minor
groups.
Analysing occupational information
41
Problem 5: Crude Sales / Service
• Sales salespersons are split:
– (5221) Shop Keepers
– (5222) Shop Supervisors
– (5223) Shop Sales Assistants
This is an improvement.
• Also, more levels and locations of sales (market,
stall, cashiers) have been regrouped in the submajor group (5200).
This has made the sub-major group even more
heterogeneous than it was.
Analysing occupational information
42
Interesting ..
• Cooks are now split up into
– (3434) Chef [a “Culinary Associate
Professional”]
– (5120) Cooks
– (9400) Food Preparation Workers
• (9411) Fast Food Preparers
• (9412) Kitchen Helper
• I am very happy with this...
Analysing occupational information
43
Problem 6: Crude occupations
• Some of the new features mend this
problem:
– “Foreman” can now be classified as (3120)
Production Supervisor.
– “Shop keeper” can go in two places.
– “Skilled Worked” can be more conveniently
coded as (7000).
Analysing occupational information
44
Interesting ...
• Specialized Secretaries and Office Managers are now in
(3000) Associate Professionals.
• Some new occupations:
–
–
–
–
–
(2230) Traditional and Complementary Health Professional
(5245) Service Station Attendant
(7234) Bicycle Repairman
(9334) Shelf Filler
(9412) Kitchen Helper
• Disappeared:
– (2121) Mathematician, Statistician
– (6142) Charcoal Burner
Analysing occupational information
45
How can we reclassify existing
data?
• A simple conversions of ISCO-88 into ISCO-08 is
not possible.
• Conversion tool will become available, that will
do two things at the same time:
– Straight recode of ISCO-88 into ISCO-08 (‘best fit’).
Truncate trailing decimals, if this is the only thing that
you want or can do.
– Trailing decimals suggest the amount of alternatives
(splits). You will have to consult a separate document to
list these options. For this to be usefull you will need
original strings or classifications.
Analysing occupational information
46
Download