Establishment of Common Framework final report

advertisement
IDB/ CARICOM REGIONAL PUBLIC GOOD
COMMON FRAMEWORK FOR A LITERACY SURVEY
PHASE1:
Establishment of Common Framework
FINAL REPORT
May 2013
1
2
Preface
This report was reviewed by the CARICOM Advisory Group on Statistics (AGS) and the
Secretariat and was changed to enable greater clarity.
The Secretariat is indebted to the Consultant, Scott Murray, for his methodological inputs
and guidance.
3
Table of Contents
Acronyms ......................................................................................................................... 6
Executive Summary ..................................................................................................... 7
INTRODUCTION TO THE PROJECT ..................................................................... 15
Background ............................................................................................................... 15
Objectives ..................................................................................................................... 15
Scope of work/Expected output/Results to be achieved of the Project ........................ 16
CHAPTER 1: BACKGROUND TO LARGE-SCALE LITERACY
ASSESSMENTS, REVIEW OPTIONS AND EVALUATION CRITERIA .......... 20
1.1. A Brief History of Large-Scale Literacy Assessment .......................... 20
1.2. The Assessment Options ............................................................................ 30
1.3. Approach to Implementing Household-Based Literacy Skills
Assessments ............................................................................................................. 35
1.4. The Evaluation Criteria and Regional Constraints ............................ 40
CHAPTER 2:
A REVIEW OF OPTIONS .............................................................. 46
2.1. Program for International Assessment of Adult Competencies
(PIAAC) - Common Assessment .......................................................................... 47
2.2 Program for International Assessment of Adult Competencies
(PIAAC)- Full Assessment ..................................................................................... 49
2.3. Literacy Assessment and Monitoring Program (LAMP) .................... 55
2.4. Saint Lucian Instruments- Common Assessment ............................. 58
2.5. Saint Lucian Instruments- Full Assessment ....................................... 63
2.6. Bow Valley Web-based Assessment- Common Assessment ........... 63
2.7. Bow Valley Web-based Assessment- Full Assessment ..................... 67
2.8. Summary ......................................................................................................... 68
CHAPTER 3: ANALYSIS OF THE LITERACY SURVEY EXPERIENCE IN
THE REGION ………………………………………………………………………..70
3.1. Bermuda’s Experience ................................................................................ 70
3.2. Saint Lucia’s Experience ............................................................................ 74
3.3. Proposed Work in Dominica ...................................................................... 78
CHAPTER 4: FEEDBACK FROM THE CARICOM ADVISORY GROUP ON
STATISTICS (AGS)....................................................................................................... 79
CHAPTER 5: ANALYSIS OF THE INDIVIDUAL COUNTRY CAPACITY
ASSESSMENTS ............................................................................................................ 84
CHAPTER 6: DETAILS ON THE OPTION RECOMMENDED BY THE
AGS- FULL BOW VALLEY ASSESSMENT ......................................................... 117
4
6.1. Detailed Methodological Approach of the Full Bow Valley WebBased Assessment ................................................................................................ 117
6.2. Recommendations to Inform the use of Bow-Valley Full WebBased Assessment ................................................................................................ 119
6.3. Adjustments Required ............................................................................... 122
CHAPTER 7: RESULTS OF THE TWO REGIONAL TRAINING
WORKSHOPS CONDUCTED UNDER PHASE 1 .............................................. 124
7.1 The First Regional Training Workshop ................................................ 124
7.2 The Second Regional Training Workshop ........................................... 125
CHAPTER 8: COMMON FRAMEWORK WITH THE PLAN OF ACTION ... 126
CHAPTER 9: SUMMARY AND CONCLUSION ................................................. 151
9.1 Activities Completed Under Phase I ....................................................... 157
LIST OF REFERENCES ........................................................................................... 158
ANNEX A: TERMS OF REFERENCE ................................................................... 159
ANNEX B: COUNTRY ASSESSMENT QUESTIONNAIRE ........................... 167
ANNEX C: COSTING FOR SAINT LUCIA’S LITERCY
SURVEY PILOT .. 174
ANNEX D: SMALL AREA ESTIMATION .............................................................. 177
ANNEX F: INCEPTION REPORT ................................................................................ 183
ANNEX G: COMMON FRAMEWORK WITH PLAN OF ACTION .......................... 184
ANNEX E: REPORTS ON THE CARICOM TECHNICAL WORKSHOPS ON THE
COMMON FRAMEWORK FOR A LITERACY SURVEY
ANNEX EI: First Workshop Report
ANNEX EII: Second Workshop Report
ANNEX F: INCEPTION REPORT
5
Acronyms
AGS
ALLS
BPC
CAPI
CARICOM
CCL
CSME
DEELSA
DeSeCo
ETS
GDP
HRSDC
IALS
IDB
IRT
ISRS
LAMP
LSUDA
MOE
NALS
NCES
NSO
OECD
PIAAC
PISA
PoA
SALNA
TAG
TOR
UIS
UNESCO
USA
YALS
Advisory Group on Statistics
Adult Literacy and Life Skills Survey
Board of Participating Countries
Computer-assisted Personal Interviewing
Caribbean Community
Canadian Council on Learning
CARICOM Single Market and Economy
Directorate for Employment, Education, Labour and Social
Affairs
Definition and selection of competencies
Educational Testing Service
Gross Domestic Product
Human Resources and Skills Development of Canada
International Adult Literacy Survey
Inter-American Development Bank
Item Response Theory
International Survey of Reading Skills
Literacy Assessment and Monitoring Program
Literacy Skills Used in Daily Activities
Ministry of Education
National Adult Literacy Survey
National Center for Educational Statistics
National Statistics Office
Organization for Economic Cooperation and Development
Program for International Assessment of Adult Competencies
Programme for International Student Assessment
Plan of Action
Saint Lucia Adult Literacy and Numeracy Assessment
Technical Advisory Group
Terms of Reference
UNESCO Institute for Statistics
United Nations Education and Science Organization
United States of America
Young Adults Literacy Survey
6
Executive Summary
The following is a summary of the Final Report of Phase I:
Options Reviewed
1.
The review of options examined four distinct assessments. In
addition, in three of the options, the assessment considered
instances of a full sample size and a reduced sample size.
Therefore, a total of seven detailed options were identified and
evaluated.
2.
It was noted that the methodology underpinning all of the options
reviewed was based on the same foundation, namely the
International Survey of Reading Skills (ISRS).
3.
Each option was evaluated in terms of information yield, cost,
operational burden, technical burden and risk.
4.
The assessment provided an analysis of the origins of large-scale
Literacy measurement including the ISRS, the Adult Literacy and
Life Skills Survey (ALLS) and the Young Adults Literacy Survey
(YALS).
5.
The review of the various literacy assessment approaches conducted by
the Consultant, suggests that all of the options would satisfy the Region’s
information needs and could provide similar levels of reliable estimates.
Further as indicated in (2), all the options have common
methodological underpinnings and may differ only by the data
collection method employed. Some countries may opt for paper and
pencil data collection while others may opt for some form of
electronic data collection/ web-based approach.
6.
Therefore, one can conclude that the recommendation of the AGS is
primarily one on the approach to data collection based on the
criteria of the assessment and has nothing to do with the
theoretical differences.
7
Regional Experience
7.
The evaluation reflects the needs and constraints facing the
countries of the Region.
8.
With the exception of Saint Lucia and Bermuda, there is limited or
no experience in the conduct of household-based skills
assessments; limited access to financial resources to support an
assessment; operational burden that a household-based skills
assessment imposes on the national data collection infrastructure
and limited technical capability to support the adaptation and
implementation of household-based skills assessments
9.
Generally, the results of the country assessment suggest that most
countries in the Region have limited capacity to administer any of
the full-scale assessment options including the full Bow Valley
web-based assessment. Most countries would need to greatly
enhance their collection and processing capacity and most would
need assistance and support to complete the technical aspects of
implementation including sample selection.
Key Issues Arising Out of Evaluation and Feedback from Countries
10.
The evaluation suggests that all of the options reviewed would
satisfy the Region’s information needs.
11.
Member States indicated preference for the use of Paper and Pencil
as well as electronic data collection including web-based
procedures for data collection.
12.
It was recommended that each country be considered a separate
domain and that the sample size should be proportionate to the
size of the population of the country.
13.
It was noted that while a very high response rate is usually difficult
to achieve for this particular type of survey, countries should strive
to achieve the required response rate of about 75- 80 percent.
Measures should be implemented to adjust for response rate bias.
8
14.
It was noted that the use of a web-based assessment might prove
to be challenging for non-computer users. The meeting was advised
that in such cases there are two options- (i) a specially designed
tutorial could be taken in advance of the assessment, to acquaint
non-computer users with basic mouse operations and the response
types. (ii) the interviewers will input the responses into the data
collection device at the direction of the respondent.
Recommendations from Framework
15. Sample SizeIn general, the size of the sample should be dependent on the country
specific policy requirements and subject to cost and the budget available to
conduct the assessment in the respective countries.
In order for the point estimates to be reliably estimated according to
selected characteristics, the Consultant has indicated that a minimum of
600 cases is required per category for each characteristic.
However, this figure of 600 cases is based on a desired level of
precision or margin of error and level of confidence. It is possible
therefore that if the tolerable level of confidence is say 90 percent
and the margin of error is say 5 percent then the number of cases
may be less than 600. Countries can use the margin of error and
level of confidence that they would normally use in their household
surveys to derive reliable estimates at the sub-national level and for
sub-groups of the population. This issue will be discussed further in
the guidelines for sample design that will be prepared under Phase
II.
16. Adequate communication infrastructure including internet access
If the preferred data collection method is web-based, it is recommended
that countries identify at an early stage, areas where internet coverage or
access are inadequate.
In the absence of the internet, the Bow Valley assessment tools allow for
the use of the 3G network or related networks. The assessment tools also
9
allows for off- line data collection with the use of large cache memory, in
which case, the delayed data download will be employed post the
interviews. In addition, central locations can be identified where internet
access is available and where respondents could be interviewed.
17. Sharing of equipment to conduct survey across countries
It is recommended that there be sharing of equipment across countries to
make it feasible for all countries to participate in the survey.
This approach is possible since countries are not likely to execute the
survey at the same time. This approach will result in a considerable
reduction in the overall survey cost per country. Countries can therefore
contribute to the purchasing of the equipment, mainly laptops, tablets
and other similar devices. This approach would satisfy concerns raised
by some countries relative to the cost of acquiring equipment.
18. Respondents with limited or no computer technology knowledge
The recommendation is for interviewers to input responses on the device
as directed by the respondents since the assessment tools allow for this.
Additionally, tutorial sessions on the use of the devices to respond to the
questions, should be made available to the respondents prior to the test.
19. Adequate human resources
Since the majority of the countries currently lack the operational and
technical capacity to conduct a literacy survey in general and specifically
the Full Bow Valley Web-Based assessment, it is recommended that
countries/ Region consider this limitation when preparing the budget for
their assessment.
High-level technical experts to be utilized to provide the training and to
bridge the gap.
20. Method of selection, age and number of respondents per household
It is recommended that one adult, 15 years old and over be selected per
household using the Kish selection method.
21. Relevance of assessment instruments
10
There must be country involvement in the development or refinement of test
items, questionnaires and corresponding documents including manuals for
training, interviewing and tutorials for respondents to ensure suitability to
the respective countries.
22. Generation of synthetic estimates
It is recommended that synthetic estimates can be generated by applying
the national survey estimates obtained (using a sub-sample of 1,000 cases
of the determined country sample) to the data of the Population and
Housing Census.
This approach is said to be a useful method to obtain estimates for a
broader range of characteristics to satisfy policy requirements and
utilizes applied statistical methods. This approach does not imply that
countries will be utilising the recommendation of the Consultant of
considering the Region as a domain and the countries as sub-domains
with the sample size of 1,000. The sample size will be selected in
accordance with Recommendation 1 and a sub-sample of this sample
can still be applied to the Census data to produce synthetic estimates in
addition to those that would be obtained from the survey.
23. Pretesting/ piloting must be done in each participating country
This is necessary to ensure that all the tools are applicable for the
respective countries. However, the sample (100- 500 cases) used in the
pilot should be selected from the main survey sample so that the data
collected during pilot exercise could be utilized should there be no need
for any major modification to the tools.
24. Translation of the common framework to Haitian French and
Surinamese Dutch
The translation of the framework including the test items should be done
by linguists who are familiar with the framework. The translation of the
test items for example (included in the filter booklet, location booklet and
the main booklet) should be done in such a way to ensure that the
psychometric performance of the items remains unaltered This is
11
necessary to ensure that the test items remain identical in psychometric
terms so as to ensure comparability among countries.
25. Duration of training of field staff
The length of training of field staff would depend on the quality of field
staff and would vary by country.
Main comments received from Member States relative to the
Framework
26. Countries have indicated that the cost of the exercise will be a
major concern. The absence of technical expertise for the
conducting of the survey will pose a problem. This would include
expertise that would be required in the areas of survey sampling,
data editing, scoring and weighting of the results, variance
estimation and statistical quality control of the operations.
27. It was also indicated that the respective Ministries of Education
may not have the necessary managerial capacity to undertake the
Literacy Survey and that the respective statistical agencies may
also lack the pedagogical skills to work on the instruments.
Therefore, collaboration between the two agencies would be
required and should be possible.
28. It was observed by one country that the paper and pencil-based
environment is a more familiar environment than the web-based
one and that a significant culture shift would be required in the
case of the latter. Ensuring suitability of the web-based approach
at the country level should be taken seriously. Actual devices
should be tested and there is also concern about the location and
security of the data set.
29. It was also stated that the paper and pencil-based approach has
credibility and ownership within it since the scoring is done by
persons such as teachers in the country trained to score the
completed test booklets. Therefore, especially with the active
12
involvement of these persons in scoring, there will be more buy-in
to the process.
30. With respect to the web-based approach, it is very important that
mechanisms be setup to ensure that the validity of the process is
well understood and thoroughly tested.
31. It must be possible for the assigned score on each case (which is
assigned by the web-based system) to be seen, validated and
verified. The process of doing this in a web-based system is not
obvious and will need to be thoroughly tested.
32. It was indicated by another country, relative to a One Laptop Per
Family (OLPF) project, that this country might be at an advantage
if a web-based methodology is used. This country viewed
favourably the proposed solutions for areas within countries that
do not have internet connectivity as well as for persons that were
not computer literate.
33. One country stated that oversampling of the 15-24 age group
(which includes the population just completing secondary school)
might be necessary since the literacy survey may have, as one of
its main objectives, the assessment of the education system in
providing an education relevant to the present day realities of the
job market. It was further stated that this group is of special
interest since it demands jobs the most, requires new skills in
some cases, is the most adaptable to retraining, and is a
significant age cohort within the population.
34. Of the 16 countries that responded to enquiries relative to the data
collection approach they are likely to use for the conduct of a
National Literacy Survey in their respective countries, seven
indicated the paper and pencil-based approach while nine
indicated the electronic approach. However, it should be noted that
it is not likely that all the countries indicating electronic would opt
for the Bow Valley web-based option. However, the theoretical
underpinning of the literacy testing framework will be the same
13
regardless of the approach
electronic/ web-based)
used
(i.e.
paper-based
versus
Achievements of the Technical Workshops
33. Two technical workshops were conducted under Phase I. The main
achievements of these workshop were that participants gained a
better understanding of what is involved in the planning and
execution of a large-scale Literacy assessment; the various literacy
measurement approaches; the issues pertaining to costing; sample
size estimation; background questionnaire and assessment/test
booklet, the Plan of Action relative to the recommendations and
actions that inform the common framework.
14
INTRODUCTION TO THE PROJECT
Background
Consistent, reliable and comparable statistical information is an
important ingredient for planning, monitoring and evaluation of policy
decisions and the lack thereof has long been considered as hampering
the effectiveness of public policy in the Caribbean Region. Consequently,
noting the lack and/or poor quality of literacy statistics in the Region,
and the perceived importance of this information for the Region’s
economic and social development in light of the advent of the CARICOM
Single Market and Economy (CSME), Statisticians of the Caribbean
Community (CARICOM) decided to pursue the development of a
CARICOM programme for the collection and analysis of social and gender
statistics including educational statistics.
Aware of these challenges and shortcomings of the existing approaches
to measuring literacy, the CARICOM Advisory Group on Statistics (AGS),
is attempting to develop a common framework for the production,
collection and analysis of Literacy data. The AGS recruited a consultant
to facilitate the development of a common framework for a literacy
Survey for the Region.
Objectives
This project is designed to create a Common Framework involving a
regional approach for conducting the literacy assessment methodology,
the development of literacy assessment instruments and the provision of
technical assistance for the development of national implementation
plans. The project is entitled “Common Framework for a Literacy Survey
in Project ATN/OC-11810-RG under the Regional Public Good Facility”
and is being funded from resources of the Inter-American Development
Bank (IDB). The executing agency is the CARICOM Secretariat under its
Regional Statistics Programme.
The program of work set out for the IDB-financed public good project
consists of three components/ phases as follows:
15
Component/Phase I: Establishment of a regional framework for
conducting and adapting Literacy assessment models for the
facilitation of a regional assessment for the execution of the
Literacy Survey treating each country as a sub-population;
Component/Phase II: Development and adaption of instruments,
such as survey instruments (questionnaires), training manuals,
and related materials, to inform about the survey, documentation
on the concepts and definitions, scoring of the assessment,
sampling approach, data dissemination/tabulation format, etc. as
part of the common framework; and
Component/Phase III: Development of a template for the national
implementation plans using a common questionnaire, field test
procedures for establishing the psychometric and measurement
property of the survey instrument, and confirmation of key aspects
of survey cost and quality.
As an initial step in the preparation of this framework, reviews of the
methodologies used in the LAMP, ISRS, Saint Lucia assessment, the Bow
Valley web-based assessment and the PIAAC.
The purpose of this report is to document the comprehensive review, the
consultations with the AGS and the Secretariat, the responses to the
country assessment questionnaire, the detailed methodological approach
with recommendations, adjustments required, actions to be taken and
the actual methodology to be utilized.
Scope of work/Expected output/Results to be achieved of the Project
Component/Phase I Activities:
As the initial step of Phase I and of the Project as a whole, the
Consultant was required to engage in a Briefing Meeting with the
Regional Statistics Programme, CARICOM Secretariat to discuss the
scope of work of the project. The meeting was held on the 20 May 2011
at the CARICOM Secretariat, Georgetown, Guyana and was attended by
the Regional Statistics Programme as well as other officers from other
16
key directorates from within the Secretariat (see Annex G for Report on
the Briefing Meeting).
The meeting clarified the scope of the project, the assessment options to
be reviewed, the risks and constraints that needed to be taken into
consideration, the expected outputs, the project timeline, reporting
requirements and logistics.
The meeting also made it clear that the consultancy was not to focus on
any aspect of implementation but rather provide CARICOM Member
States with options for consideration.
Following the Briefing meeting, an inception report was prepared to
document the project/ consultancy execution methodology and the
detailed draft work plan/ implementation schedule for the lifetime of the
project/ consultancy. This Inception Report , a copy of which is included
as Annex F, reflected input (improvements/ adjustments) received from
the Briefing Meeting held at the Secretariat on 20th May, 2011 and from
the 8th Advisory Group on Statistics (AGS) meeting held in Kingston,
Jamaica June 27 -29, 2011.
The inception report stressed the importance of skill to economic growth
and competitiveness and the need for reliable comparative data to inform
public policy. The report also stressed the high cost, technical and
operational burden of household survey-based skill assessments and the
risk of these burdens overwhelming the limited capacities of Member
State’s statistical systems.
The project description, the project scope, the expected outputs, the
reports to be delivered as part of the project, and the risks and
assumptions relative to the project were all elaborated on in the
Inception Report. An initial draft schedule of activities was also included
(see Annex F).
As an initial step in the preparation of the common framework,
comprehensive reviews will be undertaken to assess the methodologies
used in the LAMP, International Survey of Reading Skills (ISRS), Saint
Lucia Adult Literacy and Numeracy Assessment (SALNA), the Program for
17
International Assessment of Adult Competencies (PIAAC) and the Bow
Valley web-based assessment
These comprehensive methodological reviews will also identify any
problems in application to CARICOM and adjustments that would be
required in adapting a common approach considering the constraints
facing Member States and Associate Member States. In addition, detailed
reviews of Bermuda’s experience with implementing the Adult Literacy
and Life Skills Survey (ALLS) study, and prospectively, the assessment
planned in Dominica.
It is expected that the Secretariat and the AGS will review this report and
will make recommendations on an option to serve as a common
framework for literacy assessment in the CARICOM area.
A Plan of Action (PoA) of the agreed approach will be prepared and
submitted. This PoA will outline the estimated cost and sequence of steps
to achieve the recommendations of the recommended option.
Support would then be given to the Secretariat and the AGS in the
preparation of the Final Common Framework for a Literacy Survey.
Component/Phase II Activities:
Based on the Common Framework developed in Phase I, all instruments
and related documents will be prepared in this phase. These include,
common instruments for pilot-testing and screening, sampling frame,
instructions for measuring/scoring the levels of literacy, training manual
(that speaks to the instruments to be used; training frequency and efforts
required; quality insurance mechanisms to be used during data
collection and subsequent data validation), guidelines for data processing
and format for data analysis and subsequent publication/dissemination.
The Member States will be consulted as per TOR (Phase II activities item
(a) x).
A draft report and a final report of phase II activities will be prepared.The
former will include adjustments required based on feedback from the
AGS, Member States, workshop and the Secretariat. The latter will
18
incorporate all comments/outputs from meetings/ workshop, the
Secretariat and other stakeholders.
Component/Phase III Activities:
This component will provide technical assistance during and after a third
regional workshop to develop a template for the national implementation
plans and its adaptation to the national realities. The template for the
preparation of the national implementation plan will include a list of all
activities to be undertaken as contained in the detailed common literacy
framework and documentation to be obtained as per the survey
instruments including pilot-test and quality assurance. The draft
template will also include the cost estimates (Budget) of all activities
such as timelines, number of interviewers, supervisors, scorers required,
training requirements, resources required for subsequent data analysis,
a social marketing effort to inform the general public, and a
sustainability component.
During this phase, a cost estimate for the conduct of two Literacy
Assessment in each CARICOM country will be prepared as per TORPhase III/ Component III, item (c). Additionally, information will be
collected from member countries with regard to resources available at the
national level, staffing, budget, collaborating partners (Statistical Office,
Ministry of Education and other relevant agencies) and other relevant
information related to capacity availability/constraints that can inform
the template to be prepared. A final draft national implementation plan
will be produced to reflect feedback from the Secretariat and the AGS.
A draft report and a final report of phase III activities will be prepared
and submitted to the Secretariat. Both reports will include the national
implementation plan template but the latter will incorporate all
comments/outputs from meetings, the Secretariat and other
stakeholders
19
CHAPTER 1: BACKGROUND TO LARGE-SCALE
LITERACY ASSESSMENTS, REVIEW
OPTIONS AND EVALUATION CRITERIA
This chapter describes the options that were selected for evaluation and
sets out the criteria that were applied in the review. The chapter begins
however, with a brief overview of the history of literacy assessments. This
will provide a background to how the options being evaluated relate one
to another.
1.1. A Brief History of Large-Scale Literacy Assessment
1.1.1.
Young Adult Literacy Survey (YALS)
The initial large-scale comparative assessments of adult literacy and
numeracy can be traced back to the Young Adult Literacy Survey (YALS)
conducted by the Educational Testing Service (ETS) on behalf of the
United States Departments of Labor and Education.
The YALS study was enabled by scientific advances in five key areas as
follows:
(a)
(b)
The first advance involved the development of the theory to explain
the relative difficulty of reading and numeracy tasks. Developed by
Irwin Kirsch and Peter Mosenthal (Kirsch and Mosenthal, 1994),
the models explained a sufficiently high proportion of the observed
variance in item difficulty to provide a means to develop tests that
systematically sampled the cognitive domains of interest. The
initial models explained roughly 83 percent of observed variance in
item difficulty, enough to allow the results of the assessment to be
interpreted as reliable indications of generalized proficiency. This
innovation also provided a means to describe proficiency in an
effective way – to identify what respondents at different levels
could and could not do.
The second advance involved the development of statistical
procedures to provide reliable summaries of both item difficulty
and individual and group proficiency. Referred to as Item
Response Theory (IRT), these statistical procedures extracted
reliable estimates of both item difficulty and proficiency out of
20
(c)
(d)
(e)
1.1.2.
complex vectors of test results than themselves included a
significant amount of item level non-response.
The third advance involved the development of statistical methods
to support the estimation of reliable estimates for population subgroups complete with unbiased estimates of standard errors that
include the error associated with the fact that one was drawing
representative samples in two dimensions i.e. of people and of the
cognitive domains of reading and numeracy.
The fourth advance was simply the development of procedures to
support the administration of the assessments within the context
of a household survey. These procedures included methods
designed to both maximize response rates and to minimize the
impact of partial non-response on the estimates of item difficulty
and proficiency.
The fifth and final advance involved the development of procedures
to control the error in the scoring of open-ended test item
responses. Scoring error of this sort translates directly into bias in
the proficiency estimates so procedures had to reduce these errors
to negligible levels. The approach developed involved the
adaptation of standard statistical quality control processes
commonly used in statistical coding operations by national
statistics offices.
Survey of Literacy Skills Used in Daily Activities (LSUDA) and
National Adult Literacy Survey (NALS)
The conduct of the YALS study in the United States of America
precipitated the conduct of two national assessments in Canada. The
Southham study of 1989 applied the YALS approaches to measurement
and data collection but failed to apply the statistical methods used to
summarize item difficulty and proficiency. The 1989 Survey of Literacy
Skills Used in Daily Activities (LSUDA), conducted by Statistics Canada
was the first study to apply the full YALS methodology in two languages
i.e. Canadian English and Canadian French. The next years the ETS
fielded the 1990 National Adult Literacy Survey (NALS) on a large
representative sample of US adults.
21
1.1.3.
International Adult Literacy Survey (IALS)
Comparison of results by Kirsch and Murray at a meeting organized by
the then UNESCO Institute of Education in Hamburg in November,
19891 led to the design and implementation of the International Adult
Literacy Survey (IALS) using a combination of NALS, LSUDA and newly
developed items. IALS was implemented by a consortium that involved
Statistics Canada, the US National Centre for Education Statistics
(NCES), ETS and the Organization for Economic Cooperation and
Development (OECD).
Statistics Canada provided overall project
management and household survey expertise related to sampling, data
collection, data processing, weighting, variance estimation, coding and
data analysis. ETS assumed responsibility for item development, test
design, scoring and related psychometric tasks. The OECD provided the
means for the rapid dissemination of results to policy makers. NCES
provided technical advice and funding to support development and
implementation. Three large-scale rounds of IALS data collection between
1994 and 1998 ensued involving some 25 population/language subgroups. The IALS data revealed several key facts, including that:
(a)
The existence of much larger differences in the level and
distribution of literacy and numeracy skills than expected
(b)
Literacy and numeracy skills had a marked impact on a
range of individual labour market, health, social and
educational outcomes
Organized by Paul Belanger, the then Director of the UNESCO Institute of Education, and co-sponsored by
the OECD, the workshop brought together assessment experts and policy makers from several countries to
discuss functional literacy. Irwin Kirsch attended as a US expert having worked on the 1985 U.S. Young
Adult Literacy Survey (YALS) and 1990 U.S. National Adult Literacy Surveys (NALS). Scott Murray attended as
a Canadian expert responsible for the conduct of the 1989 Canadian Survey of Literacy Skills Used in Daily
Activities (LSUDA)
1
22
(c)
The impact of literacy and numeracy on outcomes varied
significantly by country in response to differences in the
relative balance of skills supply and the social and economic
demand for skills.
(d)
The observed differences in the level and distribution of
literacy skills by country explained over half of the
differences in long term rates of growth in Gross Domestic
Product (GDP) and labour productivity
IALS also confirmed the need for more quality assurance related to
sampling and the control of data collection. Additional quality assurance
methods were added with each successive round of IALS data collection2.
1.1.4.
Adult Literacy and Life Skills Survey (ALLS)
Starting in 2000, the US NCES and Statistics Canada initiated a program
of work designed to inform the design of a new round of international
comparative assessment. The program included fundamental work on
the definition and selection of competencies (DeSeCo) that was jointly
funded by Statistics Canada, the NCES and the Swiss government.
Known as DeSeCo, this element identified several additional skills
domains that might be included. Subsequently, frameworks and
associated measures were developed for numeracy, teamwork, problem
solving and information and communication technologies. Testing in
multiple countries3 revealed that only the initial prose literacy, document
literacy and the new numeracy and problem solving measures met the
demanding standards set for inclusion. The background questionnaires
were also refined and extended. Two rounds of data collection, involving
some 11 countries, were undertaken of what was known as the Adult
Literacy and Life Skills Survey (ALLS). Analysis of the data4 using
synthetic cohort methods provided the first clear evidence of skills loss
on the supply of literacy skills. Bermuda participated in the ALLS in
2003.
1.1.5.
International Survey of Reading Skills (ISRS)
The Consultant served as the International Study Director for all three rounds of data
collection and analysis.
2
3
4
Canada, United States, Spain, Netherlands
Done by the Consultant and a colleague.
23
Having established the nature of adult literacy and numeracy problems
in OECD economies and their impact, Statistics Canada, the NCES and
the ETS jointly developed a study to shed light on what might be done to
help low-level readers improve their skills. The subsequent study, fielded
in 2005 and known as the International Survey of Reading Skills (ISRS),
was the first international comparative study to test the component
reading skills of adults with a battery of clinical reading tests and an oral
fluency test. The testing, undertaken in Canadian English, American
English and Canadian French, showed that the components reading
measures played a significant role in explaining the emergence of general
reading proficiency. Moreover, analysis of the data identified several
groups of learners each of whom shared distinct patterns of strength and
weakness on the component measures5.
The ISRS was designed to correct what was perceived to be a failure in
the market for literacy services i.e. that the instructional offerings of
programs were not well matched to the specific needs of different kinds of
learners with the result that the overall efficiency, effectiveness and levels
of learner satisfaction were well below what was possible. The ISRS
provided a means to classify potential learners into groups based on
patterns of test-takers strengths and weaknesses in the component
reading measures. This information allowed programs to create
homogeneous groups of learners and to tailor instruction to each group’s
specific needs. The availability of the ISRS data also allowed for a
nuanced analysis of the numbers of each group in the adult population,
what best practice would do to address their learning needs and an
analysis of the associated costs and benefits – all information needed by
policy makers and programs to target their resources more efficiently.
1.1.6.
Literacy Assessment and Monitoring Programme (LAMP)
In 2005, the UNESCO Institute of Statistics commenced the adaptation
of the IALS/ALLS/ISRS methods to meet the needs of a broader range of
countries6. After the Institute had spent 2 years and a fair bit of money
to develop an approach known as the Literacy Assessment and
The Consultant has used these patterns to define best practice instructional responses for each group and
as a basis for an associated series of cost/benefit analyses (CCL, 2007; DataAngel, 2009, DataAngel, 2010).
6 The Consultant was recruited to assist with the adaptation.
5
24
Monitoring Programme (LAMP), a partnership was negotiated with ETS
and Statistics Canada to adapt the ALLS and ISRS measures and
methods. Development of LAMP was completed in early 2007. The
resulting design included a background questionnaire, an assessment
that measured prose literacy, document literacy, numeracy and, for
readers in Levels 1 and 2, reading component measures based on the
ISRS model and measures. Pilot assessments were organized in several
countries and languages including Niger, El Salvador, Palestine,
Mongolia and Morocco. Full-scale collection was only undertaken in
Palestine. At this point, the Consultant left the Institute and the new
director abrogated the agreements with Statistics Canada and ETS to
support implementation. Unfortunately, at the time of writing nothing
has been published on the psychometric performance of the measures,
difficulties encountered during implementation nor on the substantive
results i.e. on the tangible results.
1.1.7.
Programme for
Competencies (PIAAC)
the
International
Assessment for Adult
In 2008, the OECD began development of instruments for a new round of
international comparative assessment. The ETS won the bid to manage
development and implementation of the first round of the Program for the
International Assessment of Adult Competencies (PIAAC) collection. This
Program borrows heavily on the IALS/ALLS/ISRS design. The PIAAC
combines the ALLS prose literacy and document literacy frameworks but
combines items into one reading measure, uses the ALLS numeracy
framework and measures, administers a variant of the ISRS reading
component measures to low-level readers and includes a new problem
solving in technology rich environments measure. Importantly PIAAC was
the first international study to use computer delivery of the assessment.
The PIAAC background questionnaires were also re-developed and, apart
from including much of what was collected in the ALLS study, include an
interesting job requirements questionnaire designed to get at skills
demand. Pilot testing in twenty-three (23) countries has established the
psychometric integrity of the measures including the ability to link to the
IALS and ALLS scales for the analysis of trend. Data collection is now
underway in the 23 countries7 with another nine (9) countries scheduled
Countries include Australia, Austria, Belgium, Canada, the Czech Republic, Demark, Estonia, Finland,
France, Germany, Hungary, Ireland, Italy, Japan, Korea, Netherlands, Norway, Poland, Russian Federation,
Slovak Republic, Spain, Sweden, the United Kingdom and the United States.
7
25
to participate in a second round of collection this year. Main data
collection is underway for the first round of countries.
1.1.8.
Saint Lucia Adult Literacy and Numeracy Assessment
(SALNA)- Pilot
In 2008, the Consultant began the development of the Saint Lucia Adult
Literacy and Numeracy Assessment (SALNA). Working with the
Consultant and staff from Statistics Canada and the National Statistics
Office in Saint Lucia, the Consultant adapted the ALLS and ISRS
measures and methods for use in Saint Lucia. The Saint Lucia
assessment included an adapted background questionnaire, measures of
IALS/ALLS prose literacy, document literacy and numeracy, and, for lowlevel readers, a battery of clinical reading assessments that had been
carried in the ISRS. A pilot study was carried out and an analysis of
these data was published in March 2009. The data, even though from
only a pilot study, demonstrated that the prose, document and numeracy
measures performed as well in Saint Lucia as they had in other countries
and that the reading component measures worked as they had in
Canada and the US. The recession caused the government to postpone
collection of the main assessment data. The National Statistics Office
(NSO) is currently seeking international funding to support
implementation.
1.1.9.
Bow Valley Web-Based Assessment
Following release of the ISRS data, the Human Resources and Skills
Development of Canada (HRSDC) Ministry funded Bow Valley College in
Calgary, Alberta to develop a web-based assessment based on the theory
assessment methods deployed in the IALS, ALLS and ISRS. The goal was
to reduce the cost, operational burden and test duration to a level that
would allow for use in a wide variety of settings including instructional
programs. The tool includes a number of innovative features including:
(a)
An adaptive algorithm that greatly reduces test duration
while reducing standard errors around the proficiency
estimates.
26
(b)
The ability to chose any combination of skills domains, the
ability to choose among four precision levels that support
different uses i.e. program triage8, formative or summative
assessment, pre and post assessment that supports reliable
estimates of score gain and a certification for employment
linked to Canada’s system of occupation skills standards.
(c)
A pair of score reports that provide diagnostic information for
the learner, and their instructor, and a third score report
that identifies the benefits that would be expected to accrue
to the learner should the prescribed training be undertaken.
Real time algorithmic scoring improves scoring reliability and
allows score reports to be generated in real time.
Bow Valley’s Web-based assessment and instructional suite of web-based
products includes:
Focus – an adaptive assessment of prose literacy, document
literacy and numeracy with four levels of precision
Foundation – an adaptive assessment of prose literacy and the
reading components assessed in the ISRS
Scaffold – an adaptive instructional system that includes the ISRS
reading component measures
Oral fluency – a technology-based assessment of oral fluency in
English, French, Spanish or Arabic
The investment of the HRSDC also included the development of a webbased instructional system, the world’s first such system to be based
upon the ALLS and ISRS frameworks. To date CAN$4.8M has been
invested in system development and validation. A number of large-scale
trials are taking place in colleges, workplaces and literacy programs. The
assessment and instructional programs will be made available
Program triage involves the process of determining learner objectives and learning needs so that an
individual learning plan can be formulated. The process of program triage is central to the implementation of
efficient and effective programs.
8
27
commercially at a fraction of the cost of equivalent paper and pencil
assessments. Importantly for current purposes, the tools are also
available in French and Spanish9.
The Bow Valley tools have been administered to 1600 adults in Canadian
English and French in educational contexts and to 300 adults in
employment programming. Thirteen thousand (13,000) additional
administrations are scheduled for this year. Validation trials are
scheduled for the Chilean Spanish version in August 2012. To date, the
tool has not yet been used in a national assessment. The implementation
of the PIAAC demonstrates that computer-based assessment is viable.
These administrations confirm several things, including that:
(a)
(b)
(c)
(d)
(e)
1.1.10.
The test produces reliable, comparable and interpretable
proficiency estimates.
As predicted, the adaptive algorithms reduce test durations
by roughly 40 percent.
The test taker tutorial eliminates any issues related
unfamiliarity with computer use. The response types are
intuitive.
The score reports are useful for both instructors and test
takers.
Test takers like the clean, uncluttered look and feel of the
test.
Summary of Historical Development of Literacy/Options
By way of summary, all of the options evaluated under this review share
a number of fundamental characteristics. Among other things, these
characteristics include:
(a)
The IALS, ALLS, LAMP, Saint Lucia and PIAAC assessments
of literacy are all based on the same framework developed by
Kirsch and Mosthenthal and initially applied in YALS and
NALS. This framework defines the variables that underlie the
relative difficulty of reading and tasks.
DataAngel Policy Research distributes the Bow Valley products outside of Canada in Canadian English,
American English, Canadian French, Mexican Spanish, Chilean Spanish, France French, Brazilian
Portuguese and a number of other languages.
9
28
(b)
ALLS, PIAAC, Saint Lucia and LAMP assessments of
numeracy are all based on an extension of quantitative
literacy framework developed by Kirsch and Mosenthal and
applied in YALS, NALS and IALS. The refined framework,
developed by a team of international experts led by Iddo Gal
and paid for by NCES and Statistics Canada, defines the
variables that underlie the relative difficulty of reading and
tasks.
(c)
YALS, NALS, IALS, ALLS, LAMP, PIAAC and Saint Lucia all
use a common set of methods to summarize proficiency i.e.
Item Response Theory-based models
(d)
YALS, NALS, IALS, ALLS, LAMP, PIAAC and Saint Lucia all
report results on a common 500 point scales and, notionally
the same proficiency levels. Each study uses a slightly
different approach to scale linking and relies on linking items
for which Statistics Canada holds the copyright
(e)
The ISRS, LAMP and PIAAC all incorporate a set of reading
components, the mastery of which have been shown to
underlie the emergence of fluid and automatic reading
characterized by proficiency at Level 3 on the international
scales. The approach was initially developed by John
Strucker at Harvard and applied in ISRS and then
subsequently refined by John Sabatini for application in
PIAAC and LAMP. The ISRS was administered in Canada to
a sub-sample of ALLS respondents, in the US to a new
representative sample plus a sample of program participants.
(f)
The International Survey of Reading Skills (ISRS), was the
first international comparative study to test the component
reading skills of adults with a battery of clinical reading tests
and an oral fluency test.
(g)
PIAAC, LAMP and Saint Lucia all use background
questionnaires that are largely based upon the questionnaire
developed for the ALLS study. These questionnaires serve to
29
support analysis and to improve the reliability and precision
of the associated proficiency estimates.
(h)
The Bow Valley suite of assessment and instructional
products was developed to reduce the barriers to the use of
assessments in research and program settings. LAMP, PIAAC
and the Saint Lucia approach are costly, operationally and
technically burdensome. The Bow Valley assessment suite is
being used to support several national level programs
sponsored by the Government of Canada.
The general conclusion is that all of the studies rely on the same theory
and methods for summarizing and reporting proficiency. Thus, the key
differences among the studies have more to do with cost, how the
methods are implemented and how much effort is devoted to quality
assurance. A proposal on how the ALLS, LAMP, PIAAC and ISRS
methods might be adapted to the needs, financial and operational
realities of small island states was done (DataAngel, 2008).
1.2. The Assessment Options
The assessment options to be evaluated flow directly from the studies
enumerated above. The options to be evaluated in accordance with the
Terms of Reference (Annex A) of the consultancy and in agreement with
the members of the Advisory Group on Statistics (AGS) at the eighth AGS
meeting are as follows:
(a)
(b)
(c)
(d)
(e)
International Survey of Reading Skills (ISRS)
The OECD’s PIAAC program
UIS’s LAMP study
Saint Lucia’s Literacy and Numeracy Assessment
Bow Valley Web-Based assessment
As mentioned earlier, it is important to acknowledge that all of these
options are based upon the same science, evidence and experience base.
The distinguishing features of these options are practical considerations
related to cost, technical burden, operational burden and risk.
The options can also be distinguished by their information yield. At the
simplest of levels, the Region’s countries share a pressing need for
30
objective comparative data on the level and social distribution of
economically and socially-important skills for policy purposes. In
statistical terms, the countries need comparative data on:
(a)
(b)
(c)
(d)
Average skills levels
The distribution of skills by proficiency level for key subpopulations including youth, employed workers and the
workforce as a whole,
The relationship of skills to outcomes
The relationship of skills to determinants
These data can be obtained through the following:
1. The conduct of a household survey that includes
demographic characteristics and a skills assessment on a
sample that is large enough to support estimates of
population characteristics directly or
2. The conduct of a household survey that includes
demographic characteristics and a skills assessment on a
sample that is large enough to provide estimates of the
relationship of skills to background characteristics.
In the latter case, estimates of population characteristics, including skills
distributions, are derived by applying the observed relations to an
existing data source such as the Census of Population through a process
of imputation. In which case, the yields estimates that are “good enough”
for most policy purposes without imposing large operational burdens or
financial investments.
In each case, two distinct implementation options are considered for
each choice of assessment program being evaluated.
The first option would involve full participation in the regular study at
sample sizes to fulfill country specific policy requirements.
The second option would see CARICOM Member States field a common
assessment in which each country is treated as sub-population.
Pursuing this option would greatly reduce the financial and operational
31
burden associated with implementation without sacrificing much of the
value of the data for policy purposes.
It is important to understand what these two options imply for the utility
of the resultant data.
The rationale that underlies the conduct of a common assessment is to
reduce the operational and financial burden of fielding a comparative
skills assessment without sacrificing too much of the associated
information yield10.
A more detailed summary of this rationale is set out below.
Official statistics of the sort to be collected by an assessment proposed,
serve five purposes:
1. Knowledge generation: understanding cause and effect, the
impact of multivariate on outcomes, relative risk,
attributable risk and what they imply for policy
2. Policy and program planning: Preparation to act to influence
outcomes or their social distribution
3. Monitoring: Tracking trend in key outcomes to determine if
the world is unfolding as expected and to identify new
emerging trends
4. Evaluation: The formal analysis of whether policies and
programs are meeting their objectives and offering value for
money
5. Administration: The process of making decisions about
specific individuals or institutional units such as programs
or regions
Studies generate two types of information – (i) point estimates of the
numbers
of
individuals
sharing
particular
combinations
of
characteristics and (ii) estimates of the relationships among variables,
Further information is detailed in the document, ‘The adaptation of DataAngel’s Literacy and Numeracy
Assessment to the needs of Small Island States’ (DataAngel, 2008).
10
32
including the estimates of the strength of the relationship between skills
and background variables.
Point estimates are estimates of the numbers of individuals in the
sampled population who share a common attribute or characteristic.
Producing point estimates is very demanding in terms of sample sizes.
Conventional practice requires 400 cases to be allocated to each cell
where reliable estimates are required by design using random samples.
Estimating relationships among variables is far less demanding with 30
cases required to support the production of reliable estimates using
random sampling. In both cases, these numbers of cases must be
multiplied by the design effect to reflect the degree to which the sample
design departs from a simple random sample. Experience suggests that
well-designed national assessments require roughly 600 cases to support
point estimates and 60 respondents per cell to support multivariate
analyses.
The fundamental idea underlying a common assessment is that the
estimates of relationships between skills and background characteristics
are far more important than estimates of the numbers of adults in
particular groups. The key innovations in comparative skills assessments
are the skills measures themselves. Much of the sample size fielded in a
skills assessment comes from re-estimating characteristics that are
already available from other sources, including the Census of Population.
These include estimates of the distributions of adults in specific age, sex
and education groups. Work in Canada shows that reliable estimates of
the distribution of skills can be derived by applying the relationships
between skills and background variables observed in the common
assessment to Census records.
Thus, the common assessment options involve determining the minimum
sample size that can support reliable estimates of the relationships
among variables. As the likely sample design and measures of all of the
available options are essentially the same, the key factor that
distinguishes them is the sample size, as it will determine the number of
point estimates and relationships each supports. The larger sample size
of the full PIAAC options means that it will support the most analysis.
Table 1 provides a sense of the number of national point estimates that
each option will support assuming a design effect of two (2).
33
Table: Notional Information Yield of Assessment Options by Use
KnowOption
Use
Policy
ledge
program
genera-
planning
tion
PIAAC
full
PIAAC
Common
and
Monitor
Evalua-
-ing
tion11
Admin
istration
Point
estimates
Multivariate
6
6
3
1
-
167
167
167
1
-
1.25
1.25
1.25
-
-
33
33
33
33
-
3.5
3.5
3.5
1
-
100
100
100
30
-
3.5
3.5
3.5
1
-
100
100
100
30
-
1.25
1.25
1.25
-
-
33
33
33
30
-
4.5
4.5
4,5
1
4
140
140
140
30
4
3
3
3
1
4
120
120
120
30
4
Point
estimates
Multivariate
Point
LAMP full
estimates
Multivariate
Saint
Point
Lucia
estimates
Full
Multivariate
Saint
Point
Lucia
estimates
Common
Multivariate
Bow
Point
Valley
estimates
full
Multivariate
Bow
Point
Valley
estimates
common
Multivariate
The table above reveals that the full PIAAC option has the highest
information yield i.e. it supports the estimation of the largest number of
point estimates and multivariate analyses.
The LAMP and full Saint Lucia options have almost the same information
yield of as PIAAC.
11
Assuming that a sufficiently large sample of literacy program participants was included
34
The Bow Valley option yields more analytic power than the Saint Lucia
and the LAMP options in either the full or common assessment options
because the adaptive nature of the assessment – the tool yields more
reliable proficiency estimates for a given test duration and sample size
than equivalent paper and pencil options.
Only the Bow Valley options provide individual proficiency estimates that
are sufficiently reliable for making administrative decisions with respect
to individual learners.
1.3. Approach to Implementing Household-Based Literacy Skills
Assessments
All household-based skills assessments, whether paper and pencil-based
or technology based, are implemented in five distinct phases:
1.3.1. An adaptation and preparation phase
1.3.2. A data collection phase
1.3.3. A data processing phase
1.3.4. A data analysis phase
1.3.5. A data dissemination phase
As noted earlier, the technology-based options greatly reduce the
operational and technical burden associated with implementation. Each
phase includes a number of distinct activities as outlined below:
1.3.1. The adaptation and preparation phase
The first phase of the adaptation and preparation phase is the
production of a national planning report that:
(a)
(b)
(c)
(d)
(e)
(f)
Specifies the objectives to be met through assessment
Identifies an appropriate sample frame
Proposes a sample design and size that responds to the
objectives
Identifies the adaptations that need to be made
Identifies the institutional consortium that will implement
the assessment
Identifies the expected products, services and dissemination
mechanisms
35
(g)
Identifies where technical assistance will be needed
The national planning reports serve several functions. They ensure that:
(a)
(b)
(c)
Funders know exactly what they are buying
Implementing agencies know exactly what they are expected
to produce
International study managers know that the implementing
agency is capable of implementing the study to specification
Once the national study has been funded the national teams:
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
Select the sample of households and design a Kish grid for
the final stage of selection of individual respondents
Divide the sample into interviewer assignments
Adapt the background questionnaire
Adjust the procedures and manuals
Adapt the training materials
Purchase the equipment needed for the reading components
i.e. tape recorders, timers, batteries,
Recruits and train interviewers to administer the test and
background questionnaire
Validate the test items psychometrically
Print manuals, questionnaires, test booklets, the sheets used
to capture scores and re-scores and codes for open-ended
background questions
In a paper and pencil-based implementation there is a need to typeset,
print and bundle test booklets, the background questionnaire and the
associated manuals, training materials and forms.
In a computer-based implementation there is a need to:
(a)
(b)
(c)
(d)
Adapt the background questionnaire application and test it
Adapt the item pool and validate test
Purchase the required hardware
Install the required software, including the Opera browser12,
the background questionnaire application and the test
application.
The recommended web browser for the Bow Valley tool is Opera as it provides the most control over what
the user can do with the keyboard.
12
36
(e)
Set up the required internet access
In both cases, the implementing agency must seek formal approval for
their implementation and any proposed adaptations. The process of
formal approval assures that the scientific integrity of the study will be
respected; a necessary condition for assuring that the study will generate
reliable and comparable results.
Experience suggests that it normally takes 6 to 8 months to complete the
preparation phase.
1.3.2. The data collection phase
In a paper and pencil-based implementation interviewers:
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
Visit selected households,
Complete/update the household roster
Select one adult member using a Kish grid (unless a list
frame of individuals is being used),
Administer the background questionnaire,
Administer and score the locator test
Administer the main booklet or the low level test booklet and
reading components
Edit the completed documents for completeness and
correctness
Revisit non-responding households to try to convert them to
respondents
Bundle and ship completed forms and recordings for
processing
The data collection phase depends on the size of the sample, the number
of interviewers and the average number of completions per day.
Collection normally takes 4 to 8 weeks to complete.
In a computer-based implementation, interviewers:
(a)
(b)
(c)
Visit selected households,
Complete/update the household roster
The system will select one adult member using a Kish grid
(unless a list frame of individuals is being used),
37
(d)
(e)
(f)
(g)
(h)
Administer the background questionnaire,
Administer the locator test
Administer the main booklet or the low level test booklet and
reading components
Revisit non-responding households to solicit cooperation
Bundle and ship completed forms and recordings for
processing
1.3.3. The data processing phase
In a paper and pencil-based implementation there is a need to:
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
(k)
(l)
Specify program and test data capture applications for:
(i) The background questionnaire
(ii) The score sheets
(iii) The coding sheets
Specify, program and test and edit program for the
background questionnaire
Train scorers
Score test booklets and reading components
Re-score, compute inter-rater reliabilities and re-score items
as needed
Code open ended fields e.g. industry, occupation, field of
study, other specifies
Capture background questionnaire
Capture scores
Capture codes
Edit background questionnaire
Merge background questionnaire, scores and codes
Scale the assessment results, link them to the international
proficiency scales and compute error estimates
In a computer-based implementation one needs to:
(a)
(b)
Adjust the background questionnaire application and
associated edits
Specify, program and test data capture applications for the
coding sheets
38
In a paper and pencil-based implementation, the data processing phase
can take from 4 to 16 months depending on the availability of staff, the
sample size and available funds.
The next step is to weight the data file to provide a mechanism for
generating unbiased estimates of population characteristics, proficiency
scores and proportions at each proficiency level. This step also includes
the creation of replicate weights that serve as a mechanism for
computing standard errors/error variances that include the additional
error associated with the fact that one has sampled the content domain
as well as the population.
In a computer-based implementation, the other data processing steps
listed above are taken care of by the system i.e. the system captures,
scores and edits the data in real time
Similarly, as in the paper and pencil, computer-assisted collections allow
the data processing phase to be greatly reduced. It is generally possible
to have a clean, weighted data file within weeks of receipt of the last
completed case.
1.3.4. The data analysis phase
The data analysis phase is identical for both the paper and pencil and
computer-based options save for timing i.e. the computer-based options
should allow the process to begin 3 to 4 months earlier. The output of
the data processing phase is a weighted, documented data file. The
international study team uses this file to generate an international report
that compares average scores and score distributions for the total adult
population and for key population sub-groups, and analysis of the
factors that underlie observed differences in skills and the impact that
skills has on individual and macro outcomes and the associated
implications for policy. The national study team will use the same data
file to produce a national report that draws out implications for national
policy. These reports depend largely on simple descriptive analysis and a
small amount of simple multivariate analysis.
The production of an international report and associated national reports
generally takes 4 to 6 months.
39
1.3.5. The data dissemination phase
The transformation of the raw data through analysis to information is a
necessary but insufficient condition for realizing a maximum return on
the investment on the assessment. In order to realize the full potential of
the study, countries will have to devise and implement an integrated
dissemination and communication strategy that ensures that key
findings reach key users and that data are available for secondary
analysis.
The only operational differences between the common assessment and
full assessment options is the size of the sample and, by extension, the
number of interviewers and the duration of collection. The 1,000 case
common assessment implies roughly 500 interviewer days of collection,
the 3,500 case full assessment implies about 1,750 interviewer days.
Assuming a two month/40 day collection window, then these options
imply a need for 15 and 45 interviewers respectively.
The data dissemination phase continues until data from the next round
of assessment becomes available. Rates of change in skills suggest a
frequency of 5 to 10 years between assessments.
Each of the seven options will be evaluated against an additional
standard set of criteria set out below.
1.4. The Evaluation Criteria and Regional Constraints
Having established the statistical and policy goals of a regional skills
assessment each assessment option is then evaluated against the
following five criteria:
1.4.1. Cost
1.4.2. Operational burden
1.4.3. Technical burden
1.4.4. Risk
1.4.5. Information yield
Each of these criteria is described in more detail below.
1.4.1. Cost
40
The assessment of adult skills within the context of a household survey
is relatively expensive when judged on a per case basis.
For paper and pencil assessments one must train large numbers of
interviewers, print the background questionnaires and test booklets,
travel to selected households, conduct interviews that averages 90
minutes in length, code open ended responses, score and re-score the
test booklets and the component tests, capture the questionnaires and
scores and edit and weight the survey file. As an example, in the Saint
Lucia assessment total domestic cost per unit was US$124. This cost
includes all variable staff and out of pocket costs of fielding the
assessment including:
(a)
(b)
(c)
(d)
(e)
(f)
(g)
Preparation and printing of the questionnaire
assessment booklets
Training of the interviewers and supervisors
Selection of the sample
Data collection
Scoring and re-scoring
Data capture
Data editing and coding
and
Fixed international costs for Saint Lucia were US$145,000. This amount
covers:
(a) Adaptation of survey instruments
(b) Preparation of interviewers and procedures manuals, scoring
guides
(c) Interviewer and supervisor training
(d) Data analysis and scaling
While these amounts are manageable for some CARICOM Member
States, they could exceed the financial capacity of many of the other
Member States. As such, the financial capacity of the countries must be
considered in the feasibility of conducting the assessment.
Costs for computer-supported data collection are lower but not uniformly
so. One must acquire the hardware, adapt the items and background
questionnaires and train interviewers in the use of the technology and
pay network usage fees. On the positive side, one saves on printing,
41
editing, scoring, data capture and scaling as the software takes all of
these steps and delivers results in real time. Cost estimates presented
include the acquisition costs for a sufficient number of suitable
computers. The only cost element for which costs have not been
estimated are the cost of internet use as these are highly variable from
country to country and on usage.
As noted above it is difficult at this stage to derive anything more than a
first order approximation of the cost of any of the options under
consideration. The cost of fielding any assessment depends on several
factors including:
(a)
The fixed international costs associated with participation.
These tasks include training in key activities, general project
management, quality assurance and trouble shooting. These
costs will vary somewhat with the expertise of the
participating countries, how much support is required, how
many activities are centralized and, most importantly, how
many countries field the assessment simultaneously. The
larger the number of countries the smaller the average
overhead cost per country. It is impossible, in the absence of
a decision on how many and which countries will field an
assessment on a common schedule, to estimate international
overheads. The cost estimates presented below assume a flat
charge of $150,000 per participating country.
(b)
The fixed national costs associated with having the project
team in place and to undertake key tasks. These costs
include item and background questionnaire adaptation,
preparation of national planning reports, sample design and
selection, management of data collection, data processing
and analysis. The longer the study takes to complete the
higher the fixed national cost. The cost estimates below
assume an 18 month implementation with a two month data
collection window.
(c)
The variable costs associated with data collection that
themselves depend on the number of interviewers, the
42
number of people to be assessed and their characteristics,
the average number of interviews completed per day and the
license fees for using component measures. The cost
estimates are based on a two month collection window for
the full assessment options, and a 6 week collection window
for common assessment options.
(d)
For paper and pencil collections numbers of manuals,
background questionnaires and test booklets to be printed,
the number of recorders and timers to be purchased, the
number of items to be scored and the number of items to be
coded.
(e)
For computer-based collections the numbers of computers
and software including replacement units and internet
access
Costs will vary from country to country depending on the sample size, on
differences in the rates for different types of personnel and the degree to
which their work can be covered under existing budgets.
Thus, the cost estimates presented below must be considered to be
indicative of the approximate relative magnitude of cost among the
options reviewed.
The cost estimate use in the Saint Lucia’s pilot is found in Annex C.
Readers are encouraged to review these carefully to gain an appreciation
of the cost elements and how they interact.
1.4.2. Operational burden
Household-based paper and pencil assessments are among the most
complex forms of social science. Interviews generally include:
(a) The completion of a household roster,
(b) the selection of one adult per household using a KISH
procedure,
(c) the completion of a background questionnaire that averages
30 minutes,
43
(d)
(e)
the completion and scoring of a filter test,
the completion of a main test booklet for skilled respondents
or the completion of a locator booklet and a reading
components test for low skilled adults. These latter testing
phases average 60 minutes but there is huge variation
around this average depending on the average skills and the
characteristics of sampled respondents.
The use of a computer-based assessment can reduce the average
duration of each interview by a significant amount because the
application includes adaptive algorithms that focus the available testing
time around the skills level of the selected respondent. Computer-based
administration eliminates the need for most of the printing – of
procedures manuals, training manuals, questionnaires, test booklets,
scoring sheets and coding sheets. Computer-based testing can also, in
some systems, obviate the need for data capture, editing and scoring – all
operationally demanding and ultimately expensive, error-prone manual
processes. Obviously, computer-based collection systems require the
acquisition of a sufficient number of computers and required software as
well as internet usage costs.
On balance, computer-administered assessments save a minimum of 40
percent of the cost, and 60 percent of the operational burden, of
equivalent paper and pencil based assessments.
Given the average length of interview and the anticipated sample sizes
that are needed to get baseline point estimates of the levels and
distributions of skills, household-based skills assessments can impose a
significant operational burden on national collection and processing
capacity. The experience in both Bermuda and Saint Lucia suggests that
this burden can be managed with proper planning but the opportunity
cost may be too high in some smaller Member States.
1.4.3. Technical burden
Many of the tasks associated with implementation of a national
assessment demand the mobilization of scarce technical resources.
While it can be expected that most of the National Statistics Offices
either have, or do not have access to, all of the requisite skills, experience
in the Region suggests that this would, in many cases, tax the available
44
technical infrastructure. In particular, the demands of sampling, training
of interviewers, the preparation and printing of booklets, questionnaires
and manuals, data capture, editing, weighting and variance estimation
and data analysis have the potential to overwhelm smaller systems.
1.4.4. Risk
The combination of cost, operational burden and technical burden
associated with fielding a large-scale adult skills assessment implies a
non-trivial risk that things can and will go wrong. Experience with IALS,
ALLS, PIAAC, Saint Lucia Assessment and the ISRS suggest that these
risks can be attenuated if the right measures are taken. Among other
things, experience suggests a need for:
(a)
(b)
(c)
(d)
(e)
A highly skilled and experienced national project manager in
each country
Sufficient budget to do a good job
Sufficient flexibility to hire local technical assistance as
needed
Implementation of an extensive quality assurance regime,
one that includes a mix of active and passive measures.
Passive measures include things such as the availability of
detailed and unambiguous specifications and standards,
extensive training and a management process that ensures
that decisions that arise during the course of implementation
are dealt with in a way that does not impair the integrity of
the measures.
An international consortium that has extensive experience in
the design, implementation and analysis of the assessment
data.
These are the minimum necessary and sufficient conditions for success.
A failure to meet any of these conditions implies unacceptably high levels
of risk to the individuals, institutions and governments involved.
45
CHAPTER 2: A REVIEW OF OPTIONS
This chapter provides details on the review of the options identified in
consultation with the CARICOM Secretariat, the Member States and the
AGS.
The Terms of Reference (TOR) called for a review of the following options:
(a)
(b)
International Survey of Reading Skills (ISRS)
The United Nations Educational, Scientific and Cultural
Organization (UNESCO) Institute for Statistics’ (UIS’) Literacy
Assessment and Monitoring Program (LAMP) paper and
pencil assessments of prose literacy, document literacy,
numeracy and reading components on a sample of 3,000
adults per country
Following consultations with the CARICOM Secretariat, its Members
States and the AGS members, the list was further modified in terms of
sample size and data collection methods (paper and pencil and webbased). The following other options were added:
(a)
The
Organisation
for
Economic
Cooperation
and
Development’s (OECD’s) Program for the International
Assessment of Adult Competencies (PIAAC) paper and pencil
reading, numeracy and reading components assessments on
a sample of 3,500 adults per country - Full Assessment
(b)
The
Organisation
for
Economic
Cooperation
and
Development’s (OECD’s) Program for the International
Assessment of Adult Competencies (PIAAC) paper and pencil
reading, numeracy and reading components assessments on
a sample of 1,000 adults per country - Common Assessment
(c)
The Saint Lucia’s paper and pencil prose literacy, document
literacy, numeracy and reading components assessments on
a sample of 1,000 adults per country- Common Assessment
46
(d)
The Saint Lucia’s paper and pencil prose literacy, document
literacy, numeracy and reading components assessments on
a sample of 3,500 adults per country- Full Assessment
(e)
The Bow Valley’s web-based prose literacy, document
literacy, numeracy and reading components assessments on
a sample of 1,000 adults per country- Common Assessment
(f)
The Bow Valley’s web-based prose literacy, document
literacy, numeracy and reading components assessments on
a sample of 3,500 adults per country- Full Assessment
2.1. Program for International Assessment of Adult Competencies
(PIAAC) - Common Assessment
The possibility of implementing a common Caribbean assessment using
the PIAAC instruments was evaluated. This would involve each
CARICOM Member State fielding roughly 1,000 cases in an assessment
that would use identical instruments. As described above, this design
would allow for the production of a limited number of national point
estimates i.e. average score and the distribution of proficiency by level, a
database to support multi-variate analysis of the relationship between
skills and background characteristics. These covariance data can also
be combined with data from other sources, including the Population and
Housing Census, to create synthetic estimates of literacy that, when
applied in Canada, have proven useful for policymaking. Conducting a
common assessment also improves the comparative dimensions of the
study as it reduces the impacts of adaptation and implementation errors
on the comparability of the results.
2.1.1. Cost of a PIAAC common assessment
The reduction in sample size by several thousand cases would reduce
interviewer training, printing, data collection and scoring costs by a
significant margin when compared to full PIAAC, LAMP or the Saint
Lucia options. The total cost of a 1,000 case common PIAAC assessment
is difficult to estimate because of ETS’s insistence that some key tasks be
centralized, including scoring and whether the OECD would insist on
each country paying full international overheads. It is estimated that a
common PIAAC assessment would cost US$650,000 or roughly US$650
47
per case. A 15 country common PIAAC assessment would require
US$9,750,000.
2.1.2. Operational burden of a PIAAC common assessment
The reduction in sample size would have a similar positive impact on the
operational burden associated with fielding the study. The collection
window could be reduced from three months to one month and the
interviewer complement reduced from roughly 45 to 24. These reductions
would have a material impact on the national statistics offices and their
ability to take on other work. The literacy assessment would not, under
these assumptions, crowd out other important work.
Conducting a common assessment would also reduce the fixed design
overheads by a significant amount. The need to develop and print one set
of test instruments, associated training materials and scoring guides.
It is worth noting that individual countries could choose to increase their
own sample size to match their information needs, level of political
interest, data collection capacity and funding envelope.
2.1.3. Technical burden of a PIAAC common assessment
The technical burden would remain the same but could be concentrated
in a single team drawn from the participating CARICOM Member States.
2.1.4. Other considerations
Participation in a PIAAC common assessment would be similar to full
PIAAC participation. However, the PIAAC consortium has stipulated
some conditions that would need to be met in implementing a PIAAC
common assessment.
First, they would require that one organization (e.g. one of the National
Statistics offices) assume responsibility for managing all survey
operations for all the participating countries. This would serve to reduce
deviations in how the tools are administered but would subordinate
national statistics offices in a way that many governments would not
accept.
48
Second, they would require that one organization assume responsibility
for managing the scoring operation. This would serve to improve the
inter-rater reliabilities and thereby reduce scoring error. This is probably
manageable but implies the physical shipment of bulky boxes of test
booklets.
2.2
Program for International Assessment of Adult Competencies
(PIAAC)- Full Assessment
The evaluation starts with PIAAC as it arguably represents the
international gold standard with respect to the assessment of adult
skills.
The PIAAC design includes a lengthy background questionnaire that
includes
a
job
requirements
module,
a
combined
prose
literacy/document literacy reading measure, a numeracy assessment, an
assessment of problem solving in technology rich environments and, for
low level readers, an assessment of reading components.
The OECD and the prime contractor for PIAAC, ETS, provided a great
deal of technical information on PIAAC, including copies of presentations
to their Board of Participating Countries (BPC) and Technical Advisory
Group (TAG).
2.2.1. Cost of Full PIAAC participation
This option would see each interested Member State field PIAAC as
regular participants. The costs associated with full PIAAC participation
are high by almost any standard. Much of these costs are directly
attributable to what PIAAC is trying to accomplish. Measuring multiple
domains and the associated covariance matrix with sufficient sample to
support the production of point estimates, such as average score and
percent at each proficiency level, for key population subgroups, makes
for an expensive and demanding design.
In addition, the first round of PIAAC is technology-based i.e. it involves
the collection of the background questionnaires using Computer-Assisted
Personal Interviewing (CAPI) followed by either computer-based or paper
and pencil-based assessment. Scoring and scaling are performed by ETS
post-hoc i.e. after collection has been completed.
49
The average duration of a PIAAC interview is approximately 100 minutes
but this average obscures huge variation in interview length associated
with differences in individual characteristics and skills levels. Interviewer
productivity for PIAAC in Canada averages 1.5 completed cases per day.
The cost estimates assume 2 completed interviews per day for a case load
of 3,500 cases.
Participating countries need to buy standardized laptops at a cost of
roughly $600/unit and to adapt and load the data collection
applications. Interviewers need to be trained not only in how to
administer the measures but also in using the basic functionality of the
computer-software.
PIAAC is also expensive because it imposes a demanding regime of
quality assurance standards. Key among these standards is a
requirement to field a sample that yields roughly 4,500 to 5,000
completed cases. The estimated total cost of the PIAAC full option
assumes 3,500 cases as these former amounts are beyond what most
Member States could afford or manage. The study also imposes
demanding response targets that translate into a need for costly nonresponse follow-up to be undertaken. The Quality Assurance (QA) regime
also includes a number of active measures that require a significant
amount of analysis work by national teams.
PIAAC participating countries are also required to cover the costs of
attending meetings of the Board of Participating Countries and of
National Project teams. These costs add roughly US$30,000 to the
annual cost of participation.
PIAAC is also expensive because participating countries are required to
contribute 84,000 Euros towards the international costs of the program.
These contributions cover international design costs, project
management costs, implementation of the quality assurance regime and
an ambitious program of analysis and dissemination.
The total cost of fielding a full PIAAC assessment cost roughly
US$1,034,000 per country or roughly US$295 per case. Thus a budget of
roughly US$15,510,000 would be required for a 15 country assessment.
50
2.2.2. Operational Burden of full PIAAC participation
Full PIAAC participation is operationally demanding. Assuming a threemonth collection window and completion of two interviews per day per
interviewer, statistical offices would have to recruit and train a minimum
of roughly 45 interviewers and an additional 5 senior interviewers to
provide support and to do non-response follow-up. Considering the
experience in Saint Lucia, finding a venue to train this number of
interviewers may be problematic in some countries.
In several National Statistics Offices (NSOs), PIAAC participation would
involve their first experience with computer-supported distributed data
collection. These NSOs would likely have to invest in the development of
the technical infrastructure and personnel needed to install software and
to maintain and repair the hardware.
The use of CAPI obviates the need for capturing the background
questionnaires data but test item scores need to be captured. With
sufficient advance planning, the associated volumes would be
manageable in most offices.
2.2.3. Technical burden of full PIAAC participation
Full PIAAC participation is also likely to impose a significant technical
burden on participating countries. The PIAAC sample design
requirements, and the associated weighting and variance estimation is
very demanding. While most of the NSO’s have access to sampling
expertise it might prove less costly to hire external consultants to
undertake these tasks. The cost of doing so would have to be borne by
the participating country.
As noted above, NSOs would likely have to invest in the development of
the technical infrastructure and personnel needed to install software and
to maintain and repair the hardware.
2.2.4. Risks associated with full PIAAC participation
The risk of things going wrong in PIAAC are minimal because the design
is based upon the lessons learnt through the conduct of IALS, ALLS and
the ISRS. Similarly, the quality assurance regime is based on containing
the key sources of error and bias revealed during the conduct of IALS,
51
ALLS and ISRS. Finally, the consortium that is implementing PIAAC
includes ETS and the OECD, two of the three partner institutions
responsible for the design and implementation of IALS and ALLS.
Statistics Canada is no longer playing a role in sampling or project
management having been replaced in these roles by Westat13. Westat has
considerable skills and experience in managing sampling in international
comparative studies having managed this function for Program for
International Student Assessment (PISA). They were also responsible for
managing the US data collection for IALS and ALLS so they are familiar
with the methods and measures. The international consortium also
includes several key individuals from the IALS and ALLS teams.
Thus, it is expected that the key risk associated with PIAAC participation
is its complexity. Given the high cost of full participation, expectations
will be high. Even with the full Quality Assurance regime in place, there
is a non-negligible risk that the operational and technical demands will
overwhelm the team in a subset of the countries.
2.2.5. Other considerations
A large number of countries will participate in the first round of PIAAC
data collection, these include Australia, Austria, Belgium, Canada, Czech
Republic, Denmark, Estonia, Finland, France, Germany, Hungary,
Ireland, Italy, Japan, Korea, Netherlands, Norway, Poland, Russian
Federation (Non-member economy), Slovak Republic, Spain, Sweden,
United Kingdom and the United States of America.
A second group of nine countries will field a variant of PIAAC in 2013.
This alone constitutes a strong endorsement of the PIAAC design,
expected information yield and organization.
A number of other considerations weigh on the evaluation; one
consideration has to do with the PIAAC experience14 as experienced by
participating national study managers and their teams. The
overwhelming majority of PIAAC national study managers were IALS
Westat is a for-profit Washington-based statistics agency with a reputation for quality design and
implementation.
14 The Consultant was a member of the PIAAC design team.
13
52
and/or PIAAC national study managers. Conversations with multiple
national project managers confirm several things including:
(a)
(b)
(c)
The PIAAC project is well organized. Meetings which are held
frequently, are well structured with clear objectives, agendas
and decision processes.
All the important PIAAC documentation is available to
national study managers on a password protected sharepoint site.
The governance structure for PIAAC is clearly set out. A
Board
of
Participating
Countries
(BPC)
governs
implementation. The international consortium is responsible
for holding regular meetings. The BPC reports to a joint
committee of the OECD’s Directorate for Employment,
Education, Labour, and Social Affairs (DEELSA) and
Education Committee.
This being said the PIAAC process involves so many countries and is so
political that many countries have some misgivings about the level of
decision-making authority that is vested in the OECD and/or the
implementing consortium. An example of this latter concern is a debate
about the response proficiency that is to be applied to create proficiency
levels. The feeling has been that decisions, in this regard, is being overly
driven by vested institutional interests rather than scientific
considerations. Put differently it is difficult, within the context of such a
complex undertaking, to give individual countries, or groups of countries,
much power.
PIAAC has also proven to be extraordinarily demanding in terms of time,
resources and effort. Several countries have recently withdrawn from the
study, including Portugal and Slovenia, because of the financial,
operational and technical demands of participation. In small systems,
the amount of time of what is invariably scarce technical staff.
It is reasonable to assume, as has been the case with PISA, that the
PIAAC skills measures will become the de-facto international standard.
There is value, therefore, in using PIAAC to benchmark Caribbean skills
profiles against the best economies in the world, including Canada, the
US and the UK.
53
It is also clear that CARICOM Member Statess would benefit greatly from
the enormous investments made in IALS, ALLS and PIAAC development.
They would be, in economic terms, free riders in this sense. Also the
CARICOM Member Statess would get significant value out of the PIAAC
analysis program.
Fortunately, it is believed that other options afford many of these benefits
at much lower cost and complexity.
On balance, therefore, the full PIAAC participation is not recommended.
54
2.3. Literacy Assessment and Monitoring Program (LAMP)
Difficulty was experienced in evaluating participation in the LAMP
program due to unavailability of data on the technical performance of the
test items as well as on any other aspect of the LAMP pilot studies. To
date, no information has been made publicly available on the LAMP
measures or results.
The review that follows is therefore based upon the experts’ knowledge of
the instruments and methods, knowledge gained from having developed
them. The review also benefits from consultation with several members of
the LAMP technical advisory committee.
The design of LAMP is almost identical to the ISRS. The study includes a
background questionnaire, an assessment of prose literacy, document
literacy, numeracy and, for low-skilled readers, a battery of clinical
reading tests. Thus, LAMP is designed to provide point estimates of skills
for key sub-populations on the IALS/ALLS scales and to place low-level
learners in groups sharing common patterns of strength and weakness
on the reading components.
The consultant was engaged in an activity with UNESCO Kabulto,
assisting the Government of Afghanistan in the design of a common
assessment that would yield reliable estimates of the distribution of
literacy skills in a situation where security concerns were paramount.
However, the activity was canceled by UIS.
2.3.1. Cost of implementing LAMP
Participation in LAMP would be less costly than participation in the full
PIAAC in large measure because the minimum sample size required by
UIS is in the 3,000 case range. The costs are also lower because UIS has
indicated that participants are not required to contribute towards the
international overheads associated with implementation.
Implementing LAMP would also reduce the travel costs associated with
PIAAC participation. Participating countries are required to attend
approximately 12 meetings over the course of a cycle and these meetings
are held in PIAAC participating countries, mostly in Europe. The
55
associated costs of travel are high. All LAMP meetings could be held in
the Caribbean.
Assuming the Saint Lucian costs with sample sizes of 3,000 cases per
participating country, LAMP could be implemented for roughly US$133
per case, about US$400,000 per country, an amount that implies that a
15 country Caribbean assessment would roughly cost US$6,000,000.
2.3.2. Operational Burden of implementing LAMP
The operation burden associated with implementing LAMP is slightly less
than that with PIAAC because the minimum permissible sample sizes are
lower than for PIAAC. The two studies are roughly equivalent in most
other operational respects.
2.3.3. Technical burden of implementing LAMP
The technical burden associated with LAMP is lower than PIAAC because
LAMP uses paper and pencil methods rather than computer-assisted
interviewing. LAMP also imposes a less stringent quality assurance
regime and reporting requirements.
2.3.4. Risks associated with implementing LAMP
As noted above, the LAMP instruments were developed in 2006 and are
based directly upon the measures that were fielded in the ISRS study.
Piloting of the LAMP instruments began in 2007 in Palestine but the UIS
has yet to publish any evidence on the psychometric performance of the
measures or on what the measures reveal for policy.
Also noted above is that the comparative assessment of skills represents
the most complex type of survey-based research. The only other study
that approaches skills assessments are the US health interview surveys.
It was not possible to obtain technical information on the performance of
the LAMP instruments.
Based on these considerations, the LAMP program of work is not
recommended.
56
2.3.5. Other considerations
LAMP pilots were conducted in El Salvador, Mongolia, Morocco, Niger
and Palestine. The main survey was implemented in Palestine15. The
implementation followed the proven open coordination approaches
developed in IALS and ALLS and national study managers all expressed
appreciation for the open, professional and flexible manner in which the
project was run. Since the Consultant’s departure from UIS, the project
has reverted to an approach wherein countries are fielded on a bilateral
basis. Contact with the Palestinian national project manager suggests
that they were happy with the service that they got from UNESCO but, in
the absence of any data, it was not possible to assess the quality of the
output.
In terms of governance, the UIS has maintained a Technical Advisory
Group (TAG) to provide advice and guidance on technical matters.
Consultations with two of the TAG members suggest that the TAG is not
fulfilling the same function as it did in the ALLS, PIAAC and PISA
programs. More specifically, the TAG has neither been provided with
access to data from the LAMP pilots nor from the main implementation.
To date, UIS has not published any information on the technical
performance of the instruments and implementation, or any results.
To date very few additional countries have taken the decision to field the
LAMP assessment. The LAMP website suggests that Palestine, Mongolia,
Jordan, and Paraguay have completed assessments.
The Ministry of Education in Jamaica was approached by the UIS relative
to the conduct of the LAMP but a discussion with the Chief Statistician of
the Jamaica Statistical Institute confirmed reservations about UIS’s
technical and operational capacity.
15
The Consultant was responsible for the initiations
57
2.4. Saint Lucian Instruments- Common Assessment
The Government of Saint Lucia with technical support by Statistics
Canada16, adapted the ISRS instruments and methods for use in Saint
Lucia. The design includes a background questionnaire, a locator test, an
assessment of prose literacy, document literacy and numeracy for high
skilled adults and a test of prose literacy, document literacy, numeracy
and reading components for low skilled adults. The instruments were
successfully piloted in 2009.
The proposal would be to apply a variant of the Saint Lucian instruments
in a common assessment in CARICOM Member States.
2.4.1. Cost of applying the Saint Lucian instruments in a common
assessment
Because both LAMP and the Saint Lucia are based on the instruments
and methods developed by Statistics Canada and ETS for the ISRS study
the domestic cost of implementing the LAMP and the Saint Lucia study
would be roughly the same.
The international overheads for the Saint Lucia pilot study amounted to
US$145,000.
This is a reasonable estimate of what it would cost to implement a
common assessment using the Saint Lucian instruments.
The costs associated with adapting the Saint Lucian instruments for use
in other Caribbean countries would be lower than those of adapting
LAMP because they have already been shown to work well in Saint Lucia.
Using the Saint Lucian unit costs as a guide, a 15 country common
assessment based on the Saint Lucian instruments and 1,000 cases per
country would cost roughly US$133/case or US$133,000 per country for
a total cost of US$2,000,000 for 15 countries.
2.4.2. Operational burden of a common assessment using the Saint Lucian
instruments
The Consultant, as a member of Statistics Canada’s team, played an active role in this
exercise.
16
58
The operational burden of conducting a common assessment based upon
the Saint Lucian approach would be similar to fielding LAMP or PIAAC.
Interview durations could be expected to run roughly 100 minutes per
case, 1700 interview hours per country or approximately 27,000
interviewer hours in total.
2.4.3. Technical burden of a common assessment using the Saint Lucian
instruments
The technical burden of conducting a common assessment using the
Saint Lucia assessment is slightly lower than for LAMP or the full PIAAC
option. This is because most of the technical tasks would be undertaken
by one team of international experts supported by staff from Bermuda
and Saint Lucia. ETS is concerned that having multiple teams
independently undertake operational tasks, such as item adaptation,
scoring, editing, weighting and variance estimation would increase the
level of non-sampling error beyond acceptable limits. Experiences with
small statistics offices is that they are far more likely to adhere to the
operational guidelines and associated quality assurance procedures than
large statistical offices where the guidelines conflict with their standard
ways of doing business.
2.4.4. Risks associated with a common assessment using the Saint Lucian
instruments
The risk associated with the conduct of a common assessment using the
Saint Lucian instruments is low. The measures have been shown to
provide useful results for policy in Canada, the US, the Mexican State of
Nuevo Leon and Bermuda. The approach provides most of the same
measures to be carried in PIAAC and have been shown to function
psychometrically in the Caribbean context. All of the associated training
material has been developed and the team that undertook the Saint
Lucian pilot as a wealth of experience in the design, adaptation,
implementation and analysis of data from skills assessments so the risk
of errors being introduced inadvertently are low.
The team includes a project manager, a sampling expert, a
psychometrician and a data collection expert all with IALS, ALLS, ISRS
and PIAAC experience.
59
2.4.5. Other considerations
The design of Saint Lucia’s literacy and numeracy assessment are based
upon the instruments developed for the International Adult Literacy
Survey (IALS) and the Adult Literacy and Life Skills Survey (ALLS).
IALS/ALLS has been fielded by a large number of countries as listed
below:
IALS 1994
Canada (with separate studies in Ontario Immigrant Literacy Study and
Ontario Survey of the Deaf and Hard of Hearing), Germany, Netherlands,
Poland, Sweden, Switzerland, United States of America and Vanuatu
IALS 1996
Australia, New Zealand and the United Kingdom
IALS 1998
Belgium, Czech Republic, Denmark, Finland, Hungary, Ireland, Norway
and Portugal
ALLS 2003
Bermuda, Canada, Italy, Norway, Nuevo Leon, Mexico, Switzerland, and
the United States of America
ALLS 2005
Australia, Hungary, Netherlands and New Zealand
The IALS and ALLS study have spawned five international comparative
reports:
60
a. Literacy, Economy and Society: Results of the first
International Adult Literacy Survey, Statistics Canada and
OECD, 1995
b. Literacy skills for the Knowledge Society: Further Results of
the International Adult Literacy Survey, OECD and HRSDC,
1997
c. Literacy in the Information Age: Final report of the
International Adult Literacy Survey, Statistics Canada and
OECD, 2000
d. Learning a Living: First results of the Adult Literacy and Life
Skills Survey, OECD and Statistics Canada, 2005
e. Literacy for Life: Further results of the Adult Literacy and
Life Skills Survey, OECD and Statistics Canada, 2011
Technical reports have been produced for the IALS and ALLS studies and
the datasets have been used to generate an impressive list of research
monographs.
The reading components measures were based on the International
Survey of Reading Skills (ISRS). The ISRS had only been fielded in two
countries prior to being fielded in Saint Lucia. The measures have since
been incorporated into both LAMP and PIAAC. Analysis of ISRS data for
Canada and the US has revealed interesting findings. See for example,
Learning Literacy in Canada: Evidence from the International Survey of
Reading Skills, Statistics Canada, 2008; Reading the Future: Planning
for Canada’s Future Literacy Needs, Canadian Council on Learning,
2008.
ISRS 2005
Canada, the United States of America and PIAAC countries
Clearly the IALS and ALLS studies were of high quality and resulted in
technically defensible, policy relevant results rapidly.
61
Implementation of the Saint Lucian pilot has equipped the National
Statistics Office in Saint Lucia with first hand experience in applying the
methods within a tight time frame. These individuals are available to
assist with implementation of the approach in other countries. Piloting of
the PIAAC instruments shows that the psychometric performance of the
IALS, ALLS and ISRS assessment items is unchanged by the move to a
computer-based administration.
62
2.5. Saint Lucian Instruments- Full Assessment
As noted above the Saint Lucian design, background questionnaires and
assessment instruments were based on the ALLS and ISRS studies. The
methods and measures were piloted in Saint Lucia and shown to be both
valid and reliable.
The Saint Lucian approach could easily be adapted for use in other
Caribbean countries at a cost of roughly US$124 per case, US$434,000
for 3500 cases per country or US$6,510,00 for the 15 countries. Virtually
all of the development costs have been absorbed by Saint Lucia so only
minor modifications need be made to the background questionnaires and
training materials.
Actual costs would vary depending on the sample size and the number of
interviewers trained and equipped.
2.6. Bow Valley Web-based Assessment- Common Assessment
As described briefly above the Government of Canada has recently
funded the development and validation of a web-based assessment and
instructional system that embodies the science that enabled IALS, ALLS,
ISRS, LAMP and PIAAC. The goal of this investment was to
simultaneously improve the reliability of skills estimates while reducing
the cost, technical burden and operational burden of the assessment
process. The development also sought to create a suite of assessment
tools that support the full range of needs from program triage to
certification in real time. The current tool kit includes assessments of
prose literacy, document literacy, numeracy, oral fluency and, for low
level readers, a computer-based variant of the reading components
carried in the ISRS, LAMP and PIAAC assessments. The tools deliver
reliable estimates of proficiency in real time.
It should be noted that nothing would prevent specific countries from
expanding their sample to support the direct estimation of more point
estimates at the national level and more reliable relationships.
2.6.1. Cost of a common assessment using the Bow Valley tools
The adaptive algorithms in the Bow Valley assessment allow the average
interview duration to be reduced significantly (the average length of an
interview is 70 minutes) so unit collection costs would drop from US$124
63
to US$83. To this amount, one must add the license fees of $20 per case
for using the Bow Valley tests yielding a cost per case of US$103. Adding
in US $10 per case for the cost of the hardware and US$10 per case in
overheads yields a total cost per case of roughly $123, resulting in a cost
per country of US$123,000 with sample size of 1,000 cases. As noted
previously, this cost estimate includes allowances for all staff and out of
pocket costs including the acquisition of the required hardware and
software. The only cost that is excluded is the cost of internet access.
This cost could be reduced significantly if the means can be found to
have individuals tested in groups.
This estimate includes an allowance for the purchase of an average 24
tablet computers per country would require an additional US$192,000
raising total costs to roughly US$1,845,000. Tablet prices are falling
rapidly so this is a maximum amount. As suggested by the AGS,
hardware bought for one country could be used in other countries
providing all the countries do not execute the survey at the same time.
2.6.2. Operational burden of a common assessment using the Bow Valley
tools
The operational burden associated with applying the Bow Valley tools is
significantly lower than that imposed by any of the other options. The
technology manages all of the burden of collecting the background
information, administering the tests, scoring and scaling the data. This
reduces the amount of training needed by interviewers from 3 days to a
day. The adaptive algorithms that have been built into the application
significantly reduce the number of items that are needed to reach the
desired precision levels. This reduction translates into shorter interview
durations.
2.6.3. Technical burden of a common assessment using the Bow Valley
tools
The technical burden associated with fielding the Bow Valley tools is
much lower than that associated with any of the other options.
Participating countries would still need to select a probability sample of
the adult population but the assessment system handles all of the
associated technical burden. The methods associated with analyzing the
64
relationship between skills and background characteristics and imputing
these relationships onto Census records are well established and
straightforward for those with experience in such work.
2.6.4. Risks of applying the Bow Valley tools in a common assessment
The conduct of the ALLS study in Bermuda and an ISRS-based
assessment in Saint Lucia demonstrates that the approach to
measurement, works and yields results that are interesting for policy.
Thus, there is very little risk of the approach not yielding results of
acceptable quality.
The key risk is whether the available communications infrastructure has
the bandwidth and stability to support the delivery of the assessment.
The Bow Valley tools are designed to support delivery on tablet
computers, on laptops or on standalone computers with internet access.
A survey of Member States indicates that internet access is highly
variable with large proportions of the population remaining without
coverage. What remains unknown is the proportion of the population
without 3G mobile access. In cases where internet access is problematic
the Bow Valley assessment can use the 3G wireless network or can be
completed and uploaded at a location where internet access is available.
A second potential risk is the unfamiliarity of the adult population with
computer technology. The Bow Valley tools have been designed to place
minimal demands on the test taker. The administration protocol also
provides for the test administrator to take over the actual system input
in cases where the test taker lacks the technical skills to use the
technology themselves.
2.6.5. Other considerations
The main other consideration related to this option is associated with the
fact that the Bow Valley tool that would be used to assess skills at the
population level are part of a suite of web-based assessment and
instructional products and services that could be used for other
purposes. For example, the suite of tools can be used:
a. For triage at literacy program intake to identify if individuals
are in need of literacy or numeracy training
65
b. At literacy program intake for formative assessment and
diagnosis of individual learning needs
c. At literacy program exit for summative assessment of
learning at program exit
d. At educational program exit to certify skills level for
employment
e. At literacy program intake and exit to estimate learning gain
f. At the point if hiring for selection
g. In research to provide a skills measure
The Bow Valley tools have yet to be used in the context of a national
assessment but have been used in a number of large national research
studies in Canada involving workers, college students and literacy
program participants. Users report that the tools are easy to use and
that they provide useful results for a variety of purposes. No significant
challenges have been encountered in using the tools in Canada. Recent
experience in using the tools in China suggests that government firewalls
prevent access to the software. This kind of problem is not expected in
the Caribbean.
66
2.7. Bow Valley Web-based Assessment- Full Assessment
The conduct of a full assessment using the Bow Valley tools would have
all the same attributes as the common assessment save the cost and the
amount of time that would be required to complete the data collection.
Total cost of this option would depend on the sample size fielded at a rate
of approximately US$83 per case. Actual costs will vary depending on the
number of interviewers trained and equipped. Again this cost estimate
includes all cost elements save for the cost of internet access.
2.7.1. Cost of a full assessment using the Bow Valley tools
This is the same as the Bow Valley common except that the total cost of
the study would be less than three times as expensive as the common
Bow Valley option because the fixed costs of design, implementation,
processing and analysis would be amortized over a larger sample.
2.7.2. Operational burden of a full assessment using the Bow Valley tools
The operational burden associated with applying the Bow Valley tools in
a Full assessment is the same as with the Common Bow Valley
assessment.
2.7.3. Technical burden of a full assessment using the Bow Valley tools
The Technical burden associated with applying the Bow Valley tools in a
Full assessment is the same as with the Common Bow Valley
assessment.
2.7.4. Risks of applying the Bow Valley tools in a full assessment
The key risks of conducting a full assessment using the Bow Valley tools
are the same as for the common assessment.
2.7.5. Other considerations
The main other consideration related to this option is the same as the
commom assessment.
67
2.8. Summary
In terms of information yield each of the seven options reviewed would
yield information in support of knowledge generation, policy and
planning and monitoring.
Options with larger sample sizes would support an average of 5 reliable
point estimates but this number would vary depending on the actual
distribution of sample and skills in the population.
All of the options evaluated would support the generation of synthetic
estimates. Options with larger sample sizes would generate more reliable
synthetic estimates.
None of the options would directly support program evaluation but the
fact that the Bow Valley tool has a variant that provides scores that are
reliable enough to support the generation of reliable estimates of score
gain makes it an ideal tool for program evaluation purposes.
The following table summarizes the evaluation of options on the other
dimensions:
Option
Cost
Cost
Cost
Operational
Technical
per case
per
for 15
Burden
burden
US$
country
countries
US$000
US$000
$295
$1,034
$15,510
Very High
Very high
Very high
PIAAC1000
$650
$650
$9,750
Moderate
Very high
High
LAMP
$133
$400
$6,000
High
High
High
$133
$133
$2,000
Moderate
High
Moderate
$124
$434
$6,500
High
High
Moderate
$123
$123
$1,845
Low
Low
Low
$115
$402.5
$6,037
Low
Low
Low
PIAAC
Risk
3500
3000
Saint
Lucia 1000
Saint
Lucia 3500
Bow Valley
1000
Bow Valley
3500
68
The table reveals significant variation in cost. Per case costs range from a
low of $115 for the full Bow Valley to a high of $650 for the 1,000 case
PIAAC. Per country costs range from a low of $123,000 for the Bow
Valley 1,000 case option to a high of $1,034,000 for the 3,500 PIAAC
option. The estimated cost of a 15-country study range from a low of
$1,845,000 for the 1,000 case Bow-Valley option to a high of $15,
510,000 of the 3,500 case PIAAC option.
The table also documents significant variation in the operation burden
associated with each option. PIAAC imposes the highest operational
burden in large measure because of the demanding quality standards it
imposes. The Bow Valley options impose the least operational burden
because the design of the tool reduces average test durations and
eliminates the need for most data capture and data cleaning and
eliminates manual scoring of the items.
The table also documents significant differences in the technical burden
imposed by the various options. The most demanding options are the
PIAAC options (Full and Common) because of the mix of technology used
and the quality assurance procedures that are imposed. The least
technically demanding options are the Bow Valley options (Full and
Common), a result that can be traced to the fact that the international
team does most of the technical activities.
Finally, the table suggests that there are significant differences in the
amount of risk associated with the various options. The evaluation found
that the LAMP and PIAAC options carry the highest risks. The LAMP is
risky because of the unavailability of published information on the
implementations that begun in 2007. The PIAAC options (Full and
Common) are risky because of the exacting quality standards imposed on
participating countries. Countries that fail to meet these standards have
to spend additional money to meet the standards or are excluded from
the international comparisons.
The Consultant recommends the 1,000-case Bow Valley option, and the
AGS recommends the 3,500-case Bow Valley option. Both options are
viable. The difference in options can be traced back to differences in what
the two groups judge to be prudent and good enough to improve policy
making in the Region. Please see Chapter 8 Section 6 for a
qualification on the recommendation of the Bow Valley option.
69
CHAPTER 3: ANALYSIS OF THE LITERACY SURVEY
EXPERIENCE IN THE REGION
This chapter reviews Bermuda’s experience in fielding the ALLS study
Saint Lucia’s experience in fielding a variant of the ISRS survey and
Dominca’s proposed implementation.
3.1. Bermuda’s Experience
The following overview of Bermuda’s experience in the 2003 ALLS survey
was taken from their administrative report and from the Consultant’s
involvement in the international study team that provided consultancy
services throughout the life of the ALLS survey project in Bermuda.
This assessment of the Bermuda’s experience will be evaluated against
the following criteria: cost, operational burden, technical burden and
risk. In addition, the lessons learnt and the challenges will be
highlighted.
3.1.1. Cost
The total cost for the survey amounted to approximately BDA$830,500
(BDA$1=US$1). This included the development cost (training material,
cost for international trainers, printing cost, etc), data collection cost,
data processing cost and data dissemination cost.
Interviewers were paid a minimum fee of fifteen hundred dollars ($1,500),
less tax. This fee was dependent on their completing 30 surveys.
Additional work was made available, after 30 completions, for those who
were interested in earning extra monies.
Incentive pay was paid during the month of May 2003.
Interviewers
were paid $60 for completing a survey (screener, BQ and Main task
booklet) plus $100 for travel. An additional bonus of $300 and $500 was
paid to those who completed 30 household and 40 households by May
30, respectively.
70
Supervisors were paid a fee of five thousand dollars ($5,000), less tax.
This fee was dependent on the successfully completion of the
assignment.
3.1.2. Operational burden
Data collection
One hundred Interviewers and twenty-seven supervisors were dispatched
into the field to complete 4,000 surveys. Each interviewer was given an
Assignment Control List with 40 household addresses, with the
expectation that they will complete 30 over a ten-week period. The
average interview lasted for approximately two hours, with each selected
respondent answering a background questionnaire and a psychometric
assessment booklet.
Although the data collection phase officially closed on August 31 2003,
in-office interviews continued until the end of September in an effort to
meet the required target. These in-office interviews resulted from two
publicity initiatives informing the public that the survey was still in
progress. In addition, the list of addresses of households that were yet to
be visited were placed in the daily newspapers. The listed householders
were asked to call the Survey Department to arrange suitable times for
an interview. Additionally, letters were sent to persons who had totally
refused to participate in the survey reminding them that the Survey was
mandatory by law.
Response Rate
The Bermuda ALLS study was scheduled to run for 3 months. However,
fieldwork was extended for an additional 3 months in order to achieve
the desired response targets.
Initially Bermuda was expected to complete 4,000 cases for international
comparison since the survey was an international initiative and there
was no real experience conducting a survey of this type with small
population countries. This sample size was determined to ensure validity
of data captured. However, due to many factors that inhibited the survey
71
progress, Bermuda, in consultation with Statistics Canada, decided to
reduce the number of completed surveys to 3,000 cases. It was reasoned
that 3,000 completed cases would still provide the accuracy of data
needed for international comparative purposes.
At the end of the interviewing period (i.e. October 31, 2003), a total
sample of 4,049 households were actually visited with 2,696 completed
cases obtained. Based on the number of homes contacted (3,025) and
those visited but for which no contact was made (304), the response rate
from the study was 82 percent, the highest of all the participating
countries.
Non-Response
The ALLS took several precautions against non-response bias, as
specified in the ALLS Administration Guidelines. Interviewers were
specifically instructed to return several times to non-response
households in order to obtain as many responses as possible. In
addition, questions were preaddressed and interviewers were given
proper maps to assist in identifying households.
Other initiatives were introduced by the Department of Statistics to
encourage participation in the survey. Civil servants who were selected
were given time off to complete the survey.
Bermuda was tasked with completing a debriefing questionnaire after the
Main study in order to demonstrate that the guidelines had been
followed, as well as to identify any collection problems they had
encountered.
3.1.3. Technical burden
The complexity of the collection procedures presented somewhat of a
challenge to most experienced interviewers, who were used to the
traditional way of completing interviews.
Bermuda reports that the ALLS study is scientifically rigorous and
implementation included a significant amount of training in various
aspects of the study design, the related quality assurance procedures
72
and how the data could be used in policy analysis. However, the study
was professionally managed, the approach to governance was open and
issues were addressed in measured and thoughtful way.
Nevertheless, Bermuda required significant levels of technical and
operational assistance during the implementation of their study. The
National Statistics Office required assistance in selecting the sample, the
training of interviewers, in the supervision of data collection, in weighting
and variance calculation. Staff were detached from Statistics Canada to
perform these tasks.
3.1.4. Risk
During the first few weeks of the data collection phase, interviewers faced
very poor weather conditions, which slowed the progress of the fieldwork.
By the end of March 2003, many interviewers still did not commence the
field activities. Thus, the expectation of four (4) completed surveys a
week per interviewer had not been fulfilled.
At the end of June 2003, a total of 2,045 households were visited and
there was the need to extend the Survey period until the end of August.
At that time, many interviewers opted to end their employment with the
Department. Out of a total of 75 interviewers, 45 willingly remained in
the field to assist the Department in reaching the target of 4,000 surveys.
Unfortunately, for the remainder of the data collection phase, the Survey
experienced two major external shocks, which created a setback in the
volume of cases which could have been completed due to the extended
time frame. In July, a general political election was suddenly announced.
All focus was steered towards the up-coming general election and
political candidates. Many households dismissed the visits of the survey
interviewers, which paralleled the canvassing of the political hopefuls.
In addition, during the latter part of August, Bermuda was ‘embraced’ by
Hurricane Fabian. Again, households were distracted as attention was
drawn to the extreme damage caused to the Island and more specifically
to individual homes. Several pre-scheduled interviews with individuals
during the first week in September were cancelled. As a result, the total
number of completed surveys did not measure up to the expected target.
73
3.1.5. Lessons Learnt




Need for an aggressive publicity campaign to encourage full
participation of households.
Need to compensate Interviewers to prevent fatigue. Interviewers
must be properly compensated to encourage good quality work.
Visual, Plan, and Execute.
Office and field staff must be
adequately trained. The ALLS study is unlike any other survey
undertaken. The level of work is comparable with the Census. The
level of technical material to cover is enormous.
Supervision – Office staff must monitor regularly the work of field
staff.
3.1.6. Challenges




Getting residents to participate in a survey that lasted two hours
or more.
A higher level of resistance compared to other surveys conducted
in the past.
General election called during the fieldwork, with both political
candidates and interviewers calling on some of the same
households.
Hurricane Fabian struck on September 5, 2003, the worst
hurricane in Bermuda’s recent history.
3.2. Saint Lucia’s Experience
The following is an overview of Saint Lucia’s experience in piloting a
variant of the ISRS study. The recession precluded Saint Lucia from
fielding the main assessment after having completed a large–scale pilot.
Analysis of the pilot data demonstrated that the psychometrics was
stable and that the background questionnaire functioned as expected.
The assessment will be evaluated against the following criteria: cost,
operational burden, and technical burden. In addition, a summary and
recommendations for the main assessment will be outlined. Most of the
information used in this overview was taken from the document, ‘A
National Literacy and Numeracy Assessment for Saint Lucia: A National
Planning Report (Saint Lucia CSO, 2008).
74
3.2.1. Cost
Experience in the ISRS study suggests that interviewers are able to
complete between 1.5 and 2 interviews per day depending on the
demographics of the enumeration district in which they are working.
The Saint Lucia’s pilot confirms that Interviewers were able to complete
an average of two interviews per day, well within the design tolerances.
The cost of overheads amounted to some US$145,000 and the data
collection cost almost US$75,000.
The sample size for the main survey is planned for 3,000 cases which is
estimated at a cost of US$434,000.
3.2.2. Operational burden
This study targeted a purposive sample of 400 adults aged 16-65 years.
Approximately 50 staff (including interviewers, collection supervisors and
CSO staff) were trained for 5 days during the week of January 24, 2009.
Six (6) members of the scoring unit were trained for 3 days during the
week of February 1, 2009.
Pilot data collection was undertaken during the months of February and
March, 2009. Data collection indicated that interviewers were able to
implement the pilot assessment more or less as specified. The
inexperienced interviewers appeared to have more difficulty in managing
the interview process. The interview lengths were much longer than
expected, a fact that reduced interviewer productivity and unit collection
costs.
3.2.3. Technical burden
Saint Lucia required significant levels of technical and operational
assistance during the implementation of their study. Assistance was
required in the training of interviewers and in weighting and estimation.
Experienced staff were detached from Statistics Canada to undertake
these tasks. Many of the more technically demanding tasks, such as
building the response database, was undertaken by the Chief Statistician
himself, a fact that placed a great burden on the system.
75
3.2.4. Risks
The pilot filter threshold proved to be too low, resulting in many relatively
low skilled respondents to be asked the more difficult questions (test
items). Together these problems led to higher levels of respondent
annoyance.
Response rates did not suffer so much but item non-response levels rose
particularly for items with high reading loads. Respondents with very
low skills levels found the locator items too difficult and this caused
higher levels of non-response.
3.2.5. Summary and Recommendations for the Main Assessment in Saint
Lucia
The analysis of the Saint Lucia pilot data made recommendations for
changes that must be introduced in the implementation of the main
assessment. The recommendations are meant to correct serious
deficiencies in the design, ones that would, if not corrected, would
jeopardize the integrity of the assessment and preclude meeting most of
the study objectives.
Analysis of the pilot results confirms much of the informal feedback
obtained from interviewers during the course of administering the pilot
and during the pilot training.
Three fundamental issues that were identified are as follows:
(a)
Interview length: Average interview durations were roughly
double the expected length of 90 minutes per household.
Such interview durations reduce response rates and lead to
higher levels of item-level non-response. These changes
increase the risk of bias in the proficiency estimates. In
formulating recommendations for the main assessment
efforts should focus on reducing the number of items in the
filter, locator and main assessment booklets by some 40
percent.
(b)
The pass/fail threshold in the filter booklet: For the pilot, the
filter threshold was set based on the distribution of
76
proficiency observed in other countries. This threshold saw
those respondents with at least high school education taking
the more difficult main assessment booklet. It appears that
many high school graduates in Saint Lucia have literacy
skills levels below those needed to complete items of the
difficulty levels found in the main assessment booklet. For
the main assessment, the filter threshold should be set
empirically at a level that routes only respondents with a
high probability of scoring at prose literacy levels 3 or above
to the more difficult side of the design. Imposition of a more
demanding filter threshold will reduce the response burden
by ensuring that respondents are assigned items appropriate
to their proficiency levels. Imposing a more demanding filter
threshold forces us to reallocate the main sample towards
more educated areas. This reallocation will ensure that the
design yields a sufficient number high-skilled adults.
(c)
The
administrative
burden
on
the
interviewers:
Administration of the assessment involves a significant
amount of paper handling. In order to keep the interview
moving along at the expected pace, interviewers must be well
organized and experienced. Many of the pilot interviewers
lacked the requisite levels of skills and experience and the
pilot training did not provide sufficient practice time to
impart the necessary basic interviewing skills. Several
changes will be introduced to try reducing administrative
burden for the interviewers, including a reduction in the
number of documents involved in the main assessment.
Other recommended changes and additions
(a)
Interviewers’ compensation: interviewers should be paid for
each case returned rather than differential amounts for the
level of completion. Experience suggests that differential
compensation elicits bad behaviour on the part of
interviewers. Specifically, they adopt practices that serve to
maximize their earnings rather than data quality.
(b)
Training for main implementation:
77
(i) Administration of the literacy assessment is very
demanding for interviewers. Many of the interviewers
trained for the Saint Lucia assessment did not have
any previous interview experience. It is recommended
that data collection window for the main assessment
should be extended to 9 weeks – a period long enough
for the interviewers used in the pilot to complete the
main assessment. Using experienced interviewers
would allow for a reduction in the duration of
interviewers’ training. In which case, more focus
should be on mock interviews.
(ii) Class size should be limited to 12-15 trainees: Class
sizes above this level reduce the level of interaction
between the instructor and the trainees and allow
weak interviewers to hide.
(iii) If a significant number of new interviewers were to be
recruited for the main assessment then survey
training should be extended by 3 days to allow for
practice of mock interviews and for training in basic
interview techniques.
(c)
Re-score requirements: Move to a sampling strategy for rescoring of locator and main assessment booklets, one in
which 15 percent of booklets are re-scored. Adjusted
procedures should be provided as required.
3.3. Proposed Work in Dominica
Dominica has indicated an interest in fielding a national literacy
assessment. A review of their capacity questionnaire suggests that they
have quite limited operational and technical capacity. Thus, any of the
conventional paper and pencil options would stretch the Dominican
statistical system beyond the breaking point. Even if the Dominican
system managed to cope the opportunity costs associated with devoting
such a high proportion of available capacity would be high.
78
CHAPTER 4: FEEDBACK FROM THE CARICOM
ADVISORY GROUP ON STATISTICS
(AGS)
The AGS play an integral role in the development of the Common
Framework for a Literacy Survey for the Region. Therefore, the
advancement of the Common Framework for a Literacy Survey Project
was on the agendas of four AGS meetings as follows, where the
Consultant made attended and made presentations:




Eighth Meeting of the AGS, Jamaica 27 June 27- 1 July, 2011
Ninth Meeting of the AGS, Belize 20- 22 October 2011
Tenth Meeting of the AGS, Suriname 18- 22 June 2012
Eleventh Meeting of the AGS, Grenada 25-28 October 2012.
Discussions and decisions that came out of the above-mentioned
meetings focused on several related issues. As documented in the
respective AGS Reports under agenda item: Advancement of the
Common Framework for a Literacy Survey Project, the issues discussed
include:
1.
Proposed sample size of 1,000 households per country
the CARICOM Region as one domain
treating
Recommendations/ Discussions:
 The consultant indicated that the data from the survey using the
Bow Valley common assessment of 1,000 households per
country could be used to provide detailed estimates by applying
the national survey estimates to the census or other survey
data. This approach was not accepted by the meeting because
it was pointed out that the countries will need literacy statistics
by small geographic areas as well as by variables such as age,
educational attainment, and sex so as to facilitate the use of
the data for policy and decision making at local levels. The
consultant confirmed that the varying sample sizes could be
79
used so it was agreed that the full assessment will be
considered instead of the common assessment.
The meeting recommended that each country be considered as
a separate domain instead of treating the Region as one domain
with each country as a sub-domain.
2.
Sample size
Recommendation:

3.
The meetings agreed that in determining the sample size for the
survey, countries should take into consideration their
respective literacy data policy/disaggregation needs
against the funds available and the acceptable level of
reliability required. The sample size to be used by countries
must be able to provide reliable estimates when applied to the
populations of the respective countries. For each country, the
sample size should be proportionate to the size of the
population of the country.
Number of adults to be targeted per household using the Bow
Valley web-based assessment
Recommendation:

4.
One adult respondent would be selected (using the Kish
selection method) for participation in the survey from each
household in sample.
Acceptable response rate for literacy assessments
Recommendation:

A very high response rate is usually difficult to achieve for this
particular type of survey but countries should strive to achieve
a response rate of about 75- 80 percent. Measures should be
implemented to adjust for response rate bias.
80
5.
Internet access
Recommendations:


6.
There may be areas in some countries where there is little or no
internet access. The meeting was advised that in such cases
there are two options- (i) a number of programmes/ software
will be preloaded onto the data collection device, which will
allow for off-line entries in which case a very large cache
memory and large download capacity will be used. (ii) countries
could establish central internet access points, in which case,
large cache memory and large download capacity would not be
required.
There may be areas in some countries where there is no
internet access but mobile network service (such as 3G) is
available. In such cases, the software could be set up to run on
any standard laptop or wireless tablet.
Inability of respondents to use the data collection device to respond
to the survey
Recommendation:

7.
The use of a web-based assessment might prove to be
challenging for non-computer user. The meeting was advised
that in such cases, the application allows for the interviewers to
input the responses into the data collection device as directed
by the respondents.
Cost of the survey
Recommendation/ Discussions:


Cost estimates to compare the web-based and paper-based
data collection methodologies should be done for each country
including full cost for infrastructure/ equipment.
Relative to the cost of data collection devices and the cost of the
survey, the meeting was informed that there would be no need
for countries to invest in a large number of data collection
81

8.
devices and field staff since the fieldwork could be done over an
extensive period. It was elaborated that literacy levels in a
population does not change at a fast rate over time and
therefore, an extended fieldwork period would not affect the
results of the survey.
In an effort to minimize the cost of the survey in the Region,
countries could share/ loan data collection devices/ hardware
(e.g. laptops) with each other since all the countries in the
Region are not likely to execute the survey at the same time. A
review of the estimated cost relative to the sharing of hardware
among countries should be conducted.
License fee for
assessment test
accessing
the
computer-/web-based
literacy
Recommendation/ Discussions:

9.
The meeting was advised that the cost for the license for the
two required test is charged on a per use basis. However, if a
large number of the countries in the Region decides to execute
the survey within a specific period, a full volume discount of
not more that 70% will apply;
Concerns about the preferred option versus the other options
relative to science of measuring literacy
Recommendation/ Discussions:

All five options considered, including the Bow Valley Webbased, utilise the same science of assessment of literacy. The
difference being, the method of data collection i.e.
electronic/web-based versus paper-based.

The web-based assessment methodology is similar to the
LAMP’s and PIAAC’s but more user-friendly and the
implementation is less costly

The IALS was developed using the ALLS and ISRS

The PIAAC generally uses the IALS methodology but a reading
component measure based on the ISRS was added

The LAMP is a less complicated version of the PIAAC
82


10.
The PIAAC Methodology set the standard for the Saint Lucia
approach.
The Bow Valley Web-based Assessment will produce the same
results as the others but with fewer complexities.
Technical capacity building at the national level
Recommendation:
Even though most of the data processing will be done
automatically and in real time using the web-based option, the
survey documents will include details on the concepts,
definitions and procedures involved in the survey process.
11.
Involvement/ Input from other stakeholders at the national level
Recommendation/ Discussion:

12.
The technical workshops will target representatives from the
National Statistics Offices and the Ministries of Education in
the respective countries. The countries thereafter should form
national Literacy Survey teams which should include
representative from the Ministry of Labor, and the Ministry of
Finance.
Major Risks
Recommendation/ Discussion:

The major risks in any literacy assessment include nonresponse biases, performance of test items and scoring errors.
In order to control these risks, adequate proactive and reactive
quality control assurance checks must be done. Further, a
skilled and experienced team should be employed to manage
the survey.
83
CHAPTER 5: ANALYSIS OF THE INDIVIDUAL
COUNTRY CAPACITY ASSESSMENTS
As noted earlier in this Report, household-based skills assessments are
one of the most costly, technically demanding, operationally taxing and
error prone of all social sciences. In order to gauge the readiness of
Member States and Associate Members to cope, and to identify what
types of support they might need, countries were asked to complete a
questionnaire designed to assess the countries in this regard (see Annex
B for questionnaire).
The questionnaire sought to identify the following among the countries:
1. Needs and priorities with respect to literacy and numeracy
data
2. Operational, financial and technical capacity and need for
support
3. Cost associated to carrying out a household survey
The questionnaire also was designed to assist countries in assessing
their national information needs and their capacity to field a householdbased skills assessment, knowledge that will inform the completion of
their respective National Planning Reports.
This Chapter of the report summarizes the responses to three Sections of
the questionnaire namely, A. Identification; B. Data needs and priorities;
and c: Operational Capacity and draws out their implications for the
common framework.
It should be noted that the numbering used for the Tables and Charts in
this Section corresponds with the numbering of the questions in the
questionnaire.
Of the twenty (20) countries in the Region, seventeen (17) responded to
the questionnaire yielding a response rate of 85 percent. Barbados, Haiti
and Trinidad and Tobago did not respond to the questionnaire.
84
Major Findings
A: Identifying information
This Section covers particulars on the respondents. It was found that
generally, senior officers from either the MOE or the NSOs or both
completed the questionnaires. Of the 17 countries responded, three
(Bahamas, Suriname and Bermuda) submitted two independent
questionnaires each, one from the NSO and the other from the (MOE). In
these cases, the responses from the NSOs were considered for all the
Sections except for Section B where the MOE’s responses were
considered since this Section reflected policy issues. Therefore, the
analysis that follows is based on the 17 responses.
B. Data needs and priorities
Adult skills assessments can be designed to serve a range of purposes
including knowledge generation; policy and program planning;
monitoring; evaluation; and program administration. Knowledge
generation involves generating new scientific insights, including
understanding cause and effect. Monitoring implies the collection of
repeated measures in order to see if the world is evolving as expected.
The design of any assessment must be adapted to support each of these
purposes and each of these uses implies a different set of technical
attributes that must be met if the system is to produce data that are
judged to be fit for purpose.
B1: Purposes of acquiring adult literacy and numeracy data
Countries were asked to indicate if any of the following purposes of
acquiring adult literacy and numeracy data were applicable to their
respective countries. They were also asked to rank the purposes that are
applicable (in order of importance to their respective country on a scale
of 1 to 5 with 1 being most important and 5 being least important):
o knowledge generation;
o policy and planning;
o monitoring;
o evaluation; and
o program administration.
85
It should be noted that all the purposes targeted were reportedly
important to all 17 countries that responded except for Montserrat in
which case, only one purpose- policy and program planning- was
indicated giving a ranking of one.
Of the countries that indicated that all the targeted purposes were
important to their respective countries (i.e. 16 countries), one (Saint
Lucia) did not provide ranking for any of the targeted purposes and this
is reflected in Table B1 and Charts B1.1 and B1.2.
Table B1 and Charts B1.1 and B1.2 reveal considerable variations in
reported purpose by countries. The variations are not problematic in and
of itself as the proposed assessment option’s sample size and design
support the first three purposes. With the generation of small-area
estimates, all the options would serve the program administration use.
The Bow Valley tool would support program evaluation as it can yield
reliable estimates of score gain.
Table B1: Purpose of literacy data by ranking and percent of
countries
Rank
1
2
3
4
5
NS
Total (%)
Total (#)
Knowledge
generation
31.3
18.8
6.3
6.3
31.3
6.3
100.0
16
*Policy and
program
planning Monitoring
52.9
12.5
35.3
6.3
0.0
31.3
0.0
31.3
5.9
12.5
5.9
6.3
100.0
100.0
17
16
Evaluation
6.3
6.3
18.8
37.5
25.0
6.3
100.0
16
Program
administration
6.3
18.8
43.8
12.5
12.5
6.3
100.0
16
Note: One country reported this as the only purpose that is important.
NS- Note Stated
86
Chart B1: Purpose of literacy data by ranking
and percent of countries
60.0
% of countries
50.0
40.0
30.0
20.0
10.0
0.0
Knowledge
generation
*Policy and
program planning
Rank 1
Rank 2
Monitoring
Rank 3
Rank 4
Evaluation
Program
administration
Rank 5
Table B1 and Chart B1 show that the main purpose of adult literacy and
numeracy data is for policy and program planning with 17 countries
reporting same. Almost 53 percent of the countries ranked this purpose
as most important to their respective countries. The second most
important purpose reported is knowledge generation with over 31 percent
of the countries ranking it as number 1 relative to its importance to their
countries.
B2: Policy departments that require adult literacy and numeracy data
Countries were asked to indicate if any of the following policy
departments require adult literacy and numeracy data and they were
also asked to rank the departments that are applicable, in order of
importance to their respective countries, on a scale of 1 to 8 with 1 being
most important and 8 being least important:
1.
2.
3.
4.
5.
Kindergarten to Grade 12 education
Adult education
Labour
Finance/Treasury
Language and culture
87
6. Social
7. Prime Minister’s Office
8. Other
Table B2: Policy departments that require literacy data by ranking and percent of
countries
KindergarLanguage
ten to
Finance/
and
Prime
Grade 12
Adult
Treasury
culture
M inister’s
Rank
education education Labour
(F/T)
(L&C) Social
Office Other
1
6.7
33.3
50.0
0.0
0.0
0.0
10.0
14.3
2
26.7
26.7
7.1
10.0
0.0
7.1
20.0
14.3
3
6.7
26.7
21.4
10.0
9.1
21.4
10.0
0.0
4
6.7
0.0
14.3
30.0
27.3
21.4
0.0
0.0
5
6.7
0.0
0.0
10.0
36.4
21.4
10.0
28.6
6
20.0
0.0
0.0
10.0
27.3
7.1
20.0
0.0
7
13.3
6.7
0.0
20.0
0.0
14.3
20.0
0.0
8
6.7
0.0
0.0
10.0
0.0
0.0
10.0
42.9
NS
6.7
6.7
7.1
0.0
0.0
7.1
0.0
0.0
100.0
100.0
100.0
100.0
15
15
14
10
Total (%)
Total (#)
100.0 100.0
11
100.0 100.0
14
10
NS: Not Stated
88
7
As shown in Table B2, there are significant variations in literacy data
importance among the departments of government. This implies a need
for the comparative analysis to reflect a broad range of policy issues.
All 17 countries responded to this question and it was found that of all
the policy departments examined, ‘Kindergarten to Grade 12 Education’
and ‘Adult Education’ are the main departments that require adult
literacy and numeracy data with 15 countries in each case. On the other
hand, the Finance/ Treasury department and the Prime Minister’s Office
are least likely, compared with the other departments, to require literacy
data with only 10 countries indicating same respectively (Table B2).
89
Chart B2: Rank by policy departments that require literacy data
and percent off countries
100%
90%
80%
70%
% of countries
60%
50%
40%
30%
20%
10%
0%
1
*K- Grade 12
2
3
4
5
6
Rank
Note: * One case excluded where no ranking was given
*Adult edu
*Labour
F/T
L&C
*Social
PM’s Office
7
Other
8
90
Chart B2 reveals that the policy departments that were ranked as most
important by the largest proportion of countries are the Labor
department and the adult education departments with over 45 percent
and just about 35 percent respectively. None of the countries identified
the Finance/ Treasury, Language and Culture, and Social departments
as the most important policy departments that require adult literacy and
numeracy data.
B3: Policy issues that require adult literacy and numeracy data
Countries were asked to indicate if the following policy issues require
adult literacy and numeracy data and they were also asked to rank the
selected issues in order of importance to their respective countries, on a
scale of 1 to 15 with 1 being most important and 15 being least
important:
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
(k)
(l)
(m)
(n)
(o)
Improving the quantity of primary and secondary education
Improving the quality of primary and secondary education (initial
education)
Improving the equity of primary and secondary education (initial
education)
Improving the efficiency and effectiveness of primary and
secondary education (initial education)
Improving the quantity of tertiary education
Improving the quality of tertiary education
Improving the equity of tertiary education
Improving the efficiency and effectiveness of tertiary education
Improving the quantity of adult education
Improving the quality of adult education
Improving the equity of adult education
Improving the efficiency and effectiveness of adult education
Reducing social and economic inequality
Improving labour productivity and competitiveness
Improving health
For analysis purposes, the 15 ranks have been grouped into five
categories as follows:
1. Most important (Rank 1-3)
2. Above average (Rank 4-6)
91
3. Average (Rank 7-9)
4. Below average (Rank 10-12)
5. Least important (Rank 13-15)
Additionally, the 15 policy issues targeted are analyzed in four groups as
follows:
1. Improving primary and secondary education (policy issues (a) to
(d))
2. Improving tertiary education (policy issues (e) to (h))
3. Improving adult education (policy issues (i) to (l))
4. Socio-economic concerns (policy issues (m) to (o))
Sixteen out of the 17 countries responded to this question.
Table B3.1: Policy issues that require literacy data relative to improving primary and
secondary eduaction by level of importance (rank)
Rank
Most important (1-3)
Above average (4-6)
Average (7-9)
Below average (10-12)
Least important (13-15)
Not Stated
Total (%)
Total (#)
Quantity of Quality of
Equity of
primary and primary and primary and
secondary
secondary
secondary
education
education
education
10.0
35.7
36.4
0.0
21.4
18.2
30.0
7.1
9.1
10.0
21.4
18.2
50.0
7.1
9.1
0.0
7.1
9.1
100.0
10
100.0
14
100.0
11
Efficiency and
effectiveness of
primary and
secondary
education
30.8
15.4
15.4
30.8
0.0
7.7
100.0
13
As shown in table B3.1, the majority of countries indicated that the
policy issue ‘improving the quality of primary and secondary education’,
requires adult literacy and numeracy data with 14 countries. Of these,
over 57 percent reported that this policy issue is most important or above
average in terms of importance to their respective countries.
92
Chart B3.1: Ranking of policy issues relative
to improving primary and secondary
education
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Most important (1-3)
Above average (4-6)
Average (7-9)
Below average (10-12) Least important (13-15)
Quantity of primary and secondary education
Quality of primary and secondary education
Equity of primary and secondary education
Efficiency and effectiveness of primary and secondary education
Chart B3.1 shows that the most important policy issues reported are
equity and quality of primary and secondary education with about onethird of the countries indicating same respectively.
93
Table B3.2: Policy issues that require literacy data relative to improving tertiary
eduaction by level of importance (rank)
Rank
Most important (1-3)
Above average (4-6)
Average (7-9)
Below average (10-12)
Least important (13-15)
Not Stated
Total (%)
Total (#)
Quantity of Quality of
tertiary
tertiary
education education
0.0
15.4
27.3
38.5
27.3
15.4
36.4
15.4
9.1
7.7
0.0
7.7
100.0
11
100.0
13
Efficiency
and
Equity of effectivenes
tertiary
s of tertiary
education
education
8.3
16.7
16.7
25.0
25.0
41.7
33.3
16.7
8.3
0.0
8.3
0.0
100.0
12
100.0
12
As it relates to improving tertiary education, 13 out of the 16 countries
that responded to this question reported the need for adult literacy and
numeracy data. Of these, almost 54 percent indicated that this policy
issue is most important or above average in importance to their
respective countries (Table B3.2).
94
Chart B3.2: Ranking of policy issues relative to
improving tertiary education
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Most important (1-3)
Above average (4-6)
Average (7-9)
Below average (10-12)Least important (13-15)
Quantity of tertiary education
Quality of tertiary education
Equity of tertiary education
Efficiency and effectiveness of tertiary education
Chart B3.2 demonstrates that the most important policy issues that
require adult literacy and numeracy data relative to tertiary education
are improving the ‘quality of tertiary education’ and ‘improving the
efficiency and effectiveness’ of tertiary education’, with around 41 percent
and 38 percent respectively.
95
Table B3.3: Policy issues that require literacy data relative to improving adult
eduaction by level of importance (rank)
Rank
Most important (1-3)
Above average (4-6)
Average (7-9)
Below average (10-12)
Least important (13-15)
Not Stated
Total (%)
Total (#)
Quantity of Quality of Equity of Efficiency and
adult
adult
adult
effectiveness of
education education education adult education
41.7
46.7
41.7
30.8
33.3
26.7
16.7
30.8
8.3
20.0
8.3
23.1
0.0
0.0
16.7
15.4
8.3
0.0
8.3
0.0
8.3
6.7
8.3
0.0
100.0
12
100.0
15
100.0
12
100.0
13
All four policy issues relating to improving adult education were found to
be most important or above average in important with over 58 percent in
each case.
Fifteen out of the 16 countries that responded to this question indicated
that adult literacy and numeracy data are required in order to improve
the quality of adult education. Of these, almost 47 percent ranked this
policy issue as most important to their respective countries (see Table
B3.3).
96
Chart B3.3: Ranking of policy issues relative to
improving adult education
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Most important (1-3) Above average (4-6)
Average (7-9)
Below average (10- Least important (1312)
15)
Quantity of adult education
Quality of adult education
Equity of adult education
Efficiency and effectiveness of adult education
As shown in Chart B3.3, all the policy issues relative to adult education
were found to be very important to countries with relatively small
variations in proportion of countries among issues. The largest
proportion of countries reported that improving quality of adult
education is most important with about 30 percent of countries
indicating same.
97
Table B3.4: Policy issues that require literacy data relative to socio
economic concerns by level of importance (rank)
Improving
labour
Reducing productivity
social and
and
economic competitive Improving
inequality
ness
health
Rank
Most important (1-3)
42.9
46.7
50.0
Above average (4-6)
28.6
20.0
25.0
Average (7-9)
7.1
13.3
25.0
Below average (10-12)
0.0
6.7
0.0
Least important (13-15)
14.3
6.7
0.0
Not Stated
7.1
6.7
0.0
Total (%)
Total (#)
100.0
14
100.0
15
100.0
8
As shown in Table B3.4, the majority of countries indicated that data on
adult literacy and numeracy are needed to reduce social and economic
inequality and to improve labour productivity and competitiveness with
14 and 15 countries respectively. It was found that the issue of
improving labour productivity and competitiveness was most prevalent
with about 47 percent of the countries indicating same followed by the
issue of reducing social and economic inequality with about 43 percent.
98
Chart B3.4: Ranking of policy issues relative to
socio economic concerns
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Most important (1-3)
Above average (4-6)
Average (7-9)
Below average (10-12) Least important (13-15)
Reducing social and economic inequality
Improving labour productivity and competitiveness
Improving health
Chart B3.4 indicates that there are hardly any variations among socio
economic policy issues relative to the importance among countries.
Approximately the same proportion of countries ranked each policy issue
as most important.
99
B4: Possible funding source(s) identified
Countries were asked if possible source(s) of funding have been identified
to support the implementation of a national literacy and numeracy
assessment.
Table B4: Funding Source Identified
Funding Source
Identified
Number
Yes
No
Not sure
Total
Percent
3
17.6
13
76.5
1
5.9
17
100.0
As shown in Table B4, all 17 countries responded to this question. The
majority of the countries have not yet identified a source of funding. In
fact, only 3 out of the 17 countries indicated that funding was identified.
One country was not sure whether or not any source of funding had been
identified.
C.
Operational Capacity
The implementation of adult skills assessments places significant
demands on the operational capacity of NSOs. This section attempts to
evaluate whether the countries have the relevant capacity to undertake
an assessment.
C1: Number of staff with Literacy Survey Experience
All 17 countries responded to the question on the number of staff with
literacy survey experience. Chart C1 and Table C1 reveal that the
overwhelming majority of countries have no experience in conducting
literacy surveys with almost 59 percent (10 countries) with no literacy
survey experience. The countries that reported having staff members
with such experience are limited to only one or two. Only seven countries
reported having experience in literacy surveys.
100
Chart C1: Number of staff members with
literacy survey experience
Number of countries
12
10
8
6
4
2
0
0
1
2
4
5
6
Number of staff members
Table C1: Number of staff with
Literacy Survey Experience
Number of
staff
Frequency
Percent
0
10
58.7
1
2
11.8
2
2
11.8
4
1
5.9
5
1
5.9
6
1
5.9
17
100.0
Total
Experience suggests that the lack of experience in the Region is not a
serious barrier to implementation provided that the implementation
process includes a significant amount of theoretical and procedural
training. Without such training, the risk of inadvertent error is very high.
C2: Specific literacy assessment experience
The seven countries which indicated that they have staff members with
literacy survey experience were asked to indicate the areas in which they
have experience.
101
Table C2: Experience in selected areas in Literacy
Survey
Yes
Survey Experience
Planning
No.
% Total
7
100
7
Sampling
6
85.7
7
Data collection
6
85.7
7
Data entry/ data capture
6
85.7
7
Coding
6
85.7
7
Editing
6
85.7
7
Data analysis
5
71.4
7
Other
3
42.9
7
As shown in Table C2, all seven countries that reported experience in
literacy surveys have experience in ‘planning of literacy surveys’ but only
five have ‘data analysis’ experience. Six countries have experience in
‘sampling’, ‘data collection’, ‘data capture’, ‘data coding’ and ‘data editing’
respectively.
C3- C12: Human capacity in survey phases
Countries were asked about their capacities in the various survey
phases- data collection, data capture, data coding, data editing and data
analysis.
102
Interviewer capacity
Table C3: Number of trained Interviewers
No. of trained
interviewers on
staff
No. of
countries
Percent
0
5
29.4
2
1
5.9
5
3
17.6
6
1
5.9
8
2
11.8
10
1
5.9
15
1
5.9
50
1
5.9
64
1
5.9
70
1
5.9
17
100.0
Total
Household-based skills assessments have average interview durations of
some 90 minutes plus travel time. Even with relatively small sample
sizes, such projects consume very large numbers of interviewer hours.
Table C3 shows the size of the available interviewer workforce. It was
found that there are significant variations in the size of the available
interviewer workforce which range from 0 to 70 interviewers. Only three
countries reported having access to large numbers of interviewers (50- 70
interviewers). The remainder of the countries have very limited numbers
of interviewers, too few to support implementation of an assessment with
a large sample size. Only 12 countries reported having trained
interviewers on staff.
Almost 30 percent (5) of the countries that responded do not have any
trained interviewers on staff, more than 53 percent have 15 or less and
almost 18 percent have 50 to 70 interviewers. However, countries
reported that they usually recruit the required number of interviewers in
an adhoc manner depending on surveys to be executed.
103
Table C4: Number of monthly interview hours
Interview hours
No. of countries
15
2
56
1
500
2
600
1
792
1
NS
5
NA
5
Total
17
NS: Not stated;
NA: Not applicable (no interviewers on staff)
When asked about the total monthly collection capacity in terms of
number of interview hours, only seven of the 12 countries that reported
having trained interviewers on staff, provided estimates. These estimates
ranged from only 15 interview hours per month to 792 interview hours
per month (see Table C4).
Field supervision capacity
An essential element of quality assurance in household-based skills
assessments involves supervision of interviewers throughout the field
operation. Table C5 reveals that only two countries reported having an
appreciable number of field supervisors (20 and 30 respectively) and
more than 35 percent of the countries do not have any field supervisors
on staff.
104
Table C5: Number of Field Supervisors on staff
No. of Field
Supervisors
No. of
countries
Percent
0
6
35.3
1
1
5.9
3
2
11.8
4
1
5.9
5
3
17.6
7
1
5.9
8
1
5.9
20
1
5.9
40
1
5.9
17
100.0
Total
Data Entry and Data Coding Capacities
While over 70 percent of the countries reported having data entry clerks
and data coding clerks respectively on staff, the number of these clerks
per country ranged from 2 to 16 and 2 to 9 respectively (see Tables C7
and C9).
Table C7: Number of Data Entry Clerks on staff
No. of Data
Entry Clerks
No. of
countries
Percent
0
5
29.4
2
1
5.9
3
2
11.8
5
2
11.8
6
4
23.5
7
1
5.9
8
1
5.9
16
1
5.9
17
100.0
Total
105
Table C9: Number of Coding Clerks on staff
No. of
Coding
Clerks
No. of
countries
Percent
0
5
29.4
2
4
23.5
3
3
17.6
4
1
5.9
5
1
5.9
8
1
5.9
9
2
11.8
17
100.0
Total
Programming capacity
As shown in Table C10, countries have access to a very limited number
of programmers on staff, in fact eight countries report having no
programmers at all on staff i.e. over 47 percent of the countries that
responded.
Table C10: Number of Programmers on staff
No. of
Programmers
No. of
countries
Percent
0
8
47.1
1
3
17.6
2
5
29.4
3
1
5.9
17
100.0
Total
Field editing capability
Again, as shown in Table C11, only nine countries have field editors. The
number of field editors ranged from 2 to 11. Only one country reported
having 11 field editors, the largest number of field editors reported.
106
Table C11: Number of Field Editors on staff
No. of
Field
Editors
No. of
countries
Percent
0
8
47.0
2
3
17.6
5
1
5.9
6
2
11.8
10
2
11.8
11
1
5.9
17
100.0
Total
Statistical analysis capacity
Extracting full value from the assessment results and associated
background information requires the transformation of the raw data into
information through a process of statistical analysis. The following chart
reveals significant variation in analytic capacity.
Table C12 reveals a large variation in the available statistical analysis
capacity. Two countries reported having no analysis capacity while one
country reports having 14 analysts on strength.
Table C12: Number of experienced Statistical Analysts on staff
No. of
experienced
Statistical
Analysts
No. of
countries
Percent
0
2
11.8
1
3
17.5
2
1
5.9
3
2
11.8
4
3
17.6
5
2
11.8
7
1
5.9
8
1
5.9
12
1
5.9
14
1
5.9
17
100.0
Total
107
C13: Analysis Tool
Analysis of assessment results is generally undertaken using a small
number of common analytic software i.e. SAS, SPSS and Excel.
When asked whether or not the staff had working experience in SAS,
SPSS and Excel, the majority of the 17 countries reported experience in
Excel and SPSS with 94 percent and 88 percent respectively. However,
only about 12 percent of the countries reported experience in SAS (see
Chart C13 and Table 13.1).
Chart C13: Experience in selected analysis
tools
100.0
94.1
88.2
90.0
80.0
70.0
60.0
% 50.0
40.0
30.0
20.0
10.0
11.8
0.0
SAS
SPSS
Excel
108
Table C13.1: Experience in selected analysis tools
SAS
SPSS
Excel
Yes
11.8
88.2
94.1
No
88.2
11.8
5.9
100.0
100.0
100.0
17
17
17
Total (%)
Total (#)
As shown in Table C13.2, only Jamaica and Grenada have experience in
SAS. However, experiences in SPSS and Excel are quite prevalent in the
region with only Montserrat and the Bahamas reported having no
experience in SPSS while only Montserrat reported no experience in
Excel.
Table C13.2: Experience in selected analysis tools by country
Anguilla
Antigua and Barbuda
Belize
Bermuda
British Virgin Island
Cayman Island
Dominica
Jamaica
Grenada
Montserrat
Saint Kitts and Nevis
Saint Lucia
Bahamas
Guyana
Suriname
Turks and Caicos Islands
St. Vincent and the Grenadines
Experience in:
SAS
SPSS
No
Yes
No
Yes
No
Yes
No
Yes
No
Yes
No
Yes
No
Yes
Yes
Yes
Yes
Yes
No
No
No
Yes
No
Yes
No
No
No
Yes
No
Yes
No
Yes
No
Yes
Excel
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
Yes
Yes
Yes
Yes
Yes
Yes
Yes
109
C14: Advanced analysis capacity
Analysis of data from skills assessments depends on a small range of
statistical techniques including tabulations (to profile the social
distribution of skills), simple regressions (to reveal the factors that have
the largest impact on observed skills levels and the impacts that skills
has on outcomes) and multi-level/multi-variate analysis (to reveal
relationships that are not confounded).
Chart C14 reveals that the majority of countries have the capability to
produce tables with 88 percent, but the minority has the capacity to
undertake regression analysis (53 percent). As such, there will be a need
for suitably qualified and experience personnel to provide this service in
most of the countries and or suitably designed training would be
required for nationals in each country. Table C13 shows the countries
with and without the targeted capabilities.
Chart C14: Type of analysis
100.0
90.0
88.2
80.0
70.0
58.8
60.0
52.9
% 50.0
40.0
30.0
20.0
10.0
0.0
Tables
Simple Regressions
Multi-level/multi-variate
regressions
110
Table C14: Type of analysis by country
Tables
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
Yes
Yes
No
Yes
Yes
Simple
Regressions
Yes
No
Yes
No
Yes
Yes
No
Yes
Yes
No
No
Yes
No
Yes
Yes
Multi-level/multivariate regressions
Yes
No
Yes
No
No
Yes
No
Yes
Yes
No
No
Yes
No
Yes
Yes
Turks and Caicos Islands
Yes
No
No
St. Vincent and the Grenadines
Yes
Yes
Yes
Anguilla
Antigua and Barbuda
Belize
Bermuda
British Virgin Island
Cayman Island
Dominica
Jamaica
Grenada
Montserrat
Saint Kitts and Nevis
Saint Lucia
Bahamas
Guyana
Suriname
Technical capacity and infrastructure
C15: Sampling capability
All of the options reviewed require the selection of a multi-stage,
stratified probability sample by a skilled and experienced sampling
statistician. As shown in Chart C15, only seven of the seventeen
responding countries have a sampling statistician on staff.
111
Chart C15: Countries with sampling
statistician on staff
12
No. of countries
10
8
6
4
2
0
Yes
No
C16- C20: Statistical Capacity
Questions 16 to 20 sought to determine whether or not the countries
have the capabilities to perform the following:





Select a multi-stage, stratified probability sample
Weight survey records
Calculate variance estimates based on complex survey designs
Calculate variance estimates using replicate weights
Use In-design, the software used to generate the test booklets
As shown in Chart C16- C20, the most prevalent capability is weighting
of survey records with almost 65 percent of the countries having this
capability followed by the selection of multi-stage, stratified probability
samples, with almost 59 percent. Less than 24 percent of the countries
have the capabilities to calculate variance estimates using replicate
weights and to use In-design, the software used to generate the test
booklets respectively.
112
Chart C16-C20: Country capacity in selected
statistical capacities
90.0
76.5
80.0
70.0
%
60.0
50.0
76.5
70.6
64.7
58.8
41.2
40.0
35.3
Yes
29.4
30.0
23.5
23.5
No
20.0
10.0
0.0
Select a multistage, stratified
probability sample
Weight survey
records
Calculate variance Calculate variance Use In-design, the
estimates based on estimates using
software used to
complex survey
replicate weights generate the test
designs
booklets
C21: Access to high-speed internet
The computer-based option reviewed in this Report requires access to
either high-speed internet or 3G wireless network or a system that
provides the option of occasional upload of results.
When asked about the proportion of the country that has high-speed
internet access, only 13 countries responded. Of these, only two reported
having 100 percent coverage and five reported having less than 50
percent coverage (see Table C21). This finding suggests a need for
multiple implementation protocols such as the use of 3G-service where
available; use of large cache memory to facilitate delayed uploading of
data; and the use of central internet access points where respondents
from areas with no internet service could participate in the Survey.
113
Chart C21: Proportion of countries with high
speed internet
6
5
4
No.
3
of countries
2
1
0
Less than 50
50-70
71-99
100
Percent
C22: Interviewers with the ability to perform simple tasks on the computer
interviewers who are able to do simple tasks on a computer
The computer-based assessments require that interviewers have the
ability to undertake simple tasks on a computer. Chart C22 reveals that
of the 12 countries that reported having interviewers on staff (see
Question C3) only eight responded to this question. Of these, all reported
having interviewers with such skill. However, only 2 reported having
between 16 and 60 such interviewers while four have less than 10
interviewers with such skills. This finding suggests a need for focused
recruitment strategies and some basic training.
114
Chart C22: Interviewers who can do simple
tasks on the computer
5
No. of countries
4
3
2
1
0
0-9
10-15
16-50
No. of Interviewers
C23: Interviewers with experience in computer-assisted personal
interviewing (CAPI)
When asked about the number of interviewers with computer-assisted
personal (CAPI) interviewing skills, only 5 of the 8 countries that have
computer-literate interviewers have such experience (Chart C23). Only
two countries have 10 to 50 such interviewers.
115
Chart C23: Interviewers with CAPI experience
3.5
Number of countries
3
2.5
2
1.5
1
0.5
0
0
1-10
10-50
Number of interviewers
116
CHAPTER 6: DETAILS ON THE OPTION
RECOMMENDED BY THE AGS- FULL BOW
VALLEY ASSESSMENT
Based on the Consultant’s comprehensive review of the various literacy
assessment, the CARICOM Advisory Group on Statistics (AGS) recommended
the use of the Full Bow Valley web-based assessment for the Region.
6.1. Detailed Methodological Approach of the Full Bow Valley Web-Based
Assessment
Each assessment option was evaluated against four criteria, namely Cost,
Technical burden, Operational burden and Risk. The evaluation recommended
the use of the Saint Lucian (paper and pencil) or the Bow Valley instruments in
a common (sample size of 1,000 households per country) regional assessment.
Both options, it was stated, would impose a manageable financial, technical
and operational burden on National Statistics Offices. In the pilot conducted in
Saint Lucia, it was reported that implementation of the paper and pencil
instrument placed a heavy burden on interviewers both “literally and
figuratively”. The weight of the test booklets and component measures was
taxing. The Bow Valley option would be slightly less costly and less
operationally burdensome but would be slightly more technically demanding
because of its reliance on computer technology.
The AGS was of the view that the Bow valley full (sample size of at least 3,000
cases) instrument would provide the data needed by the countries for policy
purposes, and that countries in the Region could shoulder the technical,
operational and financial burdens associated with conducting this assessment
with acceptable levels of risk.
The web-based data collection method is preferred over the paper and pencil
method since, among other advantages, this method limits the possibility of
manual errors during the data collection phase and it also reduces the time it
takes for data collection, compilation and analysis. This ultimately will
translate to a reduction in the cost of the overall survey.
The full Bow Valley web-based assessment embodies the same science that is
contained in the IALS, ALLS, ISRS, LAMP and PIAAC. Compared to the other
assessments, the Bow Valley web-based assessment provides improved
117
reliability of skills estimates while reducing the cost, technical burden and
operational burden of the assessment process. It utilizes a suite of assessment
tools that support the full range of needs from program triage (i.e. the process
of determining learner objectives and learning needs) to certification in real
time. The assessment measures prose literacy, document literacy, numeracy,
oral fluency and, for low-level readers, a computer-based variant of the reading
components carried in the ISRS, LAMP and PIAAC assessments.
The full Bow Valley web-based assessment will allow countries to use sample
sizes that will support the direct estimation of more point estimates at the
national level and more reliable relationships.
The assessment of the Bow-valley web-based assessment relative to the four
criteria is summarized as follows:
6.1.1. Cost
The adaptive algorithms in the Bow Valley Web-based assessment allow the
average interview duration to be significantly less than the other options
reviewed resulting in the lowest unit cost for data collection, even with
allowanced for licensing fees and the hardware cost.
6.1.2. Operational burden
The preferred option uses computer technology to manage all of the burden of
collecting the background information, administering the tests, scoring and
scaling the data. The adaptive algorithms that have been built into the
application significantly reduce the number of items that are needed to reach
the desired precision levels. This reduction translates into shorter interview
durations. As such, the operational burden associated with the Bow Valley
web-based assessment is significantly lower than that imposed by any of the
other options.
6.1.3. Technical burden
Compared to the other options reviewed, the technical burden associated with
the Bow Valley web-based assessment is much lower. This is mainly because
the assessment is designed to handle all of the associated technical burden
almost automatically. In cases where countries are interested in synthetic
estimates for small areas using the Population and Housing Census data, the
118
associated methods to do this is relatively easy to follow for those who are
suitably experienced.
6.1.4. Risks
The key risk is unavailability of suitable communication infrastructure such as
adequate bandwidth to support the delivery of the assessment. However, the
Bow Valley web-based assessment tools are designed to support delivery on
tablet computers, on laptops or on standalone computers with or without
internet access. In cases where internet access is problematic, the Bow Valley
web-based assessment can use 3G (third-generation) wireless network or can
be completed off-line and uploaded at a location where internet access is
available.
Another potential risk is the unfamiliarity of the respondents (test takers) with
computer technology. However, the Bow Valley assessment tools are designed
to place minimal demands on the respondent. The administration protocol of
the assessment provides for the interviewer to input the test responses as
directed by the respondent, in cases where the respondents lack the technical
skills to use the technology themselves.
6.2. Recommendations to Inform the use of Bow-Valley Full Web-Based
Assessment
The recommendations for the use of the Full Bow Valley Web-Based
assessment are as follows:
Recommendation 1-Sample Size: In general, the size of the sample
should be dependent on the country specific policy requirements and
subject to cost and the budget available to conduct the assessment
in the respective countries.
In order for the point estimates to be reliably estimated according to
selected characteristics, a minimum of 600 cases is required per category
for each characteristic. Considering that educational attainment explains
almost 70 percent of a person’s literacy level and in an effort to minimize
the cost and maximize the reliability of the survey, the recommended
minimum characteristics and categories for small countries or for
countries that have budget constraints are as follows:
119
Characteristic
Sex
Educational attainment
Category
Male
Female
(600 cases)
(600 cases)
Primary
Secondary
Tertiary
(600 cases) (600 cases) (600 cases)
Therefore, total sample size to provide reliable point estimates for sex and
education with five categories is 3000 cases (5*600).
Similarly, for larger countries or for countries with less severe budget
constraints, the recommended minimum characteristics and categories
are as follows:
Characteristic
Sex
Educational attainment
Age Groups
Category
Male
Female
(600 cases)
(600 cases)
Primary
Secondary
Tertiary
(600 cases)
(600 cases) (600 cases)
15-44
45+
(600 cases) (600 cases)
In this case, the total sample size to provide reliable point estimates for
sex, education and age with seven categories is 4,200 cases (7*600).
Therefore, the recommended minimum sample size is 3,000 cases for
small countries and 4,200 cases for larger countries. Allowances can be
made for non-response, which will bring the recommended sample sizes to
3,600 and 4,800 respectively. Countries are free to increase the number of
characteristics starting with these recommended proposals.
However, this figure of 600 cases is based on a desired level of
precision or margin of error and level of confidence. It is possible
therefore that if the tolerable level of confidence is say 90 percent
and the margin of error is say 5 percent then the number of cases
may be less than 600. Countries can use the margin of error and
level of confidence that they would normally use in their household
surveys to derive reliable estimates at the sub-national level and for
sub-groups of the population. This issue will be discussed further in
the guidelines for sample design that will be prepared under Phase
II.
120
Recommendation 2- Adequate Communication Infrastructure: Since
the preferred data collection method is web-based, it is recommended
that countries identify at an early stage, areas where internet
coverage or access are inadequate.
In the absence of the internet, the Bow Valley assessment tools allow for
the use of 3G networks. The assessment tools also allows for off-line data
collection with the use of large cache memory, in which case, the delayed
data download will be employed post the interviews. In addition, central
locations can be identified where internet access is available and where
respondents could be interviewed.
Recommendation 3- Sharing of equipment to conduct survey across
countries: It is recommended that there be sharing of equipment
across countries to make it feasible for all countries to participate in
the survey.
This approach is possible since countries do not have to execute the survey
at the same time. This approach will result in a considerable reduction in
the overall survey cost per country. Countries can therefore contribute to
the purchasing of the equipment, mainly laptops and other hand-held
devices.
Recommendation 4- Respondents with limited or no computer
technology knowledge: The recommendation is for interviewers to
input responses on the device as directed by the respondents since
the assessment tools allow for this. Additionally, tutorial sessions on
the use of the devices to respond to the questions, should be made
available to the respondents prior to the test.
Recommendation 5- Adequate human resources: Since the majority of
the countries currently lack the operational and technical capacity to
conduct a literacy survey in general and specifically the Full Bow
Valley Web-Based assessment, it is recommended that countries/
Region consider this limitation when preparing the budget for their
assessment.
High-level technical experts to be utilized to provide the training and
to bridge the gap.
121
Recommendation 6Method of selection, age and number of
respondents per household: It is recommended that one adult, 15
years old and over be selected per household using the Kish selection
method.
Recommendation 7- Relevance of Assessment Instruments: There must
be country involvement in the development or refinement of test
items, questionnaires and corresponding documents including
manuals for training, interviewing and tutorials for respondents to
ensure suitability to the respective countries.
Recommendation 8Generation of synthetic estimates for a
broader range of characteristics: It is recommended that synthetic
estimates can be generated by applying the national survey estimates
obtained (using a sub-sample of 1,000 cases) to the data of the
population and housing census.
This approach is said to be a useful method to obtain estimates for a range
of characteristics to satisfy policy requirements and utilizes applied
statistical methods which are explained in Annex D.
6.3. Adjustments Required
The recommended option requires adaptations to make it country-specific. The
assessment comprises the following instruments/ documents:
(a)
(b)
(c)
(d)
(e)
Screening component
Background Questionnaire
Core task test items
Main task test items
Exit component
The adaptations required to these documents are as follows:
6.3.1. Adaptations to the background questionnaire
The questionnaire used in Saint Lucia’s pilot is, to some extent, region-specific
and can be used in the web-based assessment. The Saint Lucia questionnaire
can therefore be reviewed in the context of Recommendation 7 above.
122
Questions in the background questionnaire that were used in Saint Lucia need
to be reviewed to ensure that the coding structure captures key variables
adequately. More specifically, the questions that relate to educational
qualifications, industry and occupation and adult education and training need
to be reviewed by each country and adjusted as appropriate. The background
questionnaire is used in analysis but also serves as a means to adjust for nonresponse and to improve the reliability of the proficiency estimates so all
changes need to be reviewed and approved by the regional study manager.
6.3.2. Review of the test items
The Saint Lucian test items can work in the Caribbean context with limited
adaptations. Countries would be required to review the items to ensure that the
items have the appropriate level of face validity.
The Bow Valley tools rely on an item pool that include all of the items that were
included in the Saint Lucia assessment booklets plus a much larger number of
items that have been shown to work in English and French populations.
Countries should review items in the Bow Valley pool to ensure that they have
the appropriate level of face validity. The adaptive design of the Bow Valley
assessment affords much greater coverage of the assessed constructs, a feature
that yields more reliable proficiency estimates.
6.3.3. The need for off-line completion
Several countries indicate that there are areas of the country where access to
high-speed internet or 3G networks is not available. Bow Valley has indicated
that a version of the test software is available that provides for standalone
completion and subsequent uploading. This adaptation would need to be
provided where required. This has been incorporated in Recommendation 2.
123
CHAPTER 7: RESULTS OF THE TWO REGIONAL
TRAINING WORKSHOPS CONDUCTED
UNDER PHASE 1
Two (2) regional technical training workshops were conducted under this Phase
targeting statisticians from the National Statistical Offices and Officers from
the Ministries of Education from the Member States and Associate States.
These workshops served to introduce participants to the various methodologies
that exist relative to reliably measuring literacy as well as to inform on the
option proposed by the AGS.
7.1
The First Regional Training Workshop
This first workshop was held in Trinidad and Tobago from 30 November to 2
December 2011(see Annex E for the workshop reports).
The main objectives of this first workshop were to familiarise Member States
with the following:
(a)
(b)
(c)
The theory that is used to build and interpret literacy assessments;
the practical issues pertaining to the conduct of a literacy
assessment; and
the potential pitfalls and the use of skill assessment data in
informing policy.
The main achievements of this workshop were that participants gained a better
understanding of what is involved in the planning and execution of a largescale Literacy assessment. The participants were exposed to the science that
underscores large-scale literacy assessments including the follow:
(a)
(b)
(c)
(d)
(e)
the pragmatics of implementing a literacy survey;
costing templates- data collection for paper-based and web-based
approaches;
the components of a National Planning Report;
the basic consideration in developing questions to measure literacy;
the methods to identify measurement goals; and
124
(f)
7.2
the considerations
assessments.
with
regards
to
the
building
of
literacy
The Second Regional Training Workshop
This second regional training workshop was held in Suriname from 21 to 22
June 2012 (see Annex F for the workshop reports)
The main objectives of this second workshop were as follows:
(a)
(b)
(c)
(d)
(e)
to provide an overview of the various literacy options reviewed
including the AGS’ recommended option;
to familiarise participants with the AGS recommended approach;
to provide an overview of the proposed Plan of Action;
to inform and obtain feedback on the technical requirements; and
to inform on the activities that are to commence under Phase II of
the project.
The main achievements of this workshop were as follows:
(a)
(b)
(c)
Participants were given a better understanding of the following:
(i) the various literacy measurement approaches and specifically
the AGS proposed web-based literacy assessment.
(ii) The issues pertaining to costing, sample size estimation,
background questionnaire and assessment/test booklet were
also discussed.
Participants were familiarized about the Plan of Action relative to the
recommendations and actions that inform the common framework.
Participants were informed about the general findings of the
individual country capacity assessment relative to the human and
technical requirements needed in the conduct of literacy surveys.
125
CHAPTER 8: COMMON FRAMEWORK WITH THE PLAN
OF ACTION
A draft Common framework was developed and submitted to countries for
comments and feedback. See Annex G for detailed document.
1. Evaluation of the options
At the Ninth meeting of the CARICOM Advisory Group on Statistics (AGS) held
20-22 October 2011 in Belize, the Consultant gave a presentation on the
findings of the evaluation of the assessment options. The evaluation suggested
that all the options evaluated would satisfy the Region’s information needs.
However, the AGS recommended the use of the full Bow Valley web-based
assessment in the Region. This option allows for each country to be considered
a separate domain with the respective sample size being proportionate to the
size of the population of the country. This decision of the AGS was in the
context of obtaining meaningful estimates at the country level relative to age,
sex, educational attainment and geographic area. In other words, countries
will need literacy statistics by small geographic areas as well as by variables
such as age, educational attainment, and sex to facilitate the use of the data
for policy and decision making at local levels.
In addition, the Bow Valley option would be slightly less costly and less
operationally burdensome but would be slightly more technically demanding
because of its reliance on computer technology. The Bow Valley tool has the
unique advantages of providing immediate results and supporting a range of
other assessment purposes including the evaluation and administration of
literacy programs. This assessment tools have proven particularly useful for
placing students and for evaluating learning gain and program efficiency.
2. Some issues raised at CARICOM’s Advisory Group on Statistics (AGS)
meetings
1. Sample size: The meeting was advised by the Consultant that the data from
the survey of 1,000 households per country could be used to provide detailed
estimates by applying the national survey estimates to the census or other
survey data. In this approach, the Region was to be viewed as one sampling
domain and the countries as sub-domains in which the sample size for each
country was to be 1,000. This approach was not accepted by the meeting
(relative to the Domain/Sub-domain and 1,000 sample size per country)
126
because it was pointed out that the countries will need literacy statistics by
small geographic areas as well as by variables such as age, educational
attainment, and sex so as to facilitate the use of the data for policy and
decision making at local levels. Therefore, the sample size should be dependent
on the policy issues to be address and hence the data disaggregation needs of
the respective countries. The sample size to be used by countries must be able
to provide reliable estimates when applied to the populations of the respective
countries. For each country, the sample size should be proportionate to the size
of the population of the country.
2. Response rate: It was noted that while a very high response rate is usually
difficult to achieve for this particular type of survey, countries should strive to
achieve a response rate of about 75- 80 percent. Measures should be
implemented to adjust for response rate bias.
3. Internet access: There may be areas in some countries where there is little or
no internet access. The meeting was advised that in such cases there are two
options- (i) a number of programmes/ software could be preloaded onto the
data collection device, which could allow for off-line entries in which case a
very large cache memory and large download capacity will be required. (ii)
countries could establish centralised internet access, in which case, large
cache memory and large download capacity would not be required.
Further, there may be areas in some countries where there is no internet
access but mobile network service (such as 3G) is available. In such cases, the
software could be set up to run on any standard laptop or wireless tablet.
4. Respondents’ level of knowledge of computers and its operation: It was noted
that the use of a web-based assessment might prove to be challenging for noncomputer users. The meeting was advised that in such cases there are two
options- (i) a specially designed tutorial could be taken, in advance of the
assessment, to acquaint non-computer users with basic mouse operations and
the response types. (ii) the interviewers will input the responses into the
laptops, tablets and other similar devices at the direction of the respondent.
5. Comparative cost of data collection: It was agreed that estimates to compare
cost of both the web-based and paper and pencil-based data collection
methodologies should be done for each country including full cost for the
infrastructure/ equipment.
127
6. Number of field staff needed versus number of data collection devices required:
In relation to the cost of laptops, tablets and other similar devices and the cost
of the survey, the meeting was informed that there will be no need for a large
number of field staff and hence these devices, since the fieldwork could be done
over an extensive period. It was elaborated that literacy levels in a population
change at a very slow rate over time and therefore, an extended fieldwork
period would not affect the results of the survey.
7. Need for technical workshops: Even though most of the data processing will be
done automatically using the preferred option, the technical workshops will
include details on concepts, procedures, scoring methodology, costing and
psychometric training.
8. Stakeholders’ involvement: The technical workshops will target representatives
from the National Statistics Office and the Ministries of Education after which,
a national team should be formed comprising of other government ministries
such as Ministry of Labour, and the Ministry of Finance.
3. Recommended option of the AGS
The AGS played an integral role in recommending the Full Bow Valley WebBased assessment for use in the CARICOM Region. This web-based literacy
assessment was developed by the Bow Valley College in Calgary, Alberta and
was funded by the Government of Canada’s Human Resources and Skills
Development Ministry. This assessment is based on the theory assessment
methods deployed in the IALS, ALLS, ISRS, PIAAC and LAMP.
The goal was to reduce the cost, operational burden and test duration to a level
that would allow for use in a wide variety of settings including instructional
programs. The tool includes a number of innovative features as follows:
(a) An adaptive algorithm that greatly reduces test duration while reducing
standard errors around the proficiency estimates.
(b)
The ability to chose any combination of skills domains, the ability to
choose among four precision levels that support different uses i.e.
program triage17, formative or summative assessment, pre and post
assessment that supports reliable estimates of score gain and a
certification for employment linked to Canada’s system of occupation
skills standards.
Program triage involves the process of determining learner objectives and learning needs so that an individual
learning plan can be formulated. The process of program triage is central to the implementation of efficient and
effective programs.
17
128
(c)
A pair of score reports that provide diagnostic information for the learner,
and their instructor, and a third score report that identifies the benefits
that would be expected to accrue to the learner should the prescribed
training be undertaken. Real time algorithmic scoring improves scoring
reliability and allows score reports to be generated in real time.
4. Main comments received from Member States
Approximately ten countries provided feedback on the draft framework as
follows:
a) Survey cost, availability of technical expertise and managerial capacity
Countries have indicated that the cost of the exercise will be a major concern.
The absence of technical expertise for the conducting of the survey will pose a
problem. This would include expertise that would be required in the areas of
survey sampling, data editing, scoring and weighting of the results, variance
estimation and statistical quality control of the operations.
It was also indicated that the respective Ministries of Education may not have
the necessary managerial capacity to undertake the Literacy Survey and that
the respective statistical agencies may also lack the pedagogical skills to work
on the instruments. Therefore, collaboration between the two agencies would
be required and should be possible.
b) Suitability of web-based approach
It was observed by one country that the paper and pencil-based environment is
a more familiar environment than the web-based one and that a significant
culture shift would be required in the case of the latter. Ensuring suitability of
the web-based approach at the country level should be taken seriously. Actual
devices should be tested and there is also concern about the location and
security of the data set.
It was also stated that the paper and pencil-based approach had credibility and
ownership within it since the scoring is done by persons such as teachers in
the country trained to score the completed test booklets. Therefore, especially
with the active involvement of these persons in scoring, there will be more buyin to the process.
With respect to the web-based approach, it is very important that mechanisms
be setup to ensure that the validity of the process is well understood. It must
129
be possible for the assigned score on each case (which is assigned by the webbased system) to be seen, validated and verified. The process of doing this in a
web-based system is not obvious and will need to be thoroughly tested.
It was indicated by another country, relative to a One Laptop Per Family (OLPF)
project, that this country might be at an advantage if a web-based methodology
is used. This country viewed favourably the proposed solutions for areas within
countries that do not have internet connectivity as well as for persons that
were not computer literate.
c) Oversampling of specific population groups
One country stated that oversampling of the 15-24 age group (which includes
the population just completing secondary school) might be necessary since the
literacy survey may have, as one of its main objectives, the assessment of the
education system in providing an education relevant to the present day
realities of the job market. It was further stated that this group is of special
interest since it demands jobs the most, requires new skills in some cases, is
the most adaptable to retraining, and is a significant age cohort within the
population.
5. Proposed data collection approach indicated by countries
Of the 16 countries that responded to enquiries relative to the data collection
approach they are likely to use for the conduct of a National Literacy Survey in
their respective countries, seven indicated the paper and pencil-based
approach while nine indicated the electronic approach. However, it should be
noted that it is not likely that all the countries indicating electronic would opt
for the Bow Valley web-based option. However, the theoretical underpinning of
the literacy testing framework will be the same regardless of the approach used
(i.e. paper-based versus electronic/ web-based). The breakdown is presented in
following table:
130
Table1: PROPOSED APPROACH TO COLLECTING DATA BY COUNTRY
COUNTRY
APPROACH TO COLLECTING
DATA
Antigua and Barbuda
Electronic
The Bahamas
Electronic
Barbados
Electronic
Belize
Paper and pencil
Dominica
Paper and pencil
Grenada
Electronic
Jamaica
Paper and pencil
Montserrat
Paper and pencil
St. Kitts and Nevis
Paper and pencil
St. Lucia
Paper and pencil
St. Vincent and the Grenadines
Electronic
Suriname
Electronic
Trinidad and Tobago
Electronic
Bermuda
Electronic
British Virgin Islands
Paper and pencil
Cayman Islands
Electronic
6. Consideration of the AGS’ recommendation relative to the use of the
Bow valley web-based approach in the Region
The review of the various literacy assessment approaches conducted by the
Consultant, suggests that all of the options would satisfy the Region’s
information needs and could provide similar levels of reliable estimates.
Further, all the options have common methodological underpinnings and
may differ only by the data collection method employed. Some countries
may opt for paper and pencil data collection while others may opt for
some form of electronic data collection/ web-based approach.
131
Therefore, one can conclude that the recommendation of the AGS is
primarily one on the approach to data collection based on the criteria of
the assessment and has nothing to do with the theoretical differences.
7. Recommendations18
1. Sample Size
In general, the size of the sample should be dependent on the country specific
policy requirements and subject to cost and the budget available to conduct the
assessment in the respective countries.
In order for the point estimates to be reliably estimated according to selected
characteristics, the Consultant has indicated that a minimum of 600 cases is
required per category for each characteristic. Considering that educational
attainment explains almost 70 percent of a person’s literacy level and in an
effort to minimize the cost and maximize the reliability of the survey, the
recommended minimum characteristics for small countries or those that have
budget constraints are as follows:
 Sex: Male (600 cases), Female (600 cases);
 Educational attainment: Primary (600 cases), Secondary (600 cases),
Tertiary (600 cases)
This gives a total of two characteristics and five categories for making reliable
point estimates, resulting in a minimum sample size of 3000 cases (5*600).
For larger countries or those with less severe budget constraints, the
recommended minimum characteristics are as follows:
 Sex: Male (600 cases), Female (600 cases),
 Educational Attainment: Primary (600 cases), Secondary (600 cases),
Tertiary (600 cases)
 Age Groups: 15-44 (600 cases), 45+ (600 cases)
This will provide for three characteristics with a total of seven categories for
reliable point estimates resulting in a sample size of 4,200 cases
(7*600).Therefore, the recommended minimum sample size is 3,000 cases for
small countries and 4,200 cases for larger countries. Allowances can be made
for non-response, which will bring the recommended sample sizes to 3,600 and
4,800 respectively. Countries are free to increase the number of characteristics
starting with these recommended proposals. However, this figure of 600
18
The recommendations that follow relate to both the web-based data collection approach and the paper-based
approach, but may include recommendations specific to electronic data collection.
132
cases is based on a desired level of precision or margin of error and level
of confidence. It is possible therefore that if the tolerable level of
confidence is say 90 percent and the margin of error is say 5 percent then
the number of cases may be less than 600. Countries can use the margin
of error and level of confidence that they would normally use in their
household surveys to derive reliable estimates at the sub-national level
and for sub-groups of the population. This issue will be discussed further
in the guidelines for sample design that will be prepared under Phase 2.
2. Adequate communication infrastructure
If the preferred data collection method is web-based, it is recommended that
countries identify at an early stage, areas where internet coverage or access are
inadequate.
In the absence of the internet, the Bow Valley assessment tools allow for the
use of the 3G network or related networks. The assessment tools also allows for
off- line data collection with the use of large cache memory, in which case, the
delayed data download will be employed post the interviews. In addition,
central locations can be identified where internet access is available and where
respondents could be interviewed.
3.
Sharing of equipment to conduct survey across countries
It is recommended that there be sharing of equipment across countries to make it
feasible for all countries to participate in the survey.
This approach is possible since countries do not have to execute the survey at
the same time. This approach will result in a considerable reduction in the
overall survey cost per country. Countries can therefore contribute to the
purchasing of the equipment, mainly laptops, tablets and other similar devices.
This approach would satisfy concerns raised by some countries relative to the
cost of acquiring equipment.
4. Respondents with limited or no computer technology knowledge
The recommendation is for interviewers to input responses on the device as
directed by the respondents since the assessment tools allow for this.
Additionally, tutorial sessions on the use of the devices to respond to the
questions, should be made available to the respondents prior to the test.
5. Adequate human resources
133
Since the majority of the countries currently lack the operational and technical
capacity to conduct a literacy survey in general and specifically the Full Bow
Valley Web-Based assessment, it is recommended that countries/ Region
consider this limitation when preparing the budget for their assessment.
High-level technical experts to be utilized to provide the training and to bridge
the gap.
6. Method of selection, age and number of respondents per household
It is recommended that one adult, 15 years old and over be selected per
household using the Kish selection method.
7. Relevance of assessment instruments
There must be country involvement in the development or refinement of test
items, questionnaires and corresponding documents including manuals for
training, interviewing and tutorials for respondents to ensure suitability to the
respective countries.
8. Generation of synthetic estimates
It is recommended that synthetic estimates can be generated by applying the
national survey estimates obtained (using a sub-sample of 1,000 cases of the
determined country sample) to the data of the Population and Housing Census.
This approach is said to be a useful method to obtain estimates for a broader
range of characteristics to satisfy policy requirements and utilizes applied
statistical methods. This approach does not imply that countries will be
utilising the recommendation of the Consultant of considering the Region as a
domain and the countries as sub-domains with the sample size of 1,000. The
sample size will be selected in accordance with Recommendation 1 and a subsample of this sample can still be applied to the Census data to produce
synthetic estimates in addition to those that would be obtained from the
survey.
9. Pretesting/ piloting must be done in each participating country
This is necessary to ensure that all the tools are applicable for the respective
countries. However, the sample (100- 500 cases) used in the pilot should be
selected from the main survey sample so that the data collected during pilot
exercise could be utilized should there be no need for any major modification to
the tools.
134
10. Translation of the common framework to Haitian French and Surinamese
Dutch
The translation of the framework including the test items should be done by
linguists who are familiar with the framework. The translation of the test items
for example (included in the filter booklet, location booklet and the main
booklet) should be done in such a way to ensure that the psychometric
performance of the items remains unaltered This is necessary to ensure that
the test items remain identical in psychometric terms so as to ensure
comparability among countries. .
11. Duration of training of field staff
The length of training of field staff would depend on the quality of field staff
and would vary by country.
135
PLAN OF ACTION
COMMON FRAMEWORK FOR A LITERACY SURVEY IN CARICOM
Activities/ Tasks
Estimated Time
Frame
A: Preparatory Phase for Survey Readiness
1. Establish/ Continue
At least 24 months
communication with the
before start of
MOE, MOF and other relevant Survey
stakeholders in preparation
for the conduct of a Literacy
Survey
2. Identify/ Determine the
At least 24 months
approach (paper and pencilbefore start of
based; electronic (e.g. webSurvey
based); any other approach or
combination of approaches to
be used to conduct the
Literacy Survey.
3. Prepare survey proposal
At least 24 months
inclusive of a budget for the
before start of
identification of funds to
Survey
conduct the activity (will
include preliminary estimates
of some activities under
Section B such as the sample
size etc.)
Responsibility
Associated
Recommendation
Remarks
Survey executing
agency
Not Applicable
Survey executing
agency
Not Applicable
Initial review of
questionnaires and
manuals re: output of
regional project/
Discussion with
proprietor of Bow Valley
Survey executing
agency
Not Applicable
The budget should be
based on an estimated
sample size, method of
data collection (paperbased versus web-based
(or any other electronic
approach), the
infrastructural/ human
resources required and
other relevant
information using the
generic costing template
137
Activities/ Tasks
4. Schedule a time frame for
the conduct of the Survey.
Estimated Time
Frame
At least 24 months
before start of
Survey
Responsibility
Survey executing
agency
Associated
Recommendation
Availability and
sharing of
equipment to
conduct survey
across countries
Remarks
prepared under the
CARICOM Project.
Draft questionnaires
and other survey
instruments relative to
printing cost/
procurement cost
should be part of the
initial proposal
Some information
through the preparation
of the National
Implementation Plan
(National Planning
Report-NPR) would have
already been available
out of the CARICOM
Project that would help
countries.
This information should
be shared with the
Secretariat as early as
possible. This is to
facilitate the
development of a
timetable to enable the
possible sharing of
survey equipment
among countries.
138
Activities/ Tasks
5. Develop publicity and
communication materials
Estimated Time
Frame
At least 6 months
before start of
Survey
Responsibility
Associated
Recommendation
Survey executing
agency
Remarks
Some work should have
been put in place under
the CARICOM Project
B: Commencement of substantive work relative to the Conduct of the Survey
1. Establish a National
At least 18 months Survey executing Not Applicable
Literacy Survey Committee
before start of
agency
(NLSC)
Survey
The NLSC should
include representatives
from the NSO, MOE and
Ministries of Labour
This Committee should
meet, as necessary, to
discuss all the survey
related activities
2. Identify Focal Point
At least 24 months
before start of
Survey
Survey executing
agency
Not Applicable
This person should
ideally be a professional
within the executing
agency.
139
Activities/ Tasks
Estimated Time
Frame
At least 18 months
before start of
Survey
Responsibility
4. Determine the sample size
based on the characteristics
and categories identified in (3)
above.
At least 18 months
before start of
Survey
5a. Review and adapt survey
questionnaires, test booklets,
and reading components
measures as per guidelines.
5b. Engage data processing
At least 18 months
before start of
Survey
3. Determine the required
characteristics and categories
(such as age, sex and area) to
inform country specific
policies. The number of
categories has implications on
the sample size, which in turn
will determine the degree of
reliability of the point
estimates.
Associated
Recommendation
Reliable sample
size
Remarks
The NLSC
Reliable sample
size
The NLSC
Relevance of
The sample design
guidelines prepared
under the CARICOM
Project should be used.
Most countries should
have an up-to-date
sample frame from the
2010 Round of
Population and Housing
Census.
Consultant should be
hired to support the
process
Work commenced at the
project proposal phase
The NLSC
assessment
This activity should be
one of the initial
considerations of the
NLSC.
Some work should have
been put in place under
the CARICOM Project
and also at the project
proposal phase.
Consultant should be
hired to support the
process
instruments
140
Activities/ Tasks
Estimated Time
Frame
Responsibility
Associated
Recommendation
Remarks
6a. Review and finalise all
survey documents including
training manuals, instruction
manuals, debriefing
questionnaire, guidelines,
control forms and scoring
sheets
6b. Engage data processing
personnel in the review of the
questionnaire
7. Identify training needs
relative to the conduct of the
Survey.
At least 24 months
before start of
Survey
The NLSC
Relevance of
Training materials
prepared under the
CARICOM Project
should be use as a
base.
At least 18months
before start of
Survey
Survey executing
agency
Adequate human
resources
8. Relative to the proposal
already prepared in Section
A(3) above, review the
availability of equipment and
infrastructure for the conduct
of the Survey
At least 18 months
before start of
Survey
Survey executing
agency
9. Adapt software to allow for
delayed uploading of data in
cases where internet and/or
other related connection/s
At least 18 months
before start of
Survey
Survey executing
agency
1. Availability and
sharing of
equipment to
conduct survey
across countries
2. Adequate
Communication
Infrastructure
Adequate
Communication
Infrastructure
personnel in the review of the
questionnaire
assessment
instruments
Applicable to both
electronic and paper
and pencil
A generic software
prepared under the
CARICOM Project
should be used.
141
Activities/ Tasks
Estimated Time
Frame
Responsibility
Associated
Recommendation
such as 3G are not available,
if necessary
10. Adapt tutorials on the use
of the devices to respond to
the Survey
At least 18 months
before start of
Survey
Survey executing
agency
Adequate
Communication
Infrastructure
11. Adapt the tabulation plan
At least 18 months
before start of
Survey
Survey executing
agency
Not Applicable
12. Apply an extensive regime
of proactive vetting of
materials and plans to detect
and prevent errors.
At least 12 months
before start of
Survey
Survey executing
agency
Adequate human
resources
13: Based on all of the above,
review and finalise the survey
design
At least 12 months
before start of
Survey
Survey executing
agency
Not Applicable
Remarks
For electronic data
capture including the
web-based approach
Consultant should be
hired to support the
process
Generic tutorial
documents prepared
under the CARICOM
Project should be used.
For electronic data
capture including the
web-based approach
A Tabulation plan
prepared under the
CARICOM Project
should be used.
Consultant should be
hired to support the
process
The finalised National
Implementation Plan
(National Planning
Report-NPR)
142
Activities/ Tasks
Estimated Time
Responsibility
Frame
C: Acquisition of Resources for the Conduct of the Survey
Associated
Recommendation
Remarks
1. Recruit Project/ Survey
Coordinator and other project
staff
At least 12 months
before start of
Survey
For Project/
Survey
CoordinatorRecruitment
should be done at
least 24 months
before the start of
Survey due to the
complex nature of
the Survey
Survey executing
agency
Adequate human
resources
Project/ Survey
Coordinator should
ideally be a professional
within the executing
agency or a National
Consultant who is able
to dedicate her/himself
full-time for the
duration of the survey.
2. Acquire all survey
materialsFor Paper and pencil-based(i) Print questionnaires, test
booklets, manuals, survey
forms, etc
(ii) Procure timer, tape
recorders, batteries, pencil,
erasers, sharpeners, data
capture and data processing
equipment, etc
For web-based(i) Enter into an agreement
At least 6 months
before start of
Survey
Survey executing
agency
Not Applicable
For other electronic
data collection
approaches,
procurement of data
collection and data
processing devices will
be required.
143
Activities/ Tasks
Estimated Time
Frame
Responsibility
Associated
Recommendation
At least 2 months
before the specific
activity is expected
to commence
Survey executing
agency
Adequate human
resources
Remarks
with web-based approach
proprietor or other proprietor
depending on the approach to
be used.
(ii) Procure data collection
devices specific to the webbased approach
3. Identify and recruit highlevel technical support
personnel for the conduct of
the Survey.
D: Data Collection, Data Processing and Related Activities
(i)
Undertake Pilot Survey
1. Launch of Survey publicity
programme.
At least 6 months
before start of
Survey
Survey executing
agency
Not Applicable
The Media should be
involved from this stage
to provided continued
media coverage until
the conclusion of the
survey field work.
Publicity and
communication should
continue throughout
144
Activities/ Tasks
Estimated Time
Frame
Responsibility
Associated
Recommendation
2. Train survey managers and
relevant personnel for pilot
survey in the execution of key
implementation steps e.g.
sampling; survey planning;
data collection; data
processing including scoring
and weighting; and data
analysis
3. Commence the process of
identifying and recruiting
field staff for the conduct of
the pilot
4. Select sample for pilot
Survey
At least 1 to 6
months before
start of survey
Survey executing
agency
Adequate human
resources
At least 9 months
before start of
Survey
Survey executing
agency
Adequate human
resources
At least 9 months
before start of
Survey
Survey executing
agency
Pretesting/
piloting must be
done
5. Train pilot Survey staff
using relevant training
materials
At least 9 months
before start of
Survey
Survey executing
agency
Pretesting/
piloting must be
done
6. Conduct pilot fieldwork
At least 9 months
before start of
Survey
Survey executing
agency
Pretesting/
piloting must be
done
Remarks
the survey
A generic publicity
materials prepared
under the CARICOM
Project should be used.
The specially designed
guidelines prepared
under the CARICOM
Project should guide
this process
Consultant should be
hired to support the
process.
Contracts will be of a
short term duration
145
Activities/ Tasks
7. Process pilot data
For paper and pencil-based
(i) score test booklets and
reading components
(ii) code of open ended fields
such as industry,
occupation, other specify
(iii) Edit household and
background
questionnaires
(iv) Capture household
questionnaires,
background
questionnaires, and
scores
(v) Merge background
questionnaire, scores and
codes
(vi) Scale the assessment
results, link them to the
international proficiency
scales and compute error
estimates
(vii) Weight the data file and
compute replicate weights
For web-based approach(i) Code open ended fields e.g.
Estimated Time
Frame
At least 6 months
before start of
Survey
Responsibility
Survey executing
agency
Associated
Recommendation
Pretesting/
piloting must be
done
Remarks
146
Activities/ Tasks
Estimated Time
Frame
Responsibility
Associated
Recommendation
Remarks
At least 6 months
before start of
Survey
Survey executing
agency
Pretesting/
piloting must be
done
Consultant should be
hired to support the
process
This would include the
preparation of a
tabulation plan and
generation of data to be
analysed
At least 2 months
before start of
Survey
Survey executing
agency
At least 6 months
before start of
Survey
Survey executing
agency
Reliable sample
size
Consultant should be
hired to support the
process
industry, occupation, field of
study, other specifies and
merge onto analysis file
(ii) Weight the data file and
compute replicate weights
8. Evaluate pilot and apply an
extensive regime of retroactive
review of operational results
to detect and correct errors.
9. Revise all survey materials
based on the pilot survey
(ii): Conduct Main Survey
1. Commence publicity
activities for main survey
2. Select of sample
147
Activities/ Tasks
Estimated Time
Frame
At least 3 months
before start of
Survey
(recruitment
should be done in
time for the
training sessions)
At least 6 weeks
before start of
Survey
Responsibility
5. Conduct fieldwork
6. Process data
For paper and pencil-based
(i) score test booklets and
reading components
(ii) code of open ended fields
such as industry, occupation,
other specify
(iii) Edit household and
background questionnaires
(iv) Capture household
questionnaires, background
questionnaires, and scores
(v) Merge background
questionnaire, scores and
codes
(vi) Scale the assessment
3. Identify and recruit field
staff and other Survey staff for
the conduct of the main
survey
4. Train Survey staff using
relevant training materials
Survey executing
agency
Associated
Recommendation
Adequate human
resources
Survey executing
agency
Adequate human
resources
Schedule time/
duration
Survey executing
agency
Not Applicable
For paper and
pencil-based
Should commence
at least 1 week
after start of
Survey
Survey executing
agency
Not Applicable
Remarks
For web-based
approachWould commence
at the start of
Survey
148
Activities/ Tasks
Estimated Time
Frame
Responsibility
Associated
Recommendation
Remarks
Generation of
An additional
international cost will
be attached to this
activity
Consultant should be
hired to support the
process
results, link them to the
international proficiency
scales and compute error
estimates
(vii) Weight the data file and
compute replicate weights
For web-based approach(i) Code open ended fields e.g.
industry, occupation, field of
study, other specifies and
merge onto analysis file
(ii) Weight the data file and
compute replicate weights
7. Generate synthetic
estimates
To be done if
required by
country
synthetic
estimates for a
broader range of
characteristics
E: Data Analysis and Dissemination
149
Activities/ Tasks
1. Prepare reports
(preliminary and final) and
disseminate widely
Estimated Time
Frame
Approximately 6-9
months after the
conclusion of
Survey
Responsibility
Survey executing
agency
Associated
Recommendation
Not Applicable
Remarks
Consultant should be
hired to support the
process.
A tabulation plan
should form part of the
process in the pilot
survey which would
allow for some data
analysis.
150
CHAPTER 9: SUMMARY AND CONCLUSION
Literacy and numeracy skills have been shown to be the most important
determinant of rates of social and economic progress over the long term, of
national competitiveness and to be one of the principle determinants of social
inequality in valued outcomes, including income, employment, health and social
engagement.
Consistent, reliable and comparable statistical information is an important
ingredient for planning, monitoring and evaluation of policy decisions and the
lack thereof has long been considered as hampering the effectiveness of public
policy in the Caribbean Region. Consequently, noting the lack and/or poor
quality of literacy statistics in the Region, and the perceived importance of this
information for the Region’s economic and social development in light of the
advent of the Caribbean Single Market and Economy (CSME), Statisticians of the
Caribbean Community (CARICOM) decided to pursue the development of a
common strategy and methodology for the collection and analysis of social and
gender statistics including educational statistics.
Aware of these challenges and shortcomings of the existing approaches to
measuring literacy, the CARICOM Advisory Group on Statistics (AGS), is
attempting to develop a common framework for the production, collection and
analysis of Literacy data.
Following consultation with CARICOM and its Member States, the following
seven detailed options were identified and evaluated:
1. The Organisation for Economic Cooperation and Development’s (OECD’s)
Program for the International Assessment of Adult Competencies (PIAAC)
paper and pencil reading, numeracy and reading components
assessments- Full Assessment;
2. The Organisation for Economic Cooperation and Development’s (OECD’s)
Program for the International Assessment of Adult Competencies (PIAAC)
paper and pencil reading, numeracy and reading components
assessments- Common Assessment;
3. The United Nations Education and Science Organization (UNESCO)
Institute for Statistics’ (UIS’) Literacy Assessment and Monitoring program
151
(LAMP) paper and pencil assessments of prose literacy, document literacy,
numeracy and reading components on a sample of 3,000 adults per
country;
4. The Saint Lucia’s paper and pencil prose literacy, document literacy,
numeracy and reading components assessments on a sample of 3,500
adults per country- Full Assessment;
5. The Saint Lucia’s paper and pencil prose literacy, document literacy,
numeracy and reading components assessments on a sample of 1,000
adults per country- Common Assessment;
6. The Bow Valley’s web-based prose literacy, document literacy, numeracy
and reading components assessments on a sample of 3,500 adults per
country- Full Assessment;
7. The Bow Valley’s web-based prose literacy, document literacy, numeracy
and reading components assessments on a sample of 1,000 adults per
country- Common Assessment.
Relative to the TOR, an assessment of the International Survey of Reading Skills
(ISRS) was required in this review. However, it should be noted that the ISRS
methodology underpins all the other options reviewed. Essentially, the Saint
Lucia assessment applied the ISRS measures and methods with minor
adaptation.
The assessment provided an analysis of the origins of large scale Literacy
measurement including the International Survey of Reading Skills (ISRS), the
Adult Literacy and Life Skills Survey (ALLS) and the Young Adults Literacy
Survey (YALS).
Each option was evaluated in terms of information yield, cost, operational
burden, technical burden and risk.
Country Needs and Constraints
The evaluation reflects the needs and constraints facing the countries of the
Region, that is, the countries have a pressing need for objective comparative data
152
on the level and social distribution of economically and socially-important skills
for policy purposes. In statistical terms, the countries need comparative data on
average skills levels, the distribution of skills by proficiency level for key subpopulations including youth, employed workers and the workforce as a whole,
the relationship of skills to outcomes and he relationship of skills to
determinants.
With the exception of Saint Lucia and Bermuda, limited or no experience in the
conduct of household-based skills assessments.
Access to financial resources to support an assessment is limited and
operational and technical expertise are limited as it relates to household-based
skills assessment
Key Issues Arising Out of Evaluation
The evaluation suggests that all of the options reviewed would satisfy the
Region’s information needs. Input from the countries suggests that they have
very limited operational and technical capability, to the point that a paper and
pencil-based national skills assessment would overwhelm them.
The evaluation recommends against participation in either of the two PIAAC
options – PIAAC in any guise is too costly, too technically and operationally
demanding given the constraints facing countries in the Region
The evaluation also recommends against participation in UIS’s LAMP program.
While it is less costly and less operationally burdensome, it is almost as
technically demanding as PIAAC.
The evaluation also recommends against fielding a full-scale study using the
Saint Lucian instruments. As with PIAAC and LAMP the financial, technical and
operational burden of this option would overwhelm many of the statistics offices.
The evaluation recommends either using the Saint Lucian or the Bow Valley
instruments in a common regional assessment. Both options would impose a
manageable financial, technical and operational burden on national statistics
offices. Both Canada and the USA have implemented the Saint Lucian design at
a national scale equivalent to what has been proposed for CARICOM Member
States and the Saint Lucian pilot survey was of a sufficient scale to suggest
implementation on the proposed scale is manageable. The Saint Lucian
153
implementation team reports that implementation placed a heavy burden on
interviewers both literally and figuratively. The weight of the test booklets and
component measures was taxing. The Bow Valley option would be slightly less
costly and less operationally burdensome but would be slightly more technically
demanding because of its reliance on computer technology. The Bow Valley tools
have been validated on a national scale involving some 2000 test takers. The
Bow Valley tool has the unique advantages of providing immediate results and
supporting a range of other assessment purposes including the evaluation and
administration of literacy programs. The assessment tools have proven
particularly useful for placing students and for evaluating learning gain and
program efficiency.
Issues Raised by CARICOM’s Advisory Group on Statistics (AGS) at the Eighth,
Ninth, Tenth and Eleventh AGS Meetings (i.e. the four meetings where the
advancement of the Literacy Project was discussed).
Discussion and decisions at the above-mentioned meetings focused on several
related issues, including:
1.
The use of a sample size of 1,000 households per country
treating the CARICOM Region as one domain
2. Sample size
3. Number of adults to be targeted per household using the Bow
Valley web-based assessment
4. Acceptable response rate for literacy assessments
5. Internet access
6. Inability of respondents to use the data collection device to
respond to the survey
7. Cost of the survey
8. License fee for accessing the computer-/web-based literacy
assessment test
9. Concerns about the preferred option versus the other options
relative to science of measuring literacy
10. Technical capacity building at the national level
11. Involvement/ Input from other stakeholders at the national
level
12. Major Risks
154
Conclusion on Findings of the Country Capacity Assessment
Generally, these results reveal considerable heterogeneity among
countries in the key uses to be served by the assessment. The fact that
countries indicated a need for data to help with program administration
provides support for the preferred assessment option.
Collectively, the results of the capacity survey suggest that the
administration of any of the full-scale assessment options would be
beyond the capacity and capability of most of the countries. Most
countries would need to greatly enhance their collection and processing
capacity and most would need assistance and support to complete the
technical aspects of implementation including sample selection, editing,
scoring, weighting, variance estimation and analysis.
The assessment reveals that the overwhelming majority of countries
currently lack the operational and technical capacity to safely field the
recommended national assessment option (the Full Bow Valley WebBased assessment).
Such a finding does not preclude implementation; rather the weakness in
the Region’s technical and operational infrastructure implies a need for
higher expenditures on:
1. The recruitment and basic training of interviewers
2. The recruitment and training of coders, programmers and data
analysts
3. The provision of higher levels of technical support for sample
selection, weighting and variance estimation
4. The training of national teams in the execution of key
implementation steps
5. The implementation of an extensive regime of proactive vetting of
materials and plans to detect and prevent errors
6. The implementation of an extensive regime of retroactive review of
operational results to detect and correct errors.
155
7. The specification and execution of a broad range of information
products and services that serve to reduce the analysis burden on
individual countries
The actual size of the expenditures that will be required, the nature of
the technical support required and the implied quality assurance regime
can only be established once countries have completed their national
planning reports and agreed to a common implementation schedule. As
noted earlier in this report, the amounts will depend not only on the
number of countries that decide to participate, but which countries
decide to participate.
The assessment also reveals that high-speed internet coverage is limited
in most countries. However, there are recommendations that would allow
for data collection where internet access is lacking or limited e.g. using
data collection devices with large cache memory and uploading the
responses later; using central points with internet access where
respondents could access to respond to the survey as well as using 3G
network where possible.
156
9.1 Activities Completed Under Phase I
The table below shows the activities completed under Phase I by
completion date and method of verification:
Activities Completed
Conduct of Briefing Meeting
Completion
Date
May 2011
Preparation of a Inception Report
Aug 2011
Conduct of CARICOM First Technical Workshop on a
Common Framework for a Literacy Survey
Conduct of CARICOM Second Technical Workshop on
a Common Framework for a Literacy Survey
Dec 2011
Review of Literacy Assessment Options- ISRS, LAMP,
PIACC, Bow Valley Web-Based option
Oct 2012
Review of country experiences relative to the ISRS,
LAMP, PIACC, Bow Valley Web-Based option
Oct 2012
Review of the work undertaken in the conduct of the
Adult Literacy and Life Skills Survey in Bermuda, the
work undertaken in Saint Lucia and the work
proposed in Dominica in the area of Literacy
Assessment
Assessment of the survey capacity of countries
Oct 2012
Preparation of Draft Report on Phase 1 activates
which include all above-mentioned activities
Preparation of Plan of Action
Oct 2012
Preparation of documentation on the Common
Framework for a Literacy Survey
Apr 2013
Jun 2012
Oct 2012
Oct 2012
Method of
Verification
1Inception
Report
1Inception
Report
2First Workshop
Report
3Second
Workshop
Report
4 Phase I Draft
Report- Chapter
2
4 Phase I Draft
Report- Chapter
2
4 Phase I Draft
Report- Chapter
3
4Phase
I Draft
Report- Chapter
5
4 Phase I Draft
Report
5Common
Framework for a
Literacy Survey
(including the
Plan of Action)
5Common
Framework for a
Literacy Survey
(including the
Plan of Action)
1See
Annex F
Annex EI
3See Annex EII
4Same as in this Report (Final Report) pg
5See Annex G
2See
157
LIST OF REFERENCES
DataAngel Policy Research, The adaptation of DataAngel’s Literacy and
Numeracy Assessment to the needs of Small Island States, 2008
DataAngel Policy Research, A National Literacy and Numeracy
Assessment for Saint Lucia: A National Planning Report, 2008
Kirsch, I.S., and Mosenthal, P.B. Interpreting the IEA Reading Literacy
Scales. In M. Binkley, K. Rust, and M. Winglee (Eds.), Methodological
issues
in comparative educational studies: The case of the IEA Reading Literacy
Study. Washington, DC: National Center for Education Statistics, United
States
Department of Education, 1994
OECD and HRSDC, Literacy skills for the Knowledge Society: Further
Results of the International Adult Literacy Survey, 1997
OECD and Statistics Canada, Learning a Living: First results of the Adult
Literacy and Life Skills Survey, 2005
OECD and Statistics Canada, Literacy for Life: Further results of the
Adult Literacy and Life Skills Survey, 2011
Statistics Canada and OECD, Literacy, Economy and Society: Results of
the first International Adult Literacy Survey, 1995
Statistics Canada and OECD, Literacy in the Information Age: Final
report of the International Adult Literacy Survey, 2000
The adaptation of DataAngel’s Literacy and Numeracy Assessment to the
needs of Small Island States’ (DataAngel, 2008)
The UNESCO Institute of Education, A Cross-National Symposium on
Functional Literacy, Hamburg, 1990
158
ANNEX A: TERMS OF REFERENCE
1.
BACKGROUND
Consistent, reliable and comparable statistical information is an
important ingredient for planning, monitoring and evaluation of policy
decisions and the lack thereof has long been considered as hampering
the effectiveness of public policy in the Caribbean Region. Consequently,
noting the lack and/or poor quality of literacy statistics in the Region,
and the perceived importance of this information for the Region’s
economic and social development in light of the advent of the Caribbean
Single Market and Economy (CSME), Statisticians of the Caribbean
Community (CARICOM) decided to pursue the development of a common
strategy and methodology for the collection and analysis of social and
gender statistics including educational statistics.
Aware of these challenges and shortcomings of the existing approaches
to measuring literacy, the CARICOM Advisory Group on Statistics (AGS),
is attempting to develop a common framework for the production,
collection and analysis of Literacy data. The AGS is charged with the
mandate to guide the improvement in the range and quality of statistics
and statistical infrastructure in the Region; it is comprised of the
Directors of the Statistical Offices of eight CARICOM Member States
which participate on a rotating basis and two members of the CARICOM
Secretariat. The AGS recommended the use of the methodology in the
“Literacy Assessment and Monitoring Program (LAMP)” developed by
UNESCO which has been tested in a number of countries. In 2006,
during the Thirty-First Meeting of the Standing Committee of Caribbean
Statisticians (SCCS), the decision was taken to develop a regional
strategy to assist Members States in its implementation. The LAMP is a
recent tool developed by UNESCO with support from Statistics Canada
and the Educational Testing Service (ETS) to measure functional
literacy.19
The objectives of the LAMP are “to develop a methodology for providing
data on the distribution of the literacy skills of adults and young people
in developing countries; to obtain high quality literacy data in
participating countries and to promote its effective use in formulating
national policy, in monitoring and in designing appropriate programme
interventions to improve literacy levels, and to build national capacities
in the measurement of literacy, to develop and use valid and reliable data
The LAMP approach is based on the International Survey of Reading Skills (ISRS) that was developed and
fielded by Statistics Canada and the United States National Centre for Education Statistics.
19
159
LAMP data and methodologies20. In sum, employing the LAMP and the
International Survey of Reading Skills (ISRS) approaches to
measurement would enable the production of reliable and comparable
data on literacy levels across all CARICOM Member States and inform
decision-makers as to the interventions and requirements needed to
improve literacy.
1.
This initiative is consistent with other CARICOM efforts to
harmonize regional statistics including the development of a common
framework for statistics production in CARICOM, a Common Census
Framework and the development of a Strategic Framework for Statistics.
To achieve the project’s objectives, the initiative has the following three
components described in this consultancy as phases: (i) establishment of
a regional framework for conducting and adapting Literacy assessment
models for the facilitation of a regional assessment for the execution of
the Literacy Survey treating each country as a sub-population;
(ii) development and adaption of LAMP instruments, such as survey
instruments (questionnaires), training manuals, and related materials, to
inform about the survey, documentation on the concepts and definitions,
scoring of the assessment and on the sampling approach, data
dissemination/tabulation format, as part of the common framework; and
(iii) development of a template for the national implementation plans
using a common questionnaire, field test procedures for establishing the
psychometric and measurement property of the survey instrument, and
confirmation of key aspects of survey cost and quality.
2.
OBJECTIVES
The general objective of this consultancy is to establish a common
framework involving a regional approach for conducting the literacy
assessment methodology, the development of literacy assessment
instruments and in the provision of technical assistance for the
development of national implementation plans for the conduct of literacy
assessments, based on the agreed to common framework and
instruments and other documents. The goal is to support informed policy
making to meet the Education for All and the MDG goals, as well as
supporting the integration process that is part of the Caribbean Single
Market and Economy (CSME).
The specific objectives can be divided into three phases as follows:
20
Literacy Assessment design. The design of the study would have 6 phases including the: (i) development
and approval of a national implementation plan, (ii) development and certification of the content and design of
survey documents in relevant languages, (iii) conduct of a field test to establish the psychometric and
measurement properties of the survey instruments and to confirm key aspects of survey cost and quality, (iv)
processing and analyzing of field test results; (v) administration of the final instruments to a probability
sample of the adult population (16 years an up), and (vi) processing, analysis and reporting of the main
assessment results.
160
Phase I:
To undertake a review of the ISRS and LAMP approaches and to apply
these approaches to a regional context through the production of a
relevant draft methodological framework for use in CARICOM.
Phase II:
To undertake the preparation of all survey instruments, sample design,
training materials etc., required to apply the literacy assessment in a
regional/national in CARICOM.
Phase III:
To provide support to CARICOM member countries in the identification
of requirements, timelines etc. as part of a strategic action plan to
undertake at least two literacy assessment surveys.
3.
SCOPE OF WORK
The Consultant will be required to undertake the following activities to
satisfy the objectives of the assignment:
Phase I Activities:
a. Engage in a Briefing Meeting with the Statistics Sub-Programme, CARICOM
Secretariat to discuss the scope of work of the project.
b. Review the ISRS and LAMP methodologies and identify any problems
in application to CARICOM and adjustments that would be required
including the following:
i. Review the ISRS and LAMP methodologies in detail;
ii. Review the experience of Countries in which the ISRS and
LAMP has been conducted/pilot-tested;
iii. Review specifically the work undertaken in the conduct of the
Adult Literacy and Life Skills Survey in Bermuda, the work
undertaken in Saint Lucia and the work proposed in Dominica
in the area of Literacy Assessment;
iv. Collect the requisite information to draft recommendations for
a common approach to the conduct of a literacy survey in
CARICOM (actions required, common content, sampling
approach, data collection, data processing, literacy scoring,
weighting, coding hardware/software requirements, publicity,
training requirements etc.);
v. Consult with Member States to determine the major
considerations to be taken on board in (iv);
c. Prepare on the basis of the comprehensive review a detailed draft
report comprising the assessment of the reviews undertaken on the
ISRS and the LAMP, the individual country assessments, information
161
d.
e.
f.
g.
obtained from Member States and a detailed methodological
approach with recommendations, adjustments required, actions to be
taken and the actual methodology to be utilised;
Prepare a plan of action outlining the sequence of steps required to
achieve the recommendations in (c) and the estimate of costs
involved;
Present findings and recommendations of the Phase I project
activities to the meeting of the Advisory Group on Statistics and
workshops organised in the course of the project to discuss the
findings;
Prepare and submit a final report on Phase I to the Secretariat
incorporating all comments/outputs from the meetings, the
Secretariat and other stakeholders;
Support the Secretariat and the AGS in the preparation of final
Common Framework for Literacy.
Phase II Activities:
a. Prepare on the basis of the Common Framework for Literacy Assessment in
CARICOM in Phase I, all instruments and related documents and
framework required and adjustments that would be required including the
following:
i.
Literacy Survey Questionnaires including questionnaire for
pilot-testing, screening questionnaire (to select respondent(s)
in household if required), main questionnaire (for
assessment), scoring sheets etc;
ii. Methodology Guide containing concepts and guidelines on
literacy assessment domains, data collection procedures,
timelines for data collection etc.;
iii. Training Manuals for Interviewers, Supervisors, Trainers;
iv. Detailed Sample design, Sampling Frame, target population,
detailed sampling method of household and respondents etc.;
v.
Guidelines for scoring and weighting results and for treatment
of non-responses;
vi.
Guidelines for Data processing, including data capture, data
editing and coding, data verification, data dictionary, data
entry forms, editing rules etc;
vii.
Tabulations to be produced;
viii.
Analytical guidelines and dissemination;
ix. Publicity material;
x. Consult with Member States to determine the major
considerations to be taken on board in (i) -(ix);
c. Prepare on the basis of the documents/materials produced in (a) a
draft report of the Phase II activities including adjustments required
based on feedback from the AGS, Member States and the Secretariat;
162
d. Prepare and submit a final report of Phase II to the Secretariat
incorporating all comments/outputs from the meetings, the
Secretariat and other stakeholders;
e. Support the Secretariat and the AGS in the finalisation of any of the
documentation for use in the creation of the Common Framework for
Literacy.
Phase III Activities:
a. Review the agreed to Common Framework for Literacy Assessment in
CARICOM and the instruments and scope of work required in the
undertaking of the literacy assessment;
b. Design a draft template for the preparation of the National
Implementation Plan to include the following:
i.
List of all activities to be undertaken as contained in the
detailed common literacy framework and documentation to be
obtained as per the survey instruments including pilot-test
and quality assurance;
ii.
Costs estimates of all activities;
 Staffing requirements (no. of interviewers, supervisors,
scorers);
 Training
requirements
specifically
scientific
measurement/data collection;
 Estimated
timeframestart
dates-end
dates/scheduling of all activities;
 Procurement plans – goods and consultancy services;
 Responsible parties;
 Comprehensive Publicity Programme;
 Approach to sustainability of process.
c. Prepare a cost estimate for the conduct of two Literacy Assessment in
each CARICOM country which should include the following
considerations:
i.
Survey Costs - fees/stipends for interviewer, supervisor,
trainers, scorers, editors coders etc;
ii.
Corresponding travel costs;
iii.
Printing/production of questionnaires and all relevant
manuals and documentation;
iv.
Advertisements/publicity;
v.
Data processing, tabulation, analysis and dissemination;
vi.
Other related costs for conducting the exercise.
d. Collect information from member countries with regard to resources
available at the national level, staffing, budget, collaborating partners
163
(Statistical Office, Ministry of Education and other relevant agencies)
and
other
relevant
information
related
to
capacity
availability/constraints that can inform the template to be prepared;
e. Provide support to countries through workshop or otherwise in the
preparation of the implementation plans;
f. Prepare on the basis of (a) to (e) a Draft National Implementation
Plan with the major components required for implementation of the
Literacy Assessment survey;
g. Prepare a draft report of the Phase 3 activities including the national
implementation plan template;
h. Prepare and submit a final report of Phase 3 to the Secretariat
incorporating all comments/outputs from the meetings, the
Secretariat and other stakeholders;
i. Prepare an overall summary report of all phases of the Consultancy.
4.
EXPECTED OUTPUTS
The expected outputs for each phase are as follows:
Phase 1:
a. A Draft Report of the findings of the Phase 1 activity. The report will
include al assessments undertaken including individual country
assessments and recommendations on what actions
and
adjustments are required to the ISRS and Methodology;
b. A Plan of Action outlining the sequence of steps required to achieve
the recommendations of Phase 1 activities, item(e) and approximate
costs involved;
c. A Final Report of the Phase 1 incorporating all comments/outputs
from meetings, the Secretariat and other stakeholders and the
activities as implemented under Phase I.
Phase 2:
a. All documents and material as contained in Phase 2 activities, item
(a) above;
b. A Draft Report of the Phase 2 activities describing work put in place
and adjustments required;
c. A Final Report of the Phase 2 activities incorporating all
comments/outputs from meetings, the Secretariat and other
stakeholders.
Phase 3:
a. National Implementation Plans/ templates;
164
b. A Draft Report of the Phase 3 activities describing work put in place
and adjustments required;
c. A Final Report of the Phase 3 activities, incorporating all
comments/outputs from meetings, the Secretariat and other
stakeholders;
d. A summary report of all three phases of the project.
5.
TIMELINE
The overall duration of the entire activity is 140 days comprising 25 days
for Phase 1, 60 days for Phase 2 and 55 days for Phase 3. The
breakdown for each phase is given below.
Phase 1:
The expected time for this phase is approximately 25 person days over a
period of four months. The timetable is as follows:
a. Preparatory period (3 days);
b. Travel to the CARICOM Secretariat in Georgetown, Guyana to
attend an initial briefing meeting, and to selected CARICOM
countries (2 days);
c. Conduct of the review of the ISRS and LAMP (10 days).
d. Preparation of the draft report with recommendations (4 days);
e. Attendance at meetings/ workshops as required (3 days);
f. Supporting of the Secretariat/AGS in the drafting of final common
framework (3 days).
Phase 2:
The expected time for this phase is approximately 60 person days over a
period of six months with at least 23 days being spent in the beneficiary
countries. The timetable is as follows:
a. Preparatory period (5 days);
b. Travel to the CARICOM Secretariat in Georgetown, Guyana to attend an
initial briefing meeting, and to selected CARICOM countries (5 days);
c. Production of draft copies of all instruments and materials required to
conduct the literacy assessment including the sampling design (30 days).
d. Preparation of the draft report with recommendations;
e. Preparation of final versions of the instruments as required based on
feedback from the Secretariat and all stake holders (10) days;
f. Preparation of a final consultancy report (5 days).
165
Phase 3:
The expected time for this consultancy is approximately 55 person days
over a period of four months. The timetable is as follows:
a. Preparatory period (5 days);
b. Travel to the CARICOM Secretariat in Georgetown, Guyana to
attend an initial briefing meeting, and to selected CARICOM
countries (5 days);
c. Production of draft National Implementation Plans/Templates (30
days);
d. Preparation of the draft report with recommendations as required
(5 days);
e. Preparation of final National Implementation Plans/Templates (5)
days;
f. Preparation of a final consultancy report (3 days);
g. Preparation of overall summary report all phases (2 days).
6.
EXPECTED OUTPUTS
The Consultant should possess a Masters Degree in the Social Sciences
in areas such as Sociology, Economics, Statistics, or any other relevant
discipline. The consultant should have at least 6 years of experience each
in statistics (educational statistics); 8 years+ experience in the planning,
execution and analysis of surveys; familiarity with the LAMP and IRIS
methodology is preferred; excellent written and English communication
skills with a demonstrated ability to assess complex situations in order
to concisely and clearly filter critical issues and draw conclusions; and
excellent facilitation skills.
166
ANNEX B: COUNTRY ASSESSMENT
QUESTIONNAIRE
CARICOM Regional Public Good Common Literacy Framework
Project
Questionnaire for Member States
CONFIDENTIAL When Completed
The following questionnaire has been developed to collect information in
support of the IDB-financed CARICOM Regional Public Good Common
Literacy Framework Project.
The questionnaire seeks to identify Member States’:
(i)
(ii)
Needs and priorities with respect to literacy and
numeracy data
Operational, financial and technical capacity and
need for support
The questionnaire will assist Member States in thinking about their
national information needs and their capacity to field a household-based
skills assessment, knowledge that will inform the completion of a
national planning report.
Questionnaires will also help identify what type of technical assistance
Member States might need.
A.
Identification:
Name:
__________________________
Designation: _________________________
Organizational affiliation:________________________
Address:
_________________________
_________________________
Telephone: ____________________________
Email:
___________________________
B.
Data needs and priorities
167
Adult skills assessments can be designed to serve a range of purposes
including knowledge generation, policy and planning, monitoring,
evaluation and program administration. Knowledge generation involves
generating new scientific insights, including understanding cause and
effect. Monitoring implies the collection of repeated measures in order to
see if the design of any assessment must be adapted to support each of
these purposes.
B1.
Which of the following purposes does your country require
adult literacy and numeracy data? Indicate all that apply and
rank in order of importance i.e. 1- Most important and 5Least important
__
Knowledge generation
__
Policy and program planning
__
Monitoring
__
Evaluation
__
Program administration
Adult skills assessments can be designed to serve a range of policy
departments.
B2.
Which of the following policy departments require adult
literacy and numeracy data? Indicate all that apply and rank
in order of importance i.e. 1- Most important and 8- Least
important
__
Kindergarten to Grade 12 education
__
Adult education
__
Labour
__
Finance/Treasury
__
Language and culture
__
Social
__
Prime Minister’s Office
__
Other (specify) ______________________________
168
Adult skills assessments can be used to address a wide range of
policy issues.
B3.
Which of the following policy issues require adult
literacy and numeracy data? Indicate all that apply and
rank in order of importance i.e. 1- Most important and
15- Least important
___
Improving the quantity of primary and secondary
education
___
Improving the quality of initial education
___
Improving the equity of initial education
___
education
___
Improving the quantity of tertiary education
___
Improving the quality of tertiary education
___
Improving the equity of tertiary education
___
Improving the efficiency and effectiveness of tertiary
education
___
Improving the quantity of adult education
___
Improving the quality of adult education
___
Improving the equity of adult education
___
education
B4.
Improving the efficiency and effectiveness of initial
Improving the efficiency and effectiveness of adult
___
Reducing social and economic inequality
___
Improving labour productivity and competitiveness
___
Improving health
Has a source(s) of funding been identified to support the
implementation of a national literacy and numeracy
assessment?
169
O
C.
Yes
O
No
Operational Capacity
The implementation of adult skills assessments places significant
demands on the operational capacity of NSOs. The following questions
will help evaluate whether Member States have the capacity to undertake
an assessment.
Experience in executing a literacy survey
C1. How many staff members have experience in literacy survey?
Number: I__I__I
If zero skip to C3
C2. In what areas do they have experience? MARK ALL THAT
APPLY
o
o
o
o
o
o
o
o
Planning
Sampling
Data collection
Data entry/ data capture
Coding
Editing
Data analysis
Other (specify) ________________
Collection capacity
C3.
How many trained interviewers do you on Staff?
Number:
C4.
I__I__I
What is the total monthly collection capacity (in total
number of interview hours)?
Number of hours:
C4a.
I__I__I
What proportion of this capacity is being utilized by the
collection of the regular statistical program?
Enter percent I__I__I
C5.
How many field supervisors do you have on staff?
Number: I__I__I
170
C6.
What is the average daily training fee paid to
interviewers, field supervisors and senior interviewers?
Interviewers
$_________
Field supervisors $_________
Senior interviewers $_________
Data capture capacity
C7.
How many Data Entry Clerks do you have on staff?
Number:
C8.
I__I__I
What is the average capacity of your data entry clerks in
terms of number of keystrokes per day?
I__I__I__I__I__I__I__I__I__I keystrokes/day
Coding capacity
C9.
How many statistical coding clerks do you currently
have on staff?
Number:
I__I__I
Editing
C10.
How many programmers do you have on staff?
Number:
C11.
I__I__I
How many field editors do you have on staff?
Number:
I__I__I
Analysis
C12.
C13.
How many staff members with statistical analysts
experience do you have on staff?
Number: I__I__I
Which of the following analysis tools does your staff have
working experience in? MARK ALL THAT APPLY
171
C14.
O
SAS
O
SPSS
O
Excel
O
Other (specify) _________________
What types of analysis can you support?
O
Tables
O
Simple regressions
O
Multi-level/multi-variate regressions
Technical Capacity
Sampling
C15.
Do you have a sampling statistician on staff?
O
C16.
Yes
O
No
Does your office have the capacity to select a multistage, stratified probability sample?
O
Yes
O
No
Weighting
C17. Does your office have the capacity to weight survey
records?
O
Yes
O
No
Variance estimation
172
C18. Does your office have the capacity to calculate variance
estimates based on complex survey designs?
O
Yes
O
No
C19. Does your office have the capacity to calculate variance
estimates using replicate weights?
O
Yes
O
No
Graphics
C20. Does your office have the capacity to use In-design, the
software used to generate the test booklets?
O
Yes
O
No
Computers in collection
C21. What proportion of the country has access to high speed
internet?
I__I__I__I%
C22. How many of your interviewers are able to do simple
tasks on a computer? E.g. create a word document, buy
products on a website, create an excel spreadsheet, send
email
Number of Interviewers: I__I__I__I
C23. How many of your interviewers have experience with
computer-assisted personal interviewing?
Number of Interviewers: I__I__I__I
173
ANNEX C: COSTING FOR SAINT LUCIA’S LITERCY
SURVEY PILOT
The following spreadsheet provides an overview of how the Saint Lucia
assessment was costed. This template was used to derive the cost
estimates for the paper and pencil options presented in this report.
Actual costs will vary depending on national wage structure. The
computer-based options include hardware and software acquisition costs
and license fees but exclude internet usage fees.
Saint Lucia Cost Estimates
Core Project team
Project manager
Sampling statistician
Programmer
In design
Revise collection manuals
Revise BQ
Prepare for pilot training
Give pilot training
Prepare for main training
Give main training
Attend
framework/adaptation
training
Attend task admin/scoring
Supervise pilot+main data
collect
Total core teams
Pilot collection
Interviewer training
Print interviewer manuals
Batch assignments
Data collection
Scoring
Print booklets
Calculators
Recorders
Timers
Print flip books
Print score sheets
Re-score 100%
Coding ISIC ISCO ISCED
Edit
Batteries
Resources
Cost
60 days @$125
20 days@$100
20 days @$200
5 days@$250
5 days@$100
5 days@$100
5 days@$100
5 days@ $150
3 days@$100
5 days @$150
$7,500.00
$2,000.00
$4,000.00
$1,250.00
$500.00
$500.00
$500.00
$750.00
$300.00
$750.00
3 days @$100x3
6 days@$125x3
$900.00
$2,250.00
150 days@$100
$15,000.00
$36,200.00
15 @ $200
200 pagex.50/pagex50
7 hours
450 cases@$100/case
15 minutes/case
500x$5
20@$15
20@$50
20@$5
100@$15
500@.5
15 minutes/case
20 days@$125
Total Cost
$36,200.00
$3,000.00
$5,000.00
$100.00
$45,000.00
$900.00
$2,500.00
$300.00
$1,000.00
$100.00
$1,500.00
$250.00
$900.00
$1,500.00
$2,500.00
$250.00
174
Data capture scores
Data capture codes
Data capture BQ
Print BQ
Clerical support
Admin support
Print task admin guide
Total Pilot
500@$10
20 days @$75
20 days@$75
20@$10
$100.00
$50.00
$750.00
$5,000.00
$1,500.00
$1,500.00
$200.00
$73,900.00
$73,900.00
Main collection
Interviewer training
Print interviewer manuals
Data collection
Scoring
Print booklets
Calculators
Recorders
Timers
Print flip books
Print score sheets
Re-score
Coding ISIC ISCO ISCED
Batteries
Print BQ
Print task admin guide
Data capture scores
Data capture codes
Data capture BQ
Edit
Map to international record
layout
International rescore
Weighting variance
estimation
5days@
$200/interviewer 45
200
pagesx.50/pageX50
4500
cases@$100/case
15 minutes/case 4500
cases
5000@$5
35@$15
35@50
35@$15
100@$15
500@.50
100%
15 minutes/case 4500
cases
5000@$10
50@$10
4500 score sheets
4500 coding sheets
20 days@$125
$9,000.00
$5,000.00
$450,000.0
0
$11,250.00
$25,000.00
$525.00
$1,750.00
$525.00
$1,500.00
$2,500.00
$11,250.00
$5,625.00
$2,500.00
$50,000.00
$500.00
$1,000.00
$500.00
$15,000.00
$2,500.00
3 days@$125
$375.00
$5,000.00
5 days@$125
$625.00
$601,925.00
International overheads
Review NPR
Training
Frameworks/adaptation
Training task admin/scoring
Quality assurance
adaptation
Psychometrics pilot
Psychometric main
Pilot meeting
3 experts
$601,925.00
$1,500.00
$10,000.00
$12,000.00
$1,500.00
$10,000.00
$20,000.00
$10,000.00
175
Small area estimates
Sampling
Analysis+reporting
General management
Main meeting
Total International
Overheads
Total project
Including 10% margin of
error
Total Available budget
$25,000.00
vet design, weighting,
reps
in US $
in EC
$5,000.00
$25,000.00
$15,000.00
$10,000.00
$145,000
$393,950.50
$393,950.50
$1,105,975.50
$1,216,573.05
$1,300,000.00
176
ANNEX D: SMALL AREA ESTIMATION
The methods employed in common assessments
The implementation of a common assessment is predicated on the
assumption that the cost and operational burden associated with a full
assessment is too high for the majority of Member States to bear. The
common assessment is designed to reduce the cost and operational
burden of the assessment by reducing the sample size to 1,000 cases
without sacrificing the scientific integrity of the skill measures. Data
from the 1,000 cases are then used in three ways.
First, the proficiency data is used to generate a small number of point
estimates (e.g. the average prose literacy, document literacy and
numeracy scores, the score distributions, the score distributions by
proficiency levels). The number of reliable point estimates will depend on
the sample size and the actual distributions of characteristics.
Second, the data are used in a multivariate analysis that explores the
link between assessed skills and background variables i.e. what variables
explain the observed differences in skill level and what impact do skill
differences have on individual outcomes.
Third, a variant of the multivariate analysis is used to provide a set of
regression parameters that can be used to impute skill scores on to the
most recent Census.
Two forms of regression are used to generate the imputed values for each
skill domain and for membership in market segments:
o Logistic regression where the dependent variable is the skill
Level (1-5).
o Ordinary Least Squares (OLS) regression
dependent variable is the skill Score.
where
the
The regression variables used in this analysis are restricted to those that
have been shown to have an impact on skill and that are available on
both the assessment background questionnaire and the Census file. For
example, the recent Canadian application of this approach the
imputation the imputation used:
o Gender
o Education_Level : 5 Categories
o Age Group :
177
o Mother_Tongue : English, French, Multiple and Other
o Province
o Labour Force Status: Employed, Unemployed, Not in Labour
Force
o Occupation ; 10 Categories
o Aboriginal : Yes/No
o Immigrant : Yes/No
To impute actual score values, within skill Levels, the percentiles of
actual regression scores were mapped onto percentiles of predicted skill
Values.
o Using the survey data the actual Scores are compared to
predicted values (based on the OLS Regression).
o This is done within each skill Level in each domain so one
can compare the percentiles of the actual scores associated
with the percentiles of the predicted values.
The imputation procedures for each individual on the Census microdata
file are as follows:
(ii) A skill Level (1-5) is imputed based on the Logistic Regression
Coefficients. The imputed value is random using not only the
coefficients but also the variance/covariance matrix.
(iii)
A preliminary Score is imputed based on the OLS Regression.
This score may not be in the appropriate range for the imputed
proficiency Level.
(iv)
This preliminary Score is converted into a final score as follows:
o The preliminary Score is converted into a percentile of
predicted scores (based on the common assessment analysis)
within the imputed Level.
o This percentile level is used to pick an actual score from the
common assessment at this percentile level, within the
proficiency Level.
o This actual score is the imputed Score.
178
(v)
This imputation is repeated 10 times so that a variance in the
various Literacy Scores and levels can be estimated.
Experience suggests that the variables that are available on both the
Census and the common assessment capture roughly 70 percent of the
variance in skill score. Thus, individual scores are quite error prone.
Application of these methods in Canada confirms that these methods
faithfully reproduce the true skill distributions and yield useful
information. The associated errors fall rapidly as proficiency results are
accumulated by population subgroup - once the size of a given target
population subgroup exceeds 500 the results meet all standard tests for
reliability. Estimates for smaller subgroups are more error prone but may
still be useful for policy.
179
ANNEX E: REPORTS ON THE CARICOM TECHNICAL
WORKSHOPS ON THE COMMON FRAMEWORK FOR A
LITERACY SURVEY
180
ANNEX EI: FIRST WORKSHOP REPORT
181
ANNEX EII: SECOND WORKSHOP REPORT
182
ANNEX F: INCEPTION REPORT
183
ANNEX G: COMMON FRAMEWORK WITH PLAN OF ACTION
184
Download