Youth Literacy in Canada: Comparisons with the Past and Expectations for the Future David A. Green and W. Craig Riddell Department of Economics University of British Columbia June 2007 Youth Literacy in Canada: Comparisons with the Past and Expectations for the Future David A. Green and W. Craig Riddell Human capital has increasingly come to be seen as a key determinant of an economy’s success. In a world with rapidly changing technologies, it is argued, having a workforce that is both skilled and flexible enough to adopt new technologies is key. Thus, the best predictor of how an economy will perform in the future may well be the skill level of its youth. Or, to reverse the argument, a country that is not doing a good job of creating skills among its new generations is likely to have troubles in the future. In this paper, we examine a set of direct measures of youth skills in Canada: the scores of a representative sample of youth on tests designed to measure their literacy, numeracy and problem solving abilities in a number of dimensions. The availability of direct measures of literacy skills is a huge advantage in a realm where indirect measures of skills (such as years of schooling) are usually all that are available. Literacy skill measures also have the advantage of measuring something that is of direct interest in and of itself (as opposed to just being of interest because of its role in promoting economic growth). Sen(1999) argues that individuals need a set of capabilities in order to function as equal members of society. Key among these capabilities is literacy since it opens up opportunities to take part in political discourse as well as opportunities in many other fora. Thus, we are interested in characterizing youth literacy and inequality in youth literacy because it is a direct measure of social equity. A key issue in discussing the literacy among the current generation of youth in Canada will be establishing a benchmark. The literacy scores we report, and even the degree of inequality 1 in them, will have little meaning in themselves.1 We use two main types of benchmarks in this paper. The first consists of earlier generations of Canadians. Our prime group of interest will be what we will call the current generation of youth, consisting of individuals aged 16 to 25 in 2003. We will compare their literacy outcomes to the directly preceding generation of youth (people who were aged 16 to 25 approximately a decade earlier) and to generations just before that. A key feature of our investigation will be our attempt to separate differences across cohorts from the effects of ageing on literacy levels. We could compare the literacy of current youth to that of the directly preceding cohort by examining the literacy of youth in 2003 and the literacy of 26 to 35 year olds in 2003 since the previous cohort of youth are observed in the latter age range in 2003. However, such a comparison will reflect both differences across cohorts and the effects of ageing on literacy. That is, we would not know whether the literacy of 26 to 35 year olds in 2003 is different because literacy levels are permanently different across cohorts or because of the fact that this group is older when we observe them. To address this problem, we use a combination of datasets - one from 1994 and one from 2003. In this way, we can compare literacy levels of the current and previous generations at times when they were both youth. Moreover, we can follow the literacy outcomes of the previous generation across the two datasets, allowing us to establish the impact of ageing on literacy. Our second benchmark consists of literacy outcomes for youth in two other countries: 1 The literacy data used here do include indicators for a set of 5 literacy “levels” or ranges of literacy scores. These are then given an interpretation, e.g., “Level 3 is the desired threshold for coping with the rapidly changing skill demands of a knowledge-based economy and society” (The Daily, November 30, 2005). We view these characterizations of the associated ranges of literacy scores as having limited empirical basis (see Blum et al (2001)) and choose to focus on examining the whole literacy score distribution rather than artificially generated subsets of it. 2 Norway and the US. In part, this choice of comparator countries is dictated by data availability but it also has some fortuitous elements. In particular, it is well known that the Nordic countries fare particularly well both in terms of literacy levels and literacy equality. Thus, a comparison with Norway sets a high standard. The US typically does not fare as well in international comparisons but is Canada’s main economic partner and a point of constant comparison for Canadian outcomes of all sorts. Given concerns about international comparability of test data (particularly tests given in different languages) (Blum et al (2001)), we do not place a heavy emphasis on comparison across the three countries in a particular year. Instead, we are interested in whether the patterns we observe across time in Canada are also seen over time within the other two countries. Data holds a place of central importance in our investigation. We make use of the unique International Adult Literacy surveys (IALS) which combine extensive survey questions on respondent backgrounds and behaviours with scores on four broad literacy tests. Particularly important for our investigation is the fact that the IALS literacy tests have been specifically designed to be comparable over time and across countries. Thus, we are able to use direct comparisons of scores from the 1994 and 2003 versions of the IALS for Canada to examine cohort and ageing effects and to use comparisons to the 1994 and 2003 IALS for the US and the 1998 and 2003 IALS for Norway to construct consistent international comparisons. As we mentioned earlier, though, there is some degree of contention about the cross-country comparability of these data; a point we discuss in more detail in the paper. Our investigation generates the following set of key conclusions. Canada’s current youth have generally lower literacy levels than previous generations of Canadians. More precisely, the 3 probability that current Canadian youth suffer low levels of literacy is either no different or slightly lower than previous generations. However, the probability that they attain high levels of literacy is decidedly lower than for previous generations and this disparity increases as we move higher and higher in the literacy distribution. This relatively inferior performance seems to us to be a cause for concern. A second key conclusion is that literacy as measured on these tests declines with age after leaving school. In some ways this is not surprising. Many parents have had the experience of having their children spout facts or mathematical calculations about which they have vague recollections from their school days but can no longer truly remember. This may reflect a “use it or lose it” model of literacy in which literacy skills obtained during school atrophy with lack of use after leaving school. Whether the result is expected or not, though, it implies that if current youth are at relatively low levels of literacy today, they are only going to move to even lower levels over time. In terms of international comparisons, Canada falls about midway between Norway and the US both in terms of literacy levels and the extent of inequality in their literacy distributions. Thus, there is potentially much to learn from the Norwegians but we do appear to have an advantage over the Americans. Interestingly, all three countries show the same pattern of literacy loss with age. Thus, whatever Norway is doing better it seems not to have to do with institutions and opportunities associated with maintaining literacy levels after leaving school. Or, to put it in the current policy vernacular, there is no reason to think, based on literacy test scores, that Norway is better at “life long learning” than Canada. In terms of crosscohort patterns, the US shows much the same pattern as Canada while the Norwegian data does not show any particular pattern of differences across cohorts. Thus, whatever Norway is doing right, it has been doing it for a while and has been consistent. Both Canada and the US, on the 4 other hand, appear to face a growing problem with each successive generation. Our investigation proceeds in sections. In section 2, we provide a brief overview of the data. In section 3, we characterize the distribution of youth literacy in Canada and provide direct comparisons with older age groups in 2003. In section 4, we break our comparisons down into dimensions related to permanent differences across cohorts and ageing effects. In section 5, we examine the role of education and introduce regression based examinations of the key patterns. In section 6, we present the literacy distributions for Norway and the US and make direct comparisons with Canada. 2) Data Our data comes from the International Adult Literacy and Skills Survey (IALS03): a combination survey and skills assessment carried out in several countries in 2003.2 We also use the International Adult Literacy Survey (IALS94), an earlier survey of literacy skills also carried out in a series of countries but in differing years for different countries. For Canada the earlier IALS was carried out in 1994 (hence our use of the abbreviation, IALS94). This is also the year of the comparable US dataset. However, Norway carried out its earlier version of IALS in 1998. The IALS03 includes standard questions on demographics, labour force status and earnings, but it also attempts to measure literacy and related cognitive skills in four broad areas: Prose, Document, Numeracy, and Problem Solving (the latter is not included in the earlier IALS). Perhaps of most importance for our discussion, both the IALS03 and the earlier version of IALS 2 The other countries participating in this first round of the IALSS03 were Bermuda, Italy, Mexico, Norway, Switzerland and the U.S. The earlier IALS survey was carried out in over 20 countries during the period 1994 to 1998. 5 attempted to go beyond measuring basic abilities in math and reading to try to assess capabilities in applying skills to situations found in everyday life. Thus, the Prose questions in the survey assess skills ranging from items such as identifying recommended dosages of aspirin from the instructions on an aspirin bottle to using “an announcement from a personnel department to answer a question that uses different phrasing from that used in the text.” The Document questions, which are intended to assess capabilities to locate and use information in various forms, range from identifying percentages in categories in a pictorial graph to assessing an average price by combining several pieces of information. The Numeracy component ranges from simple addition of pieces of information on an order form to calculating the percentage of calories coming from fat in a Big Mac based on a table. In part of the work that follows, we use comparisons between the 1994 and 2003 data to examine issues related to changes in literacy with age and differences across birth cohorts. Unfortunately, the Numeracy component of the tests changed substantially between the earlier version of IALS and the 2003 survey and, as a result, we cannot make comparisons in this dimension. In contrast, the Document and Prose tests have substantial overlap in the two survey years, with approximately 45% of the questions being identical across years. Statistics Canada also renormalized test results from the remaining 55% of the questions in 2003 so that the overall average test scores from 2003 bore the same relationship to the overall average in 1994 as do the averages on the questions that are identical between the two years. Even with this, there is potential room for non-comparability of the test results across the years. As Osberg (2000) points out, the literacy scores in the IALS are constructed based on Item Response Theory. Essentially, in this approach, questions are rated on their level of difficulty on a scale with a maximum value 6 of 500. The reported literacy score is a calculation for the individual of the level of difficulty a respondent is capable of answering correctly 80% of the time (with the accompanying assumption that they will get questions of lower levels of difficulty correct more often and questions with higher levels of difficulty correct less often). This calculation is based on answers to a set of literacy test questions but also involves some amount of imputation based on the individual’s observable characteristics. Osberg states that this imputation can imply assigned literacy scores that are actually outside the range of difficulty of the test questions that are asked.3 Given this, even with questions that are well matched across surveys, we could observe changes in the literacy distribution between years arising simply because of a change in imputation methods. To the best of our knowledge, no such change in methods occurred. Moreover, while we document substantial changes in the literacy distribution between the two surveys for younger aged individuals, there are no such changes for individuals in the 46 to 55 age group (Green and Riddell(2006)). If there were general changes in procedures or question difficulty, however, we would expect to see it for all age groups. Based on this, in the analysis that follows, we treat the Prose and Document test scores as perfectly comparable between the two survey years. The Canadian IALS94 sample contains observations on 5660 individuals while the Canadian IALS03 is substantially larger at 23038 individuals. More importantly for our purposes, the samples of youth (who we define as individuals who are age 16 to 25 in the sample year) contain 3574 individuals in 2003 and 1193 in 1994. We include both males and females 3 Though, this appears to be more of a problem in the lower than the upper tail of the distribution. Osberg(2000) presents evidence that 26% of Canadians in the 1994 IALS had imputed scores below the lowest difficulty rating of any question on one part of the test but only 0.5% had imputed scores above the highest difficulty rating of any question. This is important in our case since much of the movement we will discuss occurs in the upper tail. 7 throughout, dividing the analysis on gender lines in some places. Finally, we use the sample weights provided with the data in all tables and estimation. Literacy scores within each of the four domains are reported as 5 plausible values for each person. While documentation associated with the IALS recommends first calculating a given statistic and then averaging across the statistics, much of our investigation focuses on averages and regressions where results are identical whether we first estimate the relevant statistics for each plausible score separately and then average them or we first average the scores and then estimate. Thus, we will focus on the average of the five plausible values for each individual throughout. 3) Youth Literacy in Canada The first, immediate consideration in an examination of literacy among youth is to establish a benchmark. One might imagine arguing that we would like it to be the case that everyone in society should have the highest possible level of literacy. That is, our goal is to have everyone achieving the top score on literacy tests. This would both promote equality in social domains where literacy is important (such as participation in political discourse) and maximize the skill set for use in production in the economy. In considering youth, our question would be whether we will be able to achieve this goal from this generation forward, and our measure of the job our education system (and our society, more generally) is doing in generating literacy is how far short of this goal we are falling. The main difficulty with trying to use this absolute benchmark as our measure of success is that it is difficult to know how to aggregate the various individual shortfalls relative to the benchmark. For example, consider a society of three people in which two are at the maximal 8 literacy level while one falls 200 points short of the top literacy test score and another society in which one person is at the maximal level and two others are each 100 points short of the top score. Should we view these societies differently in terms of literacy achievement? This is more than idle speculation since it is related to how we should spend societal resources in trying to meet literacy goals. If we feel that falling below some critical literacy level is a tremendous disadvantage in society while falling a bit short of the top level leads only to small inconveniences then we would rate the first society as being much lower in terms of literacy success. We would also tend to focus our resources on raising the literacy of people below the critical threshold. Put another way, the distribution of the literacy (as opposed to just the average literacy level) is important for our deliberations on the literacy success of a particular generation. In the empirical work that follows, we will present results in terms of features of the whole distribution. While establishing an absolute measure of the literacy success of a generation is a daunting task, we can at least examine whether recent generations are obtaining higher levels of literacy than earlier generations. If this is the case then there is reason for optimism and we can investigate what has been working well in literacy creation. If it is not the case then there is reason for alarm and we need to go back to figuring out what was done better in previous generations. An additional benchmark comes from comparisons with other countries. A particular advantage of the IALS datasets is the attention that has been paid to making them internationally comparable. Thus, we can see whether youth in other countries fare better in terms of literacy and, if so, set out to investigate how literacy creation mechanisms differ in those countries. We will pursue both comparison strategies in this paper, comparing literacy outcomes 9 of youth in Canada to outcomes for previous generations of Canadians and to literacy outcomes of youth in the United States and Norway. It is worth noting, though, that comparisons of test scores from the IALS across countries is not without controversy. Blum et al(2001) argue forcefully that IALS test scores are not comparable because of problems in translation and cultural specificity of questions. They provide direct examples of questions in the IALS for the UK and France which are intended to be the same but are, in fact, less clear in the French version. It is noteworthy that the French respondents recorded substantially lower literacy scores than those from other countries in the early 90's version of the IALS. Indeed, their shortfall is very surprising. This view, however, is not uniformly held. Tuijnman and Boudard(2001) quote independent reviews of the IALS showing “adequate robustness” of the data. Overall, the arguments in Blum et al (2001) suggest caution about the comparisons between Canada and Norway in the tables that follow. The comparisons between Canada and the US, though, are likely more reliable since they are conducted in the same language and within a relatively similar cultural reference. Moreover, we will focus, to some extent, on the question of whether over time changes in the literacy distribution are similar within each of the three countries. These types of comparisons are likely to be much more reliable than simple comparisons of literacy levels. 3.1) Characterizing the Distribution of Youth Literacy in Canada As stated in the data section earlier, we will define youth as being between the age of 16 and 25. However, we would view a level of literacy for a 17 year old differently if that person has left school (and therefore is less likely to improve their literacy further) versus if he or she is still in school. We will also present some results in which we eliminate from the sample all those who 10 are still in school to see whether this consideration is likely to make a difference. In addition, we will present breakdowns with and without immigrants. As we will see, immigrant - nonimmigrant differences are important for understanding the overall literacy distribution. Moreover, if we want to examine impacts of the Canadian education system we clearly need to remove immigrants who were not educated in Canada. In Table 1, we present basic statistics characterizing literacy distributions in 2003 for all 16 to 25 year olds in Canada. Each column of the table corresponds to one of the literacy tests (prose, document, numeracy, problem solving). Without a benchmark, it is difficult to know what to think of the level and spread of this distribution. There is clearly a substantial difference in performance on the tests between the top and bottom of the distribution. We will wait until we present the distributions for various comparison groups before we discuss this distribution further. Instead, we start by looking at the distribution within specific sub-groups of the population. Table 2 presents literacy score distributions for all youth who are currently in school (i.e., who respond that their current work situation is “in school”). The distributions are extremely similar to the ones displayed in Table 1, indicating that we do not need to be concerned about the fact that we observe some individuals before they have completed their schooling. For the remainder of the paper, we will use all youth (whether they are in school or not) in order to maintain a larger sample size.4 In Tables 3 and 4, we examine literacy for sub-groups who might be expected to have lower literacy levels. The first of these is immigrants who arrived in Canada after age 11 and 4 The one exception to this will arise when we present the cohort based analyses broken down by education. There we will need to focus only on individuals not in school in order to make sure we have a consistent cohort definition across datasets. 11 who, as a result, must have had at least their early education outside Canada. The literacy outcomes for this group are substantially below the overall figures from Table 2. In fact, as we will see, the 5th percentile of the older age at arrival immigrants is lower than that for nonimmigrant, non-aboriginal youth by the equivalent of the standard deviation of the latter distribution. This type of shortcoming is evident in all of the literacy dimensions. In all cases, the disadvantage of older age at arrival immigrants is much smaller (though still substantial) in the upper part of the distribution. This likely fits with language difficulties in answering the test for those in the bottom part of the distribution. Indeed, for immigrant youth who arrived at an older age with prose literacy scores below the 25th percentile, only 19% list English or French as the language most often spoken at home while for those with literacy scores above the 75th percentile, 42% report English or French as the language spoken at home. Ferrar et al(2005) find strong evidence of language effects on literacy score outcomes among immigrants. They argue that it is difficult - if not impossible - to disentangle literacy from language skills and that it is not clear we want to. If we are interested in literacy as a measure of skills needed to succeed both at work and in society more broadly in Canada (as opposed to a measure of intelligence) then those skills should be measured in terms of their levels in either English or French. In comparison, the literacy outcomes for aboriginal youth in Table 4 show better scores than those for immigrants who arrive at older ages at the lower end of the distribution (the 5th and 25th percentiles) but lower scores at the 50th percentile and above. Broadly speaking though, both of these groups have similarly low results to those for other Canadians. Table 5 contains the literacy distribution results for immigrants who arrived in Canada before age 11. This group would both have completed their education in Canada and have a 12 substantial amount of time to acclimatize to Canada. Their outcomes are actually similar to what we observe for the overall distribution in Table 1 in the lower tail and slightly worse than the overall distribution in the upper tail. This implies that this group is “average” for the youth population as a whole - performing better than aboriginals and older age at arrival immigrants but worse than other Canadians. It is important to emphasize that these tables are constructed without conditioning on education. Thus, standard effects of children of first generation immigrants (whether born in Canada or abroad) obtaining higher levels of education (e.g., Bonikowska(2007)) are not enough to generate higher levels of literacy for this group.5 Table 6 contains descriptive statistics for literacy score distributions for youth who are non-immigrant and non-aboriginal. This distribution dominates that for any of the other groups examined, including immigrants who arrived at a young age. They are also the group whose literacy will most reflect the impacts of the main education system and, for this reason, we will use them as our baseline case. In Tables 7 and 8, we investigate the gender dimension of literacy among youth. In these tables, we focus attention on non-immigrant, non-aboriginal youth in order to make sure we do not confuse gender effects with differences in these dimensions. Interestingly, gender differences vary by the type of literacy test. Females have higher prose test scores across the whole distribution with differences ranging from over 25 at the bottom to approximately 13 at the top (i.e., between half and a third of a standard deviation). In document literacy, on the other hand, 5 In a regression of a dummy variable corresponding to being currently in school on age and a dummy variable corresponding to immigrants who arrived before the age of 10, the latter variable has a coefficient of .11 with a standard error of .034. Thus, immigrants who arrived as young children are 10% more likely to be in school and thus end up with higher education. 13 females obtain higher scores by approximately a third of a standard deviation at the bottom of the distribution but this advantage declines as we move up the distribution and by the 95th percentile males and females have the same score. In numeracy, males actually have superior scores across the distribution, though the difference is not as large as the female advantage in prose literacy. Finally, in problem solving females again have an advantage. In some ways, this seems surprising since one might have expected that the same type of logic present in math questions would be a key element in problem solving.6 The gender results presented here echo those in earlier papers (e.g., Willms(1998)). 3.2) Comparisons With Other Age Groups As we discussed earlier, one way to benchmark the literacy levels of youth is to compare them to older individuals. In this way, we can see whether Canada is making progress in terms of literacy across generations. In Table 9, we present statistics on the literacy distribution for the sample of non-immigrant, non-aboriginal individuals who are aged 26 to 35. We drop immigrants and aboriginals in order to be sure that differences across age groups are not arising just because of composition shifts in these dimensions.7 A comparison of Table 9 with Table 6 reveals that the older age group has slightly better literacy score distributions. For example, the prose literacy distribution of the older group has values at the various percentiles that is generally 6 It is worth noting that these same patterns (i.e., that females have higher prose, document and problem solving scores but lower numeracy scores) are replicated when we control for age and education by regressing the individual types of scores on a female dummy, age, years of schooling and a dummy corresponding to being in school. 7 We also examined this broken down by gender. The cross-age group patterns are very similar for males and females separately, so we present results for males and females combined in the paper for presentational simplicity. 14 on the order of 4 to 7 points higher than those for the youth distribution. The same description can be used for the numeracy distributions. These differences are not large (they are typically on the order of 1/6th of a standard deviation) but they do not point to declines in literacy across successive generations. A comparison with the literacy distributions for 36 to 45 year olds8 shown in Table 10, on the other hand, is more mixed. The 36 to 45 year olds have worse prose literacy outcomes at the very bottom of the distribution than the youth but quite similar prose literacy scores for the 25th percentile and above. This, of course, also means that their literacy is inferior to that of 26 to 35 year olds. A key complication in these comparisons is differences in schooling. Thus, many youth have not completed their education and, thus, given that schooling and literacy are positively related, have not attained their highest literacy. To generate a cleaner comparison, in Table 11, we present regression results separately by literacy types in which we regress literacy scores on years of schooling, a dummy variable for whether the individual is currently in school, and dummy variables corresponding to 26 to 35 year olds and 36 to 45 year olds (the base group is youth). The samples used in these regressions are again restricted to non-immigrant, nonaboriginals. The regressions for all literacy types show strong effects of schooling, with coefficients on the years of schooling variable that are both substantial in size and statistically significant. The schooling coefficients indicate that an extra year of school is associated with 7 or 8 extra literacy score points, indicating that increasing years of education from 12 to 16 would be associated with an increase in literacy approximately equal to the increase from moving from the 50th to the 75th percentile of the youth literacy distribution. In addition, being in school at the time 8 Again, the results shown are for a non-immigrant, non-aboriginal sample. 15 of the survey is also associated with higher literacy scores. Once we control for schooling, the youth are the highest literacy age group with average literacy scores approximately 7 points higher than for 26 to 35 year olds and 12 to 14 points higher than 36 to 45 year olds. Thus, our surmise that youth are placed at a disadvantage by the simple comparisons in the previous tables appears to be true. 4) Cohort and Ageing Dimensions 4.1) Cohort Effects The results from the comparisons of Tables 6, 9 and 10 and from the regressions in Table 11 can be interpreted in different ways. One possible interpretation is that the literacy experience of older workers in the same survey can be used for predictions of what we should expect as the current youth get older. Under this interpretation (controlling for schooling) we should expect youth to lose their literacy levels as they age. Alternatively, we could see differences across the age groups as reflecting differences in their schooling and experiences: literacy may be lower for older workers (conditional on years of schooling) because schools were not as good at teaching literacy when they were younger or, instead, because the jobs they held when younger did not require as much use of literacy as do the jobs held by today’s youth. In reality, the differences we observe in our single cross-section are likely some combination of the two factors. Older workers may have different literacy levels because of some combination of changes in literacy that happen to everyone as they age and differences in literacy levels across successive cohorts of individuals. There is no way to untangle these factors in a single cross-section. Solving the problem of untangling ageing and cohort effects requires some sort of panel 16 data. Thus, suppose we could follow a sample of individuals over time, re-testing their literacy at various points in their life. Based on that, we could establish how literacy varies with age. If we could do that with successive cohorts (say, start following a new sample of 16 to 25 year olds every 10 years), we could also investigate whether literacy levels are changing across cohorts by comparing literacy levels at the same age for different cohorts.9 We, unfortunately, do not have a pure panel with literacy test scores (indeed, as far as we are aware, such a dataset does not exist in any country). However, we can obtain consistent estimates of cohort and ageing effects from a series of cross-sectional datasets under a specific set of assumptions. In our case, we have data from the 1994 IALS and the 2003 IALSS for Canada.10 The idea behind “pseudo-panel” techniques constructed using a series of cross-sectional datasets can be understood by considering the complete population of youth aged 16 to 25 in 1994. We will call this, cohort A. As we stated above, our preference would be to obtain literacy test scores for all of these individuals in 1994 and then obtain a new set of test scores for them in 2003. Given that we cannot do that, consider, instead, drawing a random sample from that 9 Thus, for example, we could compare the literacy of 16 to 25 year olds in 2005 with that of 16 to 25 year olds in 1995. As is well known from the panel and pseudo-panel literature, literacy levels between these two groups could differ because of cohort effects (permanent differences in literacy levels across different generations) or because of year effects (differences in literacy levels across all age groups in the two years). In this case, year effects would amount to the claim that literacy levels trend up or down for everyone in the society (regardless of their age). This seems unlikely to us and so we assume that year effects do not exist. This allows us to summarize all patterns over time as a combination of cohort and ageing effects. It is worth noting that the assumption that there are no year effects cannot be proven because of perfect collinearity among ageing, year and cohort effects. It can only be justified as the most reasonable assumption in the circumstance. 10 Later in the paper, we will make use of the same type of data for the US and Norway in order to provide benchmarks for our Canadian results. 17 population in 1994 and obtaining literacy test scores for them. Then, imagine drawing a different random sample from that same population (now aged 25 to 34) in 2003 and, again, obtaining literacy test scores. Since each sample (the one taken in 1994 and the one taken in 2003) is representative of the overall population, each can provide consistent estimates of characteristics of the literacy distribution for the entire population of cohort A. Thus, by examining those characteristics from each sample, we obtain estimates of how the literacy distribution for cohort A evolved as the members of the cohort aged from 16- 25 years of age to 25-34 years of age. As a concrete example, we can obtain a consistent estimate of average prose literacy for cohort A when it is aged 16-25 using the 1994 data. We can also obtain a consistent estimate of average prose literacy for cohort A when it is aged 25 to 34 using the 2003 data. Comparing these, yields an estimate of the impact of ageing for this cohort. The same approach can be used for examining the evolution of other characteristics of the literacy distribution such as the median or the standard deviation. The conditions under which this approach yields consistent estimates of the evolution of the literacy distribution with age for a given cohort follow naturally from the given example. We, essentially, need the two surveys to be representative draws from the same population observed at different ages. This would be violated if the population from which we are drawing changes over time. Thus, if we examined average literacy for everyone in Canada who were age 16 to 25 in 1994 and for everyone in Canada age 25 to 34 in 2003, we would have problems because of immigration. Some of the people present in the 25 to 34 year old age group in 2003 would be immigrants who had arrived since 1994. Suppose, for the moment, that these new arrivals had quite low literacy levels. In that case, we would see a lower average literacy level in the sample 18 of 25 to 34 year olds in 2003 than in the sample of 16 to 25 year olds in 1994 even if the literacy level of those in the original sample did not change at all over time. To avoid this, we will focus on non-immigrants in our cohort-based investigations. We also exclude observations from the Territories in 2003 to make the data comparable to the 1994 data. Further, we require that the samples at each point in time can be regarded as representative of the overall cohort population. To insure this, we make use of the sample weights provided with the IALS surveys. Finally, we require that the literacy tests are comparable over time. If, for example, the 2003 test were harder than the 1994 test this would have obvious impacts on our attempts to make inferences about the impact of ageing on literacy scores. In fact, the Prose and Document tests were designed to make them comparable over time. Approximately 45% of the test questions in these two areas are actually identical across the two years, with those common questions used as the basis of a renormalization designed to insure that the overall average test scores in the two surveys bear the same relationship to one another as the averages on the common subset of questions. In addition, in Green and Riddell(2006), we argue that the fact that differences in scores between the two surveys differ by age group is evidence in favour of comparability. If one test were simply more difficult than the other then one would expect to see uniformly lower scores for all age groups in the more difficult test. Finally, it is worth noting that the Numeracy test changed substantially between the 1994 and 2003 surveys and so cannot be used for the type of comparisons we are considering. Further, the 2003 survey includes a new test - problem solving - that was not implemented in 1994. For this reason, we focus our attention in this section on the Prose and Document scores. To this point, we have discussed how to use multiple cross-section datasets to obtain a 19 consistent picture of the evolution of literacy with age for a given cohort. We can also use these datasets to answer questions about how literacy differs across cohorts. This is particularly interesting for us since it will allow us to investigate whether current youth literacy can be viewed as an improvement on that of earlier cohorts. The key to making comparisons across cohorts is to insure that any comparison is not contaminated by differences in age. Thus, in our discussion of what we can learn from a single cross-section, earlier, we argued that comparing 16-25 year olds to 26-35 year olds in, say, 2003, does not provide us with a clear picture of cohort differences because any observed differences may partly reflect the fact that one group has aged more than the other. To get around this, we need to compare the two cohorts at the same age. Thus, comparing literacy scores for 16 to 25 year olds in 1994 to those for 16 to 25 year olds in 2003 is effectively comparing literacy across successive cohorts. As before, a key assumption is that the literacy tests are comparable across the two years. We begin with the question of differences in literacy levels across cohorts. We will label cohorts by the mid-point of the birth years that compose the given cohort. Thus, we will call the cohort consisting of 16 to 25 year olds in 2003, the 1983 birth cohort; the cohort consisting of 16 to 25 year olds in 1994 (who are then the 25 to 34 year olds in 2003), the 1974 cohort; the cohort consisting of 26 to 35 year olds in 1994 (and 35 to 44 year olds in 2003), the 1964 cohort; the cohort consisting of 36 to 45 year olds in 1994 (and 45 to 54 year olds in 2003), the 1954 cohort; and the cohort consisting of 46 to 55 year olds in 1994 (and 55 to 64 year olds in 2003), the 1944 cohort. Table 12 contains characteristics of document literacy distributions for pairs of cohorts observed at the same age. We also provide an all-encompassing representation of the information in these tables by presenting figures containing kernel density plots for the literacy distributions 20 for pairs of cohorts in Figures 1, 2 and 3. It is worth pausing for a moment to consider what is being presented in the kernel density figures. One way to depict the distribution of a continuous variable (like literacy) is with a histogram. Histograms are constructed by first dividing the range of the literacy variable into subsegments (e.g., from 200 to 220, 220 to 240, etc.). We would then calculate the proportion of the total sample with observations within each of these segments. With this, we can see where in the total range most of the observations are concentrated. The downside of this depiction is that is tends to be very “jumpy”, with high values in one segment next to low values in the neighbouring segments. This can be caused by the arbitrariness of where the edges of the segments are placed. Kernel densities can be seen as smoothed histograms which can overcome this jumpiness. Thus, as with histograms, ranges of literacy scores where the kernel density line is high represents scores where there is considerable concentration of individuals. With this in mind, we turn to examining the plots of the document literacy densities for 16 to 25 year olds in 1994 and for the same age group in 2003 shown in Figure 1. As we stated earlier, this comparison shows the difference between literacy scores for current youth (what we call the 1983 Cohort) relative to the previous cohort (the 1974 Cohort) at the same point in the life cycle. The two densities are very similar in the lower range of scores (what is typically called the left or lower tail of the distribution), implying that individuals in the two cohorts have roughly the same probability of possessing low document literacy skills. However, the two distributions part company in the upper range (the right tail). In particular, the most recent youth cohort has a noticeably lower density in the right tail, implying that youth in the 1983 Cohort are less likely to have high literacy scores. This is reflected in the statistics associated with the two 21 distributions presented in Table 12: while the 25th and 50th percentiles are very similar for the two distributions, the 90th percentile of the 1983 Cohort distribution is 12 points lower and the 95th percentile is almost 20 points lower than the comparable percentiles of the 1974 Cohort distribution. Put in a different way, approximately 5% of individuals in the 1983 Cohort sample have document literacy scores above 355 while over 10% of 1974 Cohort sample members have scores above this value. Thus, the troubling implication from Figure 1 and the first columns of Table 12 is that the most recent cohort of youth have lower literacy at the upper end of the literacy score range.11 Figure 2 presents density plots for document literacy scores for the 1974 Cohort (the cohort which was aged 16 to 25 in 1994) and the 1964 Cohort (the cohort who would have been in the youth age range - 16 to 25 - in 1984). We can compare these two cohorts in the common age range, 26 to 35. We observe the 1974 Cohort in this age range in 2003 and the 1964 Cohort in this age range in 1994. The figure shows that the 1974 Cohort has a lower probability of having very low literacy scores but also lower probabilities of having high literacy scores. Thus, the 95th percentile of the 1974 Cohort document literacy distribution is approximately 30 points lower than that for the 1964 Cohort. Similarly, in Figure 3, the 1964 Cohort has a similar median to the 1954 Cohort but much lower 90th and 95th percentiles. Thus, the overall picture constructed from Figures 1 through 3 is one of declining literacy across cohorts in the upper end of the distribution, with the lower tail of the distribution changing very little across cohorts. Given that more recent cohorts have more education (17% of the 1974 Cohort are university graduates 11 Note that we will provide standard errors on estimated cohort differences in order to assess whether the observed differences are statistically significant in the regressions presented below. 22 compared to 13% of the 1954 Cohort) this is both surprising and disturbing. It is worth recalling, at this point, criticisms that have been levelled at the Item Response Theory approach used to calculate the literacy scores we are presenting. Based on numbers presented in Osberg(2000), imputation appears to be greater and, potentially, more problematic in the lower than the upper tail of the distribution. If this is true then the observation that the lower tails of the literacy score distributions are similar across cohorts could just be a reflection of the people in this part of the distribution being imputed similar scores rather than a reflection of a true lack of change. We currently have no means of assessing this possibility and it should be kept in mind in the discussion that follows. In Figures 4 through 6, we repeat our cross-cohort comparisons but use prose literacy. These are backed up by statistics for the distributions in Tables 13. Both Figure 4 and the statistics indicate that the 1983 and 1974 Cohorts have virtually identical prose literacy distributions. The differences between the 1974 and 1964 Cohorts are similar to what we saw in the document score distributions (though with the absolute differences being smaller): the 1974 Cohort has both lower probabilities of very low prose literacy scores and lower probabilities of high literacy scores. The comparison between the 1964 and 1954 Cohorts is also similar to what was observed with document literacy, with similar lower tails of the distribution but a noticeably lower probability of observing high scores in the 1964 Cohort. Thus, there is again evidence that literacy at the top of the distributions in recent relative to earlier cohorts. This difference, though, is smaller in Prose than Document literacy and there is no difference across the most recent cohorts in Prose literacy. Thus, the most recent cohort of youth are similar to the directly proceeding cohort in reading literacy but worse at the upper end of the range in literacy tasks 23 related to finding and interpreting information in documents. 4.2) Effects of Ageing We turn, next, to the question of how literacy levels change as an individual ages. As we discussed earlier, the best way to track this would be with a true panel of individuals undergoing literacy testing at different points in their lifecycle. Not having that, we employ the next best option: following a cohort of individuals across successive cross-sectional datasets. With two such datasets, we can only follow a given cohort through one 9 year period of their life. Observing what happens to different cohorts (ageing through different lifecycle periods) will then allow us to piece together the complete ageing profile under the assumption that profile has the same shape for each cohort. Thus, our approach assumes that each cohort can have a different overall literacy level (captured in the differences in literacy at a given age described in the previous subsection) but the same shape for the profile showing how literacy changes with age. In Figure 7, we plot the kernel densities for document literacy for the 1974 Cohort at the two age ranges in which we observe them: 16 to 25 (their age in IALS94) and 25 to 34 (their age in IALS03). This provides our best guess at what will happen to the literacy of today’s youth as they age through the next 9 years of their lifecycle. The figure indicates that the lower tail of the distribution was relatively unchanged across this period but there was a noticeable deterioration in the right tail. In other words, as this cohort of youth grew older, the literacy skills at the bottom of the distribution did not change but those at the top atrophied. The results in Table 14, where we present characteristics of the two distributions, support this conclusion: the 25th percentile is quite similar across the two age ranges but the 95th percentile is 18 points lower for this Cohort observed at age 25 to 34 than it was for the same cohort observed at age 16 to 25. Put another 24 way, at age 16 to 25, over 10% of the people in the 1974 Cohort had a literacy score of 356 or above while by the time they were age 25 to 34, only 5% of the people in this Cohort had scores that exceeded this mark. It is worth pointing out that since we do not have a true panel, we cannot know whether the people in each tail of the distribution are similar in the two age range distributions (i.e., whether a person observed at the 5th percentile in the younger age distribution is likely to also be observed at about the same percentile in the older age distribution). It is possible (though not all likely) that as people age, those with the lowest literacy at the younger age become the ones with the highest literacy at the older age and vice versa. However, the most probable scenario is that the low literacy types at the younger age are also the lower literacy types at the older age. If that is true then one would interpret this figure as saying those with the lowest literacy levels do not lose their literacy as they age but the individuals with the highest literacy levels do lose some of their skills over time. This would fit with a particular form of what is sometimes referred to as a “use it or lose it” model of literacy in which basic literacy skills are not lost over time (either because they are easier to retain or because they are actually used by virtually everyone on a day to day basis) but higher level literacy skills are lost to some extent. In Figure 8, we repeat this exercise but for the 1964 Cohort, for whom we observe literacy scores at ages 26 to 35 (in 1994) and 35 to 44 (in 2003). The pattern is much the same as in Figure 7, with the older age distribution having a similar left tail to what was observed for the same cohort 9 years earlier but a noticeably lower right tail. The main difference relative to Figure 7 is that the differences between the two distributions start a much lower level in the 1964 Cohort. Thus, for the 1974 Cohort, the median document literacy score (shown in Table 14) is very similar at the two ages at which we observe them while for the 1964 Cohort, the median is 5 25 points lower at the older age. By the 95th percentile, the older age distribution is a full 27 points (or, the equivalent of two-thirds of a standard deviation of the overall literacy distribution) lower. While almost 10% of individuals in this Cohort have scores above 365 when they are age 26 to 35, only 1% have scores above this level when they are 35 to 44. The same picture arises with the 1954 Cohort, observed at ages 36 to 45 (in 1994) and 45 to 54 (in 2003). The two densities plotted in Figure 9 and the statistics presented in Table 14 show a deterioration with age across virtually the whole literacy range. Thus, putting the results from the ageing of these three cohorts together, one arrives at a picture in which document literacy skills are at their peak levels in the youth (16 to 25) age range. In the ensuing years, these skills deteriorate, with the higher level skills deteriorating first and to the greatest extent while the lower levels are initially unaffected by ageing but eventually they, too, deteriorate. It is possible that this fits with a “use it or lose it” scenario, i.e., one in which literacy obtained during formal schooling deteriorates if it is not used on the job or in day-to-day activities later in life. We will return to this point later. In Figures 10 through 12 and Table 15, we repeat the exercise of examining ageing effects using Prose literacy. In contrast to what we observed for document literacy, the prose literacy distributions are virtually identical at ages 16 to 25 and ages 25 to 34 for the 1974 Cohort. In other words, while upper levels of document literacy begin to deteriorate as soon as people leave formal schooling, the same is not true of prose literacy. This might be because prose skills are more essential in the sense of being more widely used in day-to-day life and jobs. For ageing beyond age 25, though, the same picture as we observed for document literacy emerges. In particular, the 95th percentile for the 1964 Cohort deteriorates by 20 points between the ages of 26 26-35 and the ages of 35-44. This is smaller than the deterioration observed in the right tail of the document literacy distribution but is still substantial. Similarly, the entire prose literacy distribution deteriorates between ages 36-45 and ages 45-54 for the 1954 Cohort but to a smaller degree than what we witnessed for document literacy. Thus, the same general pattern of deterioration of literacy skills with age is evident for prose as for document literacy but to a lesser degree and arising later in life. It is interesting to compare these cohort and ageing patterns with what we observed in the 2003 cross-sectional data. Recall that in comparing literacy levels across age groups (not conditioning on education), we found that 26 to 35 year olds had slightly better prose and document literacy, particularly toward the top of the distributions. The results here indicate that this arises because the 1974 Cohort (who are roughly aged 26 to 35 in 2003) has superior literacy outcomes relative to 1983 Cohort (Youth in 2003) but that superiority has been reduced to some extent by the effects of ageing. 5) Introducing the Education Dimension 5.1) Differences Within Education Groups Over Time As we discussed earlier, differences in literacy across cohorts may be a reflection of differences in education levels across those cohorts and, for Youth, may partly reflect incomplete schooling. As a first step toward understanding the education dimension, we present kernel density plots for four education groups, showing the 1994 and 2003 densities in the same figure. The four education categories are based on highest level of education attained and consist of: 1) education less than high school graduation; 2) high school graduation; 3) a post-secondary degree 27 or diploma less than a BA; 4) a BA or post-graduate degree. Figure 13 contains the kernel density plots of document literacy for drop-outs of all ages in both years. The plots indicate an improvement over time, with a lower probability of obtaining quite low literacy scores offset by higher probabilities of obtaining what were above average literacy scores for this group in 1994. More specifically, the 25th percentile of this distribution increases from 125 in 1994 to 166 in 2003 and the 75th percentile increases from 281 to 294. This is a heartening improvement. In contrast, though, Figure 14 shows declines over time for high school graduates at all percentiles above the 25th. The 95th percentile for this group declines from 363 to 352 between 1994 and 2003. A pattern of stronger declines in the upper part of the distribution for high school graduates is also evident in the plots for the Some Post Secondary group in Figure 15. For that group, the 95th percentile of the document literacy distribution declined from 387 to 350 between the two years - a very large decline, indeed. Finally, university graduates show a similar pattern, with little difference between the two years in the lower half of the distribution but strong declines in the probability of observing high literacy scores offset by increased probabilities of seeing scores that were above average but still somewhat mediocre in the 1994 distribution. This is reflected in a decline of the 90th percentile from 392 to 365. Figures 17 to 20 contain the same set of educational breakdowns but for prose rather than document literacy. The patterns portrayed in these figures are in broad agreement with what we observed for document literacy. In particular, drop-outs experienced literacy improvements (particularly at the bottom of their distribution) while all other educational groups experienced declines (particularly at the top). The main difference is that the declines over time for the university educated are more pervasive across the distribution of prose literacy than for document 28 literacy. 5.2) Regression Results One way to proceed in understanding how these within-education group movements relate to the cross-cohort differences would be to provide a complete set of breakdowns into groups defined by education and cohort. The trouble with this approach is that many of the education x cohort groups are quite small, making inference based on them difficult. Examining such a large number of groups is also cumbersome. Instead, we adopt the approach of presenting both mean and quantile regressions in which we pool the data from 1994 and 2003. With the pooled data, we can identify cohort effects (average literacy scores for member of a cohort, whichever sample they are observed in, holding all else constant) and ageing effects (which are identified in the same way as in our discussion in the previous subsection, i.e., by the changes across time within a given cohort). We estimate these effects holding constant (i.e., including as additional covariates) dummy variables corresponding to the various education levels and to gender. With the inclusion of these education controls, the estimated cohort effects show differences across cohorts while holding constant education composition, i.e., abstracting from the fact that the different cohorts have different educational compositions. For those who are not interested in the details of the regressions, we provide a summary of the key findings from them at the end of the section. We begin with a simple mean regression in which we regress the log of the individual’s literacy score on dummy variables corresponding to 5 cohorts: 1) the cohort who were 16-25 in 2003; 2) the cohort who were 16-25 in 1994; 3) the cohort who were 26-35 in 1994; 4) the cohort who were 36-45 in 1994; and 5) the cohort who were 46 to 55 in 1994. The first of these cohorts 29 (the 1983 Cohort) is the omitted group so all cohort dummy coefficients can be read as showing a difference relative to that cohort. In addition, we include a series of age dummy variables corresponding to: 1) 16 to 25; 2) 26 to 35; 3) 36 to 45; 4) 46 to 55; and 5) 56 to 65. The first age group is the omitted group in the regressions. We control for education using dummies corresponding to the high school graduate, some post-secondary and university categories mentioned earlier, with drop-outs forming the base group. We also include a set of dummy variables for mother’s education and a separate set for father’s education. These pertain to the same categories as for the respondent’s own education plus an additional category corresponding to an answer that the respondent does not know his or her parent’s education. Finally, we include a gender dummy variable. We experimented with further specifications in which we controlled for whether each parent was an immigrant, regional dummies, parental occupation and interactions of parental occupation and education but these did not yield substantial changes in the results and so we present the simpler specification to aid in the readability of the results. The first column of Table 16 contains the estimated coefficients and their associated standard errors from this specification with the log of document literacy as the dependent variable. The effect of education is as one might expect: literacy increases with education, with university educated individuals having literacy scores that are approximately 11% higher than high school grads and 25% higher than high school drop-outs. Having either a mother or father who were themselves drop-outs is associated with about 3% lower literacy than if the respondent’s parent was high school educated and these effects are statistically significant at any conventional significance level. Interestingly, though, whether the parent has a high school, some post-secondary or university education does not matter. Further, not knowing the education of a 30 parent has a negative effect on average literacy that is on the order of 5% and is statistically significant. This may be picking up something about parenting and how close the child is to his or her parents. Both parental immigrant status and gender do not have substantial or statistically significant effects. Thus, these results point to formal schooling being particularly important in literacy skill formation, with some negative impact of having a very low educated parent but parental impact being otherwise much lower than that of schools. Our main interest in this table is in the cohort and ageing effects. The coefficients on the cohort dummy variables rise to some extent with the cohort number (which would fit with older cohorts having higher average literacy levels) but these effects are neither large nor statistically significant at the 5% level. This is perhaps not surprising since the figures we examined in previous sections indicate that cross-cohort differences tend to be focussed in the top end of the distribution rather than being pervasive across the distribution, implying that mean literacy scores (which is what is being examined in these regressions) will move much less than the tails of the distribution. On the other hand, a negative ageing effect is evident in a form that indicates a relatively constant loss of literacy skills with age. These age effects are economically substantial (with 46 yo 55 year olds having 9% lower document literacy than 16 to 25 year olds) and statistically significant at any conventional significance level. This fits with the robust picture of declining literacy with age within cohorts seen in the earlier figures. The second column in Table 16 shows results from the same regression specification but with using the log of prose literacy as the dependent variable. The results are broadly the same as those obtained using document literacy except that most effects are somewhat smaller, fitting with earlier results showing that prose literacy tends to have less variation across a number of 31 dimensions. The other main difference is that the gender effect is larger and statistically significant, implying that females have somewhat higher prose literacy even though there is no difference in document literacy across the genders. The cohort effect estimates do not point to any consistent, cross-cohort patterns while the age effect coefficients again point to declining literacy with age, though with smaller declines than we witnessed in document literacy. In the third and fourth columns of the Table, we present results from a re-estimation of the regressions with the inclusion of occupation dummy variables. The main patterns in our estimates are not changed with the inclusion of these variables. Cohort effects are again muted and there is still strong evidence of a decline in literacy with age. This implies that these patterns are not due shifts in occupational composition across cohorts or with age. Thus, explanations for the types of patterns we are observing lies elsewhere: with factors such as school quality for cohort effects and for factors operating within occupations over time for the ageing effects. We are interested, in part, in how these results vary with education. To investigate this, we estimated an additional specification in which we fully interacted the cohort dummy variables with the education variables, allowing different cohort effects for each education group. The results from that estimation are presented in Table 17 The effects of other variables, such as parental education, are not affected by the inclusion of these interactions so we do not report on their coefficients for the sake of brevity. The first column contains results for document literacy. The simple cohort dummy coefficients in this table correspond to the cohort effects for the base education group (high school drop-outs). For that group, the cohort effects are negative and, at times, statistically significant, implying that the current cohort of Youth have higher average document literacy than earlier cohorts. This fits with the improvement (particularly at the bottom 32 of the distribution) that we saw in the kernel density plots for drop-outs in the previous section. It seems very likely that this reflects rising years of education among drop-outs. Simple tabulations from the 2001 Census show that of those with 10 or fewer years of education, 72% of 20 to 24 year olds had 9 or 10 years of education (with the remainder having completed fewer years of school) but only 60% of those aged 45 to 54 had 9 or 10 years of schooling. The next set of coefficients show the difference in the cohort effect for the second cohort for the other education groups relative to drop-outs. These coefficients should be added to the simple 1974 Cohort effect and to the coefficient associated with their specific education level (e.g., the coefficient on the simple hs dummy, which shows the literacy of a high school graduate relative to a high school drop out in the first cohort) to get the total effect for the 1974 Cohort for each of the education groups. The fact that this total effect is positive for all three education groups indicates that, in contrast to drop-outs, Youth literacy in these other education categories is lower than the literacy level of the cohort that directly precedes the current Youth. This is even more evident as we move to examining earlier cohorts. Thus, these results imply that the near zero cohort effects we witnessed in Table 16 are actually due to offsetting negative effects for drop-outs combined with positive effects for the other education levels. The age effects continue to show strongly declining literacy with age. The results in the second column of Table 17, show that much the same set of patterns exists for prose literacy. Once again, the results imply that the current cohort of Youth drop-outs are better than previous cohorts of drop-outs in their average literacy but the reverse is true for all other education groups. Because much of what we observed in terms of cross-cohort and ageing effects in the figures in the last section was unevenly distributed across the distribution, we turn next to 33 investigations using quantile regressions. In particular, we present results from quantile regressions run for the 10th, 50th and 90th quantiles. This allows us to see what is happening in each tail of the distribution as well as in the middle. It is important to clarify the interpretation of coefficients in these regressions. The coefficient on, for example, the 1974 Cohort dummy variable in the quantile regression corresponding to the 10th percentile shows the difference in the 10th percentiles between 1974 Cohort and the 1983 Cohort (the base group), holding constant the effects of all other covariates in the regression.12 In Table 18, we present the estimated coefficients from cohort, age, education, gender and parental education variables for the 10th, 50th and 90th quantile regressions for the log of document literacy.13 The patterns in the 10th percentile quantile regression point to effects of the respondent’s own education that are stronger than was observed in the mean regression. Moreover, while literacy declines both as age and cohort number increases, the effects are generally small and statistically insignificant. The only exception to this is for the oldest (age 5665) age group, whose 10th percentile is much lower than other cohorts. Overall, though, the conclusion from this regression is that the lower tail of the document literacy distribution is 12 Standard errors reported in this table and all other quantile regression tables are bootstrap standard errors. We performed our estimation in Stata. However, Stata does not allow both weighting and bootstrap standard errors. To get around this, we first created a weighted “fake” dataset in which we turned each observation into multiple observations according to its weight. For example, an observation with a weight of 2.8 is viewed as representing 3 actual observations (after rounding) so we simply replicated this observation twice more so that it is reflected a total of 3 times in our final “fake” dataset. It is worth noting that we first normalize the weights so they sum to the actual sample size. This way we do not create a dataset that seems much larger (and thus yields more precision) than the actual dataset. Once we have created this weighted dataset, we can use it in all of our estimations along with the bootstrap command. 13 We do not include region or parental immigration status variables in order to simplify the exposition. Their inclusion does not alter our main results. 34 relatively constant across cohorts and relatively immune to ageing effects. In contrast to what is observed for the 10th percentile, the third column shows that the median reflects both cohort and age effects that are statistically significant. The cohort effects, though are not large, with the 1954 Cohort having a median document literacy score that is approximately 4% higher than current youth (the 1983 Cohort). The ageing effects are similar in magnitude to what was observed in the mean regressions: a negative, relatively constantly declining profile. It is in the 90th percentile results that we see truly sizeable effects. Again, the results imply that the current cohort of youth has the lowest 90th percentile of all the cohort literacy distributions. The 1954 Cohort (the cohort which was 36-45 in 1994 and, thus, would themselves have been youth in the early 1970s) have a 90th percentile for their literacy distribution which is 14% above that of current youth, holding constant education, gender and parental education. Similarly, the ageing effects are much stronger at the 90th percentile than what we observed at the median and in the lower tail. The implication is that whatever is different across cohorts, it is focussed on the creation of very top level literacy skills. Similarly, the ageing process seems largely to have to do with losing higher level literacy skills. It is worth noting, on the other hand, that the effects of education decline strongly across the distribution: being a high school drop out is associated with much lower minimum levels of literacy than what is observed for high school graduates but has a smaller (though still substantial) effect at the top end of the distribution. Put another way, the lowest literacy values observed in the distribution for the university educated are well above those for drop-outs while the highest literacy values observed for the university educated exceed those for the less educated groups but not by as much. 35 Table 19 contains the estimates from quantile regressions for the log of prose literacy. The results at the 10th percentile differ from those for document literacy in that they show clearer evidence of cohort effects, with those effects again pointing to current youth having the lowest literacy values. As with document literacy, the main ageing effect on this part of the distribution is a substantially lower value for the oldest age group we consider. As we have seen in several situations, the estimated effects on the median are similar in pattern to those for document literacy but are somewhat smaller in magnitude. The same is true at the 90th percentile. Prose literacy scores show the same pattern of a substantially lower 90th percentile for current youth relative to earlier cohorts and the 90th percentile shows a sharply declining pattern with respect to age. These effects are slightly smaller than what we observe for document literacy but are still substantial and still point to significant concern about what has been happening in the upper literacy ranges. We consider the impact of literacy more closely by re-estimating our quantile regression specifications for two education sub-groups: those whose highest education level is high school graduation or less (combining the high school drop-outs and the high school graduates in order to obtain more substantial sample sizes), and the university educated. We present results from the 10th, 50th and 90th percentile quantile regressions for document literacy for both groups in Table 20. The results at the 10th percentile for the lower educated group point to the current youth cohort (the 1983 Cohort) having the highest scores at this point in the distribution, though most of the estimated cohort effects are not statistically significant. The ageing pattern at the 10th percentile is neither consistent nor statistically significant. At the median, there is no consistent cohort pattern discernable but there is a clear pattern of declining literacy with age. At the 90th 36 percentile, on the other hand, we observe a strong cross-cohort pattern with the current youth having the lowest values in this part of the distribution. We again see evidence of a substantial decline in literacy with age at the top of the distribution. Thus, even the lowest educated face a deterioration with age and a decline across successive cohorts of the literacy skills which, for them, are relatively high. For the university educated, on the other hand, the patterns of declining literacy scores with age and across cohorts are evident in both tails of the distribution and are quite strong. In Table 21, we repeat the education break-down exercise using prose literacy. The main patterns are similar to those observed for document literacy. The main differences are that cohort and ageing patterns are not in evidence even up to the median for the low educated group and that, for the university educated, the estimates point to the 26-35 age group having the highest literacy level. The latter would fit with many of the youngest, 16-25 age group, not having completed their education and, as a result, still having increases in literacy yet to come. In general, the patterns evident in the prose literacy results are more muted than with document literacy, though this is not always the case. The results from the various regression exercises point to several key conclusions that back up what we observed in the figures in the previous section. First, much of the movement in the distributions, both across cohorts and with age, occur in the upper tails of the literacy distributions. In contrast, the lowest levels of literacy are relatively constant across cohorts and age groups, though there is some (relatively weak) evidence of improvements in the lower tail of the distribution for current youth relative to earlier cohorts. Second, the cohort pattern that emerges in the upper tail (and to some extent in the middle of the distribution) is one in which 37 literacy levels are lowest for current youth and are increasing the farther back in terms of cohorts we go. Third, the age pattern that emerges both in the upper tail and in the middle of the distribution is one with strongly declining literacy with age. Fourth, the declines with age and across cohorts are particularly strong for more educated individuals. This fits with the fact that we observe these movements mainly in the upper tail of the distribution when we do not control for education. However, we also observe these trends in the upper tails of the distributions for the least educated. For the latter group, there is some evidence of improvements in the minimum literacy scores observed across successive cohorts, with the current youth having better low-end literacy scores than previous generations. Fifth, education is a key determinant of literacy skills and its importance is particularly strong in raising minimum literacy levels (as opposed to generating very high literacy values). This is not meant to imply that formal schooling does not have an impact in generating high literacy values - far from it - but its strongest effects are in raising minimum levels. Sixth, parental education has an impact on literacy but mainly in the form of worse literacy outcomes for those whose parents are high school drop-outs. Differences across individuals whose parents are high school graduates, post-secondary graduates and university graduates are small. The impact of parental education, in general, is much smaller than that of the respondent’s own education. Unfortunately, in these surveys, we do not know anything about reading and literacy in the home when the respondent was young so we cannot make definitive statements about the role of home versus school in literacy generation but comparing the relatively small impacts of parental education to the sizeable impacts of own education point toward a conclusion that formal schooling is the most important venue for literacy generation. Seventh, the patterns we observe tend to be more muted for prose than 38 document literacy. This may be because prose literacy is more commonly used - that it is, in some sense, a more basic skill. As a result, it tends to deteriorate less with age and has less variation across cohorts. Overall, the picture that emerges is one in which literacy skills are generated in school but higher end literacy skills start to deteriorate as soon as people leave school. Further, current youth are not suffering as low minimum levels of literacy as previous cohorts but are also not attaining as high top end literacy levels as previous cohorts. Thus, current youth can be characterized as having relatively lower levels of top-end literacy compared to previous generations and there is every reason to expect their literacy levels will decline from here, following the standard ageing pattern observed in earlier generations. 6) Comparisons to Other Countries Another potential benchmark for the literacy of Canada’s youth is the literacy of youth in other countries. We use Norway and the US as benchmarks in this paper. We chose both countries because in both cases we have access to consistent data from both rounds of the IALS, allowing us to make the same kind of cohort comparisons we carry out for Canada. In addition, Norway provides an interesting benchmark because the Nordic countries tend to perform well in these literacy comparisons. Thus, a comparison with Norway allows us to see how well Canada compares to the “gold standard” in terms of what is attainable. The US is interesting since it is the closest competitor for our workforce. The Norwegian and US samples are smaller than those for Canada with 2522 in our usable 2003 sample for Norway and 1486 in our 2003 sample for the US. In Table 22, we recreate our statistics for document literacy from our overall Canadian 39 youth sample along with the same statistics from the 2003 Norwegian and American samples. It is worth re-iterating that while the IALSS surveys were explicitly designed to allow direct comparability of literacy levels across countries and over time, there is reason to be cautious, particularly in the comparisons with Norway. As we expected, the Norwegian document literacy distribution for youth in 2003 dominates that the youth distribution for Canada in the same year. The 5th percentile is 15 points higher in the Norwegian than the Canadian distribution and both the 50th and 95th percentiles are at least 12 points higher. The extent of inequality in literacy among youth is also somewhat lower in the Norwegian sample, with a ratio of the 95th to the 5th percentile of 1.64 for Norway and 1.69 for Canada. In comparison, the 5th percentile of the US youth document literacy distribution is 9 points worse and the median is 14 points worse than that for Canada. On the other hand, the 95th percentiles for Canada and the US only differ by 3 pointsl. Thus, Canada’s performance is comparable to the US at upper literacy levels but superior in the lower half of the distribution. The obvious implication is that inequality in literacy is lower in Canada than in the US. A reasonable summary of this evidence is that Canada sits between Norway, which has both superior literacy levels and less inequality in literacy, and the US, which has generally lower literacy levels and higher literacy inequality. This fits with evidence on literacy levels and inequality across a set of countries in Wilms(1998). Inequality in literacy is important both because it will lead to inequality in other outcomes such as earnings and health (Green and Riddell(2006)) and because literacy is valuable in its own right. Sen argues that literacy is a key determinant of full social inclusion and thus something we should focus on for its own sake (Sen(1999)). In that case, greater inequality in literacy, and particularly, low minimum levels of inequality, should be viewed as bad. Indeed, while some 40 argument can be made for positive effects from earnings inequality stemming from incentive effects, no such argument can be made about inequality in literacy. Inequality in literacy is simply bad. In this sense, Canada is doing better than the US but has much to learn from Norway. In Table 23, we repeat the cross-country comparison for youth using prose rather than document literacy. The comparison of Canada with Norway for prose literacy again points to superior literacy in Norway. Interestingly, though, while Norway dominates Canada in the lower end of the literacy distribution to roughly the same extent we observed in document literacy, the 95th percentile of the Canadian distribution is only 3 points below that for the Norwegian distribution. Once again, this points to greater inequality in the Canadian distribution but it at least indicates that Canadian youth have, at the top end, comparable prose literacy relative to a country that is known to perform well in literacy. Relative to the US, the Canadian prose literacy distribution is again dominant in the lower half of the distribution but the two have similar 95th percentiles. Thus, all three countries attain similar levels of prose literacy at the top end of the distribution but quite different levels at the low end. Finally, in Table 24, we repeat the comparisons for numeracy scores. Once again, the ordering runs from Norway as the best to the US as the worst in terms of literacy scores at the bottom of the distribution. However, all three countries have identical scores at the 95th percentile of the distribution, pointing to the Norwegian superiority in reducing inequality in literacy scores. The fact that Canada performs as well as Norway - a known high-achieving country in literacy on numeracy in the top half of the distribution is heartening. Note that we do not present results for problem solving because the US data does not include scores for this test and the Norwegian problem solving scores take odd sizes. 41 6.1) Norway We turn next to a closer examination of outcomes in the two comparison countries, with a particular emphasis on the question of whether the cross-cohort patterns and ageing patterns are similar to those we observed for Canada. In this subsection, we examine the results from Norway. Recall that we can only examine prose and document literacy in the cross-cohort comparisons because only those scores are comparable across surveys. For Norway, we have the 2003 survey plus an earlier survey from 1998 to use in our comparisons. We will only present figures from Prose literacy to save on space since the document and prose outcomes are similar. We will, however, present tables based on both measures. Figure 21 contains the kernel density plots for 16 to 25 year olds in 1998 and 2003, allowing a cross-cohort comparison between the current youth and youth from approximately a half cohort earlier. Underneath the figure, we present a summary table showing characteristics of each distribution. The two distributions are essentially identical, implying no substantial, cross-cohort changes. Figure 22 plots the two distributions corresponding to a comparison of what we have called the 1974 Cohort with the 1964 Cohort. In this case, the newer cohort (who are 26 to 35 in 2003) have a similar distribution in the lower tail but a superior distribution in the upper tail (or, in other words, people in the newer cohort have a greater probability of attaining high literacy scores). The same is true, though to a smaller extent, in a comparison of the 1964 Cohort with the 1954 Cohort in Figure 23. Thus, these plots suggest that Norwegian literacy has been moving in the opposite direction relative to Canada, with improvements rather than declines across successive cohorts. As in the Canadian data, the changes are occurring mainly in the upper tail of the distribution. Given that the upper tails of the current Canadian and Norwegian distributions of prose literacy for youth are 42 similar, the implication is that Canada had superior prose literacy distributions to Norway in the past but the two have converged to the middle. If these patterns continue, Canada would be projected to fall behind at the top end of the distribution (as they already are behind in the lower end) within a generation or two. In Figure 24, we plot the prose literacy densities for individuals in the 16 to 25 year age group in 1998 both in 1998 and 5 years later. This allows us to see the impact of ageing on literacy for a given cohort, though we are only able to follow them for 5 years rather than the 9 that was possible for Canadians. The figure and the table of statistics beneath it indicate that there is actually an improvement over time for this cohort, with the largest improvements at the bottom of the distribution. Particularly since we only follow this group for 5 years, we may just be picking up the direct effects of ongoing schooling for this group. Thus, the next figure (25), which shows the ageing effects for the people who were age 26 to 35 in 1998 is more likely to reveal pure ageing effects. This figure shows little change at the very bottom of the distribution but improvements with age over the rest. Following those who were age 36 to 45 in 1998 (Figure 26) shows a decline in literacy with ageing at the bottom of the distribution but no effect at the top. Overall, these pictures suggest the opposite pattern to that observed in the Canadian data: literacy levels that are either relatively unchanging or generally increasing with age. To examine these implications more systematically, in Table 25, we present the estimated coefficients from our standard OLS specification plus quantile regressions for the 10th, 50th and 90th percentiles. The cohort effects are generally insignificant and small in these regressions. On the other hand, the age effects reveal a strongly declining pattern with age across the whole distribution. Indeed, they show a pattern similar in magnitude to that seen in the Canadian data. 43 The fact that the estimation and the plotted figures show such different results stems from the short time between cross-sectional observations. Because these observations are only 5 years apart, a person who is 16 to 25 in the first dataset might still be in the same category in the second dataset. In this case, when we simply plot literacy distributions for 16 to 25 year olds in 1998 and in 2003, the latter group will include some people from the initial cohort (i.e., the people who are 16 to 25 in 1998). When we estimate, on the other hand, we are able to be precise about who belongs to which cohort since we define cohorts based on ages groups in 1998. This means that both our cohort and ageing effects are more accurate in the regression analysis. Thus, we conclude that Norway does not show any consistent pattern in literacy across cohorts but does show a strong decline in literacy with age, much like in the Canadian data. Similarly, education shows effects that are similar in magnitude to those observed for Canada and which, again, decline across the literacy distribution. The main difference relative to Canada in this dimension is that the penalties in literacy to having less than a high school education are smaller in Norway. This is likely what is behind the superior left hand tail of the Norwegian prose literacy distribution relative to that for Canada. Table 26 presents the same types of results but for document literacy. The pattern of results is very similar to that for prose literacy but, as we often observed in the Canadian data, the estimated effects are typically larger in magnitude than those for prose literacy. 6.2) United States Next, we repeat our analysis of literacy data for the US samples. Figure 27 plots the 44 kernel densities for prose literacy for 16 to 25 year olds in the 1994 and 2003 data.14 Thus, this is a very similar depiction of the differences between the 1984 Cohort (current youth) and the 1974 Cohort to what we created from the Canadian data. As in the Canadian data, this figure shows a deterioration of literacy in the upper half of the distribution between the two cohorts. The same is true in the comparison of the 1974 Cohort and the 1964 Cohort in Figure 28: the younger cohort (the 1974 Cohort in this case) has a superior literacy distribution. This is particularly the case at the top end of the distribution, with the 95th percentile of the 1974 Cohort distribution being 12 points higher than that for Cohort 3. This same pattern is even clearer in the comparison of the 1964 and 1954 Cohorts in Figure 29. The 95th percentile of the 1964 Cohort distribution is nearly 30 points lower than that in the 1954 Cohort distribution. Figures 30 through 32 show the impact of ageing on prose literacy for the 1974, 1964 and 1954 Cohorts, respectively. Figure 30 shows some deterioration in literacy scores in the middle of the distribution as the 1974 Cohort ages from ages 16-24 in 1994 to 25-33 in 2003. For the 1964 Cohort, ageing 9 years is associated with declines in all percentiles above the 10th, with their 95th percentile being nearly 20 points lower in 2003 than in 1994. A similar ageing pattern is evident for the 1954 Cohort, except that the declines at the very top of the distribution are even greater. We, again, confirm these patterns using OLS and quantile regressions, presented in Table 27 for prose literacy and 28 for document literacy. The prose literacy regressions show the same kind of pattern that we have seen for the Canadian data and, to some extent, the Norwegian data. 14 Note that the age groupings in the US data dictate a slightly different set of groups than in the Canadian data (i.e., the youngest age group is 16 to 24 rather than 16 to 25). 45 In particular, the estimated coefficients point to the current cohort of youth having the lowest literacy level, with this pattern being particularly strong at the top of the distribution. The results also match those from the Canadian data in showing strong declines with age that are, again, strongest at the top of the distribution. The main difference relative to the Canadian results is that these cohort and ageing patterns are evident even at the bottom of the distribution for the US, which was not the case for Canada. This is not the case, though, for the US document literacy results in Table 28, where the 10th percentile quantile regression does not show a consistent pattern either across cohorts or with age. Interestingly, the patterns at the 90th percentile are also not as strong for US document literacy. Finally, education effects are important for the US, having similar magnitudes to what is observed in the Canadian data. Overall, comparisons to both Norway and the US indicate that the ageing pattern we identified for Canada is also present in countries with quite different literacy levels. In all three countries, literacy declines strongly with age for any given cohort. This may indicate that literacy has a “use it or lose it” nature in all of these economies. The countries are also similar in the importance of formal schooling for generating literacy and in the fact that schooling plays a particularly strong role in raising minimum literacy levels. In contrast, there are differences across countries in the cohort patterns of literacy. Both the US and Canada have experienced a pattern of declining literacy across successive generations, with particularly strong declines at the top end of the literacy distributions. Norway, on the other hand, has not experienced any clear pattern: recent cohorts have all attained similar literacy levels (conditional on their schooling levels and the schooling levels of their parents). The fact that the ageing patterns are similar across the three economies suggests that the impact of post-schooling institutions is similar in all 46 three. That is, none of the countries has established a superior system in terms of maintaining post-schooling literacy levels. Cohort effects, on the other hand, are related to “permanent” differences associated with people who were born and went through schooling at different times. Differences in cohort patterns are thus reflections of institutions which have persistent effects on literacy, with differences in the efficacy of formal schooling being the most likely candidate. Under this interpretation, Norway is not only doing something better with its schooling (in that it is generating both higher overall literacy levels and less literacy inequality), it has maintained its schooling effectiveness over time. In contrast, cohort patterns in the Canada and the US may indicate a reduction in the efficacy of schooling over time, particularly in terms of generating high end literacy. 7) Exploring Cohort Effects: Why Have Current Canadian Youth Fallen Behind in Literacy? We turn, next, to trying to understand the emerging differences in literacy across cohorts in Canada. As a first step, we examine differences in literacy correlates between generations in the IALS03. The survey includes questions about literacy use at work. The literacy use at work questions ask about frequency of performing reading, writing and mathematical tasks. Thus, for reading, questions are asked about 5 tasks and there are also questions on 5 writing tasks and 5 math tasks. We construct dummy variables equalling 1 if the individual responded that he or she performed 4 or 5 of the reading related tasks at least once a week and similar variables for the writing and math tasks. We also constructed dummy variables corresponding to performing one to three of the tasks at least once a week for each of reading, writing and math. Finally, we constructed a dummy variable corresponding to individuals who answered that they performed 47 all of the tasks in a given area (e.g., reading) “rarely”. In Table 29, we show the proportions in each of these categories for the 1983 Cohort (those aged 16-25) and the 1974 Cohort (those aged 26-35) (note there is an omitted category which includes all other possible responses). For this part of the table, we restrict our attention to individuals who are either high school drop-outs or high school graduates, otherwise the results will naturally favour the 1974 Cohort since all the university graduates in that cohort will be at work while at least some of the university graduates in the 1983 Cohort will still be in school. This would make it appear that the 1974 Cohort has more university related jobs. The patterns of literacy and numeracy at work are extremely different between the two groups. In particular, the 1974 Cohort is over twice as likely to claim that they use reading skills frequently at work and is, similarly, much more likely to claim that they use writing and math skills in their work.15 The 1974 Cohort is also more likely to read a newspaper or a book at least once a week and less likely to agree to the statement that they read only when they have to. Thus, the 1974 Cohort appears to use literacy related skills much more often both on and off the job (though, interestingly, their tendency to watch tv is not very different from the 1983 Cohort). Whether this is a result of their higher literacy levels or a cause of literacy differentials between the two cohorts cannot be discerned from this data. The differences we observe may, also, be a function of ageing rather than true differences across cohorts (i.e., people may read more as they move beyond the youth years, and we are observing the 1983 Cohort in the youth years and the 15 As a side point, note that people are generally more likely to claim they using reading skills on their job than the other two types of literacy/numeracy skills. This fits with our conjectures, earlier, that prose literacy tends to vary less across cohorts and age groups because it is a more commonly used type of literacy. 48 1974 Cohort in a later age range). Unfortunately, differences in questions between the IALS94 and IALS03 make it difficult to compare the two cohorts at the same age. 8) Conclusions In this paper, we use data from International Adult Literacy Surveys (IALS) for pairs of years for Canada, Norway and the US. Our focus is on the literacy levels and extent of literacy inequality among Canadian youth (individuals aged 16 to 25). We find that Canada’s current youth have generally lower literacy levels than previous generations of Canadians. More precisely, the probability that current Canadian youth suffer low levels of literacy is either no different or slightly lower than previous generations. However, the probability that they attain high levels of literacy is decidedly lower than for previous generations and this disparity increases as we move higher and higher in the literacy distribution. This relatively inferior performance seems to us to be a cause for concern. A second key conclusion is that literacy as measured on these tests declines with age after leaving school. This may reflect a “use it or lose it” model of literacy in which literacy skills obtained during school atrophy with lack of use after leaving school. Importantly, this implies that if current youth are at relatively low levels of literacy today, they are only going to move to even lower levels over time. In terms of international comparisons, Canada falls about midway between Norway and the US both in terms of literacy levels and the extent of inequality in their literacy distributions. Thus, there is potentially much to learn from the Norwegians but we do appear to have an advantage over the Americans. Interestingly, all three countries show the same pattern of literacy loss with age. Thus, whatever Norway is doing better it seems not to have to do with institutions and opportunities associated with maintaining literacy levels after leaving school. Or, to put it in 49 the current policy vernacular, there is no reason to think, based on literacy test scores, that Norway is better at “life long learning” than Canada. In terms of cross-cohort patterns, the US shows much the same pattern as Canada while the Norwegian data does not show any particular pattern of differences across cohorts. Thus, whatever Norway is doing right, it has been doing it for a while and has been consistent. Both Canada and the US, on the other hand, appear to face a growing problem with each successive generation. Taken together, comparisons across generations within Canada indicate that we are at least doing no worse and may be improving our performance in terms of raising the literacy levels at the low end. That is, literacy policies aimed at basic literacy seem to be working to some extent. However, a comparison with Norway indicates that we can still do much better in this regard. It is at the other end - the top - that we see real declines for current youth relative to earlier generations. This is something that would need to be addressed with a different type of policy. In particular, it focuses attention on the efficacy of post-secondary education. The next step in an investigation of these patterns would be to use the fact that we know the province where individuals lived during their high school years in the IALS to try to relate patterns to differences in education policy across provinces and over time. Thus, we could investigate whether fiscal problems in various provinces in the 1990s are at the heart of the cross-cohort declines. On the face of it, this is unlikely to provide the main explanation since the declines we observe are evident for the cohort which attended school in the 1980s relative to the previous cohorts as well as for the most recent cohort relative to all previous cohorts. Whether the decline reflects greater presence of ESL students and other students requiring extra help in the classroom or changes in educational philosophy toward a focus on helping the weakest students or some 50 other change will be interesting to investigate but is beyond the scope of the current data. 51 References Blum, A., H. Goldstein, and F. Guerin-Pace (2001). “International Adult Literacy Survey (IALS): An Analysis of International Comparisons of Adult Literacy,” Assessment in Education, 8(2), pp. 225-246. Bownikowska, A. (2007). “Explaining the Education Gap Between Children of Immigrants and the Native Born: Allocation of Human Capital Investments in Immigrant Families,” Department of Economics, University of British Columbia. Buchinsky, M. (1997). “Recent Advances in Quantile Regression Models: A Practical Guideline for Empirical Research,” The Journal of Human Resources, 33(1), pp. 88-126. Canada. Statistics Canada. Census of Canada, 2001: “Total, Average and Median Years of Schooling.” Ottawa, Ont.: Statistics Canada (97F0017XCB2001008). Crompton, S (1996). “The Marginally Literate Workforce,” Perspectives on Labour and Income, Statistics Canada, Summer. Ferrer, A., D.A. Green and W.C. Riddell (2006). “The Effect of Literacy on Immigrant Earnings,” Journal of Human Resources, Spring, pp. 380-410. Green, D.A. and Riddell, W.C. (2003). “Literacy and Earnings: An Investigation of the Interaction of Cognitive and Unobserved Skills in Earnings Generation,” Labour Economics 10 (April) 165-84. Green, D.A. and W.C. Riddell (2006). “Literacy and the Labour Market: The Generation of Literacy and its Impact on Earnings,” Report prepared for Statistics Canada. Osberg, L. (2000). “Schooling, Literacy and Individual Earnings,” Statistics Canada, Catalogue no. 89F0120XIE. Sen, A. (1999). Development as Freedom. New York: Anchor Books. Tuijnman, A. And E. Boudard (2001). “Adult Education Participation in North America: International Perspectives,” Adult Education and Literacy Monograph Series, Ottawa. Willms, D. (1998). “Inequalities in Literacy Skills Among Youth in Canada and in the United States,” Statistics Canada. 52 Table 1. All Youth, 2003: Distributions of Literacy Scores Prose literacy Document Numeracy literacy literacy 5th percentile 210.9077 208.9329 199.0734 th 25 percentile 258.5269 265.1976 248.7537 50th percentile 291.7701 292.7859 283.012 th 75 percentile 318.5801 320.1128 312.5327 95th percentile 349.4128 353.5218 349.9275 Mean 287.1755 290.3056 279.1742 Standard 42.96934 42.71123 46.87337 deviation Number of 3574 3574 3574 observations Problem solving literacy 210.6055 258.0384 287.251 312.0545 343.921 284.0523 40.7942 3574 Table 2. All Youth not currently in school, 2003: Distributions of Literacy Scores Prose literacy Document Numeracy Problem literacy literacy solving literacy 5th percentile 211.7186 212.5642 195.9011 209.2827 th 25 percentile 257.4533 265.1878 248.0647 257.4167 50th percentile 291.8838 291.9083 281.2377 285.5068 th 75 percentile 317.1489 319.575 310.5335 311.6965 95th percentile 348.2553 351.1317 345.2617 342.6533 Mean 286.5293 289.7571 277.2776 283.1576 Standard 42.72131 42.31721 46.9206 40.9399 deviation Number of 2384 2384 2384 2384 observations 53 Table 3. Immigrant Youth whose age at immigration was over 10, 2003: Distributions of Literacy Scores Prose literacy Document Numeracy Problem literacy literacy solving literacy 5th percentile 175.0877 176.6836 146.4709 177.3325 th 25 percentile 217.891 228.2136 217.4049 226.5175 50th percentile 255.0026 268.5102 255.9726 253.8476 th 75 percentile 292.7519 294.395 295.1372 287.3729 95th percentile 329.9397 342.2929 331.4167 320.53 Mean 255.3356 263.5364 253.6294 252.6483 Standard 50.56662 51.17622 52.54771 47.3772 deviation Number of 254 254 254 254 observations Table 4. Aboriginal youth, 2003: Distributions of Literacy Scores Prose literacy Document Numeracy literacy literacy 5th percentile 191.0428 192.8186 170.7711 th 25 percentile 230.6976 227.7727 216.2931 50th percentile 258.7505 264.3837 245.1619 th 75 percentile 289.7625 290.8277 275.7117 th 95 percentile 323.5422 330.8159 314.8333 Mean 258.5063 260.5814 244.0102 Standard 42.1194 42.85846 45.87844 deviation Number of 304 304 304 observations 54 Problem solving literacy 194.5074 225.8704 255.3114 284.3848 317.4493 255.9818 39.50166 304 Table 5. Immigrant Youth whose age at immigration was 10 or under, 2003: Distributions of Literacy Scores Prose literacy Document Numeracy Problem literacy literacy solving literacy th 5 percentile 188.1311 190.4963 185.9063 189.6137 25th percentile 237.3374 251.2311 234.447 239.4582 th 50 percentile 275.4353 283.9462 266.4867 272.4555 75th percentile 303.7608 312.5289 308.5161 305.2515 th 95 percentile 340.1601 346.874 349.5515 333.1534 Mean 270.5121 278.1956 268.436 268.4988 Standard 46.24143 46.82592 48.74372 43.66114 deviation Number of 186 186 186 186 observations Table 6. Non-immigrant, non-aboriginal youth, 2003: Distributions of Literacy Scores Prose literacy Document Numeracy Problem literacy literacy solving literacy th 5 percentile 220.8727 222.7078 203.2564 224.4111 25th percentile 266.8139 269.9175 256.0006 264.6787 th 50 percentile 295.119 295.8677 286.8163 290.1752 75th percentile 321.9458 323.9247 315.8259 315.7706 th 95 percentile 351.2164 354.8656 350.284 346.3427 Mean 292.2965 294.5618 283.179 289.1168 Standard 40.21067 40.34996 45.29129 37.98856 deviation Number of 2470 2470 2470 2470 observations 55 Table 7. Female, non-immigrant, non-aboriginal youth, 2003: Distributions of Literacy Scores Prose literacy Document Numeracy Problem literacy literacy solving literacy th 5 percentile 229.5333 217.3028 203.2564 228.3848 25th percentile 272.5641 270.6931 250.7837 266.3936 th 50 percentile 302.7395 296.0048 283.012 293.3438 75th percentile 328.3363 327.5212 312.0671 319.4827 th 95 percentile 357.2094 355.2672 342.4908 346.3427 Mean 298.9263 295.9567 279.8304 291.5042 Standard 39.9142 40.87085 43.55226 37.75761 deviation Number of 1235 1235 1235 1235 observations Table 8. Male, non-immigrant, non-aboriginal youth, 2003: Distributions of Literacy Scores Prose literacy Document Numeracy Problem literacy literacy solving literacy 5th percentile 212.2286 223.3568 204.6953 218.6879 th 25 percentile 261.1498 269.8997 258.6088 264.272 50th percentile 290.8956 295.8677 290.5585 288.692 th 75 percentile 311.9637 319.3583 318.4716 311.6142 th 95 percentile 343.9744 352.607 354.8716 346.8442 Mean 285.9816 293.2331 286.3686 286.8427 Standard 39.48408 39.81818 46.68208 38.08315 deviation Number of 1235 1235 1235 1235 observations 56 Table 9. Non-immigrant, non-aboriginal 26 to 35 years olds, 2003: Distributions of Literacy Scores Prose literacy Document Numeracy Problem literacy literacy solving literacy th 5 percentile 228.6403 227.0782 216.9315 224.4427 25th percentile 275.4052 276.4366 263.8186 269.6664 th 50 percentile 303.6976 304.2396 293.3011 296.1187 75th percentile 328.1208 330.8532 322.1561 322.3264 th 95 percentile 360.3748 361.0248 363.8924 352.6783 Mean 300.3751 301.4102 291.7783 293.7479 Standard 40.16061 41.9833 44.34693 39.25177 deviation Number of 2180 2180 2180 2180 observations Table 10. Non-immigrant, non-aboriginal 36 to 45 years olds, Distributions of Literacy Scores Prose literacy Document Numeracy Problem literacy literacy solving literacy 5th percentile 209.4869 203.9987 194.6071 205.9367 th 25 percentile 263.6475 262.931 251.4567 257.7397 50th percentile 294.1848 292.3627 282.9394 285.4711 th 75 percentile 323.0555 324.7113 316.6167 313.3466 th 95 percentile 355.7727 359.4344 353.1294 355.7798 Mean 290.6646 289.769 280.9003 283.6141 Standard 45.20643 47.76224 49.01289 44.7549 deviation Number of 3215 3215 3215 3215 observations 57 Table 11: Regressions of Literacy Scores on Age, Education and Gender Non-Immigrant, Non-Aboriginals, 16 to 45 Years Old Fem ale 26-35 36-45 Years of schooling Prose literacy 7.0865*** (1.8127) -4.2735** (1.9097) -9.6365*** (2.3937) Docum ent literacy -3.1131* (1.7568) -6.6429*** (1.858) -13.4596*** (2.4507) Num eracy literacy -13.9275*** (1.8398) -4.4201* (2.2272) -9.9657*** (3.1532) Problem solving literacy -0.0477 (1.4094) -7.2048*** (2.1806) -12.9801*** (2.5493) 6.9525*** (0.2563) 7.3192*** (0.274) 7.6200*** (0.2869) 6.6899*** (0.3109) Currently in school 1.5452 -0.1641 3.6946 (2.7576) (3.1492) (3.3944) Constant 201.5889*** 204.7470*** 193.7657*** (3.5695) (3.9655) (5.0406) Observations 7865 7865 7865 R-squared 0.26 0.25 0.26 Standard errors in parentheses * significant at 10%; ** significant at 5%; *** significant at 1% 58 1.6923 (2.5832) 205.1251*** (4.4923) 7865 0.24 Table 12: Cross-Cohort Comparisons, Document Literacy Percentiles Age 16-25 Age 26-35 Age 36-45 1983 1974 1974 1964 1964 1954 Cohort Cohort Cohort Cohort Cohort Cohort th 213 227 188 204 208 5 percentile 223 th 270 276 266 263 259 25 percentile 270 th 303 304 299 292 292 50 percentile 297 th 324 334 331 339 325 333 75 percentile th 374 361 387 359 380 95 percentile 355 observation 2662 1193 2180 937 3215 932 s Table 13: Cross-Cohort Comparisons, Prose Literacy Percentiles Age 16-25 Age 26-35 Age 36-45 1983 1974 1964 1954 1974 1964 Cohort Cohort Cohort Cohort Cohort Cohort 229 216 209 207 5th percentile 224 219 th 268 275 262 264 268 25 percentile 267 th 296 296 304 294 294 298 50 percentile th 321 328 327 323 328 75 percentile 323 th 360 360 366 356 372 95 percentile 352 observation 2662 1193 2180 937 3215 932 s 59 Table 14: Effects of Ageing for Specific Cohorts, Document Literacy Percentiles 1974 Cohort 1964 Cohort 1954 Cohort 16-25 25-34 26-35 35-44 36-45 45-54 th 188 208 5 percentile 213 231 206 199 th 270 277 266 263 259 256 25 percentile th 305 299 294 292 287 50 percentile 303 th 331 339 325 333 317 75 percentile 334 th 374 361 387 360 380 354 95 percentile observation 1193 2108 937 3132 932 3314 s Table 15: Effects of Ageing for Specific Cohorts, Prose Literacy Percentiles 1974 Cohort 1964 Cohort 1954 Cohort 16-25 25-34 26-35 35-44 36-45 45-54 th 216 207 5 percentile 219 230 213 202 276 262 264 268 262 25th percentile 268 th 306 294 295 298 291 50 percentile 296 th 321 328 327 324 328 318 75 percentile th 360 366 356 372 353 95 percentile 360 observation 1193 2108 937 3132 932 3314 s 60 Table 16: OLS Regressions W ith Cohort Effects Docum ent P r o s e Document literacy literacy literacy High school graduates 0.1417*** 0.1316*** 0.1302*** (0.0099) (0.0188) (0.0113) Som e postsecondary 0.1767*** 0.1717*** 0.1577*** (0.0088) (0.0169) (0.0093) U niversity graduates 0.2474*** 0.2340*** 0.2119*** (0.0189) (0.0198) (0.0158) Fem ale -0.0006 0.0377*** 0.0021 (0.0054) (0.0082) (0.006) Age groups 26-35 36-45 46-55 56-65 Mother’s education L e s s th a n high school P o s t secondary Not reported Father’s education L e s s th a n high school P o s t secondary Not reported 1974 Cohort 1964 Cohort 1954 Cohort 1944 Cohort P r o s e literacy 0.1229*** (0.0205) 0.1563*** (0.017) 0.2013*** (0.017) 0.0374*** (0.008) -0.0299*** (0.0104) -0.0564*** (0.0166) -0.0861*** (0.0293) -0.1190*** (0.0315) -0.0189** (0.009) -0.0331 (0.0239) -0.0516* (0.0302) -0.0651 (0.049) -0.0433*** (0.0107) -0.0756*** (0.0178) -0.1114*** (0.0302) -0.1332*** (0.0295) -0.0291*** (0.0103) -0.0469* (0.0276) -0.0710* (0.035) -0.0763 (0.0493) -0.0424*** (0.0048) -0.0355*** (0.0051) -0.0374*** (0.0041) -0.0316*** (0.0052) -0.0024 (0.0065) -0.0599*** (0.0189) 0.0051 (0.0086) -0.0487*** (0.0167) 0.0001 (0.0067) -0.0573** (0.021) 0.0064 (0.009) -0.0467** (0.0189) -0.0381** (0.0172) -0.0257 (0.0168) -0.0370** (0.0173) -0.0247 (0.0175) 0.009 (0.0188) -0.0522*** (0.0117) 0.0082 (0.0125) 0.0192 (0.0207) 0.042 (0.0329) 0.0281 0.0127 (0.0175) -0.0286** (0.0134) 0.0016 (0.0139) 0.0042 (0.03) 0.0258 (0.0369) -0.002 0.0104 (0.0185) -0.0457*** (0.0124) 0.0076 (0.0131) 0.0189 (0.0218) 0.0469 (0.0346) 0.0384 0.0138 (0.0173) -0.0236* (0.0135) 0.0015 (0.0141) 0.0035 (0.0317) 0.0293 (0.0404) 0.0056 61 (0.0334) Occupation Constant (0.0548) (0.0324) Yes 5.6456*** (0.021) 16924 0.32 5.5851*** 5.5550*** (0.0176) (0.0215) Observations 16924 16924 R-squared 0.3 0.27 Standard errors in parentheses * significant at 10%; ** significant at 5%; *** significant at 1% 62 (0.0564) Yes 5.6053*** (0.0207) 16924 0.28 Table 17: OLS Regressions with Schooling-Cohort Interactions Education High school graduates Som e post-secondary University graduates 1974 Cohort 1964 Cohort 1954 Cohort 1944 Cohort 1974 Cohort, high school graduates Docum ent Prose 0.0651*** (0.0144) 0.0920*** (0.0135) 0.1629*** (0.0187) -0.0212 (0.0237) -0.0749 (0.0525) -0.0314 (0.0374) -0.0566 (0.0433) 0.0635*** (0.0149) 0.0984*** (0.0137) 0.1449*** (0.0161) -0.0146 (0.0212) -0.0796 (0.0724) -0.0401 (0.0424) -0.091 (0.0833) 0.0527* (0.0269) 0.03 (0.0268) 0.0497* (0.0276) 0.0321 (0.028) 0.0435 (0.0294) 0.039 (0.0241) 0.1148*** (0.0393) 0.1011* (0.056) 0.1292*** (0.0464) 0.1122* (0.0584) 0.1412*** (0.0489) 0.1300** (0.05) 0.0985*** (0.0284) 0.0863*** (0.0238) 0.1132*** (0.0262) 0.0873*** (0.0231) 0.0984*** (0.0353) 0.1069*** (0.0257) 0.1293*** (0.0253) 0.1299** (0.0514) 0.1168*** 0.1223*** 1974 Cohort, som e postsecondary 1974 Cohort, university graduates 1964 Cohort, high school graduates 1964 Cohort, som e postsecondary 1964 Cohort, university graduates 1954 Cohort, high school graduates 1954 Cohort, som e postsecondary 1954 Cohort, university graduates 1944 Cohort, high school graduates 1944 Cohort, som e postsecondary 63 1944 Cohort, university graduates Fem ale Age groups 26-35 (0.0289) (0.0363) 0.1291*** (0.0286) -0.0009 (0.0054) 0.1506*** (0.0454) 0.0374*** (0.0083) -0.0226** (0.0109) 36-45 -0.0526*** (0.0183) 46-55 -0.0833** (0.0307) 56-65 -0.1167*** (0.0331) Constant 5.6337*** (0.0172) Observations 16924 R-squared 0.32 Standard errors in parentheses * significant at 10%; ** significant at 5%; *** 1% -0.011 (0.011) -0.0279 (0.0245) -0.047 (0.0315) -0.0628 (0.0525) 5.5982*** (0.0173) 16924 0.28 significant at 64 Table 18: Docum ent Literacy, Quantile Regressions 10 th percentile 50t percentile Education High school graduates Som e post-secondary University graduates Fem ale 90 th percentile 0.2140*** (0.0115) 0.2480*** (0.0092) 0.3567*** (0.0111) 0.0039 (0.0061) 0.1155*** (0.0036) 0.1487*** (0.004) 0.2087*** (0.0035) -0.0050** (0.0023) 0.0784*** (0.004) 0.1100*** (0.0044) 0.1682*** (0.0044) -0.0073** (0.0033) -0.0117 (0.0144) -0.0189 (0.0227) -0.012 (0.0293) -0.0813** (0.0361) -0.0207*** (0.0052) -0.0472*** (0.0064) -0.0737*** (0.0088) -0.1317*** (0.0165) -0.0526*** (0.0071) -0.1071*** (0.0141) -0.1875*** (0.0149) -0.2264*** (0.0161) -0.0514*** (0.0117) -0.0011 (0.0085) -0.1186*** (0.0144) -0.0388*** (0.0038) 0.0059** (0.0028) -0.0562*** (0.01) -0.0168*** (0.004) 0.0171*** (0.0046) -0.0333*** (0.0054) Age groups 26-35 36-45 46-55 56-65 Mother’s education Less than high school Post-secondary Not reported Father’s education Less than high school -0.0452*** (0.0075) Post-secondary -0.0005 (0.0073) Not reported -0.0496*** (0.0131) 1974 Cohort -0.019 (0.0155) 1964 Cohort -0.0345 (0.0217) 1954 Cohort -0.0296 (0.0284) 1944 Cohort -0.0414 (0.0331) Constant 5.3609*** (0.0134) Observations 16072 * significant at 10%; ** significant at 5%; -0.0389*** (0.0041) 0.0064 (0.0051) -0.0540*** (0.0072) 0.0152** (0.0065) 0.0297*** (0.008) 0.0468*** (0.0101) 0.0653*** (0.0174) 5.6116*** (0.0059) 16072 *** significant 65 -0.0256*** (0.0039) 0.005 (0.0045) -0.0283*** (0.005) 0.0315*** (0.0098) 0.0916*** (0.0141) 0.1532*** (0.0162) 0.1631*** (0.0168) 5.7702*** (0.0062) 16072 at 1% Table 19: Prose literacy, Quantile Regressions 10thpercentile 50thpercentile Education High school graduates Som e post-secondary University graduates Fem ale Age groups 26-35 36-45 46-55 56-65 Mother’s education Less than high school Post-secondary Not reported Father’s education Less than high school Post-secondary Not reported 1974 Cohort 1964 Cohort 1954 Cohort 1944 Cohort Constant Observations 90 th percentile 0.2080*** (0.0103) 0.2584*** (0.011) 0.3480*** (0.0128) 0.0375*** (0.006) 0.0940*** (0.0054) 0.1354*** (0.0047) 0.1877*** (0.0053) 0.0299*** (0.0031) 0.0655*** (0.0038) 0.0910*** (0.0046) 0.1427*** (0.0063) 0.0266*** (0.0021) -0.0082 (0.0149) -0.0177 (0.0182) -0.0333 (0.0207) -0.1099*** (0.0253) -0.0074 (0.0057) -0.0266*** (0.0082) -0.0483*** (0.0082) -0.0798*** (0.0111) -0.0170** (0.007) -0.0618*** (0.0106) -0.0939*** (0.0138) -0.1433*** (0.0141) -0.0353*** (0.0077) 0.0218*** (0.0069) -0.0581*** (0.0178) -0.0302*** (0.0035) 0.0133*** (0.0051) -0.0486*** (0.0077) -0.0172*** (0.0044) 0.0231*** (0.0047) -0.0129* (0.0072) -0.0383*** (0.0082) 0.0008 (0.0069) -0.0612*** (0.0121) -0.009 (0.0109) -0.015 (0.0168) 0.0092 (0.0225) 0.027 (-0.023) 5.3195*** (0.0142) 16072 -0.0259*** (0.0032) 0.0110*** (0.0037) -0.0278*** (0.0073) 0.002 (0.004) 0.0124* (0.0075) 0.0410*** (0.0074) 0.0394*** (0.0114) 5.5910*** (0.0056) 16072 -0.0103*** (0.0039) 0.0246*** (0.0038) -0.0272*** (0.0091) 0.0118 (0.0078) 0.0564*** (0.0118) 0.0935*** (0.0126) 0.1158*** (0.0155) 5.7334*** (0.0058) 16072 Standard errors in parentheses * significant at 10%; ** significant at 5%; *** significant at 1% 66 Table 20: Docum ent Literacy, Quantile High school or less 10 th 50 th percentile percentile Fem ale 0.0360*** -0.0035 (0.0102) (0.0048) Regressions, Education Breakdowns University or m ore 90 th 10 th 50 th percentile percentile percentile -0.0016 -0.0092 0.0113* (0.0045) (0.008) (0.0062) 90 th percentile -0.0077* (0.0041) Age groups 0.0108 (0.0596) 0.0258*** (0.0098) 0.0655*** (0.0121) 0.1025*** (0.0138) 0.1277*** (0.0184) 0.0709*** (0.0091) 0.0997*** (0.0097) 0.1906*** (0.0226) 0.1984*** (0.0266) 0.1068*** (0.0165) 0.0684*** (0.0056) -0.0183* (0.01) 0.1665*** (0.0332) 26-35 -0.0076 (0.0193) 36-45 0.0137 (0.0356) 46-55 -0.0063 (0.0399) 56-65 Mother’s education Less than high school Postsecondary Not reported Father’s education Less than high school Postsecondary Not reported 1974 Cohort 1964 Cohort 1954 Cohort 1944 Cohort Constant -0.0299 (0.0345) 0.0803*** (0.03) 0.0136 (0.0165) -0.0997** (0.0395) 0.2067*** (0.0445) -0.0129 (0.0206) 0.0864*** (0.0289) -0.0144 (0.0124) 0.1068*** (0.0207) 0.1568*** (0.0199) 0.1955*** (0.0214) 0.0424*** (0.0066) 0.0495*** (0.013) 0.0312*** (0.0061) 0.0127* (0.0068) 0.0032 (0.0068) 0.0744*** (0.0115) 0.0191* (0.0106) 0.0352*** (0.0113) 0.0180* (0.0099) -0.0152 (0.0508) 0.0127** (0.0061) 0.0762*** (0.0259) 0.0960*** (0.0199) 0.0470*** (0.0064) -0.0116* (0.0065) -0.0059 (0.0105) -0.0023 (0.0067) 0.0198*** (0.0046) 0.0145 (0.0164) 0.0966*** (0.0215) 0.0491*** (0.0139) -0.0093 (0.0408) 0.0068 (0.0418) -0.1165** (0.0593) 5.4899*** (0.01) 0.0173*** (0.0054) 0.0881*** (0.0073) 0.0492*** (0.0094) 0.0687*** (0.0107) 0.1017*** (0.0139) 0.0713*** (0.0181) 5.6883*** (0.0061) 0.0119* (0.0071) 0.0350*** (0.0089) 0.0588*** (0.0075) 0.0955*** (0.0098) 0.1718*** (0.019) 0.1462*** (0.0265) 5.8213*** (0.0109) 0.0058 (0.0092) 0.0095* (0.005) 0.0001 (0.0081) -0.0349 (0.1666) 0.0323 (0.0311) 0.0803** (0.0343) 0.0830** (0.0369) 0.1246*** (0.0438) 5.6655*** (0.0192) -0.0476** (0.0207) -0.0093 (0.0152) 0.0064 (0.0167) -0.0166 (0.0197) 0.0304 (0.0227) 5.7838*** (0.0072) 0.0135 (0.0217) 0.0571*** (0.0149) 0.1539*** (0.0221) 0.1792*** (0.0226) 0.1917*** (0.0226) 5.8581*** (0.0123) 67 0.023 (0.0214) 0.0413*** (0.0062) 0.0419 (0.0405) Observations 9259 9259 9259 2846 Standard errors in parentheses * significant at 10%; ** significant at 5%; *** significant at 1% 68 2846 2846 Table 21: Prose literacy, Quantile Regressions, Education Breakdowns High school or less University or m ore 10 th 50 th 10 th 50 th 10 th percentile percentile percentile percentile percentile Fem ale 0.0607*** 0.0398*** 0.0393*** 0.0016 0.0335*** (0.0128) (0.0045) (0.0037) (0.0086) (0.0045) Age groups 26-35 -0.0124 (0.0184) -0.0107 (0.0102) 36-45 -0.0165 (0.0297) -0.0145 (0.0176) 46-55 0.0264 (0.0363) -0.0374* (0.0217) 56-65 -0.0209 (0.0522) -0.0463* (0.0268) -0.0121 (0.0091) 0.0513*** (0.0183) 0.0889*** (0.0203) 0.1302*** (0.0236) 0.0933*** (0.012) 0.0392*** (0.0048) 0.0035 (0.0148) 0.1654*** (0.0247) Mother’s education Less than high school Postsecondary Not reported Father’s education Less than high school Postsecondary Not reported 1974 Cohort 1964 Cohort 1954 Cohort 1944 Cohort Constant 50 th percentile 0.0210*** (0.0048) 0.0328** (0.0144) -0.009 (0.0113) 0.0204*** (0.0057) 0.0187 (0.0185) 0.0006 (0.0204) -0.0421 (0.0266) 0.1621*** (0.0283) 0.0016 (0.0206) 0.0639*** (0.0226) 0.1096*** (0.0217) 0.0378*** (0.0049) 0.0738*** (0.0157) 0.0251*** (0.0069) 0.0231*** (0.0069) 0.0256*** (0.0065) 0.0550*** (0.0106) 0.0095* (0.0054) 0.0218*** (0.0053) 0.0303** (0.0121) 0.0180*** (0.0046) 0.0430*** (0.0093) -0.0197 (0.0651) -0.0391 (0.0261) 0.0504*** (0.0103) 0.0788*** (0.0157) 0.0392*** (0.008) -0.0094* (0.0049) -0.0028 (0.0092) -0.0073 (0.007) 0.0095 (0.0131) 0.0814*** (0.0153) 0.0416** (0.0171) 0.0162 (0.0309) -0.0015 (0.0357) -0.075 (0.0458) 5.4661*** (0.0145) 0.0157** (0.0064) 0.0537*** (0.0063) 0.0276*** (0.0095) 0.0129 (0.0163) 0.0512** (0.0201) -0.0041 (0.0241) 5.6460*** (0.006) 0.0182*** (0.0053) 0.0487*** (0.0053) 0.0168* (0.0093) 0.0587*** (0.0173) 0.0906*** (0.0183) 0.0981*** (0.023) 5.7877*** (0.0054) -0.0128 (0.0099) 0.0215*** (0.0083) -0.0028 (0.0054) 0.0188** (0.0077) -0.0348 (0.2085) -0.0135 (0.0191) -0.0318 (0.021) 0.0363 (0.0237) 0.1568*** (0.0263) 5.6570*** (0.0132) -0.0428* (0.0241) 0.0282** (0.0125) 0.0383** (0.0184) 0.0685*** (0.0221) 0.0844*** (0.0233) 5.7465*** (0.0094) 0.032 (0.0209) 0.0309*** (0.0109) 0.0375* (0.0194) 0.0816*** (0.0248) 0.1116*** (0.0375) 5.8116*** (0.0112) 69 -0.0457* (0.0246) -0.0853** (0.041) Observations 9259 9259 9259 2846 Standard errors in parentheses * significant at 10%; ** significant at 5%; *** significant at 1% 70 2846 2846 Table 22: Cross country comparison, All youth, Document literacy Canada Norway U.S. th 5 percentile 209 198 224 th 25 percentile 265 249 274 th 50 percentile 293 279 305 75th percentile 320 313 332 th 95 percentile 354 351 368 Mean 290 277 301 Standard deviation 43 48 45 Number of 3574 320 516 observations Table 23: Cross country comparison, All youth, Prose literacy Canada U.S. th 5 percentile 211 198 th 25 percentile 259 241 th 50 percentile 292 276 75th percentile 319 306 th 95 percentile 349 346 Mean 287 272 Standard deviation 43 46 Number of 3574 320 observations Norway 225 272 300 325 352 296 39 516 Table 24: Cross country comparison, All youth, Numeracy literacy Canada Norway U.S. th 5 percentile 199 178 207 th 25 percentile 249 227 254 th 50 percentile 283 271 286 75th percentile 313 307 317 th 95 percentile 350 350 350 Mean 279 266 284 Standard deviation 47 53 44 Number of 3574 320 516 observations 71 Table 25: Prose Literacy Regression, Norway variable 1983 Cohort 1964 Cohort 1954 Cohort 1944 Cohort Age group 16-24 35-44 45-54 55-65 Education Less than HS College degree BA or more female Mother’s education Less than high school Ps, univ. or more None reported Father’s education Less than high school Ps, univ. or more None reported constant observation OLS .005 (.0121) .015 (.0195) .005 (.0205) -.031 (.0352) 10th quantile -.009 (.0198) .032*** (.0120) .028 (.0278) -.015 (.0364) 50th quantile .018** (.0082) .006 (.0078) -.017 (.0145) -.054*** (.0176) 90th quantile .013 (.0137) .004 (.0132) .001 (.0212) .017 (.0229) .024* (.0137) -.004 (.0079) -.047*** (.0133) -.076*** (.0183) .044** (.0171) -.015 (.0137) -.094*** (.0268) -.111*** (.0361) .002 (.0098) -.000 (.0084) -.033*** (.0123) -.058*** (.0157) .020 (.0154) .001 (.0120) -.034** (.0135) -.073*** (.0231) -.083*** (.0169) .055*** (.0067) .103*** (.0094) .019 (.0114) -.122*** (.0179) .064*** (.0147) .125*** (.0124) .022** (.0100) -.072*** (.0073) .050*** (.0055) .084*** (.0057) .020*** (.0036) -.025* (.0137) .043*** (.0096) .075*** (.0106) .021*** (.0058) -.011* (.0063) -.003 (.0157) -.055** (.0227) -.010 (.0093) -.009 (.0186) -.041** (.0177) -.011* (.0055) .001 (.0067) -.081*** (.0177) .003 (.0064) .010 (.0089) -.011 (.0239) -.018 (.0107) .014* (.0075) -.006 (.0143) 5.677*** (.0121) 3709 -.017 (.0113) .028*** (.0105) -.026* (.0157) 5.540*** (.0149) 3890 -.016*** (.0045) .020*** (.0045) .016 (.0214) 5.694*** (.0077) 3890 -.013** (.0060) .012 (.0105) -.013 (.0155) 5.787*** (.0103) 3890 Note:Number if observations in quantile regressions is from the re-created datasets. Cohort 2 (omitted category) are people who were from 26 to 35 in 1998, Cohort 1 are people who were from 16 to 25 in 1998, Cohort 3 are people who were from 36 to 45 in 1998, Cohort 4 are people who were from 46 to 55 in 1998, and Cohort 5 are people who were from 56 to 65 in 1998. 72 Figure 26: Document Literacy Regressions, Norway -.001 (.0146) .023 (.0315) .022 (.0439) -.026 (.0648) 10 th quantile -.029 (.0278) .036* (.0185) .030 (.0259) .004 (.0428) 50 th quantile .009 (.0092) .013 (.0090) -.013 (.0132) -.055** (.0231) 90 th quantile .008 (.0094) -.008 (.0113) .013 (0136) .000 (.0220) .033* (.0163) -.013 (.0111) -.664** (.0263) -.110** (.0419) .072** (.0283) -.032 (.0203) -.101*** (.0269) -.173*** (.0348) .016 (.0102) -.008 (.0093) -.041*** (.0120) -.076*** (.0159) .014 (.0094) .004 (.0127) -.041*** (.0155) -.101*** (.0206) -.106*** (.0174) .063*** (.0097) .105*** (.0107) -.030*** (.0061) -.158*** (.0324) .068*** (.0172) .133*** (.0136) -.032** (.0125) -.084*** (.0132) .069*** (.0066) .093*** (.0048) -.028*** (.0060) -.046*** (.0118) .035*** (.0100) .066*** (.0067) -.017*** (.0050) -.010 (.0076) -.002 (.0262) -.050 (.0334) -.030*** (.0108) -.025 (.0206) -.061* (.0340) -.015** (.0074) .011 (.0084) -.051*** (.0157) .017*** (.0064) .007 (.0086) -.009 (.0175) -.011 (.0089) .016* (.0128) -.009 (.0145) 5.723*** (.0096) 3709 -.004 (.0123) .033** (.0168) -.027 (.0257) 5.580*** (.0140) 3890 -.011** (.0051) .019*** (.0050) .010 (.0152) 5.739*** (.0074) 3890 -.013* (.0073) .025*** (.0072) -.022 (.0205) 5.854*** (.0091) 3890 Variable OLS 1983 Cohort 1964 Cohort 1954 Cohort 1944 Cohort Age group 16-24 35-44 45-54 55-65 Education Less than HS College degree BA or more female Mother’s education Less than high school Ps, univ. or more None reported Father’s education Less than high school Ps, univ. or more None reported constant observation Note:Number if observations in quantile regressions is from the re-created datasets. Cohort 2 (omitted category) are people who were from 26 to 35 in 1998, Cohort 1 are people who were from 16 to 25 in 1998, Cohort 3 are people who were from 36 to 45 in 1998, Cohort 4 are people who were from 46 to 55 in 1998, and Cohort 5 are people who were from 56 to 65 in 1998. 73 Table 27: Prose Literacy Regressions, US -.033** (.0142) .030 (.0245) .067 (.0402) 10 th quantile -.040* (.0241) -.009 (.0220) .056 (.0378) 50 th quantile -.027* (.0151) .037*** (.0138) .075*** (.0192) 90 th quantile -.012 (.0196) .049*** (.0152) .095*** (.0249) .081*** (.0137) -.030 (.0249) -.060 (.0408) -.113** (.0544) .133*** (.0242) -.031 (.0228) -.040 (.0407) -.091* (.0547) .093*** (.0205) -.031*** (.0104) -.054*** (.0178) -.108*** (.0289) .021 (.0169) -.015 (.0188) -.066*** (.0204) -.125*** (.0313) -.175*** (.0254) .049** (.0195) .135*** (.0080) .038 (.0256) -.248*** (.0390) .054* (.0285) .169*** (.0220) .046*** (.0156) -.173*** (.0215) .054*** (.0104) .127*** (.0088) .021*** (.0070) -.114*** (.0117) .031** (.0155) .095*** (.0123) .019* (.0101) -.052*** (.0133) .009 (.0080) -.076** (.0371) -.048*** (.0119) .023** (.0113) -.057*** (.0209) -.012 (.0096) .017* (.0098) -.077*** (.0241) 5.640*** (.0107) 2138 -.017 (.0128) .006 (.0139) -.060*** (.0145) 5.800*** (.0138) 2138 variable OLS 1983 Cohort 1964 Cohort 1954 Cohort Age group 16-24 35-44 45-54 55-65 Education Less than HS College degree BA or more female Mother’s education Less than high -.052*** -.077*** school (.0110) (.0237) Ps, univ. or more .011 .010 (.0165) (.0185) None reported -.086*** -.062 (.0210) (.0384) Father’s education Less than high -.026 -.017 school (.0171) (.0258) Ps, univ. or more .015* .033 (.0086) (.0207) None reported -.076*** -.116*** (.0210) (.0358) constant 5.627*** 5.424*** (.0158) (.0214) observation 2162 2138 Note: Number of observations in quantile regressions is from the re-created datasets. Cohort 2 (omitted category) are people who were from 26 to 35 in 1994, Cohort 1 are people who were from 16 to 25 in 1994, Cohort 3 are people who were from 36 to 45 in 1994, and Cohort 4 were people who were 46 to 56 in 1994. 74 Table 28: Document Literacy Regressions, US variable 1983 Cohort 1964 Cohort 1954 Cohort Age group 16-24 35-44 45-54 55-65 Education Less than HS OLS -.006 (.0152) -.003 (.0245) .020 (.0397) 10 th quantile .022 (.0358) -.016 (.0332) .013 (.0704) 50 th quantile -.031* (.0177) .034** (.0133) .046** (.0193) 90 th quantile -.012 (.0168) .024 (.0164) .030 (.0245) .066*** (.0149) -.014 (.0243) -.022 (.0387) -.063 (.0528) .051 (.0472) .000 (.0270) .009 (.0506) -.035 (.0823) .067*** (.0207) -.041*** (.0152) -.069*** (.0158) -.119*** (.0238) .022 (.0138) -.022 (.0158) -.050* (.0263) -.081** (.0386) -.307*** (.0614) .068* (.0399) .183*** (.0204) .035** (.0149) -.172*** (.0257) .052*** (.0147) .138*** (.0092) -.002 (.0087) -.151*** (.0160) .012 (.0127) .091*** (.0142) -.017* (.0089) -.103*** (.0256) .028 (.0259) -.111 (.0823) -.052*** (.0104) -.000 (.0100) -.067* (.0369) -.059*** (.0174) -.009 (.0193) -.047** (.0208) .004 (.0216) .045** (.021) -.103** (.0420) 5.370*** (.0328) 2138 -.028** (.0123) .019* (.0106) -.114*** (.0196) 5.664*** (.0102) 2138 .004 (.0111) .018** (.0082) -.057*** (.0169) 5.830*** (.0149) 2138 -.198*** (.0372) College degree .043** (.0137) BA or more .137*** (.0110) female .009 (.0326) Mother’s education Less than high -.063*** school (.0153) Ps, univ. or -.008 more (.0298) None reported -.095*** (.0200) Father’s education Less than high -.032* school (.0170) Ps, univ. or .029** more (.0133) None reported -.091*** (.0201) constant 5.629*** (.0172) observation 2162 Note: Number of observations is from the re-created datasets. Cohort 2 (omitted category) are people who were from 26 to 35 in 1994, Cohort 1 are people who were from 16 to 25 in 1994, Cohort 3 are people who were from 36 to 45 in 1994, and Cohort 4 were people who were 46 to 56 in 1994. 75 Table 29: Literacy Related Activities, 2003 IALS Variable 1983 Cohort (Age 16-25) 1974 Cohort (Age 26-35) Reading Tasks at Work 4 or 5 at least once/week 15.8% 34.7% 1 to 3 at least once/week 25.1 33.7 8 6.5 4 or 5 at least once/week 4.1 14.1 1 to 3 at least once/week 24.8 42.9 rarely 18.8 14.7 4 or 5 at least once/week 15.8 28.8 1 to 3 at least once/week 29.4 39.7 5.1 6.5 rarely Writing Tasks at Work Math Tasks at Work rarely Less than 1 hour per day watching tv 27 26.2 Read a newspaper at least once a week 66.7 72.5 Read a book at least once a week 38.1 45.6 Agree or strongly agree that read only when have to 27 21 76 .006 .004 0 .002 Density .008 .01 Figure 1: Document Literacy, Age 16-25 0 100 200 300 400 500 newx Literacy Score IALS94 IALS03 .006 .004 .002 0 Density .008 .01 Figure 2: Document Literacy, Age 26-35 0 100 200 300 400 newx Literacy Score IALS94 IALS03 500 .004 0 .002 Density .006 .008 Figure 3: Document Literacy, Age 36-45 0 100 200 300 400 500 newx Literacy Score IALS94 IALS03 .006 .004 .002 0 Density .008 .01 Figure 4: Prose Literacy, Age 16-25 0 100 200 pnewx 300 Literacy Score IALS94 IALS03 400 .006 .004 0 .002 Density .008 .01 Figure 5: Prose Literacy, Age 26-35 0 100 200 pnewx 300 400 Literacy Score IALS94 IALS03 .006 .004 .002 0 Density .008 .01 Figure 6: Prose Literacy, Age 36-45 0 100 200 pnewx 300 Literacy Score IALS94 IALS03 400 .006 .004 0 .002 Density .008 .01 Figure 7: Document Literacy, Cohort 2 (16-25 in 1994) 0 100 200 300 400 500 newx Literacy Score IALS94 IALS03 .004 .002 0 Density .006 .008 Figure 8: Document Literacy, Cohort 3 (26-35 in 1994) 0 100 200 300 400 newx Literacy Score IALS94 IALS03 500 .004 0 .002 Density .006 .008 Figure 9: Document Literacy, Cohort 4 (36-45 in 1994) 0 100 200 300 400 500 newx Literacy Score IALS94 IALS03 .006 .004 .002 0 Density .008 .01 Figure 10: Prose Literacy, Cohort 2 (16-25 in 1994) 0 100 200 pnewx 300 Literacy Score IALS94 IALS03 400 .006 .004 0 .002 Density .008 .01 Figure 11: Prose Literacy, Cohort 3 (26-35 in 1994) 0 100 200 pnewx 300 400 Literacy Score IALS94 IALS03 .006 .004 .002 0 Density .008 .01 Figure 12: Prose Literacy, Cohort 4 (36-45 in 1994) 0 100 200 pnewx 300 Literacy Score IALS94 IALS03 400 .006 .004 0 .002 Density .008 .01 Figure 13: Document Literacy, Less than High School 0 100 200 300 400 500 newx Literacy Score IALS94 IALS03 .006 .004 .002 0 Density .008 .01 Figure 14: Document Literacy, High School Graduates 0 100 200 300 400 newx Literacy Score IALS94 IALS03 500 .006 .004 0 .002 Density .008 .01 Figure 15: Document Literacy, Some Post-Secondary 0 100 200 300 400 500 newx Literacy Score IALS94 IALS03 .005 0 Density .01 Figure 16: Document Literacy, University Graduates 0 100 200 300 400 newx Literacy Score IALS94 IALS03 500 .006 .004 0 .002 Density .008 .01 Figure 17: Prose Literacy, Less than High School 0 100 200 pnewx 300 400 Literacy Score IALS94 IALS03 .005 0 Density .01 .015 Figure 18: Prose Literacy, High School Graduates 0 100 200 pnewx 300 Literacy Score IALS94 IALS03 400 .005 0 Density .01 .015 Figure 19: Prose Literacy, Some Post-Secondary 0 100 200 pnewx 300 400 Literacy Score IALS94 IALS03 .005 0 Density .01 .015 Figure 20: Prose Literacy, University Graduates 0 100 200 pnewx 300 Literacy Score IALS94 IALS03 400 0 .005 Density .01 .015 Figure 21: Prose Literacy (Norway), Age 16-25 100 200 300 Literacy Score IALS94 th 5 percentile IALS03 IALS94 IALS03 237 235 th 282 282 th 304 306 th 326 328 th 95 percentile 356 352 mean 301 302 Standard deviation 36 37 observations 336 521 25 percentile 50 percentile 75 percentile 400 0 .005 Density .01 .015 Figure 22: Prose Literacy (Norway), Age 26-35 100 200 300 Literacy Score IALS94 th 5 percentile IALS03 IALS94 IALS03 237 238 th 279 281 th 301 311 th 320 328 th 95 percentile 348 355 mean 299 303 Standard deviation 32 38 observations 327 521 25 percentile 50 percentile 75 percentile 400 0 .005 Density .01 .015 Figure 23: Prose Literacy (Norway), Age 36-45 100 200 300 Literacy Score IALS94 th 5 percentile IALS03 IALS94 IALS03 244 232 th 276 276 th 299 304 th 316 323 th 95 percentile 344 351 mean 296 297 Standard deviation 31 37 observations 326 526 25 percentile 50 percentile 75 percentile 400 0 .005 Density .01 .015 Figure 24: Prose Literacy (Norway), 16-25 in 1998 100 200 300 Literacy Score IALS94 th 5 percentile IALS03 IALS94 IALS03 237 249 th 282 287 th 304 311 th 326 332 th 95 percentile 356 354 mean 301 307 Standard deviation 36 36 observations 336 433 25 percentile 50 percentile 75 percentile 400 0 .005 Density .01 .015 Figure 25: Prose Literacy (Norway), 26-35 in 1998 100 200 300 Literacy Score IALS94 th 5 percentile IALS03 IALS94 IALS03 237 238 th 279 277 th 301 308 th 320 325 th 95 percentile 348 353 mean 299 299 Standard deviation 32 37 observations 327 553 25 percentile 50 percentile 75 percentile 400 0 .005 Density .01 .015 Figure 26: Prose Literacy(Norway), 36-45 in 1998 100 200 300 Literacy Score IALS94 th 5 percentile IALS03 IALS94 IALS03 244 224 th 276 269 th 299 296 th 316 318 th 95 percentile 344 346 mean 296 292 Standard deviation 31 37 observations 326 531 25 percentile 50 percentile 75 percentile 400 0 .002 Density .004 .006 .008 .01 Figure 27: Prose Literacy (U.S.), Age 16-24 0 100 200 Literacy Score IALS94 th 5 percentile 300 IALS03 IALS94 IALS03 169 201 th 262 246 th 288 276 th 314 306 th 95 percentile 358 347 mean 284 274 Standard deviation 54 44 observations 192 283 25 percentile 50 percentile 75 percentile 400 0 .002 Density .004 .006 .008 Figure 28: Prose Literacy (U.S.), Age 25-34 0 100 200 Literacy Score IALS94 th 5 percentile 300 IALS03 IALS94 IALS03 206 205 th 261 251 th 292 280 th 325 308 th 95 percentile 362 350 mean 288 279 Standard deviation 51 45 observations 241 296 25 percentile 50 percentile 75 percentile 400 0 .002 Density .004 .006 .008 .01 Figure 29: Prose Literacy (U.S.), Age 35-44 0 100 200 Literacy Score IALS94 th 5 percentile 300 IALS03 IALS94 IALS03 180 198 th 251 253 th 294 277 th 330 311 th 95 percentile 382 349 mean 290 279 Standard deviation 59 46 observations 269 332 25 percentile 50 percentile 75 percentile 400 0 .002 Density .004 .006 .008 Figure 30: Prose Literacy (U.S.), 16-25 in 1994 0 100 200 Literacy Score IALS94 th 5 percentile 300 IALS03 IALS94 IALS03 169 205 th 259 251 th 288 280 th 314 308 th 95 percentile 359 350 mean 284 279 Standard deviation 53 45 observations 205 296 25 percentile 50 percentile 75 percentile 400 0 .002 Density .004 .006 .008 .01 Figure 31: Prose Literacy (U.S.), 26-35 in 1994 0 100 200 Literacy Score IALS94 th 5 percentile 300 IALS03 IALS94 IALS03 199 198 th 261 253 th 293 277 th 328 311 th 95 percentile 366 349 mean 290 279 Standard deviation 53 46 observations 254 332 25 percentile 50 percentile 75 percentile 400 0 .002 Density .004 .006 .008 .01 Figure 32: Prose Literacy (U.S.), 36-45 in 1994 0 100 200 Literacy Score IALS94 th 5 percentile 300 IALS03 IALS94 IALS03 186 192 th 252 242 th 297 283 th 330 309 th 95 percentile 384 341 mean 292 276 Standard deviation 59 46 observations 267 321 25 percentile 50 percentile 75 percentile 400