int. j. lang. comm. dis., 2000, vol. 35, no. 2, 165–188 Prevalence and natural history of primary speech and language delay: ndings from a systematic review of the literature James Law†*, James Boyle‡, Frances Harris†, Avril Harkness‡ and Chad Nye§ † Department of Language and Communication Science, City University, London, UK ‡ Department of Psychology, University of Strathclyde § Department of Communication Disorders, University of Central Florida (Received July 1998; accepted March 1999) Abstract The prevalence and the natural history of primary speech and language delays were two of four domains covered in a systematic review of the literature related to screening for speech and language delay carried out for the NHS in the UK. The structure and process of the full literature review is introduced and criteria for inclusion in the two domains are speci ed. The resulting data set gave 16 prevalence estimates generated from 21 publications and 12 natural history studies generated from 18 publications. Results are summarized for six subdivisions of primary speech and language delays: (1) speech and/or language, (2) language only, (3) speech only, (4) expression with comprehension, (5) expression only and (6) comprehension only. Combination of the data suggests that both concurrent and predictive case de nition can be problematic. Prediction improves if language is taken independently of speech and if expressive and receptive language are taken together. The results are discussed in terms of the need to develop a model of prevalence based on risk of subsequent diYculties. Keywords: primary speech and language delay, children, prevalence, natural history, epidemiology. Introduction The present review was commissioned and managed by the National Health Service Centre for Reviews and Dissemination at the University of York on behalf of the *Address correspondence to: James Law, Department of Language and Communication Science, City University, Northampton Square, London EC1V OHB, UK; e-mail: J.C.Law@city.ac.uk International Journal of Language & Communication Disorders ISSN 1368-2822 print/ISSN 1460-6984 online © 2000 Royal College of Speech & Language Therapists http://www.tandf.co.uk/journals/tf/13682822.html 166 J. Law et al. National Health Technology Assessment Programme of the NHS in the UK. The overall aim of the review was to establish whether there was a case for the introduction of universal population screening for speech and language delay. An epidemiological model was adopted (Law et al. 1998) and this led to the identi cation and review of the four key domains of the literature: prevalence, natural history, intervention and screening. Although these domains have been treated independently in the literature to date, the position taken in the review was that they are closely associated. While it may be possible to calculate prevalence at any age given a suitable measurement, it is only a meaningful construct in the context of natural history of the condition. Similarly, both screening and intervention are tied into the need for a clear concept of case de nition and appropriate targeting depends on an understanding of the course of the disability. The intervention and screening literature are reviewed elsewhere (Boyle et al. in preparation, Law et al. in press). Prevalence Prevalence is conventionally expressed as the proportion or percentage of cases in a given population at a speci ed time. Normal rather than clinical population is the focus in the present review. Prevalence diVers from, but is often confused with, incidence, which refers to the number of new cases (Mosciki 1984). Cases may move out of the pool from which prevalence is taken because of mediating factors (such as spontaneous recovery, intervention or death), but prevalence also depends upon incidence. Prevalence will rise if incidence exceeds the decrease brought about by these mediating factors, and is an important concept for three reasons: it allows planning of service delivery; it should also allow calculation at an epidemiological level of the impact of intervention, with a successful intervention being one that results in a decline in prevalence; and it should re ect current knowledge about a disorder where the boundaries lie between for example normality and abnormality and which cases respond to intervention. To date there has been no attempt to carry out a systematic synthesis of the prevalence literature although there have been some extensive narrative reviews (MacKeith and Rutter 1972, Healey et al. 1981, Silva et al. 1987, Harasty and Reed 1994). Natural history ‘Natural history’ is a speci c term used to describe the prognosis of a condition in the absence of intervention (Gordis 1996). It can be a diYcult concept when it comes to children’s development because, in most cases, children receive some form of intervention by virtue of entering the education system. Nonetheless there probably is a distinction to be drawn between intervention of a generic nature and intervention which is targeted at the child’s speci c diYculties. Natural history is important because the anticipated status of a putative case allows determination of current status. Thus, if a two cases are identi ed as having comparable levels of skill at time one but it is known that there is a much greater risk of persistent problems for case one than for case two, then the relative weighting attributed to case one will be higher. Prevalence and natural history of primary speech and language delay 167 Systematic review of the literature The aims of a systematic review are to locate, appraise and synthesize evidence from studies to provide empirical answers to scienti c research questions. It diVers from other types of review in that it adheres to a strict scienti c design to make it more comprehensive, and to minimize the chance of bias (NHS Centre for Reviews and Dissemination 1996). Such a review follows a strict procedure that involves the identi cation of published and unpublished data sources, the development of research questions, search strategies and inclusion criteria, the reliable coding of the data, and its synthesis ultimately leading to data-driven conclusions. Methodology Scope of the literature The literature for 1967 to May 1997 was targeted. CROS AND DIALINDEX search mechanisms identi ed the databases most likely to identify appropriate literature. Thus, six databases were identi ed: Embase, Medline, ERIC, PsychINFO, CINAHL and LLBA. In addition, the System for Indexing Grey Literature in Europe (SIGLE) and the Boston Spa Conferences (British Library Databases) were searched for unpublished, or ‘grey’, literature. Grey literature is scanned to attempt to overcome publication bias. Finally, journals were hand-searched to identify relevant authors in the eld. In all, 53 prevalence papers met the criteria of relevance for prevalence but of these only 21 met the full inclusion criteria. These provided 16 sets of data. Twenty-three papers met the initial criteria for relevance in the natural history domain, but of these only 18 were included in the review providing 12 data sets (see Appendices 1 and 2). Criteria for inclusion in the review The review dealt with children with primary speech and/or language delays. That is, it did not include children with delays secondary to other conditions such as autism, or more general developmental disabilities. It is recognized that, in the absence of universally recognized criteria for primary delays, this may be a diYcult judgement to make and that there may be more similarities than diVerences between these groups. Nonetheless the distinction between primary and secondary delay is one that is widely recognized by both researchers and clinicians and, therefore, has face validity. Authors use many diVerent ways to de ne the populations they describe. Some use discrepancies between measures of language and general abilities, but again there is no widely accepted level of discrepancy that could be imposed on the data. In addition, some design criteria were speci ed for each domain. Prevalence studies Studies were included that estimated the prevalence of speech and language delays in children aged up to 16 years in a general population. The studies needed to present data about the number of participants and the diagnostic samples, and the de nition of case status had to be determined either by standardized measures of speech and/or language or to use clearly de ned clinical judgement. Studies of clinic populations were excluded. In addition, a distinction has been drawn between 168 J. Law et al. single-level prevalence studies, which have been based on surveys or clinical judgement, not subjected to external validation, and those studies that have been based on a pre-screen or net of the population with proportions of passes and fails sampled and then given a diagnostic assessment, either on a standardized language procedure or on a criterion referenced clinical judgement (which the authors have made some attempt to validate or de ne such that it could be replicable). The ‘single level designs’ have been excluded on the grounds that they would be impossible to replicate and are likely to lead to under-reporting in all but the most clear-cut medical cases (Leske 1981). Many of the largest studies have been excluded because, although the sampling was potentially of considerable interest, their singlestage approach would not have met the inclusion criteria for the review. They have also tended to report the lowest prevalence gures (Blum-Harasty and Rosenthal 1992). This two-stage criterion resulted in the exclusion of a number of studies that are commonly included in narrative reviews most notably Drillien and Drummond (1983), Fundudis et al. (1979), Hull et al. (1971), Morley (1965) and Peckham (1973). The reason for applying such a stringent criterion is that it provides a measure of control against threats to the reliability of the results and hence to the validity of the interpretation. Natural history Large-scale prospective same-age cohort longitudinal studies oVer the best evidence of natural history outcomes. However, most of the prospective studies located by the review were those where at least a proportion of the subjects received speech and language therapy services. A statement regarding the numbers in therapy, or the amount of therapy received, led to the study being excluded from this review. ‘Therapy’ was interpreted in the context of the studies as speci c advice or assessment from speech and language therapists. Where there was no such statement about therapy contact for the subjects, the study was included. In a minority of instances a study explicitly stated there was no intervention for the subjects. General advice given routinely by a health professional other than a speech and language specialist was regarded as non-speci c intervention, and such studies were included. This applied, for example, to some treatment studies where control groups received only general help. Moreover, within a prospective design with some subjects receiving treatment, where data for individual untreated subjects could be retrieved, their outcome was noted. Another criteria for inclusion ensured that there was a retest interval of at least 6 months, using norm- or criterion-referenced language outcome measures. No minimum number of subjects was set. These constraints, in particular the stringent inclusion criteria, also resulted in a number of well-known studies being excluded from the review (e.g. Sheridan and Peckham 1975, Klackenburg 1980, Paul 1993). While their designs have many exemplary features, and have achieved long-term follow up for relatively large samples, it has not proved possible to separate out possible treatment eVects in their data set. Reliability To ensure that the same data was being extracted appropriately, two independent coders coded a subset of the papers and the percentage agreement rate was reported. Nineteen per cent of the prevalence papers and 14% of the natural history papers Prevalence and natural history of primary speech and language delay 169 were subjected to the reliability check resulting in reliabilities of 84.8 and 85.4% respectively. DiVerences were resolved following discussion between the coders. The majority of sources of diVerence between the coders arose as the result of gauging which gures to use as the denominator at any one stage in the calculation of prevalence or natural history because of diVerential attrition in the populations concerned. Results Results from prevalence studies To represent its diversity the data in the present review were classi ed by the language domains measured, and by the age of the children in the sample. Table 1 shows prevalence for (1) speech and/or language, (2) language and (3) speech. Table 2 focuses on those studies that have looked at language delay in the absence Table 1. Median prevalence estimates by type of speech and language delay and age Type of delay Speech/language delay, language delay only and speech delay only Age (years) 2;0 3;0 4;6 5;0 6;0 7;0 Speech/language delay median of estimates (range) Language delay only median of estimates (range) Speech delay only median of estimates (range) 5.0 a 6.9 [5.6–8] a,c,e 5.0 [–] a 11.78 [4.56–19.0]b† – – 16 [8–19] f 2.63 [2.27–7.6]g,h,k – 6.8 [2.14–10.4] b,g,i,j,d 5.5 d 3.1 [2.02–8.4] g,d – – – 7.8 [6.4–24.6]b,d,j 14.55 [12.6–16.5]d,j 2.3 d a Bax et al. (1980, 1983); b Beitchman et al. (1986); c Burden et al. (1996); d Dudley and Delage (1980); et al. (1974); f Rescorla et al. (1993); g Silva et al. (1983); h Stevenson and Richman (1976), Richman et al. (1982); i Tomblin et al. (1997); j Tuomi and IvanoV (1977); k Wong (1992). † Beitchman et al. (1986) is the only study to include prevalence estimates for both speech and language and speech or language. This highlights the diYculty in synthesizing data in this area because both the level of diYculty and the extent to which the categories can reasonably be teased apart from the data is unclear. e Randall Table 2. Median prevalence estimates by type of language delay and age Type of delay Expressive and receptive language delay, expressive delay only and receptive delay only Age (years) 2 3 5 7 a Silva Expressive and receptive language delay median percent (range) Expressive delay only median percent (range) Receptive delay only median percent (range) – 3.01a,b [2.63–3.4] 2.14a 2.02a 16 c [8–19] 2.3 a,d [2.27–2.34] 4.27a 2.81a – 2.63a 3.95a 3.59a et al. (1983); b Wong (1992); c Rescorla et al. (1993); d Stevenson and Richman (1976). 170 J. Law et al. of speech delay and separates those language delays out into expressive/receptive delays, isolated expressive delays and isolated receptive delays. This group represents a subgroup of those quoted in table 1. Table 3 displays studies that report prevalence across diVerent age groups and which are impossible to combine in a meaningful way. In a number of studies more than one data set is provided for a given age band. To avoid over-representing such studies in the total data set only a single (median) gure was included for each study. Speech and/or language delay The largest single group and, thus, the group most likely to be picked up in the process of early identi cation (the original aim of the systematic review) is that with combined speech/language diYculties, where researchers have included the possibility of either speech or language being implicated. This corresponds to column 1 of table 1, and the median for this column is 5.95%. However, although the median was adopted as the most appropriate measure of central tendency in view of the variability observed in the data, considerable caution needs to be taken in extrapolating to produce single composite prevalence estimates especially given the relatively small number of studies involved. Taking speech and language delay as a single construct, the majority of studies took diagnostic assessment score cut-oVs between Õ 2 and Õ 1.5 standard deviations (SD) below the mean for the standardization sample on a single measure, which automatically gives prevalence rates of between 2.28 and 6.68%. In most cases the range of ndings is slightly wider than would be anticipated on this basis, ranging from 1.35 to 8% (table 1, column 1; table 3). Where more liberal cut points are adopted higher prevalence gures result and this is compounded when failure on any one of a number of measures is used as the criteria. For example, one of the highest gures came from a large population study that used multiple measures ( Beitchman et al. 1986). The gure for speech or language was much higher than most other estimates (19%) being an inclusive estimate for speech or language delay and using a cut-oV of Õ 1 SD below the mean on a series of diVerent speech and language measures. By contrast, a scale of clinical diYculty adopted by Paul et al. (1992) (table 3) reported much lower gures (1.35% for mild, moderate and severe cases, 0.65 for moderate and severe or what they term ‘serious’ cases only). These gures derive from a study in which case status depended upon a single objecti ed criteria for clinical judgement rather than performance on standardized assessments of speech and language performance. Such conservative criteria may re ect the fact that this study was carried out in Jamaica, a country where services, with fewer Table 3. Period prevalence estimates Age (years) Study Type of delay Prevalence estimate (%) 2:0–9.0 6.0–12 3:0–5.0 5.0–7.0 6.0–12 12–14 Paul et al. (1992), Thorburn et al. (1991) Harasty and Reid (1994) Stewart et al. (1986) Kirkpatrick and Ward (1984) Harasty and Reid (1994) Warr Leeper et al. (1979) Speech and Language Speech and Language Speech Speech Speech Speech 1.35 8.0 1.5 4.6 12.6 7.3 Prevalence and natural history of primary speech and language delay 171 resources available, would be likely to apply a more stringent criterion as to who should receive services. Language delay only Language delay in the absence of speech delay is the group most commonly identi ed in the studies included in the present review. Some studies have reported children with either expressive or receptive delays as one group (table 1, column 2; table 2, column 1). The variability here is also wide, giving a range of 2.02–19%. In one case a cut-oV, which would anticipate a higher prevalence estimate ( Õ 1.25 SD), resulted in a convergent prevalence estimate of 7.4% because children had to receive two low test scores to achieve case status (Tomblin et al. 1997). The gures for expressive with receptive language delays combined show little variation within a narrow band, re ecting the conservative cut-oV scores adopted in these studies and perhaps a greater con dence in the most appropriate level at which to set case status. It is of note that this most recent population study suggests a higher level of prevalence for children with language problems based on a discrepancy criterion rather than for the broader category of children with primary delays captured by most of the other studies. It is probable that the rather more liberal cut point used in this study explains the variation. But it would be wrong to over emphasize the diVerence. The important issue is whether the more inclusive de nition of language delay results in more cases whose diYculties would not otherwise resolve being provided with appropriate intervention. Most notable is the sharp drop between 2 and 3 years, the high early gure created by the measure of output vocabulary adopted by Rescorla et al. (1993). In the main the estimates use either single measures of language which amalgamate diVerent components (comprehension, expression, vocabulary, verbal memory, etc.). Sometimes cases are identi ed by failure on any one of a number of subtests (Stevenson and Richman 1976, Dudley and Delage 1980). In other cases composite measures made up of a number of diVerent tests are used (Tomblin et al. 1997). The evidence seems to suggest that it does not make much diVerence which approach is adopted. It is noteworthy that the level of prevalence rises at the 5-year mark. In part this is the function of the 10.4% gure reported by Silva et al. (1983), which sums expressive, receptive and expressive receptive cases. These are separated out in table 2, column 1. It is also possible that there is a change in expectations at this point in the child’s development, a closer approximation to the adult norm is now expected and as the expectations increase so do the number of potential cases of delay. In a study of primary speech and language delay it is obviously important to establish what is known about the prevalence of speci c language impairment (SLI), that is language impairments that are identi ed by means of the discrepancy between language and IQ measures. Only two studies have achieved this with very disparate results. Stevenson and Richman (1976), who obtained a prevalence of 0.6% for SLI, and Tomblin et al. (1997), who obtained 7.4% for a group of SLI children. This diVerence is likely to stem from the diVerence in discrepancy criterion adopted and the age of the children in the study. Thus, Tomblin et al. speci ed that IQ had to be within the normal range with language level below Õ 1.25 SD on their composite measure at 5 years, while Stevenson and Richman speci ed that expressive language had to be less than two-thirds of mental age, which in turn had to be two-thirds of chronological age at 3 years of age. 172 J. Law et al. Speech delay only The gures for speech (not language) delay are again highly variable, ranging from 2.3 to 24.6% (table 1, column 3; table 3). Kirkpatrick and Ward (1984) identi ed 4.6% using a Õ 2 SD cut-oV. Beitchman et al. (1986) set a Õ 2 SD cut-oV on particular subtests to reach their speech prevalence rate of 6.4%. In contrast, Tuomi and IvanoV (1977) adopted a criteria of 1 year behind chronological age and reported 24.6% in kindergarten children, although this fell to 16.5% by Grade 1. These gures are relatively high and probably re ect the relatively liberal de nition of what constitutes a case, including include children with poor ‘stimulability’, multiple and consistent errors and those with a lateral lisp. It would appear that the childrens’ speech is reported in absolute rather than developmental terms. Signi cantly prevalence studies of speech delay in the pre-school years did not ful l the inclusion criteria for the review. This probably re ects a recognition of the diYculty in establishing case status during the very early years. Language only—expressive and receptive delay Two studies have reported data that allow combination of expressive and receptive language scales (Silva et al. 1983, Wong et al. 1992). It might be anticipated that these data should coincide with those from table 1, column 2. In fact, this method of classifying the children suggests a rather more conservative estimate of prevalence as between 2 and 3%, which remains constant across the age range. To classify children in this way it is necessary to use standardized procedures that tease apart expressive and receptive skills. But inevitably the precision using measures in this way eVectively exacerbates the circularity associated with the use of this type of metric, a point discussed further below. Language only—expressive delay Three authors report gures for delays only in expressive language skills. Two studies use only an expressive language measure (Stevenson and Richman 1976, Rescorla et al. 1993), while Silva et al.’s criteria (1983) seek to identify expressive delay in the absence of delay in receptive language skills. The gures reported by Stevenson and Richman, and Silva range from 2.34 to 4.27% over the age range 3–7 years. The highest gure comes from Rescorla et al. (1993), with 8, 16 and 19% according to the screen cut-oV. These gures were computed from a singlevocabulary checklist rather than a diagnostic test performance, which was the approach adopted in the other studies. That is, the range of reported prevalence re ects the initial screening stage of this study. Alternative prevalence gures can be computed re ecting the reference measure, giving 9.8 and 13%. However, the age range for Rescorla et al.’s study was at least 1 year below that of the other studies, and it may be that the range of expressive vocabulary development is especially wide in the slightly lower age range. Language only—receptive delay The gures for receptive delay are again tightly grouped ranging from 2.63 to 3.95%. All the gures come from the Dunedin study (Silva et al. 1983). Silva et al.’s gures here could include children who are actually expressive-receptive language Prevalence and natural history of primary speech and language delay 173 delayed, but who just achieved a pass on the expressive language measure while failing the receptive test. The gures may, therefore, be an instance of psychometric convention for cut-oVs confounding the clinical impression. Relying on standardized assessments to establish case status makes it diYcult to judge whether prevalence decreases with age. A given cut-oV on such a measure will result in the same percentage of the population being identi ed at any given time. The data presented here suggest that prevalence does not decrease over time. Bax et al. (1983) indicate fairly stable gures over the 2–4.5-year age range for their ‘de nitely abnormal’ group. It seems likely that the majority of studies reviewed above take this latter relatively stable group as their subject pool. This could suggest that similar prevalence gures across a time range re ect the same group of children at diVerent points in time. In fact, the evidence from Silva et al. (1983) is that individual children may move in and out of the disordered group. This may be true or it may be a function of the lack of stability of the metrics adopted. Predictability of problems depends upon the presenting symptoms at any one point, children with expressive and receptive delays being much more likely to have persistent problems than children with expressive delays alone. By contrast, Bax et al.’s ‘possibly abnormal’ group shows prevalence decreasing from 17 to 12 to 7% between 2 and 4;6 years. This study suggests that there may be a group of children for whom development is especially variable in rate but that may have less entrenched problems. These problems may tend to reduce over time perhaps for no other reason than test/retest error or regression to the mean (Robson 1993). But a further possibility is that this reduction comes as a result of the eVect of speech and language therapy services, a point made by Butler (1989) with regard to the Bax et al. study. Confounding factors There are potentially confounding factors (most notably, gender, socio-economic status and bilingualism) in the literature that need to be considered. A propensity for marked speech and language delays to be more common in males than females is generally con rmed by the studies reviewed here. Gender ratios derived are 1.25:1 ( Randall et al. 1974), 2.26:1 (Stevenson and Richman 1976), 2.30:1 (Burden et al. 1996), 1.25:1 for both speech and language at 4 years (Stewart et al. 1986) and 2.3:1 (speech), with 1.2–1.6:1 (language) (Tuomi and IvanoV 1977). There are two exceptions to this pattern. One is Beitchman et al. (1986), who found the reverse pattern for speech only (0.98:1), language only (0.98:1) and speech or language (0.82:1), and a most unexpected 0.46:1 for the speech and language diagnosis. The other is Tomblin et al. (1997), who suggest that while boys are more likely to present with SLI, the ratio is nearer equivalence. There are possible explanations for these gures. The rst is suggested by the design of the Beitchman et al. study that sought to sample and then project the false-negatives back into the original population sample. Of the false-negatives, the majority was girls and in projecting back up to the main sample the authors projected the gender balance as well as the number of cases. A second explanation is that the relatively liberal cut-oV eVectively misses the commonly observed discrepancy between the genders because those cases found may be less likely to be true clinical cases and as such may tend to re ect the normal gender balance in the population. A third explanation and one favoured by Tomblin et al. is that existing data are the result of underreporting of diYculties in 174 J. Law et al. girls, a phenomenon which has also been reported in the literature related to reading disabilities (Shaywitz et al. 1990). The fact that the other major cohort study in this area did detect the predicted imbalance (2:1) adopting the fth percentile as a cutoV suggests that it may be the cut-oV which is the determining factor here rather than a high level of undetected diYculties in girls (Silva 1980). The studies quoted here are not generally helpful in addressing the issue of increased prevalence in lower socio-economic groups. Researchers have commented on this issue but the inclusion criteria for this review eVectively excluded groups that might be considered as lower socio-economic status. Some of the studies were carried out in areas with a relatively advantaged population (Burden et al. 1996, Rescorla et al. 1993). Bax et al. (1983) commented on the cumulative eVect of low SES on language delay. Similarly Harasty and Reed (1994) give some indication of the potential eVect of introducing variation of this type into prevalence estimates. For example, they found higher rates of refusal when making up their diagnostic sample of lower SES children. Also the data here do not address bilingual or ethnically diverse populations. Stevenson and Richman (1976) deliberately excluded non-indigenous families from their study. Tomblin et al. (1997) noted increased prevalence of language delays in monolingual African-Americans, a nding not replicated by Stewart et al. (1986) in the only reviewed study explicitly to target prevalence in a black population. Wong (1992) found levels of language delay for children from Hong Kong comparable to those in other studies. In the review no studies were found, with the exception of Tomblin et al. (1997) (also Records and Tomblin 1994), which attempted to integrate clinical judgement and standardized procedure and to determine the eVect that this might have on the estimate of prevalence. Accordingly, it is necessary to interpret the above results as being based on a notional psychometric convention. One would argue that this may tend, especially when seen in terms of a cut point of more than 1 SD below the mean, to lead to a relatively liberal cut-oV which may, in turn, overestimate real need. However, it does not help one to escape the essentially circular nature of statistically derived prevalence estimates. Results from natural history studies A range of diVerent ndings is reported, namely those relating to speech and language, language only and speech-only delays. As above, the criteria for determining what is a case is established by the authors of the papers in question, and the disparate nature of some of the ndings probably re ects these criteria as much as they do real diVerences in the groups. This section summarizes each category in turn reporting median gures for each category. Persistence here is de ned as the number of children who remain cases of speech and/or language delay at the longest time point quoted by the study. Importantly only speech and language outcomes were used rather than broader educational outcomes or behaviour problems on the grounds that these might result in a more comparable data set. The median persistence gures for studies examining speech only, language only, and speech and language only are given in table 4. The individual studies are listed with their persistence estimates in table 5. Persistence here refers to the proportion of children whom the authors reported continued to meet their criteria for case status. The median gures for studies examining expressive/receptive, Prevalence and natural history of primary speech and language delay 175 Table 4. Median persistence reported for speech and language, language only and speech only Type of delay (no. of studies) Speech only (3) Language only (10) Speech and language (1) Median persistence (%) Range of reported persistence (%) Age range (years;months) 50 66 38 22–54 0–100 n/a 4;10–33;0 0;10–7;0 3;0–7;0 Table 5. Persistence for individual studies: speech only, speech and language, and language only Cases/sample size Speech and language Fiedler et al. (1971)† Median Language (expressive and receptive) Hall et al. (1993), Hall (1996) Rescorla and Schwartz (1990) Richman et al. (1982) Scarborough and Dobrich (1990) Silva et al. (1983)† Thal and Tobias (1992), Thal and Bates (1998) Klee et al. (1998) Ward (1992) Ward (1992) Ward (1992) Median Speech Bralley and Stoudt (1977)† Felsen eld et al. (1992) Renfrew and Geary (1973) Median † Persistence Persistence Age range (%) (years;months) 46/138 38 38 3;0–7;0 5/9 25/25 22/705 4/16 23/1027 10/30 6/36 119/321 61/321 23/321 100 54 65 0 78 40 67 82 73 50 66 4;7–7;0 2;2–3;0 2;0–3;0 2;6–5;6 3;0–7;0 1;10–3;0 2;0–3;0 1;0–2;0 0;10–1;10 0;10–1;10 60/60 24/52 150/150 22 50 54 50 6;6–11;6 4;10–33;0 5;0–5;6 for the longest of three periods within the same study. Table 6. Median persistence for receptive language only, expressive language only and expressive and receptive language Type of delay (no. of studies) Expressive and receptive language (6) Expressive language only (5) Receptive language only (1) Median persistence (%) Range of reported persistence median (%) Age range (years;months) 75.6 26–100 0;10–7;0 40 8.7 0–54 n/a 0;10–7;0 3;0-7;0 expressive and receptive delays are given in table 6, and the corresponding individual studies are listed with persistence estimates in table 7. Median persistence is derived from the single quoted gure for each sample of children with the study concerned. 176 J. Law et al. Table 7. Persistence for individual studies: receptive language only, expressive language only and expressive/receptive language Cases/sample size Persistence (%) Age range (years;months) Expressive/receptive language Richman et al. (1982) Ward (1992) Ward (1992) Hall et al. (1993) Klee et al. (1998) Silva et al. (1983)† Median 22/705 119/321 61/321 5/9 6/36 23/1027 65 82 73 100 67 78.2 75.6 2;0–3;0 1;0–2;0 0;10–1;10 4;7–7;0 2;0–3;0 3;0–7;0 Expressive language Rescorla and Schwartz (1990)‡ Thal and Tobias (1992) Scarborough and Dobrich (1990) Silva et al. (1983) Ward (1992) Median 25/25 10/30 4/16 21/1027 23/321 54 40 0 28.6 50 40 2;2–3;0 1;10–3;0 2;6–5;6 3;0–7;0 0;10–1;10 23/1027 8.7 3;0–7;0 Receptive language Silva et al. (1983)† † Persistence ‡ Median for the longest of three periods within the same study. between two expressive scales reported. To give some indication of the populations reported in tables 5 and 7, the total number of clinical cases relative to the size of the sample is expressed as case/ sample size. The population could have been made up entirely of cases or could have been made up of a larger population sample from which a clinical subgroup was picked out. Speech and language delay Only one very early study combined speech and language delay (Fiedler et al. 1971) in following children between 3 and 7 years of age. The relatively high gure for spontaneous improvement over the 4 years of the study needs to be contrasted with the much higher level of persistence over the rst year (87%). Of interest here is that of the control group 29% developed problems by 4 years and 8% had minor speech problems at 7 years. Unfortunately beyond knowing that the children failed a screening measure at 3 years there is little in this study about the initial presenting symptoms of the children concerned. Language delay only As with the prevalence data this group represents the largest single source of data. Of the 10 studies, the range is extreme in this group although it is apparent that taking out the studies with the smallest samples would also reduces this range considerably. Thus, if studies with samples below 30 are removed the range shifts to 50–82%, although the median only increases marginally (to 70%). Ward (1992) followed infants from a mean age of 1–2 years of age (tables 5 and 7). The sample Prevalence and natural history of primary speech and language delay 177 is divided into three groups, being expressive and receptive delay with listening diYculties, expressive and receptive delay without listening diYculties, and expressive delay alone. There is a diVerential outcome for the three groups: while 82% of the rst group continues to show language delay, the gures are 73 and 50% respectively for groups 2 and 3. Of interest is the fact that the type of presenting delay changed in some cases from a combined expressive-receptive delay to one of expressive delay alone. Speech delay only Speech problems may appear less persistent than language problems, in that Bralley and Stoudt (1977) report up to 78% of articulation errors resolving. However, the long-term data from Felsenfeld et al. (1992) suggest that underlying language diYculties may continue for children originally identi ed as having speech delay. Also, there is a relatively small body of evidence from follow-up studies of children treated for speech problems (and thus not represented in the present review) to suggest that literacy skills are at risk even after resolution of speech delay (Stackhouse and Snowling 1992). The studies looking at speech development give a broad span of 6 months, 5 and 28 years follow-up, all starting with subjects initially ~ 5 years of age. Over 6 months Renfrew and Geary (1973) showed that 54% of speech cases persist, while Bralley and Stoudt (1977) reported nearly 22% of speech problems persisting over 5 years. Felsenfeld et al. (1992) followed some of a longitudinal cohort (begun in 1960) to track adulthood outcomes. In this study children de nitely had no therapy until third grade (USA schooling, ~ 8 years); there is no account in Felsenfeld et al.’s paper of any later therapy received. Felsenfeld et al. found 50% of children with residual speech problems, as assessed by sentence level tests of articulation. Language measures in adulthood also showed de cits relative to controls, even though the children were originally identi ed as speech delayed. More broadly, nonverbal reasoning and personality scores in adulthood were not signi cantly diVerent between the group originally speech delayed and the control group. It should be noted here that the controls were identi ed in adulthood and were not matched to those adults who had originally presented as speech delayed. Language only—expressive and receptive delay This group of studies corresponds closely to that reported in table 5 once the smaller studies focusing on expressive delays have been reassigned, the only diVerences being that Hall (1996) remains in and the expressive group from the Ward (1992) study is taken out. The result is that the median changes little. The Ward data are made up of two separate groups of children both with expressive/receptive diYculties, but one with and one without concomitant listening diYculties. In one study 65% of delays persisted to age 4 (Richman et al. 1982) and 38% of the same group was experiencing cognitive de cits at 8 years. Of particular interest is Silva et al.’s data (1983), which show that those children presenting with a language delay are a uctuating group; some children failed at each of the three language assessment points and some failed at only one or two assessment points. There is, however, the possibility of test-retest error contributing to this eVect. The more stable 178 J. Law et al. subgroup comprised those children with generalized language problems aVecting both receptive and expressive skills. Language only—expressive delay only Again there was a considerable disparity between the samples adopted. Two were taken from relatively large-scale population studies (Silva et al. 1983, Ward 1992) and the other three were recruited speci cally for the study by notices in paediatricians oYces or local newspapers. Some indicated precise criteria for expressive language delay (Rescorla and Schwartz 1990). Otherwise we were looking to identify late talkers on the basis of expressive measures but monitored receptive language and other skills (Scarborough and Dobrich 1990, Thal et al. 1991, Thal and Tobias 1992). The picture that emerges is one of relatively high spontaneous remission. Interestingly, where receptive skills are monitored there is some indication that expressive skills tend to improve but that there is little change in receptive skills (Scarborough and Dobrich 1990). Thal and Tobias (1992) noted a better outcome for those children with normal receptive skills and who use gesture to compensate for their lack of expressive output. Additional outcomes for areas other than spoken language are given by three studies (Scarborough and Dobrich 1990, Richman et al. 1982, Silva et al. 1987) (table 5 and 7). These all point to reduced reading skills at age 7 or 8 years among those with earlier language delay (whether or not that oral language delay has resolved). Language only—receptive delay only Only one study sought to identify a group of children with receptive language delay in the absence of expressive delays (Silva et al. 1982). Although it did prove possible to identify such a group the fact that only two of 23 children continued to have problems rather suggests that such children are not at obvious risk for persistent language diYculties. Nevertheless 46% of this group did go on to have reduced IQ and reading performance at age 7. This also suggests that this is a lower risk group than the expressive delay only group if language measures are the target outcome but of comparable risk when other areas of development are taken into consideration. The risk of persistent problems would appear to be much lower than that for the expressive/receptive delay group. Summary of major ndings Inevitably such a review is a retrospective exercise and it is only possible to use the classi cation framework adopted by the authors concerned. Combining data across studies can be problematic because of potential incompatibility. Nevertheless there are major diVerences in the range of persistence reported. In the main the gures for speech delay appear to be more variable than those for language delay. Equally there is considerable variability in the persistence of speech plus language delay in untreated samples. The highest level of persistence is that for expressive and receptive language delay, in turn suggesting that it is this group which is recognized and best de ned in terms of its case status. Prevalence and natural history of primary speech and language delay 179 Discussion At one level there does seem to be a measure of consensus emerging regarding the prevalence of speech and language delay. The median of 5.95% quoted here corresponds well with the 5.59% that was the earliest estimate quoted for speech impairment (Blanton 1916) and with the 6% found by Provonost (NINDS 1969) in a large-scale study (87 288 subjects) identifying speech and language impairments in New England (both cited in Healey et al. 1981). Neither of these works was included in the present synthesis for methodological reasons. Of course reliability does not necessarily imply validity and it is unclear whether the de nitions adopted are comparable from a vertical (severity of delay) or a horizontal (range of included measures) perspective. However, this level of convergence suggests that some agreement may be emerging as to what level of diYculty should be considered a clinical problem. In few of the studies reviewed was any attempt to monitor explicitly persistent schooling diYculties. Nevertheless it might reasonably be argued that the level set corresponds to a clinical understanding about which children are most ‘at risk’. These estimates suggest that speech and language delay in early childhood is a common problem which can have important implications for the child. Whether such delay is, in fact, a suYciently important health issue to merit population screening is addressed elsewhere (Law et al. 1998). Even a moderate or mild delay may cause appreciable concern to parents and other carers. For a high prevalence condition such as speech and language delay the priority in terms of the screening process is to prevent the identi cation of false positives (maximize speci city) while maximizing the proportion of true positives (sensitivity). Acceptable levels of accuracy of such measures are discussed in Law et al. (1998) but their validity will, in the end, depend on the relative costs of over and under identi cation. There are also social and economic costs associated with diVerent outcomes but these have yet to be explored. The natural history data suggest that a predictive element needs to be built into any such programme. To date no such model exists. The present review indicates that at least two common beliefs are not based on evidence from existing prevalence data. There is little evidence to suggest declining prevalence across the 0–16-year age range, with the possible exception of a sharp drop in expressive delays after 2 years, and there are no data to suggest that the prevalence has increased over the past 30 years. This apparent stability appears to contradict the current understanding of the progression of developmental language delays. In particular, the natural history data suggest that for many children (~ 60% of children with early expressive delays, perhaps 25% with expressive/receptive delays) diYculties resolve. The explanation for the stability comes from the relationship between prevalence and incidence. If the increase in the number of cases corresponds to the decrease resulting from spontaneous remission the prevalence will remain static. Clearly the key to good prevalence data is longitudinal rather than cross-sectional data. There is no evidence which would suggest that there is a real increase in cases during the 30 years covered by the review (1967–97) despite the fact that many report increased rates of referral and educational placement for these children particularly in the early school years ( Jowett and Evans 1996, Reid et al. 1996, SOEID 1996). This suggests that the estimation of prevalence and the demands made on services are not necessarily equivalent. Indeed, referral is likely to be a function of the level of service provision and cannot be equated with 180 J. Law et al. prevalence. Only with a greater understanding of the longitudinal picture of cases can prevalence be established. The present data set included studies where replicable criteria for case status were provided. In the majority of cases this involved the application of standardized assessment measures. The advantage of this approach is that the resulting data are easier to compare with those from other studies. The disadvantage is that there is an implicit circularity in setting a cut-oV on a measure that has been developed on a normal population. Identical cut-oVs on standardized measures applied at diVerent ages will necessarily result in the same proportion of the population being identied. Any discrepancy between the proportion of the population predicted from the cut-oV adopted and that found in the prevalence study can better be explained by the potential diVerences between the two populations than by diVerences in the rate of true cases in the population studied. Cut-oVs tend to follow psychometric convention without any direct attempt to link them with clinical judgement. Clinical judgement of case status may also be problematic because it is likely to relate to availability of services and to the anticipated response to therapy but, as Tomblin et al. (1997) found, it is likely to be closely related to psychometric norms and an explicit link between the two may result in higher clinical validity. Stronger support for prevalence estimates would be indicated if the rates derived by clinical judgement of case status showed agreement with those rates derived using conventional cut-oV scores. The review was charged with the identi cation of gaps in the literature. Apart from the lack of clarity regarding the level of case status there is one major omission in the data. The one important group that has been almost completely ignored in the available prevalence literature related to speech and language disorders is the child with pragmatic impairments. It is true that this group has only relatively recently been identi ed as communication impaired and it is true that many such children would be subsumed into the autistic continuum. However, given the intransigent nature of many of the problems experienced by these children (Rutter et al. 1992), it is important to know whether they have been incorporated into the other gures or omitted as a distinct non-linguistic category. The evidence suggests that our understanding of the natural history of speech and language delay, and thus the de nition, of case status remains something of an imprecise science at present. Improving prediction should help resolve issues of case de nition by demonstrating what would happen if the children were not treated. Good and poor outcome groups could then be traced back to the point at which they were initially identi ed and relative risk assigned to individual pro les. Furthermore, it should be possible to establish what has been referred to as the ‘critical point’ in the progress of the disability before which therapy is either more eVective or easier to apply than afterwards (Sackett et al. 1991). This review con rms the conclusion reached by a number of other studies that children with language delays which incorporate both expressive and receptive skills present the clearest picture. These children are likely to nd it diYcult to process incoming language, to initiate communication with others and to formulate their responses appropriately. Accordingly they are less likely to compensate for their diYculties and are most likely to nd diYculty in coping with the demands of school. Many of these studies report persistence over comparatively short periods. While this may be interesting in its own right it does not help address the more central question of whether speech and language delays are likely to have a lasting Prevalence and natural history of primary speech and language delay 181 impact on the child. If the condition is broadly benign it might be anticipated that persistence would decline the longer the follow-up period. Alternatively it may be that the condition left untreated leads to a wide range of social and educational sequelae. These issues were not fully explored in the present synthesis but are highly relevant to the area of speech and language delay. Whether these oral language delays have resolved, multiple educational and social diYculties are noted for children who had earlier speech/language delays. Of early expressive language delayed children, 41–75% showed reading problems at age 8 years. This nding is con rmed by the larger body of literature from follow-up studies of children who may have received intervention for speech and language delay and who were subsequently identi ed as being dyslexic (e.g. Catts 1991). Indeed there is also the issue of the prognosis for children with speech and language delays who have received intervention. Follow-up studies tend to paint a more negative picture of outcomes for children whose diYculties do not resolve by school entry (Petrie 1975, Bishop and Edmundson 1987, Cook et al. 1988, Huntley et al. 1988, Urwin et al. 1988, Haynes and Naidoo 1991, Davison and Howlin 1997). The ndings reveal that in addition to continuing problems in verbal language, reading, spelling and other aspects of educational attainment can also be aVected (Aram and Nation 1980, Stark et al. 1984, Bishop and Adams 1990, Catts 1991, Morris-Friehe and Sanger 1994, Tallal et al. 1997) together with behaviour and other aspects of psycho-social adjustment (Baker and Cantwell 1987, Silva et al. 1987, Beitchman et al. 1989, 1996). While the problems are more marked for those children whose language diYculties are associated with low intellectual ability or those with both receptive and productive language diYculties, those with primary delay can also experience marked long-term diYculties which may persist into adulthood (Haynes and Naidoo 1991). How does this rather more negative picture square with the natural history data reported above? The main reason for this diVerence is that these follow-up studies in the main report data on children who are already con rmed ‘clinical’ cases and are, therefore, already in the highest risk group. By including only the most severe cases they may overestimate the negative sequelae of the population of speech and language delayed children as a whole. Given the level of persistence in the treated groups it might be more appropriate to argue that spontaneous improvement in this group is no longer the central issue. Rather the outcome should be how well the children can cope with the disabilities associated with their impairments. The nature of the concept of natural history needs to be addressed. While it may be technically possible to provide no input at all, in reality the needs of the children contribute to the world around them. They receive help from parents, from teachers and nursery staV, which may re ect those needs. Intervention of this type may not be comparable to specialist speech and language input but it is diYcult to say that it has no eVect. The term ‘natural history’ is derived from less contextdependent interventions. Thus, a drug regimen may be said to be in place or not. If it is not, it may be possible to follow the course of the disease. However, it is not clear that the same is true of speech and language delays especially in children who, as a group, may be paid particular attention by all those around them by virtue of those very diYculties. A second diYculty in interpretation is the ethical implications of withholding intervention in order to carry out such studies. While this may have been true when no support services were available, for example for the children in the Dunedin study (Silva et al. 1987), this is now not the case for 182 J. Law et al. most children in the English-speaking world at least. On what grounds would it be reasonable to withhold intervention for the sort of length of time such that it would be meaningful to ascertain natural history? It has been suggested that 3 years is an appropriate period over which to measure change (Aylward 1988). To a certain extent the answer to this question depends on what is known about intervention. On the one hand if one cannot show that intervention is eVective it may be that natural history is a reasonable concept. On the other hand if bene ts can be demonstrated then the case for withholding treatment over the long-term recedes. The intervention data reviewed in the present synthesis (Boyle at al. 1999) has suggested that there is a good case for assuming that intervention can be shown to be bene cial although the data set remains comparatively small. This has led some commentators to suggest that a distinction be drawn between routine intervention and specialist intervention (Shriberg 1994, Shriberg et al. 1994). Community provision of speech and language therapy could constitute natural history in the sense that children receive other services within the healthcare system, for example the services of a family doctor or a community nurse. Such services are often of a lower level of intensity than specialist provision and might reasonably be construed as the normal activities of social or health services. However, such an approach does not really help overcome the essentially rather arbitrary distinctions that are drawn between diVerent levels of input. Where does specialist intervention begin—once children have been referred to specialist provision? The move towards integrating children into their local community schools may mean that fewer children would be receiving the kind of speci c intervention that would preclude them from involvement in natural history studies of the kind reviewed here. A model of predictive risk Interpretation of the data is dogged by an uncertainty as to which children are truly cases in the sense that without intervention they would necessarily go on to experience problems associated with early speech and language delay in school and beyond. This issue is complicated further by social constructs of ‘normality’ and the diYculty in establishing a true ‘gold standard assessment’ with a theoretically determined de nition of case status. An alternative approach is to move away from the model adopted in the above studies and towards a model of predicted risk. This would involve identifying the key predictive factors of subsequent case status and to build them into a model that identi es an ‘at risk’ group. This is a relatively new concept in the eld of speech and language therapy, although see Tomblin et al. (1997) for some early work in quantifying that risk with relation to SLI. Much depends on the outcome measures adopted. It seems likely that speech and language measures alone will not provide a meaningful outcome and literacy and other educational measures will need to be built into such a model ( Beitchman et al. 1996, Drillien et al. 1988, Glascoe 1999). Such an approach would circumvent the need to carry out separate prevalence studies in bilingual or disadvantaged populations but it would mean gathering data, which included representative populations, rather than attempting to model outcomes on a discrete clinical population as is often the case. It would not be a matter of reporting, as most authors have already done, that males are over-represented in this group of children but of establishing to what extent gender is a risk factor in diVerent subcategories of the population. The advantage of such an approach is that it could make use of the available genetic Prevalence and natural history of primary speech and language delay 183 ( Bishop et al. 1995), and intervention data (Boyle et al. 1999a, b) along with data regarding co-morbidity such as behaviour (Stevenson 1996) as it becomes available to modify and improve the model. Similarly such a model could begin to test the ndings of contributory factors identi ed in other studies. EVectively a model based on risk would be one that embraced the variance in the population. Early attempts to model the progression of speech and language disorder have relied heavily on regression analysis (Schery 1985) and have often included essentially normal middle class children in their samples (Bee et al. 1982). However, there is a need to look at the capacity of independent variables to discriminate clinical outcomes. To date there have been a great many studies examining the association between other factors and language delay/disorder (Lassman et al. 1980). But with three exceptions little attention had been paid to the risk attached to these associations (Drillien et al. 1988, Tomblin et al. 1997, Paul and Fountain 1999). Tomblin et al., in a retrospective study, adopted the odds ratio metric, a means of ascertaining the level of association between the occurrence of a particular condition and a particular risk factor. The odds of having an exposure to a given risk among clinical cases is contrasted with the odds of having it in the normal population. They calculated the odds ratios associated with case status of SLI at 5 years and found that paternal rather than maternal history of speech and language diYculties and paternal smoking during foetal development, were particularly associated with speci c language diYculties. Conversely breast-feeding and length of breast-feeding functioned as a protective factor eVectively reducing the risk of SLI. Paul and Fountain (1999) examined a small prospective cohort of thirty six children with speci c expressive language delays identi ed at 20–34 months and followed up to 7 years of age. Using discriminant function analysis, they found that the best predictors of outcome were socio-economic status, parent report of adaptive behaviour and gross motor skills. Although case status in these studies was comparable (10th centile) the measures diVered as did the variables put into the analyses which were in themselves based upon diVerent statistics. In neither case was the level of intervention factored into the analysis. It is also possible to look at speech and language delays as markers for other conditions (Whitehurst et al. 1991). Drillien et al. (1988), for example, used early language performance as predictors of subsequent language and educational development. They examined the relationship between developmental screening results in ve developmental areas and their association with subsequent educational and language performance. They reported signi cant odds ratios of 1.60 (CI 1.12–2.27) to 2.44 (1.5–3.97) for the association between all aspects of developmental screening and subsequent expressive language diYculties but found much weaker associations between early expressive language diYculties and subsequent school performance. This corresponds well with the relatively low level of persistence of expressive diYculties reported in the present review. Each of these associations was calculated on the basis of the individual child’s performance at developmental screening checks between 20 weeks and 5 years. Like Paul and Fountain they emphasized the saliency of adaptive behaviour and motor skills in the predictive model. The aim of such analyses would be to provide the clinician with the tools to say that, given the following presenting symptoms, and predisposing and ameliorating factors the following outcome could be predicted given the following intervention. For example, it might be possible to say that a child with an expressive delay has an even chance of developing persistent eVects either in language or in 184 J. Law et al. other skills. These odds may lengthen if negative social interaction and medical factors are included but may shorten if short-term intervention of known impact is added into the equation. For example, Boyle et al. (1999a, b) indicate a mean eVect size of the order of 1 SD for relative short-term ‘dosage’ interventions for expressive language delays in the pre-school years. Conversely, it might be possible to say that a child with an expressive/receptive diYculty may have a seventy ve percent chance of developing associated diYculties, for example a risk which may be increased if other contributory factors are added but which may not be alleviated with short-term intervention. Finally there may be diVerential eVects of impairments and environmental eVects across time. For example, there is some suggestion that for children with motor diYculties the motor skills serve as good predictors in the short-term but the environment plays as a more substantial role as the child gets older (Cohen et al. 1986). The role of social class is clearly one that needs to be examined with care. Across a population it is likely to operate as a proxy variable accounting for a considerable proportion of the variance. However, in doing so it is likely to mask the eVects of other components in any analysis, components, which may be more amenable to modi cation at the level of the individual. Reviews of the literature often end with a call for more research. The conclusion drawn here is that while health service providers may wish for measures of prevalence upon which to base services it would be premature to call for more prevalence data. More population estimates may be warranted particularly in relation to the needs of local populations but it is the risk not the single prevalence estimate towards which progress most needs to be made. There is a need for large-scale cohort studies to tease out the relationship between the component parts of the equation across time and to establish a better understanding of the incidence of speech and language delays. More attention needs to be paid to the risk and protective factors and the extent to which they are in uenced by confounding variables and their interactions. Conclusions This paper has demonstrated the close association between prevalence and natural history. Appreciation of one cannot follow without a good understanding of the other. Ultimately the issue revolves around the concepts of prediction and relative risk. Such issues can only be properly dealt with using longitudinal population data and with clinical validation of psychometric classi cation. Acknowledgements The NHS Centre commissioned this systematic review for Reviews and Dissemination at the University of York. F. H. received support from the Department of Clinical Communication Studies, City University, London, in the latter part of this project. The authors also thank Christopher Norris, Biomedical Research Indexing, and two anonymous reviewers of the paper. References Aram, D. M. and Nation, J. E., 1980, Pre-school language disorders and subsequent language and academic diYculties. Journal of Communication Disorders, 13, 159–170. Prevalence and natural history of primary speech and language delay 185 Aylward, G. P., 1988, Issues in prediction and developmental follow-up. Developmental and Behavioural Pediatrics, 9, 307–308. Baker, L. and Cantwell, D. P., 1987, A prospective psychiatric follow-up of children with speech/ language disorders. Journal of the American Academy of Child and Ad olescent Psychiatry, 26, 546–553. Bee, H., Barnard, K. E., Eyres, S. J., Gray, C. A., Hammond, M. A., Spietz, A. L., Snyder, C. and Clark, B., 1982, Prediction of IQ and Language skills from perinatal status, child performance, family characteristics and mother–infant interaction Child Development, 53, 1134–1156. Beitchman, J. H., Hood, J., Rochon, J. and Peterson, M., 1989, Empirical classi cations of speech/ language impairment in children: II. Behavioral characteristics. Journal of the American Academy of Child and Ad olescent Psychiatry, 28, 118–123. Beitchman, J. H. Wilson, B. Brownlie, E. B., Walters, H. and Lancee, W., 1996, Long term consistency in speech/language pro les: 1. Developmental and academic outcomes. Journal of the American Academy of Child Psychiatry, 35, 804–825. Bishop, D. and Adams, C., 1990, A prospective study of the relationship between speci c language impairment, phonology and reading retardation. Journal of Child Psychology and Psychiatry, 31, 1027–1050. Bishop, D. V. M. and Edmundson, A., 1987, Language-impaired four year olds: distinguishing transient from persistent impairment. Journal of Speech and Hearing Disorders, 52, 156–173. Bishop, D. V. M., North, T. and Donlan, C., 1995, Genetic basis of speci c language impairment: evidence from a twin study. Developmental Medicine and Child Neurology, 37, 56–71. Blanton, S., 1916, A survey of speech defects. Journal of Educational Psychology, 7, 581–592. Blum-Harasty, J. and Rosenthal, J., 1992, The prevalence of communication disorders in children: a summary and critical review. Australian Journal of Human Communication Disorders, 20, 63–80. Boyle, J., Law, J., Harris, F., Harkness, A. and Nye, C., 1999a, The eVectiveness of intervention approaches for primary speech and language delays: a systematic review of outcomes from group design studies. Journal of Speech language and Hearing Research (submitted). Butler, J., 1989, Child Health Surveillance in Primary Care: A Critical Review (London: HMSO). Catts, H. W., 1991, Early identi cation of dyslexia: evidence from a follow-up study of speechlanguage impaired children. Annals of Dyslexia, 41, 163–177. Cohen, S. E., Parmalee, A. H., Beckwith, L. and Sigman, M., 1986, Cognitive development in preterm infants: birth to 8 years. Journal of Developmental and Behavioural Pediatrics, 7, 102–110. Cook, J., Urwin, S. and Kelly, K., 1988, Pre-school language intervention—a follow-up of some within-group diVerences. Child: Care, Health and Development, 15, 381–400. Davison, F. M. and Howlin, P., 1997, A follow-up study of children attending a primary-age language unit. European Journal of Disorders of Communication, 32, 19–36. Drillien, C. and Drummond, M., 1983, Developmental Screening and the Child with Special Needs (London: Heinemann). Drillien, C. M., Pickering, R. M. and Drummond, M. B., 1988, Predictive value of screening for diVerent areas of development. Developmental Medicine and Child Neurology, 30, 294–305. Fundudis, T., Kolvin, I. and Garside, R. F., 1979, Speech Retarded and Deaf Children: Their Psychological Development (London: Academic Press). Glascoe, F., 1999, Toward a model for an evidenced based approach to development/behavioural surveillance, promotion and patient education based on parents. concerns. Developmental and Behavioural Pediatrics (submitted). Gordis, L., 1996, Epidemiology (London: W. B. Saunders), pp. 77–89. Haynes, C. and Naidoo, S., 1991, Children with Speci c Speech and Language Impairment (Oxford: Blackwells). Healey, W. C., Ackerman, B. L., Chappell, C. R., Perrin, K. L. and Stormer, J., 1981, The Prevalence of Communicative Disorders: A Review of the Literature (Rockville: American Speech and Hearing Association). Hull, F. M., Mielke, P. W., Timmons, R. J. and Willeford, J. A., 1971, The national speech and hearing survey. ASHA, 13, 501–509. Huntley, R. M. C., Holt, K., Butterfill, A. and Latham, C., 1988, A follow-up study of a language intervention programme. British Journal of Disorders of Communication, 23, 127–140. Jowett, S. and Evans, C., 1996, Speech and Language Therapy Services for Children (Slough: NFER). Klackenburg, G., 1980, What happens to children with retarded speech at three? Acta Paediatrica Scandanivica, 69, 681–685. Lassman, F. M., Fisch, R. O., Vetter, D. K. and LaBenz, E. S., 1980, Early Correlates of Speech Language and Hearing (Littleton: PSG). 186 J. Law et al. Law, J., Boyle, J., Harris, F., Harkness, A. and Nye, C., 1998, Screening for Speech and Language Delay: A Systematic Review of the Literature. Health Technology Assessment, 2(9). Law, J., Boyle, J., Harris, F., Harkness, A. and Nye, C., 1999, Screening children for speech and language delay—the results of a systematic review of the literature. Developmental Medicine and Child Neurology (submitted). Leske, M., 1981, Prevalence estimates of communicative disorders in the US. ASHA Journal, 23, 217–225. MacKeith, R. and Rutter, M., 1972, A note on the prevalence of language disorders in young children. In M. Rutter and J. Martin (eds), The Child with Delayed Speech (London: Heinemann), pp. 48–52. Morley, M., 1965, The Development and Disorders of Speech in Childhood (Edinburgh: E&S Livingstone). Morris-Friehe, M. and Sanger, D., 1994, Follow-up of children at risk for language problems. Journal of Communication Disorders, 27, 241–256. Mosciki, E., 1984, The prevalence of ‘incidence’ is too high. Journal of the American Speech and Hearing Association, 26, 39–40. National Health Service Centre for Reviews and Dissemination, 1996, Undertaking Systematic Reviews of Research on EVectiveness. Report no. 4 (York: NHS University of York, Centre for Reviews and Dissemination). National Institute of Neurological Disease and Stroke, National Institutes of Health, US Department of Health, Education and Welfare, 1969, Human Communication and its Disorders: An Overview. Monograph no. 10 (Bethesda: NINDS). Paul, R., 1993, Patterns of development in late talkers. Journal of Childhood Communication Disorders, 15, 7–14. Paul, R. and Fountain, R., 1999, Predicting outcomes of early expressive language delay (submitted). Peckham, C. S., 1973, Speech defects in a national sample of children aged seven years. British Journal of Disorders of Communication, 8, 2–8. Petrie, I., 1975, Characteristics and progress of a group of language disordered children with severe receptive diYculties. British Journal of Disorders of Communication, 10, 123–133. Records, N. L. and Tomblin, J. B., 1994, Clinical decision making: describing the decision rules of practising speech-language pathologists. Journal of Speech and Hearing Research, 37, 144–156. Reid, J., Millar, S., Tait, L., Donaldson, M., Dean, E., Thomson, G. and Grieve, R., 1996, Pupils with Special Education Needs: The Role of Speech and Language Therapists (Edinburgh: SCER). Robson, C., 1993, Real World Research: A Resource for Social Scientists and Practitioner-Researchers (Oxford: Blackwells). Rutter, M., Mahwood, L. and Howlin, P., 1992, Language delay and social development. In P. Fletcher and D. Hall (eds), Speci c Speech and Language Disorders in Children (London: Whurr). Sackett, D. L., Haynes, R. B., Guyatt, G. H. and Tugwell, P., 1991, Clinical Epidemiology: A Basic Science for Clinical Medicine, 2nd edn (London: Little Brown). Schery, T. K., 1985, Correlates of language development in language disordered children. Journal of Speech and Hearing Disorders, 50, 73–83. The Scottish Office: Education and Industry Department, 1996, The Education of Pupils with Language and Communication Disorders (Edinburgh: HMSO). Shaywitz, S. E., Shaywitz, B. A., Fletcher, J. M. and Escobar, M. D., 1990, Prevalence of reading disability in boys and girls: results of the Connecticut longitudinal study. Journal of the American Medical Association, 264, 998–1002. Sheridan, M. D. and Peckham, C., 1975, Follow-up at 11 years of children who had marked speech defects at 7 years. Child: Care, Health and Development, 1, 157–166. Shriberg, L. D., 1994, Developmental phonological disorders: moving towards the 21st century— forwards, backwards or endlessly sideways? American Journal of Speech-Language Pathology: A Journal of Clinical Practice, 3, 26–28. Shriberg, L. D., Gruber, F. A. and Kwiatkowski, J., 1994, Developmental phonological disorders. III: Long-term speech sound normalization. Journal of Speech and Hearing Research, 37, 1151–1177. Silva, P. A., Williams, S. M. and McGee R., 1987, A longitudinal study of children with developmental language delay at age three: later intelligence, reading and behaviour problems. Developmental Medicine and Child Neurology, 29, 630–640. Stackhouse, J. and Snowling, M., 1992, Developmental verbal dyspraxia. II: A developmental perspective on two case studies. European Journal of Disorders of Communication, 27, 35–54. Stark, R. E., Bernstein, L. E., Condino, R., Bender, M., Tallal, P. and Catts, H., 1984, Fouryear follow-up study of language impaired children. Annals of Dyslexia, 34, 49–68. Prevalence and natural history of primary speech and language delay 187 Stevenson, J., 1996, Developmental changes in the mechanisms linking language disabilities and behaviour disorders. In J. H. Beitchman, N. J. Cohen, M. M. Konstantareas and R. Tannock (eds), Language, Learning and Behavior Disorders: Developmental, Biological and Clinical Perspectives (Cambridge: Cambridge University Press). Stothard, S. E., Snowling, M. J., Bishop, D. V. M., Chipchase, B. B. and Kaplan, C. A., 1998, Language impaired pre-schoolers: a follow-up into adolescence. Journal of Speech, Language and Hearing Research, 41, 407–418. Tallal, P., Allard, L., Miller, S. and Curtiss, S., 1997, Academic outcomes of language impaired children. In C. Hulme and M. Snowling (eds), Dyslexia: Biology Cognition and Intervention (London: Whurr). Tomblin, J. B. Smith, E. and Zhang, X., 1997, Epidemiology of speci c language impairment: prenatal and peri-natal risk factors. Journal of Communication Disorders, 30, 325–344. Urwin, S., Cook, J. and Kelly, K., 1988, Pre-school language intervention: a follow-up study. Child: Care, Health and Development, 14, 127–146. Whitehurst, G. J., Fischel, J. E., Lonigan, C. J., Valdez-Menchaca, M. C., Arnold, D. S. and Smith, M., 1991, Treatment of early expressive language delay: if, when and how. Topics in Language Disorders, 11, 55–68. Appendix 1: Studies meeting the criteria for inclusion in the prevalence domain of the review Bax, M., Hart, H. and Jenkins, S., 1980, The health needs of the pre-school child (unpublished). Bax, M., Hart, H. and Jenkins, S., 1983, The behaviour, development and health of the young child: implications for care. British Medical Journal, 286, 1793–1796. Beitchman, J. H., Nair, R., Clegg, M., Patel, P. G., Ferguson, B., Pressman, E. et al., 1986, Prevalence of speech and language disorders in 5-year-old kindergarten children in the OttawaCarleton region. Journal of Speech and Hearing Disorders, 51, 98–110. [Erratum, 1987, 52, 94. ] Burden, V., Stott, C. M., Forge, J. and Goodyer, I., 1996, The Cambridge Language and Speech Project (CLASP). 1: Detection of language diYculties at 36–39 months. Developmental Medicine and Child Neurology, 38, 613–631. Dudley, J. G. and Delage, J., 1980, Incidence des troubles de la parole et du langage chez les enfants franco-quebecois. Communication Humaine, 5, 131–142. Harasty, J. and Reed, V. A., 1994, The prevalence of speech and language impairment in two Sydney metropolitan schools. Australian Journal of Human Communication Disorders, 22, 1–23. Kirkpatrick, E. and Ward, J., 1984, Prevalence of articulation errors in New South Wales primary school pupils. Australian Journal of Human Communication Disorders, 12, 55–62. Paul, T. J., Desai, P. and Thorburn, M. J., 1992, The prevalence of childhood disability and related medical diagnoses in Clarendon, Jamaica. West Indian Medical Journal, 41, 8–11. Randall, D., Reynell, J. and Curwen, M., 1974, A study of language development in a sample of 3 year old children. British Journal of Disorders of Communication, 9, 3–16. Rescorla, L., Hadicke-Wiley, M. and Escarce, E., 1993, Epidemiological investigation of expressive language delay at age two [Special Issue: Language Development in Special Populations]. First Language, 13, 5–22. Richman, N., Stevenson, J. and Graham, P. J., 1982, Preschool to school: A Behavioural Study (London: Academic Press). Silva, P. A., 1980, The prevalence, stability and signi cance of developmental language delay in preschool children. Developmental Medicine and Child Neurology, 22, 786–777. Silva, P. A., McGee, R. and Williams, S. M., 1983, Developmental language delay from three to seven years and its signi cance for low intelligence and reading diYculties at age seven. Developmental Medicine and Child Neurology, 25, 783–793. Stevenson, J. and Richman, N., 1976, The prevalence of language delay in a population of threeyear-old children and its association with general retardation. Developmental Medicine and Child Neurology, 18, 431–441. Stewart, J. M., Hester, E. J. and Taylor, D. L., 1986, Prevalence of language, speech and hearing disorders in an urban preschool black population. Journal of Childhood Communication Disorders, 9, 107–123. Thorburn, M. J., Desai, P. and Durkin, M., 1991, A comparison of eYcacy of key informant and 188 J. Law et al. community survey methods in the identi cation of childhood disability in Jamaica. Annals of Epidemiology, 1, 255–261. Tomblin, J. B., Records, N., Buckwalter, P., Zhang, X., Smith, E. and O’Brien, M., 1997, Prevalence of speci c language impairment in kindergarten children. Journal of Speech Language and Hearing Research, 40, 1245–1260. Tuomi, S. and Ivanoff, P., 1977, Incidence of speech and hearing disorders among kindergarten and grade 1 children. Special Education in Canada, 51, 5–8. Warr-Leeper, G. A., McShea, R. S. and Leeper, H. A. Jr, 1979, The incidence of voice and speech deviations in a middle school population. Language, Speech and Hearing Services in Schools, 10, 14–20. Wong, V., Lee, P. W. H., Lieh-Mak, F., Yeung, C. Y., Leung, P. W. L., Luk, S. L. and Yiu, E., 1992, Language screening in preschool Chinese children. European Journal of Disorders of Communication, 27, 247–264. Appendix 2: Studies meeting the criteria for inclusion in the natural history domain of the review Bralley, R. C. and Stoudt, R. J., 1977, A ve year longitudinal study of development of articulation. Language, Speech and Hearing Services in Schools, 8, 176–180. Felsenfeld, S., Broen, P. A. and McGue, M., 1992, A 28-year follow-up of adults with a history of moderate phonological disorder: linguistic and personality results. Journal of Speech and Hearing Research, 35, 1114–1125. Fiedler, M. F., Lenneberg, E. H., Rolfe, U. T. and Drorbaugh, J. E., 1971, A speech screening procedure with three-year-old children. Pediatrics, 48, 268–276. Hall, N. E., 1996, Language and uency in child language disorders: changes over time. Journal of Fluency Disorders, 21, 1–32. Hall, N. E., Yamashita, T. S. and Aram, D. M., 1993, Relationship between language and uency in children with developmental language disorders. Journal of Speech and Hearing Research; 36, 568–579. Klee, T., Carson, D. K., Gavin, W. J., Hall, L., Kent, A. and Reece, S., 1998, Concurrent and predictive validity of an early language screening program. Journal of Speech, Language and Hearing Research, 41, 627–642. Renfrew, C. E. and Geary, L., 1973, Prediction of persisting speech de cit. British Journal of Disorders of Communication, 8, 37–41. Rescorla, L. and Schwartz, E., 1990, Outcome of toddlers with speci c expressive language delay. Applied Psycholinguistics, 11, 393–407. Richman, N., Stevenson, J. and Graham, P. J., 1982, Preschool to School: A Behavioural Study (London: Academic Press). Scarborough, H. S. and Dobrich, W., 1990, Development of children with early language delay. Journal of Speech and Hearing Research, 33, 70–83. Silva, P. A., 1980, The prevalence, stability and signi cance of developmental language delay in preschool children. Developmental Medicine and Child Neurology, 22, 768–777. Silva, P. A., McGee, R. and Williams, S. M., 1983, Developmental language delay from three to seven years and its signi cance for low intelligence and reading diYculties at age seven. Developmental Medicine and Child Neurology, 25, 783–793. Silva, P. A., Williams, S. M. and McGee, R., 1987, A longitudinal study of children with developmental language delay at age three: later intelligence, reading and behaviour problems. Developmental Medicine and Child Neurology, 29, 630–640. Stevenson, J. and Richman, N., 1976, The prevalence of language delay in a population of threeyear-old children and its association with general retardation. Developmental Medicine and Child Neurology, 18, 431–441. Thal, D. J. and Bates, E., 1988, Language and gesture in late talkers. Journal of Speech and Hearing Research, 31, 115–123. Thal, D. J. and Tobias, S., 1992, Communicative gestures in children with delayed onset of oral expressive vocabulary. Journal of Speech and Hearing Research, 35, 1281–1289. Thal, D. J., Tobias, S. and Morrison, D., 1991, Language and gesture in late talkers: a one year follow-up. Journal of Speech and Hearing Research, 34, 604–612. Ward, S., 1992, The predictive validity and accuracy of a screening test for language delay and auditory perceptual disorder. British Journal of Disorders of Communication, 27, 55–72.