Model-Based Estimates of Income for Wards in England and Wales, 2001/02 User Guide Summary Consultation with users has indicated that there is a widespread requirement to have better information on income at the small area level. Following the Government's decision not to include an income question in the 2001 Census, the Office for National Statistics (ONS) took the lead in exploring alternative measures for the provision of income data for small areas. The Small Area Income Estimates Project was established with the aim of producing sets of ward-level estimates of average household income using the modelling techniques developed by the ONS Methodology Directorate. Estimates and confidence intervals for the values of average ward income (1998/99) for all wards in England and Wales have been produced based on 1998 ward boundaries. These were released as experimental1 statistics on the ONS website in TP PT February 2004 (Longhurst et al (2004)). One of the main limitations of the estimates highlighted by users concerned the relevance of the estimates in terms of making comparisons with the 2001 Census data – as they relate to 1998/99 and are based on 1998 ward boundaries. A Neighbourhood Statistics (NeSS) funded project was established to improve the model and make it comparable with the 2001 Census by using more recent sources of covariate2 and TP PT survey data. Estimates of average weekly household income have been produced for 2001/02 based on Census Area Statistics (CAS) wards for 2003. These have been rounded to the nearest £10. Estimates are produced for the following four income types: • total3 household weekly income (unequivalised); • net household weekly income (unequivalised); • net household weekly income before housing costs (equivalised); and TP PT 1 The term ‘experimental’ is applied to any set of ONS statistics that do not yet meet the rigorous quality standards of National Statistics and/or maybe subject to change due to methodological development. 2 A variable (or variables) that is either known or can be estimated relatively accurately and is then used in the estimation of other variables. 3 Total income is consistent with the gross income measure used in the 1998/99 model-based estimates. For more information on definitions see appendix A. TP PT TP PT TP PT 1 • net household weekly income after housing costs (equivalised). Note equivalised income means that the household income values have been adjusted to take into consideration household size and composition. The release of a second more up-to-date series of model-based average weekly household income estimates for wards raises the issue of comparability with the 1998/99 estimates. The 2001/02 estimates are not comparable with the 1998/99 estimates. This document summarises the reasons why the 2001/02 estimates are not comparable with the 1998/99 estimates. The 1998/99 estimates have been removed from the public domain and replaced with the 2001/02 estimates to prevent erroneous comparisons. Details of how the estimates have been developed and guidance on the appropriate use of the estimates are also given. The first section of the report provides some background to the project and some guidance on the use of the estimates. The second section provides a brief technical description of the methodology used to produce the model-based estimates. Maps of the ward level estimates for England and Wales are presented in Section 3. It must be noted that these experimental estimates are not calculated in the same way as the national and regional household income data published in June 2004 and April 2005 (ONS (2005)). The output geography and definition of income are different as are the statistical data sources employed. Therefore these two data sets cannot be quantitatively reconciled. 2 1 1.1 Background and Guidance on Use Introduction Consultation with users, including representatives from central and local government, the academic and business sectors, about requirements for information from the 2001 Census underlined the need for a question on income. In particular, the work substantiated the widespread and increasing demand for detailed information at a range of geographical levels. Although the Government recognised the need for information on income, concern about the risks to the conduct of the Census meant that the preferred approach was to assess whether or not requirements could be met by using alternative sources of data. In accordance with proposals set out in the Census White Paper, the Government Statistical Service (GSS) set up a working group to investigate the feasibility of various options including: • using data on the receipt of benefits from the Department of Work and Pensions; and • producing modelled income data. The results of this work were set out alongside findings from the detailed Census programme of research and question testing in a paper circulated to users so they could identify the preferred approach for meeting their requirements. The Government considered this information before they made the decision not to include an income question in the 2001 Census. Following the initial work into the feasibility of producing model-based estimates of income the ONS established a project to implement this approach. Estimates and confidence intervals for average weekly household income for wards in England and Wales, for 1998/99 were released in February 2004. One of the main limitations of the estimates highlighted by users concerned the relevance of the estimates in terms of making comparisons with the 2001 Census data 3 – as they relate to 1998/99 and are based on 1998 ward boundaries. A Neighbourhood Statistics (NeSS) funded project was established to update the model-based estimates and make them comparable with the 2001 Census. The model-based methodology used for the 1998/99 estimates has been used for the 2001/02 estimates. 1.2 Model-based approach The model-based approach is based on finding a relationship between weekly household income (as measured in the Family Resources Survey (FRS)) and covariate information (usually from Census or administrative sources) for the wards that are represented in the Survey. This relationship is then used to provide estimates of average weekly household income for all wards. To ensure that the model-based estimates are consistent with the FRS published estimates at higher geographical levels, the model-based estimates are constrained to the direct estimates for Government Office Regions (GORs) in England and the estimate for the country of Wales. It is important to recognise that the model-based approach gives estimates that are of a different nature from the standard estimates from the FRS. This is because they are dependent upon correctly specifying the relationship between weekly household income and the covariate information. A brief explanation of the methodology is provided in Section 2. 1.3 Guidance on use and limitations of the estimates The main limitation of estimates for small areas, either those estimated directly from responses to surveys or model-based, is that they are subject to variability (see Section 2). ONS has produced confidence intervals associated with the model-based estimates for each ward in order to make the accuracy of the estimates clear (see Section 2.6 for further information). 4 Five further limitations of the estimates must be considered: • the consistency and accuracy of income estimates for other, often larger geographical areas; • the conclusions that may be drawn from the estimates on the overall distribution of income and the ranking of specific areas; • consistency between the estimates for the four different types of income; • consistency with different time periods; and • comparability with 1998/99 estimates. 1.3.1 Consistency and accuracy of estimates for other geographical areas The model-based methodology produces ward-level estimates of average weekly household income. These ward level estimates can be aggregated to provide income estimates for larger geographical areas such as Local Authority Districts (LADs) or regions. However, this method is approximate and hence it is not possible to assess the precision of the aggregated estimates. The model-based methodology has been developed to ensure that the ward estimates are constrained to direct survey estimates from the FRS for GORs in England and the estimate for the country of Wales. For example, the model-based estimates for the wards in Wales when added together correspond to the FRS estimate of average weekly household income for Wales. However, the model-based estimates will not be consistent with FRS estimates of average weekly household income for other geographical levels. The issue of geographic consistency and methods for assessing the accuracy of estimates for areas other than wards will be explored as part of future research. 1.3.2 Distribution and ranking of income levels In common with any ranking based on estimates, care must be exercised in interpreting the ranking of the wards. One needs to take into account the variability of the estimates when using these figures. For example, the confidence interval around the highest ranked ward suggests that the estimate lies among the group of wards 5 with the highest income levels rather than being the ward with the highest average ward income. Estimates for two particular wards can only be described as significantly different if the confidence intervals for those estimates do not overlap. Although these model-based estimates can be used to rank wards by income they cannot be used to make any inferences on the distribution of income levels across the wards. The estimation procedure will tend to shrink estimates towards the average level of income for the whole population, so model-based estimates at each end of the scale tend to be over or under-estimated. Nevertheless estimates can be used to make certain inferences, e.g. the average weekly household income for ward A is greater than the average for ward B (if the appropriate confidence intervals do not overlap). However, making assertions such as x% of wards have an average household income over £y per week is not valid. 1.3.3 Consistency between the four different types of income Estimates have been produced for four different types of income. In some cases slight inconsistencies (when examining the estimates) may occur between the income types for a particular ward, e.g. a ward may have a larger estimate for net income when compared with total income. Although there may be some such inconsistencies the models selected are the best possible to model the general pattern of income over all wards. This reinforces the need to look at the confidence intervals and not just the estimates as the confidence intervals summarise the variability of the estimates caused by the modelling process (see Section 2). 1.3.4 Consistency with different time periods These estimates have been produced on 2003 CAS ward boundaries and therefore cannot be translated onto any other boundary system. Users must be aware of this when using the estimates in any application or drawing conclusions from the data. The estimates are also based on 2001/02 survey data and so are only valid for this period. 6 1.3.5 Comparability with the 1998/99 estimates In order for two sets of model-based estimates to be comparable, the survey and covariate data used in the models should be the same with the exception of the reference time period. In addition the methodology employed to estimate the two sets of estimates should be the same as should the output geographies for the estimates. If these criteria are met, one can say that estimates for the same ward in two different time periods are significantly different if the confidence intervals do not overlap. The methodology for estimating ward-level average income for 2001/02 is exactly the same as that used for the 1998/99 estimates. However, due to the availability of different covariates and also different output geographies, the 2001/02 estimates are not comparable with the 1998/99 estimates. As a result, the 1998/99 estimates have been removed from the public domain and replaced with the 2001/02 estimates to prevent erroneous comparisons. 1.3.6 Examples of data use Given that the model-based estimates are subject to limitations some examples of appropriate and inappropriate uses for the estimates have been produced. 7 1.3.6.1 Ward comparisons When comparing two model-based estimates, one ward may only be said to have a significantly lower or higher average income than another if the confidence intervals for the two wards do not overlap. For example, using Table 1 it may be said that ward C has a significantly lower model-based income estimate than ward A since the 95% confidence intervals do not overlap. However, it would be wrong to say that ward B has a significantly lower model-based income estimate than ward A, since the confidence intervals overlap. 95% confidence intervals for the income estimate Estimate Lower Confidence Upper Confidence Limit Limit Ward A 1660 1310 2120 Ward B 1110 910 1360 Ward C 1080 920 1270 Table 1: Model-based income estimates and confidence intervals for three wards Estimate LAD A 420 LAD B 560 LAD C 770 Table 2: Aggregated model-based income estimates for three LADs Ward level estimates can be aggregated to higher geographical levels. Table 2 provides model-based estimates of the average level of household income for three LADs. Although the income estimate for LAD A may seem a great deal lower than that for LAD C, there are no confidence intervals available for geographies other than wards. This means that we have no measure of the precision of the estimates and therefore cannot say that one aggregated model-based estimate is significantly different to another. 8 1.3.6.2 Ward profiles The model-based ward estimates of income can be used in conjunction with other data sources to build up a profile of a particular ward. Examples are shown below in Table 3. Ward A Ward B Ward C Model-based estimate of total weekly £230 £710 £1230 household income with 95% confidence [200, 280] [600, 840] [950, 1610] Rank of LAD in which ward lies on the Top 15 Upper Bottom 15 Index of Multiple Deprivation, income Quartile interval 1 domain (2004) TP PT % Adults claiming Income Support 37% 3% 2% 3% 29% (2001/02) % Properties in Council Tax band H <1% (2001) Table 3: Ward Profiles 1.4 Results Model-based estimates and their 95% confidence intervals have been produced for 2001/02, for CAS wards (2003) in England and Wales for the following income types: • total household weekly income (unequivalised); • net household weekly income (unequivalised); • net household weekly income before housing costs (equivalised); and • net household weekly income after housing costs (equivalised). Equivalised income means that the household income values have been adjusted to take into consideration the number and type of people in the household; it represents TP 1 The area with a rank of 1 is the most deprived. PT 9 the income level of every individual in the household. Equivalisation is needed in order to make sensible income comparisons between households. For more details on these income definitions see Appendix A. Reliable estimates cannot be produced for a small number of wards. The wards affected are those lying in the City of London and the Isles of Scilly. Instead, estimates have been provided for the LADs in which these wards lie. 1.5 Future plans This project was established to address users’ requirements for more up-to-date information on income levels, for areas that are comparable with outputs from the 2001 Census. The 2001/02 model-based average income estimates on 2003 CAS ward boundaries meet these requirements. However, a limitation is they cannot be compared with the 1998/99 model-based estimates because they are based upon different covariate data sources and different output geographies. In order to optimise the comparability of future series of average income estimates with the 2001/02 estimates a scoping exercise should be carried out to identify and secure the availability of the same survey and covariate data sources. As a result of strong user interest for information on the distribution of income within wards, a separate project has been set up to look at providing estimates of the proportions of households on low incomes within wards. 10 2 Guide to the Methodology This section provides a brief description of the methodology for producing modelbased estimates of average weekly household income at ward level. A full description of the methodology can be obtained by request from ‘spatialanalysis@ons.gov.uk’. For more information on the general small area estimation modelling procedure developed by the ONS, refer to the Small Area Estimation Project (SAEP) Report (Heady et al (2003)). 2.1 How do model-based estimates differ from standard survey estimates? The principal reasoning behind the need for small area estimation is that surveys are designed to provide reliable estimates at national and sometimes regional levels, they are not typically designed to provide estimates for smaller geographical areas. The inevitable result for areas such as wards is that the vast majority will contain no sample respondents at all and hence no direct survey estimates will be possible. In order to provide ward estimates of income using survey data (here the Family Resources Survey (FRS) is used) a model-based approach has been adopted. This methodology is dependent upon the correct specification of the model, the quality and relevance of the input data sources and the fit of the model. The premise behind the model-based methodology is that we can find a relationship between weekly household income, as measured by the FRS, and other covariate sources of information (mainly provided from Census and administrative data) in the sampled wards. We can then ‘borrow strength’ from this relationship to generalise and produce reliable estimates of average household weekly income for all wards. During our research a number of different relationships and sources of information were investigated. modelling. The best sources of information available were selected for We are satisfied that while there are some limitations with our methodology (see Section 1) the models are well specified and the modelling assumptions hold. 11 2.2 Description of the data The survey data: The survey data were obtained from the 2001/02 Family Resources Survey (FRS). The FRS was chosen as the source for this study since it is the survey with the largest sample that includes appropriate questions on total (gross income plus tax credits/benefits) and net income. The covariate data: The covariate data came from the following sources: • the 2001 Census; • 2001/02 Department of Work and Pensions (DWP) benefit claimant counts; • 2001 HM Land Registry dwelling price data; • 2002 Council Tax data; and • Regional/country indicators – GOR indicators (that split England into nine regions) plus a country indicator for Wales. 2.3 Deriving the estimates To derive the required estimates of average weekly household income for wards we: • built a model relating the survey variable to the covariate information for wards covered by the survey; • used the model and covariate data (which are available for all wards) to estimate the average weekly household income for all wards; and • ensured the model-based estimates were constrained to the FRS income estimates at the GOR/country level. The model-based ward estimates of income were aggregated to GOR/country level and comparisons made between these aggregated estimates and the FRS estimates at these levels. The relevant ratios of the FRS estimates to the aggregated model-based estimates at the GOR/country level were then used to scale the model-based ward-level estimates. 12 2.4 The model for income Models have been developed for the purpose of producing small area estimates of income, for England and Wales for each income type. The models defined relate the FRS survey estimate of weekly household income to the following predictors. The covariate data are listed in order of significance: England and Wales • the social class of the ward population; • the composition of households in the ward, e.g. number of children in the household; • regional/country indicators; • the employment status of the ward population; • the proportion of the ward population claiming DWP benefits; and • the proportion of dwellings in each of the Council Tax bands in a ward. The result is average weekly household income by ward. 2.5 Validation of the model A number of diagnostic checks have been used to assess the appropriateness of the models developed for producing ward-level estimates of income. The analysis shows that in general the models are well specified and the assumptions are sound. This provides confidence in the accuracy of the estimates and the associated confidence intervals. In addition the methodology used to produce the model-based estimates has undergone an academic review and been evaluated by the wider academic community. As well as validating the process of making the estimates it is necessary to validate the estimates themselves. Analysis to compare the model–based estimates with other sources of income data was carried out to establish the plausibility of the model-based estimates. These processes have ensured that the methodology and its application are valid, the models developed are the best possible for the data available and the modelbased estimates are plausible. 13 2.6 How precise are the estimates? Each of the estimates is accompanied by a 95% confidence interval. This interval represents ‘uncertainty’ in the modelling process. This means that, assuming the model holds, on average the confidence interval is expected to contain the true value around 95% of the time. For example, if a ward estimate of average weekly household income is £580 and the 95% confidence interval is [£510, £650] we know that 95% of the time the average weekly household income for that ward will fall within this range. 14 3 Maps 3.1 Introduction The model-based ward-level estimates of average weekly household income can be displayed on maps. The interval ranges in each map have been chosen to aid interpretation. 5% of wards are included in the ranges for the highest and lowest income levels. 20% of wards are included in the ranges for the second highest and second lowest income levels. 75% of wards are included in the ranges for income levels nearest the average value. The number of wards in each interval range is shown. Section 3.2 shows the maps of the estimates for the four income types for England and Wales. 3.2 England and Wales Map 1 shows the geographical variation of ward estimates of average total weekly household income (unequivalised) in England and Wales. The map shows that the majority of wards with the highest levels (darkest areas) of total weekly household income are concentrated in the South of England, in and around London. As we move out further from this area the average ward income decreases. Areas of lighter colour, i.e. lower income levels, are common in the South West and North of England. The majority of wards in Wales have average income levels below the national average. South Wales and North East England show the lowest income levels. Maps 2, 3 and 4 display the geographical variation of ward income in England and Wales for the three other income types. The maps show a similar pattern of income distribution as Map 1, although Map 3 and Map 4 show a slightly stronger North South divide. 15 Map 1 16 Map 2 17 Map 3 18 Map 4 19 Appendix A Survey data - income definitions This appendix contains details on the four income types modelled. For more specific information please refer to the survey reports (Dhaneca et al (2003) and Adihetty et al (2003)). Total household weekly income (unequivalised) Total household weekly income is the sum of the gross income of every member of the household plus any income from taxes/benefits such as Working Families Tax Credit. It is calculated as the sum of income from: • earnings (gross); • self-employment; • investments; • disability benefits; • retirement pensions and income support; • other benefits (including tax credits); • other pensions; and • other/remaining sources. Net household weekly income (unequivalised) Net household weekly income (unequivalised) is the sum of the net income of every member of the household. It is calculated using the same components as total income but excludes: • income tax payments; • national insurance contributions; • domestic rates/council tax; • contributions to occupational pension schemes; 20 • all maintenance and child support payments, which are deducted from the income of the person making the payments; and • parental contribution to students living away from home. Net household weekly income before housing costs (equivalised) Net household weekly income before housing costs (equivalised) is composed of the same elements as net household weekly income but is subject to the McClement’s equivalence scale (Adihetty et al (2003)). Applying the equivalence scale adjusts the household income values to take into consideration the number and composition of people in the household; it represents the income level of every individual in the household. Equivalence is needed in order to make sensible income comparisons between households. For example, one household may have 2 adults and 2 children and have a total weekly household income of £300. If this is compared with a household containing just 1 adult who has a total weekly household income of £270, then although the first household has the higher total weekly income it is the second that has the higher standard of living. Although a number of equivalence scales have been developed, the equivalence scale used for the income estimates is the McClement’s scale, which is used across a number of Government surveys. An example of the effect of applying the McClement’s scale is as follows: A single person, a couple and a couple with two children aged four and seven, all have unequivalised net weekly household incomes of £100 before housing costs. After equivalisation, these become £164 (single person); £100 (couple); £72 (couple with children). 21 Net household weekly income after housing costs (equivalised) Net household weekly income after housing costs (equivalised) is composed of the same elements of net household weekly income but is subject to the following deductions prior to the McClement’s equivalence scale being applied: • rent (including housing benefit); • water rates, community water charges and council water charges; • mortgage interest payments; • structural insurance premiums (for owner occupiers); and • ground rent and service charges. 22 B Bibliography Adihetty, S., Lunn, S., Pitt, W., Stanborough, J., Vigurs, C., Wilkie-Jones, C. (2003) Households Below Average Income: An analysis of the income distribution from 1994/95 – 2001/02. DWP Publication. The Stationary Office, ISBN 1 84123 557 1. http://www.dwp.gov.uk/asd/hbai.asp/ Dhaneca, N., Ellerd-Elliott, S., Herring, I., Horsfall, E., Matejic, P., Shome, J., Snow,. J. (2003). Family Resources Survey: Great Britain, 2001/02. DWP Publication. The Stationary Office, ISBN 1 84388 153 5. http://www.dwp.gov.uk/asd/frs/2001_02/index.asp/ Heady, P., Clarke, P., Brown, G., Ellis, K., Heasman, D., Hennell, S., Longhurst, J., Mitchell, B. (2003). Small Area Estimation Project Report. Model-Based Small Area Estimation Series No.2, ONS Publication. http://www.statistics.gov.uk/ Longhurst, J., Cruddas, M., Goldring, S., Mitchell, B. (2004). Model-based Estimates of Income for Wards, 1998/99: Technical Report. To be published in Model-Based Small Area Estimation Series,ONS Publication. Office for National Statistics (2005). Regional Household Income. http://www.statistics.gov.uk/StatBase/Product.asp?vlnk=7359/ 23