Secondary consumer data Chapter 2 Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 1 Primary vs.secondary data Primary data: data directly collected for the purposes of the research under the researcher"s control Secondary data: exploitation of data already available which is useful to the research aims although not being explicitly built for its scopes Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 2 Using secondary data A checklist: 1. Is primary data collection too expensive or time consuming compared to secondary data? 2. Is there information about the quality of secondary data (survey method, questionnaire if relevant, etc.)? 3. Is the sample probabilistic and the sampling method known? 4. Does the target population of the secondary data sources coincide with the research target population? 5. Are secondary data outdated? 6. Is the source of secondary data independent and free from biases? 7. Does the purpose of secondary data fit with those of the research? 8. Is there any risk that some key variable is missing from secondary data? 9. Is it possible to make time comparisons with secondary data? 10. Are micro-data available? Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 3 Costs and timing issues • Some secondary data are prohibitively expensive to a level that primary data collection might be convenient • Even when costs for primary data collection are slightly higher, an ad-hoc survey gives more control and is preferable • Secondary data collection is also preferred when a very quick response is needed and there is no time for a proper data collection step • In some circumstance, even when secondary data have quality limitations, research findings accompanied by a transparent discussion of data issue and an effort for error quantification can be a satisfactory alternative Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 4 Information on data quality • Secondary data should be accompanied with detailed information on the sampling plan, questionnaire, interview methods, non-response treatments, etc. • A critical review of secondary data construction may lead to two outcomes: (a) data are of an acceptable quality hence statistical processing will be meaningful; (b) quality is inadequate because of serious shortcomings • Even in case (b) the review proves very useful: the necessary primary research can now be set up exploiting strengths and fixing weaknesses of the secondary source. Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 5 Sampling technique • As discussed in lecture five (Sampling), inference on sample data is only possible when the sample is extracted through probabilistic rules • Non-probabilistic samples (such as quota sampling or haphazard samples) might be biased and misleading • When using secondary data, a necessary condition is to be informed on the sampling method and on the steps taken to ensure sample representativeness. Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 6 Target populations • Representativeness is the extent to which observations in a sample reflect the key (targeted) characteristics of the population • When relying on secondary data, it is necessary to check that the sample is also representative for our target population • Even with probabilistic methods, representativeness is a relative concept, usually based on one or a few variables • Example of misuse of secondary data: Survey data on purchasing for a wide basket of goods – one is interested in fizzy drinks consumption and these are part of the basket Even with a geographically representative sample and adequate data quality, suppose the data collection step has been carried out in July: one gets an upward biased estimate of consumption, because the target population of the secondary data (consumers in July) does not fit adequately with our target population (fizzy drinks consumers throughout the year) Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 7 Time lags • A key question: could the time lag between the current research and the data collection step pose problems? • The relevance of the time lag depends on the research objectives. If the aim is to explore whether there is room for a novel product, it may be very risky to rely on outdated information If the research purpose is to explore a relationship between two or more variables which can be safely assumed to be stable in the medium term (like income elasticities), then a time lag of a couple of years may be acceptable Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 8 Source reliability • It is not infrequent to have very different measurements for the same variable, according to the source – especially for sensitive data • Example – electoral surveys • Meta-analyses of secondary data allow evaluations of source reliability Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 9 Meta-analysis • Meta-analysis is the statistical analysis of several research findings. Its aim is to combine the results of several studies to gain broader knowledge of a phenomenon • Simple meta-analysis – averaging measures of a given variable across different studies. • More complex meta-analysis – relate the target effect (the target/dependent variable) on the characteristic of the research environment of the selected studies (the explanatory variables). This allows one: (a)To gather useful information on the direction of the relationship; and (b)To identify potential biases in research designs and sources. • The contribution of meta-analysis depends on the number of studies taken into account and is based on a rigorous and systematic selection and classification of the meta-data • Drawback of meta-analysis – the sources of bias are difficult to be controlled. If it is an indicator of the methodological quality of the study, then the effect of improper methods can be evaluated. Otherwise, improper studies may reflect on the final estimates of meta-analysis. • The method is increasingly used in marketing and consumer research. Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 10 Purpose of secondary data collection • Before using secondary data, one should consider the purpose of its collection • Example – • Household food expenditure data, collected by on official source as part of a budget survey within the national accounting system • This current research: evaluate food consumption and nutrition intakes • Problem: expenditures depend on price levels and the sample could be representative in terms of the range of price levels faced by the population, while food consumption may depend on a number of lifestyle variables not necessarily accounted for in designing the expenditure survey. • While the above source of bias is likely to influence the results and needs to be carefully addresses, it is not necessarily a reason for abandoning the analysis of secondary data Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 11 Completeness of the secondary data • Are all the necessary variables considered in the target data set? • Example – Food consumption data can be estimated through: (a) respondent statements on consumption (e.g. food intake survey) (b) purchase data divided by price data (e.g. household budget survey) (c) disappearance data (e.g. production-export+import) (d) supermarket scan data • Scan data excludes those shoppers who shop at local stores rather than supermarkets • Purchase and disappearance data also include quantities that are not consumed or wasted within the household Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 12 Availability of data observed at several points in time • When secondary data are collected at regular intervals – sometimes even on the same individuals – they allow for the evaluation of dynamic behaviours • Panel data is made by variables observed on the same units at several points in time • Repeated surveys are extremely expensive and consistency across surveys is not easy to be guaranteed; in this case secondary sources have a key advantage on running an expensive primary panel survey. Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 13 Household budget surveys (3) Drawbacks (a) some respondent biases (especially in diary keeping) (b) (frequent) chaselack of information on prices and quantities (c) delay between data collection and data publication (at least one or two years) (d) usually lack information on attitudes or point of purchase Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 14 Micro-data • Secondary data can be provided at different levels of aggregation • Micro-data i.e. individual information on each of the sampled units, is increasingly available • Availability of individual (micro) data helps to correct sampling biases or target population mismatches, besides allowing to take into account as many control factors as the number of variables available for each statistical unit • The availability of micro-data is a big incentive to the use of secondary data Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 15 A rough guide to secondary data sources for consumer research 1. 2. 3. 4. 5. Household surveys Lifestyle and social trend surveys Consumer panels Attitude surveys Retail level surveys Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 16 Household budget surveys (1) • Main purpose – to gather data on household expenditures • Nationally representative samples, usually sampling designs allow for representativeness of population sub-groups (e.g. regions, age groups, etc.) • Data collection based on weekly or bi-weekly diaries where expenditures are recorded (in some cases receipts are collected) and final questionnaire (with sociodemographics) Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 17 Household budget surveys(2) Advantages a) based on large and representative samples b) allow to explore relation between expenditures and a set of household information c) carried out at regular intervals d) generally accessible at a very low cost, often for free e) Individual household data (treated to guarantee anonymity) is usually available, allowing for a high degree of personalization in cross-tabulation and processing Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 18 Household surveys in the UK • UK Expenditure and Food Survey • Merger of National Food Survey and the Household Expenditure Survey • About 7,000 British households • face-to-face interviews and bi-weekly diary on daily expenditures for household members greater than sixteen • Harmonization with EU surveys through a common classification of individual consumption by purpose (COICOP). • Two questionnaires: information on one-time purchases and information about the income of each adult member of the household • Other household surveys: • Family Resource Survey (tenure and housing costs) • National Diet and Nutrition Survey (every four to five years to monitor food consumption and dietary habits through a seven day dietary record). Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 19 The US consumer survey • United States Consumer Expenditure Survey (CES) • • Managed by the Bureau of Labor Statistics Two separate and independent samples: • Computer-assisted interviews • diary filled by the household members • • • • • Household income and socio-economic characteristics are in common Data are publicly available and date back to 1980 About 7,500 households enter each of the surveys. Interview survey: monthly out-of-pocket expenditures such as housing, apparel, transportation, health care, insurance, and entertainment Diary survey: weekly expenditure of frequently purchased items such as food and beverages, tobacco, personal care products, and non-prescription drugs and supplies. Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 20 Household budget surveys in Europe • Each EU member state conducts its own household budget survey • Since 1988 the European Statistical Institute (Eurostat) has started a harmonization effort • EU household budget survey every five or six years which consists in the ex-post harmonization of National microdata • Aggregate data are available on the Eurostat New-Cronos Data Bank, with average expenditure by country and COICOP category. Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi Household Budget Surveys in the European Union (EU-15) Enquête sur les Budgets des Ménages Forbrugerundersøgelsen Einkommens- und Verbrauchsstichprobe Family Budget Survey Encuesta Continua de Presupuestos Familiares Encuesta Basica de Presupuestos Familiares Enquête Budgets des Familles France: Household Budget Survey Ireland: Rilevazione sui consumi delle famiglie italiane Italy: Luxembourg: Enquête Budgets Familiaux Netherlands: Budgetonderzoek Konsumerhebung Austria: Inquérito aos orçamentos familiares Portugal: Kulutustukimus Finland: Hushållens utgifter Sweden: Expenditure and Food Survey UK: Belgium: Denmark: Germany: Greece: Spain: 21 The COICOP classification Example: 3 CLOTHING AND FOOTWEAR 3.1 Clothing 3.1.1 Garments 3.1.1.1 Men"s outerwear 3.1.1.2 Women"s outerwear 3.1.1.3 Children"s outerwear 3.1.1.4 Other garments 3.1.2/3 Other articles of clothing and clothing accessories and cleaning, repair and hire of clothing 3.2 Footwear 3.2.1/2 Shoes and other footwear and repair and hire of footwear CLICK HERE FOR THE COMPLETE CLASSIFICATION Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 22 Lifestyle and social trend surveys • Main purpose: to record a wide range of variables focusing on social dynamics, habits and lifestyle trends • These surveys are multi-purpose with some sections that remain unchanged over time to allow for comparisons, other sections that change cyclically so that time comparisons are still possible although not on a yearly basis, and some sections change every year to look into specific issues of current relevance Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 23 Multi-purpose surveys in the UK • General household survey • • • • • household as the sample unit, but information collected on individuals approximately 9,000 households – about 16,000 adults aged sixteen and over Data on education, employment, health, housing, and population and family information Other areas covered periodically (e.g. leisure, household burglary, smoking and drinking) Face-to-face computer assisted interviews and some self-completion sections. • Office for National Statistics (ONS) omnibus survey • • about 1,800 adults are interviewed by Computer-aided personal interviewing large range of topics, such as contraception, unused medicines, tobacco consumption, changes to family income, internet access, arts participation, transport, fire safety and time use • Other surveys • • • Smoking, drinking and drug use among young people survey Time use survey on lifestyles National Surveys of Sexual Attitudes and Lifestyles (sexual habits of British people) • About 6,400 respondents • Household questionnaire, individual questionnaire and self-completion 24-hour diaries (10-minute slots) Topics covered include employment, qualifications, leisure time activities and demographic details Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 24 Consumer panels • Objective: monitor consumption dynamics or response to changing factors (like advertising) • Repeated surveys where the same individuals are interviewed over time • Panel surveys have higher response rates, but are very expensive, so access to secondary data is generally preferred Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 25 British household panel survey • ONS British household panel survey • monitors social and economic change at the individual and household level in Britain • Data: a range of socio-economic variables (housing, neighbourhood, demographics, residential mobility, health and caring, employment and earnings, lifetime childbirth, marital and relationship history, employment status history, values and opinions, household finances and organization) • Annual survey of adults – 5,000 households (about 10,000 individuals). Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 26 European panel survey • Launched in 2005 by Eurostat • EU-wide household panel survey, named European Union Statistics on Income and Living Conditions, EU-SILC • Every year the survey collects micro-data on income, poverty, social exclusion, labour, education, health information and living conditions in all twenty-five member states plus Norway and Iceland • Will include Turkey, Romania, Bulgaria and Switzerland in coming years. • Cross-sectional data on income, poverty, social exclusion and other living conditions • Longitudinal data to detect individual-level changes, referring to a four year period • EU-SILC is the evolution of the previous panel survey ECHP (European Community Household Panel) run between 1994 and 2001 Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 27 Attitudes and values • Regular and official surveys do not usually provide a sufficient detail. • British social attitudes survey • series of annual surveys to monitor dynamics in British social, economic, political and moral values • Around 3,600 individuals per year to answer • some questions are the same throughout the years, others have a lower frequency • Large range of issues – newspaper readership, political parties and trust, public spending,welfare benefits, health care, childcare, poverty, the labour market and the workplace, education, charitable giving, the countryside, transport and the environment, Europe, economic prospects, race, religion, civil liberties, immigration, sentencing and prisons, fear of crime and the portrayal of sex and violence in the media. Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 28 Retail surveys • Data collection at the point of purchase • Usually scan tracking at the retail level • Retail Sales Inquiry • monthly collects sales data on 5,000 retailers • all retailers with more than 100 employees plus stratified sample of smaller retailer. • records total retail turnover (including sales from stores, ecommerce, Internet sales, etc.), which is the value of sales of goods to the general public for personal or household use • Sales volume is also recorded following the COICOP classification • Retail tracking services based on scan data are also provided by Gfk, ACNielsen and IRI Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 29 Consumer trends • UK Statistical Office (HMSO) time series data, which includes: (a) quarterly data on household final consumption expenditure (from National Accounts) according to the COICOP international classification (b) monthly consumer and retail price indices; (c) monthly data on consumer credit and other household borrowings (d) monthly retail sales indices for different types of retailers and goods • Consumer trends : a) details on Household Final Consumption Expenditure (HHFCE) at aggregate level for the whole UK on a quarterly basis b) quarterly time series from 1963 to the latest information for each COICOP item c) national expenditure (by UK citizens) and domestic expenditure (on UK territory) d) raw (not seasonally adjusted) and seasonally adjusted data (where seasonal factors have been removed through statistical methods) e) current prices and constant prices, where the latter implies that expenditures have been deflated so that the whole time series refer to the prices of a given base year Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 30 Commercial data • ACNielsen consumer panel (from 1989) • • • • • • • GfK retail panel in the UK • • • scanner-based supermarket tracking system with information on sales, market shares, distribution, pricing and promotion, across retail channels Data provided on a weekly basis in the week following the collection. Taylor-Nelson Sofres (TNS) continuous household panel • • • more than 200 products monitored on retail outlets plus quarterly consumer panel of 18,000 households (HomeAudit) Information Resources (IRI) InfoScan (from 1987) • • home scanning technology (HomeScan™), of purchased products taken at home all information on the barcode is memorized Integration of scan data and traditional diaries barcode scanners (MyScan™) for out-of-home purchases. Nationally representative data which can be disaggregated by outlet type and by demographic group. About 10,000 UK households plus the same methodology in twenty-six countries (210,000 households) diaries and home scanners for fast moving consumer goods (FMCG) plus separate panel targeted at individuals rather than household sfor goods like toiletries, textiles, impulse products, tobacco, telecoms, fast food, petrol TNS also records separately media audience,providing data for assessing advertising campaigns. Ipsos Omnibus Surveys • • • • regular surveys sponsored by multiple clients on the same sample one section is stable over time, other sections are based on the information required by the clients. In the UK, Ipsos runs regularly a survey on 2,000 households (Capibus™) with CAPI interviews CATI alternative (Express™) and a web-based version (i:omnibus™) are also provided Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 31