Primary data collection Chapter 3 Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 1 Some alternatives to the use of secondary data • Primary data collection • Experimental methods (experiments under controlled conditions) • Ethnographic methods (targeted to the study of cultures, immersion in a cultural community to record behaviors) • Test marketing (actually launching the marketing activity on a small scale) • Qualitative research methods Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 2 Qualitative research methods • Exploratory research: gathering insights and understanding of the research problem • Loose definition of information needs • Very flexible and almost unstructured research processes • Samples are not necessarily representative and often very small • Useful to explore emotional and effective relationships (e.g. a TV advert is funny or not), or for sensitive topics that people are likely to be unwilling or unable to answer. Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 3 Running qualitative research • Objectives of qualitative research: (1) obtain an adequate definition of the research problem (2) develop specific hypothesis to be tested through quantitative research (3) identify key variables which will require a specific quantitative analysis (4) set the priorities for further research • Direct methods (straight and undisguised collection): focus groups, in-depth interviews and panels such as the Delphi method and nominal group techniques • Indirect methods (disguised objectives, indirect collections); psychology-based projective techniques like association, completion, construction and expression. Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 4 Errors in primary surveys Source of error Description A SAMPLING ERROR Error associated purely with the fact that we observe a sample rather than the whole population, with probabilistic samples that can be estimated B NON-SAMPLING ERROR (B1+B2+B3+B4+B5) This error includes all other sources of errors that do not depend on the sampling process. Nonsampling errors can be random or non-random (biases), where the latter are more likely to affect the estimation results B1 Sampling frame errors Some of the population items are not represented in the sampling frame B2 Non response errors (B21+B22) Some of the sampled units do not participate to the survey B21 Not-at-home The sampled unit could not be contacted B22 Refusals The sampled unit refused to cooperate B3 Researcher errors All those errors imputable to problems in the research design, such as errors in defining the population, inappropriate administration methods, inconsistencies between the research objectives and the questionnaire, errors in data processing, etc. B4 Interviewer errors Errors due to inappropriate actions of the interviewer. These include inappropriate selection of the respondents, errors in asking questions, errors in recording the responses or even fabricating them. B5 Respondent errors While participating in the survey, the respondent provides (willingly or unwillingly) incorrect answers or does not answer to some of the questions TOTAL SURVEY ERROR (A+B) Overall error Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 5 Sources of error • Sampling error: probabilistic sampling allows to measure and control this error component • In many situations the sampling error is low compared to non-sampling and potentially systematic errors (especially when the proportion of non-respondent is high) • Systematic errors: cannot be quantified prior to the survey and difficult to be detected even after the field work) • Prevention of systematic errors is vital Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 6 Primary research process 1. 2. 3. 4. Clearly formulate the research objectives Set the survey research design Design data-collection method and forms Design sample and collect data Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 7 Formulating the research objectives • A clear definition of the research problem is needed through: • • • • Discussion with the final user. Interviews with experts on the topic Analysis of secondary data Qualitative research • Then the precise research questions can be derived from the research problem • Break down the research problem into components • At this stage the theoretical and statistical framework needs to be chosen • Questions can be expressed through hypotheses to be tested Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 8 The survey research design in seven steps 1) Identification of the reference population and sampling frame 2) Choice of sampling criteria 3) Definition of the estimation methodology for making inference on the surveyed parameters 4) Choice of sample size 5) Choice of the data-collection method 6) Questionnaire design 7) Cost evaluation Not necessarily in this order – rather interlinked decisions Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 9 Reference population and sampling frame • Sampling frame: the complete list of all elements in a population which can be used to extract a sample • The survey gathers information about the population • Definition of the reference population: • identification of the basic unit to be surveyed (the individual consumer, the household, geographic areas, etc.), which will provide the basis for selecting the sampling units • The population and sampling units do not necessarily coincide with the basic elements of the population (e.g. household as sampling\population units, then measurements on single individuals) • The reference population needs to be reflected by an appropriate sampling frame – thus it might not be the ideal one, but simply the feasible one Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 10 Sampling criteria • Choice of the type of sampling (see lecture 5) • probabilistic versus non probabilistic • stratified versus simple random sampling • deep implications in terms of costs and precision levels • The choice is constrained by other decisions • The sampling frame • Variables available in the sampling frame (stratification) • Interview method (e.g. telephone, mail, mall-intercept) Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 11 Estimation methodology (inference) Inference: the generalization process which allows to project characteristics observed in a sample to the whole population of interest • Sample estimators depend on the sampling criteria • The methodological choice (average, proportions, models, classification, etc.) is relevant to: • Questionnaire design • Sample size Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 12 Sample size • Mathematical rules (see lecture 5): size is a function of the precision levels and sampling design • Other issues: • Non-response rates (which depend on the administration method) need to be taken into account • When non-response rates are high and non-responses are not random sampling error is negligible compared to non-sampling errors. • If information on sub-groups of the target population is relevant, representativeness requires an increase of the overall sample size. Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 13 Data collection method • Choice of the administration method (face-to-face interviews, telephone interviews, electronic surveys, postal surveys) is related to: • • • • sampling method sampling size sampling frame questionnaire design • This is one of the first decisions to be taken, usually based on: • Number of questions and duration of the interview • Type of question (sensitive or not) Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 14 Questionnaire • Key factors for developing questionnaire • • • • Research objectives Administration methods Characteristics of the target population Methodologies chosen for statistical processing • Qualitative methods and pre-testing are key to quality improvement Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 15 Cost-effectiveness • The ideal research design might be the most expensive one • Compromise is often necessary • Strategy • • • Cost the ideal research design Prioritize issues Identify cost reductions • The seven steps of the research design are not sequential, but are considered and adjusted simultaneously • Strategy: • • • Choose the ideal administration method Define the questionnaire length and sample size according to the administration method and budget constraint If the solution is not acceptable change the administration method Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 16 Final steps • After determining all steps of the research design, the questionnaire is drafted considering: • Consistency with research objectives • Internal coherence • Potential source of bias • Draft questionnaire is pre-tested through a pilot study (this may include pilot statistical analysis) • The draft questionnaire is adjusted and finalized (e.g. to meet duration constraints) • The sample is extracted • Field work! Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 17 Administration methods • • • • Telephone interviews Personal interviews Mail surveys Electronic interviews Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 18 Telephone interviews • Traditional interviewing (a phone, a pencil and a questionnaire) • Computer Assisted Telephone Interviewing (CATI) – computerised questionnaire administered to respondents through the phone •Software checks for consistency and completeness •Reduces the interviewers’ errors •May control sampling procedures (e.g. random dialling) •Measures quality parameters (e.g. duration) •Data are ready to use Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi •Cannot reach those without a telephone •It is expensive (at least 700-1000 interviews to justify costs) •Not very suitable for open questions •Interviews should be short (twelve to fifteen minutes) •Use of stimuli is not possible 19 Personal interviews • In-home (interviewer visits respondent at home) •Personal contact with interviewer •Highly Expensive •Skilled interviewers increase quality •Interviewer influence/bias •Longer duration •Wariness of respondents •High response rate • Mall-intercept (respondent was stopped outside shops or in the street) •Cheaper •Difficulties in obtaining sensitive •Easy use of stimuli information (no anonymity) •Can be run without sampling frame •High social desirability • Computer-Assisted (CAPI) with interviewer •Increased involvement of respondent •On-screen and off-screen stimuli •Interviews may last even longer Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi •Limited sampling control •Slower (but time perception varies) 20 Mail surveys • Mail interviews (Fax for businesses) •Cheap •Optimal for sensitive questions/anonymity/social desirability •No interviewer bias •Very low response rate •Selection bias / low sample control •Very slow • Mail panels •Allow for longitudinal (time comparison) design •Higher response rate •Higher sample control Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi •More expensive •Low control of data collection environment 21 Electronic interviews • E-mail (ASCII/text message) •Very cheap •Quick •No interviewer bias •Selection bias •Requires data entry before analysis •Low quality of data •Low sample control •Low response rate (and decreasing) • Web-based (HTML/Java) •Allow for (some type of) stimuli •Logic/consistency checks (CAWI) •Higher sample control •Anonimity/Sensitive questions Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi •(Very) low sample control •Selection bias •Problems in compiling lists •Even lower response rate 22 The questionnaire • Key step in ensuring consistency between the actual measurement and the targeted measurement • Questionnaires are a likely source of non-sampling error • • • Potential discordance between the information provided by the respondent and the interpretation by the researcher Bad questionnaires increase non-response errors Ill-posed questions raise response errors (e.g. inaccurate answers) Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 23 Steps for a good questionnaire • Eight steps towards a good questionnaire: 1.Specify the information to be collected 2.Define the information collected by each individual question 3.Choose structure and measurement scale 4.Determine the wording of each questions 5.Sort the questions (and possibly divide them into sections) 6.Code the questions and simulate statistical processing 7.Write an appropriate introduction / presentation and define the layout 8.Pilot the questionnaire and revise where necessary Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 24 Specify the information to be collected • Exploit preliminary qualitative research (e.g. focus groups) • Ensure consistency between the research objective and the research questions • The use of a theoretical framework needs to be considered at this stage • Take into account the chosen statistical methodology Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 25 Example (1) Research objective: understanding the radio listening habits of a population (2) Research questions: (a) when do people listen to radio; (b) where do people listen to radio; (c) what radio stations they listen to (3) Disaggregation of research questions: e.g. (a) could become (a1) time of the day when they listen to radio; (a2) days of the week when they listen to radio; (a3) activities they do while listening to radio; (a4) seasonal differences in radio listening habits Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 26 Information collected by each individual questions • Is the question necessary? • Unnecessary questions should be eliminated, unless they serve for other purposes (e.g. disguise the purpose of sponsorship, etc.) • Is a single question sufficient? • When do you listen to the radio? What does it mean? Two potential interpretations: • on which days or at what time of the day? Better: • How often do you listen to the radio? (number of days per week) • At what time of the day do you typically listen to the radio? • Why do you eat Sainsbury pizza? • Define variables / prepare a draft spreadsheet Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 27 Choose structure and measurement scale • Unstructured question (open-ended, free response) • Good as first questions on a topic • Less biasing influence (but interviewer bias) • Coding of responses is costly and time-consuming • Structured questions • Multiple Choice (A, B or C?) – order bias • Dichotomous (Yes or No or Don"t know) – question wording bias • Scales (from one to ten) • Choice of measurement scale (see lecture 1) • Sensitivity issues may be dealt by indirect elicitation techniques Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 28 Overcoming problems in answering • Factors that might lead to unanswered questions or inaccurate answers: • • • • Lack of information Lack of memory Incapacity to articulate certain responses Unwillingness to answer (sensitive information, too much effort, the question/context is perceived as inappropriate) Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 29 Techniques to get sensitive questions answered • • • • • Hide the question among a group of innocent questions State that the behavior of interest is common or the usefulness of an answer Use the third-person technique Provide categories instead of asking for figures Use randomised techniques (but you lose any linkage with other questions) Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 30 Randomised techniques Please flip a coin. If you get a head, please answer to question A, if you get a tail please answer to question B. A. Are you enjoying this lecture? B. Are you a female? YES Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi NO 31 Interpretation of randomised questions • We got the following results for the question: YES: 40% NO: 60% • We know that 60% of our respondents are female and 40% are male • We know that the probability of getting a head or a tail is 50% Pr(Yes) 50%*( A yes) 50%*( B yes) Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 32 Result Pr(Yes) 0.5*( A yes) 0.5*( B yes) 40% 0.5*( A yes) 0.5*60% 40% 0.5*60% Pr( A Yes ) 20% 0.5 Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 33 Wording • • • • • • • • Avoid long and elaborate questions Use wording compatible with the measurement scale Use ordinary words Avoid ambiguous words (no generally, frequent etc.) Avoid phrasing which suggests the answer (Do you think people should listen more to the radio and watch less television?) Avoid questions which need a particular effort for memory, computing, etc. (How many hours per year do you listen to the radio?; What is the frequency of your favourite radio station?) Avoid questions that are too generic (Why do you like radio programs?) Use positive and negative statements (advisable to use dual statements for different respondents; e.g. Is this cheese soft? Is this cheese hard?) Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 34 Sorting questions • Use good opening (ice-breaking) questions (What is your current favourite radio hit?) • Place difficult and sensitive question towards the end (What is your salary?) • Ask basic information first; target variables (Do you own a radio?) • Ask classification and identification questions at the end (age, gender, etc.) • General questions should precede specific questions • Follow a logical order (a flow chart may help) Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 35 Code the question and test statistical processing • Code the questions in a way that is functional for an electronic data sheet (or statistical package) • A good strategy is to simulate data (or use pilot data) to test the statistical techniques • Try to anticipate potential problems in terms of lack of variability (e.g. all respondents giving the same answers, number of items in a measurement scale, etc.) Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 36 Introduction presentation and layout • Decide whether it is good to mention who promotes the research (trust versus perception of vested interests) • Consider pros and cons of provision of incentives to participate (higher response rates vs. selection bias) • Professional appearance in self-administered questionnaires • Avoid splitting questions across pages • Consider positioning key question close to the top of the page Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 37 Pilot and pre-testing • Pilot study: preliminary test on the questionnaire on a small number of respondents to check for all previous issues and potential non-sampling error • Control on quality parameter (e.g. length and timing of the questionnaire) • Better by personal interview (regardless of the actual survey method, a second pre-testing may be carried out for some specific administration methods) • Use a variety of interviewers for personal interviews (to detect potential interviewer bias) • Respondent is asked to think aloud • Debriefing (go through the questionnaire with the respondent after he has finished to compile it) • Check consistency with the research objectives Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 38 Indicative costs Indicative UK costs and response rates for different types of interviews Variable costs Survey type Mail survey Personal CAPI Telephone CATI Fixed costs 2,000 4,000 15,000 4,000 15,000 Total costs Per respondent Per 200 respondents 500 respondents 1,000 respondents questionnaire (assumed response rate) 3 50 60 20 50 15 63 86 40 83 (20) (80) (70) (50) (60) 5,000 16,500 32,143 12,000 31,667 9,500 35,250 57,857 24,000 56,667 17,000 66,500 100,714 44,000 98,333 Note: costs are in British pounds and are purely indicative, based on 2005 rates of private marketing research agencies. Fixed costs basically include sampling frame, hiring of equipment and training of interviewers Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 39 Four types of primary surveys 1. 2. 3. 4. Socio-demographics Behaviors and economics Psychographics, lifestyle & attitudes Response to marketing actions Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 40 Socio-demographic surveys • Most surveys contain a socio-demographic section to test relationships between behaviors, attitudes or needs and belonging to a certain segment of the population • Socio-demographic information is often used as a benchmark to test the representativeness of a sample • Example: the UK Annual Population Survey • Socio-demographic survey are especially useful to: monitor demographic trends, gather information on housing, update information on ageing, monitor employment, collect information on transport usage, monitor migration and collect information on minority groups Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 41 Behavioral and economic surveys • Monitor consumer purchasing decision and relate behaviors to their determinants. • Typical expenditure survey: looks into recorded expenditures for different products or services, by brand or category • Other purchasing decision information may refer to: • frequency of purchase • point of purchase • brand switching. • Other behavioral surveys: • Consumption and use • Disposal Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 42 Attitudinal psychographics and lifestyle surveys • Other than economics drivers of behaviors are related to: • • • • Lifestyles Social pressure Individual attitudes Habits • Integration of psychology-based questionnaire sections into market research surveys • Example: Ajzen’s theory of planned behavior explains (intention to) behavior as a function of: • Attitudes towards behavior (e.g. attitude towards drinking coffee) • Social norms (e.g. what other people think about drinking coffee) • Perceived behavioral control (e.g. factors which help or constrain the choice of drinking coffee) Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 43 Response to marketing actions • Relation between marketing actions and consumer response • Example: advertising has different objectives • • • • Demand increase Loyalty increase Product positioning Defensive strategies against other companies" action etc. Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 44 Responses to the marketing mix Marketing mix Example action Examples of response variables Price Cut prices Sales, Profits, New customers gained, Market shares, Perceived quality, Stock reduction Product Launch a new product Consumer acceptance, Brand image effect, Target customer profile, Price positioning Promotion Advertising Brand/product awareness, Brand loyalty, Brand image, Sales, Market shares, New customers gained, Customer retention, Product positioning, Willingness to pay Place Launch e-commerce New customers gained, Sales, Profits, Brand image, Market shares Participants Improve customer contact management Customer satisfaction, Brand loyalty, Customer retention, Perceived quality Process Introduce a customer complaining procedure Customer satisfaction, Customer retention, Brand loyalty Physical evidence Change selling environment Consumer acceptance, Sales, New customers gained, Perceived quality, Willingness to pay, Time spent on the shop, Customer satisfaction, Brand loyalty Statistics for Marketing & Consumer Research Copyright © 2008 - Mario Mazzocchi 45