BUSINESS RESEARCH DR HARITIKA CHHATWAL The Nature of Business Research Application of the scientific method in searching for the truth about business phenomena. Include defining business opportunities and problems, Generating and evaluating alternative courses of action, Monitoring employee and organizational performance. Facilitate the managerial decision-making process Essential tool for management for problem-solving and decision-making Decrease the risk of making a wrong decision in each area. Basic Vs Applied Research Basic Business Research Applied Business Research • Research conducted without a specific decision in mind that usually does not address the needs of a specific organization. • Research conducted to address a specific business decision for a specific firm or organization • It attempts to expand the limits of knowledge in general and is not aimed at solving a particular pragmatic problem. • Evaluating the impact of a training programme on employee performance. • 1. Understanding the consumer buying process • 2. Examining the consumer learning process. • Examining consumer response to direct marketing programmes OBJECTIVES OF RESEARCH • 1.To gain familiarity with a phenomenon or to achieve new insights into it • 2. To portray accurately the characteristics of a particular individual, situation or a group • 3. To determine the frequency with which something occurs or with which it is associated with something else • 4. To test a hypothesis of a causal relationship between variables TYPES OF RESEARCH Descriptive vs. Analytical: • Includes surveys and fact-finding enquiries of different kinds. • Purpose - description of the state of affairs as it exists at present. researcher has no control over the variables; • for example, frequency of shopping, preferences of people, or similar data. • The methods of research - survey, comparative and correlational methods. • In analytical research, on the other hand, the researcher has to use facts or information already available, and analyze these to make a critical evaluation of the material. Applied vs. Fundamental Applied research • Aims at finding a solution for an immediate problem facing a society or an industrial/business organisation. • Fundamental research is mainly concerned with generalisations and with the formulation of a theory. • Research concerning some natural phenomenon or relating to pure mathematics are examples of fundamental research. • Research studies, concerning human behaviour carried on with a view to make generalisations about human behaviour, are also examples of fundamental research, but research aimed at certain conclusions (say, a solution) facing a concrete social or business problem is an example of applied research. Quantitative vs. Qualitative: • (iii) Quantitative research is based on the measurement of quantity or amount. It is applicable to phenomena that can be expressed in terms of quantity. • Surveys, questionnaires, telephone, and interviews. • Survey conducted to understand how long a doctor takes to tend to a patient when the patient walks into the hospital. • Qualitative research, on the other hand, is concerned with qualitative phenomenon, i.e., phenomena relating to or involving quality or kind. • Ex: Motivation Research/ Attitude or opinion research - in depth interviews ,word association tests, sentence completion tests, story completion tests and similar other projective techniques. Conceptual vs. Empirical • Conceptual research is that related to some abstract idea(s) or theory. It is generally used by philosophers and thinkers to develop new concepts or to reinterpret existing ones. • Empirical research is data-based research, coming up with conclusions which are capable of being verified by observation or experiment. • Experimental type of research. In such a research it is necessary to get at facts firsthand, at their source, and • Actively to go about doing certain things to stimulate the production of desired information. In such a research, the • Working hypothesis required. • Such research is thus characterised by the experimenter’s control over the variables under study and his deliberate manipulation of one of them to study its effects. • Evidence gathered through experiments or empirical studies is today considered to be the most powerful support possible for a given hypothesis. Significance of Research The study of research methods provides the manager with the knowledge and skills needed to solve the problems and meet the challenges of a fast paced decision-making environment. The factors that stimulate an interest in a scientific approach to decision-making are: increased need for more and better information availability of improved techniques/tools in collecting and analysing the data overload of information The increasingly complex nature of business consequent to eventful changes in technology, competition, consumer behaviour, etc. is compelling the businessmen to bestow more attention on research. As an aid to economic policy, research has gained added importance, both for government and business organizations.. Systematic research provides the basis for nearly all business policies of the government and government’s analysis of needs and desires of the people and revenue available to meet the expenditure rest on research. Operational marketing and motivational research are crucial for taking strategic business decisions. Business and society are mutually dependent. Successful business relies on tracking the significant changes in the society. Research is a dependable tool to feel the pulse of the society. Business research is a systematic inquiry that provides information to guide business decisions. The information needed and the interpretation essential in the different functional areas of research are overwhelming, and are as follows: • MARKETING Demand forecasting, consumer buying behaviour, measuring effectiveness of advertisement, media selection, test marketing, product positioning, new product potential, etc. • PRODUCTION What to produce, how much to produce, when to produce, for whom to produce, how to improve quality control or reduce inventory cost, etc. • MATERIALS Where to buy, how much to buy, when to buy and at what price to buy. • FINANCE How to manage the working capital, how to juggle the debt–equity ratio or how to improve the accounting procedure. • HRD Human resources planning, incentive schemes, employment trend, turnover, performance appraisal, etc. • GOVERNMENT Budgets, planning, resource optimization, etc. Step 1: Problem or Opportunity Identification • The management of the company identifies the problem or opportunity in the organization or in the environment. • The management needs to understand the reasons of the problems, a systematic research has to be adopted. Step 2: Meeting to Discuss the Problem or Opportunity Dimensions -Decision Maker and Business Researcher • Decision maker should understand the dimensions of the research and the researcher should also understand the scope of decision making by the decision maker. Step 3: Defining the Management Problem and Subsequently the Research Problem • The management problem is concerned with the decision maker and is action oriented in nature. • Research problem is somewhat information oriented and focuses mainly on the causes and not on the symptoms. Step 4: Formal Research Proposal and Introducing the Dimensions to the Problem STEP 1 -Develop a theoretical model to quantify an attitude. • For example, to estimate the “buying intentions” for a particular product, a) prepare a theoretical model to measure an attitude like buying intentions. b) list of decisive factors in buying a particular product. c) collect these factors from the literature and then proposes a model containing various factors that are combined to measure the buying intention. d) EXAMPLE -Suppose the researcher has decided to take five factors as determinants of the buying intention. These factors are brand image, brand awareness, price, availability, and after-sales services and are treated as the main variables. For example, he has explored the first factor—brand image—and collected seven statements from the literature to characterize it. These seven statements are converted into seven questions and are placed as the first seven questions of the questionnaire. In this manner, the questionnaire consists of approximately 35 questions, where each question is designed to measure some phenomenon of interest to the researcher. These questions are now rated on a 1- to 7-point rating scale. e) Moderating variables are the second set of independent variables, which a researcher believes to have a significant contributory or contingent impact on the originally assumed cause–effect relationship between the dependent and independent variables. Hypothesis • Buying intention is based on the five factors: brand image, brand awareness, price, availability, and after-sales services. The researcher assumes that after quantification of these independent variables, any enhancement in one of these variables will enhance the buying intention of the consumer. • Thus, five hypotheses can be constructed as follows: • Hypothesis 1: “Brand image” has a significant liner impact on the buying intention. • Hypothesis 2: “Brand awareness” has a significant liner impact on the buying intention. • Hypothesis 3: “Price” has a significant liner impact on the buying intention. • Hypothesis 4: “Availability” has a significant liner impact on the buying intention. • Hypothesis 5: “After-sales services” has a significant liner impact on the buying intention. • The researcher can also test the combined impact of these five variables on the buying intention. The proposed multiple regression model will be • The corresponding hypothesis may be constructed as follows: Hypothesis 6: All the five factors in combination have a significant linear impact on the buying intention. Similarly, three other hypotheses Step 5: Approaches to Research • The research approach is formulated • The questions are framed and scientifically placed in the questionnaire. • Approaches to research consists of making a suitable decision regarding research components like types of research, measurement and scaling, development of questionnaire, sample size-determined sampling techniques and data analysis plan. Types of Research Exploratory Research To explore the insight of the general research problem. It is also used to find out the relevant variables to frame the theoretical model It is is purely unstructured and provides an insight to the problem. Exploratory research is used to explore the different dimensions of the problem The research procedure is unstructured, qualitative, and flexible Findings of the exploratory research are generally not conclusive 1. The exploratory research is helpful for both formulating the problem and defining it precisely 2. Exploratory research is used to identify and define the key research variables 3. Helpful in formulating the hypotheses. Methods of Conducting Exploratory Research • Secondary data analysis, expert survey, focus group interviews, depth interview, case analysis, and projective techniques. • Secondary data analysis – For problem understanding and exploration and develop an understanding about research findings • Expert Survey-Consult the experts of the concerned field. • Focus Group Interviews - A trained moderator leads a small group of participants to an unstructured discussion about the topic of interest • Depth interview- Between a highly skilled interviewer and a respondent from the target population to unfold the underlying opinions, motivations, emotions • Case Analysis -Combines the record analysis and observation with individuals and group interviews. • Projective Technique -The projective technique is used to generate the information when the researcher believes that the respondent will or cannot reveal the desired meaningful information by direct questioning. • Word Association- In the word association technique, the respondents are required to respond to the presentation of an object by indicating the first word, image, or thought that comes in his or her mind as a response to that object. • For example, Sony colour television is tested on three words: quality, price, and availability. If the first response of the respondent is “quality” then the unprocessed and spontaneous response to the brand indicates that the quality of the brand is perceived well by the consumer. • Completion Task - The respondent is presented with incomplete sentence, story, argument, or conversation and asked to complete it. In the field of business research, the two widely used completion task techniques are sentence completion task and story completion task. • For example, to assess the respondent’s feelings about LG air conditioners, the completion task may be, “I use LG air conditioner because it gives me ______________.” This is just one example; many incomplete sentences can be constructed to elicit responses from different angles. Another example is, “People who use LG air conditioners are _________.” Descriptive Research As evident from the name, descriptive research is conducted to describe the business or market characteristics. Used in segmenting and targeting the market. Describe the characteristics of some relevant groups for the research, to understand the demographic and other characteristics of the population, to understand the consumer perception about any product or services, to understand the degree of association between marketing variable, and to make some forecasting about sales, production, or other phenomenon of interest. Descriptive Research Specific hypotheses are formulated before conducting the descriptive research. Hence, this is structured and preplanned in comparison with the exploratory research. It can be further classified into cross-sectional study and longitudinal study. Cross-Sectional Study and Longitudinal Study Cross-sectional research design involves the collection of information from a sample of a population at only one point of time. In this study, various segments of the population are sampled so that the relationship among the variables may be investigated by cross tabulation. Sample surveys are cross-sectional studies in which the samples happen to be a representative of the population. The cross-sectional study generally involves large samples from the population; hence, they are sometimes referred as “sample surveys.” In a cross-sectional design, a representative sample taken from the population is studied at only one point of time. “What is the effectiveness of an advertisement campaign for an air conditioner?” is an example of cross-sectional study. Longitudinal study involves survey of the same population over a period of time. In a longitudinal study, the sample remains the same over a period of time. “How have consumers changed their opinion about the performance of air conditioner as compared with that last summer?” is an example of longitudinal study. Longitudinal surveys usually combine both extensive (quantitative) and intensive (qualitative) approaches. Causal Research • To identify the cause-and-effect relationship between two or more business (or decision) variables. • Many business decisions are based on the causal relationship between the variables of interest. • For example, a cement manufacturing is working on the assumption that the increase in advertisement expenditure is going to increase the sales of the company. Step 6 -Fieldwork and Data Collection • The researcher has to also decide whether to go for a survey or adopt the observation methods and decide whether the research will be based on the field data collection or laboratory experiment. Step 7- Data Preparation and Data Entry • There is a specific scientific procedure to deal with the missing data and other problems related to the datacollection process. After feeding the data Step 8: Data Analysis After feeding the data in the spreadsheet, data analysis is launched. Various sophisticated statistical analytical techniques to execute the data analysis exercise. These include univariate statistical analysis, bivariate statistical analysis, and multivariate statistical analysis. Step 9: Interpretation of Result and Presentation of Findings There is need to interpret the result and present the non-statistical findings derived from the statistical result. The researcher has to determine whether the result of the study is in line with the existing literature. Important to present the findings in a scientific manner. The results obtained from the analysis are statistical in nature. Step 10: Management Decision and Its Implementation • As the last step the findings are conveyed to the decision maker after consultation with the research programmer. • The decision maker analyses the findings and takes an appropriate decision in the light of the statistical findings presented by the researcher. Survey and Quantitative Techniques • Survey method- A structured questionnaire administered to a sample of a target population, designed to elicit specific information from participants. • Structured data collection- Use of a formal questionnaire that presents questions in a prearranged order. • Fixed-response alternative questions- Questions that require participants to choose from a set of predetermined answers. MEASUREMENT AND SCALING • Measurement means assigning numbers or other symbols to characteristics of objects according to certain pre-specified rules. • what we measure is not the object but some characteristic of it. • Thus, we do not measure consumers, only their perceptions, attitudes, preferences or other relevant characteristics. • numbers are usually assigned for one of two reasons. • First, numbers permit statistical analysis of the resulting data. • Second, numbers facilitate a universal and transparent communication of measurement rules and results. • Scaling may be considered an extension of measurement. Scaling involves creating a continuum upon which measured objects are located. • To illustrate, consider a scale for locating consumers according to the characteristic ‘attitude towards visiting a cinema’. Each participant is assigned a number indicating an unfavourable attitude (measured as 1), a neutral attitude (measured as 2) or a favourable attitude (measured as 3). • Measurement is the actual assignment of 1, 2 or 3 to each participant. • Scaling is the process of placing the participants on a continuum it is a process by which participants would be classified as having an unfavourable, neutral or positive attitude. Scale Characteristics and level of Measurement • Description-The unique labels or descriptors that are used to designate each value of the scale. All scales possess description. • Example : 1. Female, 2. Male; 1 = Strongly disagree, 2 = Disagree, 3 = Neither agree nor disagree, 4 = Agree and 5 = Strongly agree; and the numbers of euros earned annually by a household. ‘Female’ and ‘male’ are unique descriptors used to describe values 1 and 2 of the gender scale. Order • Relative sizes or positions of the descriptors. • No absolute values only relative values. • Order is denoted by descriptors such as ‘greater than’, ‘less than’ and ‘equal to’. • Example 1 -preference for visit can be expressed by the following order, with the mostpreferred art form being listed first and the least-preferred last: Cinema Theatre Pop concert • For King Raghav, the preference for the cinema is greater than the preference for the theatre. Likewise, the preference for a pop concert is less than the preference for the theatre. • Example 2 Participants who fall into the same age category, say 35 to 49, are considered to be equal to each other in terms of age, and greater than participants in the 20 to 34 age group. • All scales do not possess the order characteristic. Example the gender scale does not possess order. Distance • Absolute differences between the scale descriptors are known and may be expressed in units. • Example - A five-person household has one person more than a fourperson household, which in turn has one person more than a threeperson household. Origin • The scale has a unique or fixed beginning, or true zero point. • Thus, an exact measurement of income by a scale such as What is the annual income of your household before taxes? €____ has a fixed origin or a true zero point. • A scale that has origin also has distance (and order and description). • Many scales used in research do not have a fixed notice or true zero point’. • Scale was defined as 1 = Strongly disagree, 2 = Disagree, 3 = Neither agree nor disagree, 4 = Agree and 5 = Strongly agree. However, 1 is an arbitrary origin or starting point. This scale could just as easily been defined as 0 = Strongly disagree, 1 = Disagree, 2 = Neither agree nor disagree, 3 = Agree and 4 = Strongly agree, with 0 as the origin. • Alternatively, shifting the origin to –2 will result in an equivalent scale: –2 = Strongly disagree, –1 = Disagree, 0 = Neither agree nor disagree, 1 = Agree and 2 = Strongly agree. All these three forms of the agree–disagree scale, with the origin at 1, 0, or –2, are equivalent. • Description, order, distance and origin represent successively higher-level characteristics, with origin being the highest scale characteristic. Description is the most basic characteristic that is present in all scales. • If a scale has order, it also has description. • If a scale has distance, it also has order and description. • Finally, a scale that has origin also has distance, order and description. • Thus, if a scale has a higher-level characteristic, it also has all the lowerlevel characteristics. • The reverse may not be true, i.e. if a scale has a lower-level characteristic, it may or may not have a higher-level characteristic. MEASUREMENT AND SCALING • WHAT SHOULD BE MEASURED? • Measurement objects can be tangible such as the number of people or consumers, • or psychological such as attitude or perception measurement. • The measurement of psychological properties requires a careful attention of a researcher. • For example, “What motivates a consumer to buy a luxury car.” In this case, a researcher has to focus on motives of a consumer. Therefore, a researcher has to broadly quantify the research focus on “consumer motivation” to address the above research question. • Suppose that a researcher hired by a Company X decides to measure it as below: Strongly motivated 5 Motivated 4 Neutral 3 Not motivated 2 Not strongly motivated 1 PRIMARY SCALES OF MEASUREMENT • The following are the four common data measurement levels used: • Nominal scale • Ordinal scale • Interval scale • Ratio scale Nominal Scale • A scale whose numbers serve only as labels or tags for identifying and classifying objects, with a strict one-to-one correspondence between the numbers and the objects. • The numbers in a nominal scale do not reflect the amount of the characteristic possessed by the objects. • For example, a high number on a football player’s shirt does not imply that the footballer is a better player than one with a low number, or vice versa. • Only a limited number of statistics, all of which are based on frequency counts, are permissible. These include percentages, mode, chi-square and binomial test. Ordinal scale • A ranking scale in which numbers are assigned to objects to indicate the relative extent to which some characteristic is possessed. Thus, it is possible to determine whether an object has more or less of a characteristic than some other object. • Thus, an ordinal scale indicates relative position, not the magnitude of the differences between the objects. examples of ordinal scales include quality rankings, rankings of teams in a tournament and occupational status. In marketing research, ordinal scales are used to measure relative attitudes, opinions, perceptions and preferences. Measurements of this type include ‘greater than’ or ‘less than’ judgements from participants. Interval scale • A scale in which the numbers are used to rank objects such that numerically equal distances on the scale represent equal distances in the characteristic being measured .An interval scale contains all the information of an ordinal scale, but it also allows you to compare the differences between objects. • Example : temperature scale. In marketing research, attitudinal data obtained from rating scales are often treated as interval data. • In an interval scale, the location of the zero point is not fixed it is not meaningful to take ratios of scale values. Statistical techniques that may be used on interval scale data include all those that can be applied to nominal and ordinal data in addition to the arithmetic mean, standard deviation • Ratio scale The highest scale. This scale allows the researcher to identify or classify objects, rank-order the objects and compare intervals or differences. It is also meaningful to compute ratios of scale value. Thus, ratio scales possess the characteristic of origin (and distance, order and description). With ratio scales we can identify or classify objects, rank the objects and compare intervals or differences. • examples of ratio scales include height, weight, age and money. In marketing research, sales, costs, market share and number of customers are variables measured on a ratio scale. MEASUREMENT SCALES • Comparative scales are based on the direct comparison of stimulus and generally generate some ranking or ordinal data. These scales are sometimes referred as non-metric scales. • Non-comparative scaling techniques generally involve the use of a rating sale, and the resulting data are interval or ratio in nature. This is the reason why these scales are referred as monadic scales or metric scales by some business researchers. ADVANTAGES AND DISADVANTAGES OF COMPARATIVE SCALES • Small differences between stimulus objects can be detected. • As they compare the stimulus objects, participants are forced to choose between them. • Participants approach the rating task from the same known reference points. • Tend to reduce halo or carryover effects from one judgement to another. Disadvantages • Ordinal nature of the data and the inability to generalise beyond the stimulus objects scaled. For instance, to compare a visit to a pop concert to a cinema or theatre visit, the researcher would have to do a new study. Non-Comparative Scales- monadic or metric scales • Each object is scaled independently of the others in the stimulus set. • The resulting data are generally assumed to be interval or ratio scaled. • For example, participants may be asked to evaluate a cinema visit on a 1 to 6 preference scale (1 = Not at all preferred, 6 = Greatly preferred). • Similar evaluations would be obtained for a theatre visit and a pop concert visit. Non-comparative scales can be continuous rating or itemised rating scales. The itemised rating scales can be further classified as Likert, semantic differential or Stapel scales. Comparative Scaling Techniques : Paired comparison scaling 1. Paired comparison scaling: A participant is presented with two objects and asked to select one according to some criterion. The data obtained are ordinal in nature n brands, [n(n – 1)/2] paired comparisons include all possible pairings of objects. Transitivity of preference: participant’s order of preference, from most to least preferred, Comparative Scaling Techniques -Rank order scaling • A comparative scaling technique in which participants are presented with several objects simultaneously and asked to order or rank them according to some criterion. • rank order scaling results in ordinal data • to measure attributes of products and services as well as preferences for brands. • If there are n stimulus objects, only (n – 1) scaling decisions need be made in rank order scaling. However, in paired comparison scaling, [n(n – 1)/2] decisions would be required • Finally, under the assumption of transitivity, rank order data can be converted to equivalent paired comparison data, and vice versa. Hence, it is possible to derive an interval scale(How ??--we shall discuss it later in study) Comparative Scaling Techniques - Constant sum scaling • In constant sum scaling, participants allocate a constant sum of units, such as points or euros, among a set of stimulus objects with respect to some criterion. • participants may be asked to allocate 100 points to attributes of bottled beers in a way that reflects the importance they attach to each attribute. If an attribute is unimportant, the participant assigns it zero points. If an attribute is twice as important as some other attribute, it receives twice as many points. The sum of all the points is 100, hence the name of the scale. • The attributes are scaled by counting the points assigned to each one by all the participants and dividing by the number of participants Comparative Scaling Techniques - Q-sort and other procedures • A comparative scaling technique that uses a rank order procedure to sort objects based on similarity with respect to some criterion • participants are given 100 attitude statements on individual cards and asked to place them into 11 piles, ranging from ‘most highly agreed with’ to ‘least highly agreed with’. The number of objects to be sorted should not be less than 60 nor more than 140; a reasonable range is 60 to 90 objects. NON COMPARATIVE SCALING TECHNIQUES • Does not compare the object being rated either with another object or with some specified standard • They evaluate only one object at a time • thus, referred to as monadic scales. • consist of continuous and itemised rating scales NON COMPARATIVE SCALING TECHNIQUES -Continuous rating scale • A measurement scale that has participants rate the objects by placing a mark at the appropriate position on a line that runs from one extreme of the criterion variable to the other. The form may vary considerably. Also called graphic rating scale NON COMPARATIVE SCALING TECHNIQUES-ITEMISED RATING SCALES –LIKERT SCALE • A measurement scale with, typically, five response categories ranging from ‘strongly disagree’ to ‘strongly agree’ that requires participants to indicate a degree of agreement or disagreement with each of a series of statements related to the stimulus objects • each statement is assigned a numerical score, ranging either from –2 to +2 or from 1 to 5. The analysis can be conducted on an item-by-item basis (profile analysis), or a total (summated) score can be calculated for each participant by summing across items. • it is important to use a consistent scoring procedure so that a high (or low) score consistently reflects a favourable response. This requires that the categories assigned to the negative statements by the participants be scored by reversing the scale. • Note that for a negative statement, an agreement reflects an unfavourable response, whereas for a positive statement, agreement represents a favourable response. Accordingly, a ‘strongly agree’ response to a favourable statement and a ‘strongly disagree’ response to an unfavourable statement would both receive scores of 5. NON COMPARATIVE SCALING TECHNIQUES-ITEMISED RATING SCALES – Semantic differential scale • The semantic differential is typically a seven-point rating scale with end points associated with bipolar labels that have semantic meaning. In a typical application, participants rate objects on a number of itemised, seven-point rating scales bounded at each end by one of two bipolar adjectives, such as ‘boring’ and ‘exciting’. • Individual items on a semantic differential scale may be scored either on a –3 to +3 or on a 1 to 7 scale. The resulting data are commonly analysed through profile analysis. In profile analysis, means or median values on each rating scale are calculated and compared by plotting or statistical analysis. This helps determine the overall differences and similarities among the objects. NON COMPARATIVE SCALING TECHNIQUES-ITEMISED RATING SCALES – Stapel scale • A scale for measuring attitudes that consists of a single adjective in the middle of an evennumbered range of values. • unipolar rating scale with 10 categories numbered from –5 to +5, without a neutral point (zero).26 • presented vertically. • Participants are asked to indicate, by selecting an appropriate numerical response category, how accurately or inaccurately each term describes the object. • The higher the number, the more accurately the term describes the object ITEMISED SCALE RATING DECISIONS • 1 The number of scale categories to use. 2 Balanced versus unbalanced scale. 3 Odd or even number of categories. 4 Forced versus non-forced choice. 5 The nature and degree of the verbal description. 6 The physical form of the scale. Number of scale categorieS • Traditional guidelines suggest that the appropriate number of categories should be between five and nine. • If the participants are interested in the scaling task and are knowledgeable about the objects, many categories may be employed and vice –versa • If telephone interviews are involved, many categories may confuse the participants. • The size of the correlation coefficient, a common measure of relationship between variables , is influenced by the number of scale categories. The correlation coefficient decreases with a reduction in the number of categories Balanced versus unbalanced scale • In a balanced scale, the number of favourable and unfavourable categories is equal; in an unbalanced scale, the categories are unequal • If the distribution of responses is likely to be skewed, however, either positively or negatively, an unbalanced scale with more categories in the direction of skewness may be appropriate Odd or even number of categories • th an odd number of categories, the middle scale position is generally designated as neutral or impartial. The presence, position and labelling of a neutral category can have a significant influence on the response. The Likert scale is a balanced rating scale with an odd number of categories and a neutral point. • If a neutral or indifferent response is possible from at least some of the participants, an odd number of categories should be used. Forced versus non-forced choice • On forced rating scales the participants are forced to express an opinion because a ‘no opinion’ option is not provided. • In such a case, participants without an opinion may mark the middle scale position. • If a sufficient proportion of the participants do not have opinions on the topic, marking the middle position will distort measures of central tendency and variance. • In situations where the participants are expected to have no opinion, as opposed to simply being reluctant to disclose it, the accuracy of data may be improved by a non-forced scale that includes a ‘no opinion’ category Nature and degree of verbal description • The strength of the adjectives used to anchor the scale may influence the distribution of the responses. With strong anchors (1 = Completely disagree, 7 = Completely agree), participants are less likely to use the extreme scale categories. This results in less variable and more peaked response distributions. Weak anchors (1 = Generally disagree, 7 = Generally agree), in contrast, produce uniform or flat distributions. P MULTI ITEM SCALE • Multi-item scale A multi-item scale consists of multiple items, where an item is a single question or statement to be evaluated • The Likert, semantic differential and Stapel scales are multi-item scales PROCESS • STEP 1 • The researcher begins by developing the construct of interest. A construct is a specific type of concept that exists at a higher level of abstraction than everyday concepts. Examples of such constructs in marketing include ‘brand loyalty’, ‘product involvement’ and ‘satisfaction STEP 2 • Develop a theoretical definition of the construct that establishes the meaning of the central idea or concept of interest. • A theory is necessary not only for constructing the scale, but also for interpreting the resulting scores. • STEP 3 • generate an initial pool of scale items. • based on theory, analysis of secondary data and qualitative research. STEP 4 • Data are collected on the reduced set of potential scale items from a large pre-test sample of participants. • The data analysed using techniques such as correlations, factor analysis, cluster analysis, discriminant analysis and statistical tests • several more items are eliminated, resulting in a purified scale. • The purified scale is evaluated for reliability and validity by collecting more data from a different sample SCALE EVALUATION –MEASUREMENT ACCURACY • The true score model provides a framework for understanding the accuracy of measurement. According to this model, 𝑋𝑂=XT+XS+XR • where XO = the observed score or measurement • XT = the true score of the characteristic • XS = systematic error • XR = random error • However, four major error sources may contaminate the results: (1) the respondent, (2) the situation, (3) the measurer, and (4) the data collection instrument. • Systematic error affects the measurement in a constant way. It represents stable factors that affect the observed score in the same way each time the measurement is made, such as mechanical factors • Random error, on the other hand, is not constant. It represents transient factors that affect the observed score in different ways each time the measurement is made, such as short-term transient personal factors or situational factors CRITERIA FOR GOOD MEASUREMENT :Validity, Reliability, and Sensitivity. Validity is the ability of an instrument to measure what is designed to measure. Example : Taking Behaviour of employees to measure consumer satisfaction in a big shopping mall is Validity Issue Why ????? Determinant of consumer satisfaction pricing policies, discount policy, parking facility Other perks offered by the seller Content Validity OR face validity • Extent to which a measuring instrument provides adequate coverage of the topic under study. EXAMPLE : • Scale designed to measure a fashion boutique’ image would be considered inadequate if it omitted any of the major dimensions such as • brand image(s) of stocked merchandise, • quality of merchandise, • assortment of merchandise, • layout and merchandising, • service of boutique personnel, • prices, etc. Health:High spiritual, physical, mental, emotional, and social levels. Physical Health assessing their medical history, weight, body composition, activity levels, diet, lifestyle, and sleep routines. signs of temporary or chronic illness, injury, or substance abuse. Further, some evaluators may only be concerned with specific aspects of health or place more importance on certain aspects. Examples : BMI (by IBM):BMI does not accurately classify heavily muscular individuals with low body fat Nor does it accurately gauge metabolic obesity (colloquially known as skinny fat), which can present just as many health risks as those who carry a visible, significant amount of visceral fat. CRITERIA Validity • Whether a scale performs as expected in relation to other selected variables as meaningful criteria TYPES OF CRITERIA VALIDITY : CONCURRENT AND PREDICTIVE Concurrent validity is assessed when the data on the scale being evaluated (e.g. loyalty scale) and the criterion variables (e.g. repeat purchasing) are collected at the same time. The amount of agreement between two different assessments. Typically, a validated test will be classified as the "gold standard," and concurrent validity will measure how a new test compares to it. What is a good concurrent validity score? Concurrent validity scores usually range between 0 and 1: Less than 0.25: small concurrence 0.25 to 0.50: moderate 0.50 to 0.75: good Over 0.75: excellent Comparing tests and GPA • In this example, a new math test is created. After the test is administered, researchers compare the results to the student's current GPA in that class. If the GPA correlates with the test's result, concurrent validity is established. Aptitude test and supervisor assessment • To confirm the validity of a leadership aptitude test, a business compares these test results to a supervisor’s assessment of the potential recruit. Predictive Validity • Predictive validity is concerned with how well a scale can forecast a future criterion. • Criterion variable is dependant variable Y and predictor variable is X . To assess predictive validity, the researcher collects data on the scale at one point in time and data on the criterion variables at a future time. For example, attitudes towards how loyal customers feel to a particular brand could be used to predict future repeat purchases of that brand. • The predicted and actual purchases (which could be tracked on CRM databases or scanned purchases) are compared to assess the predictive validity of the attitudinal scale. X Y • For example, if the correlation between a pre-employment test and the employee productivity one year later is 0.86, this test is more predictive of employee productivity compared to a test that only has a correlation of 0.35. CONSTRUCT Validity • Construct validity addresses the question of what construct or characteristic the scale is, in fact, measuring. • new questionnaire to evaluate aggression, the instrument’s construct validity would be the extent to which it actually assesses aggression as opposed to assertiveness, social dominance • Its of three types • Convergent Validity, Divergent ,Nomological EXAMPLES • Purchase Intention And Purchase Behavior • We can determine construct validity by following-up later on to see if the answers to a questionnaire correlated with actual behavior. For example, after completing a questionnaire indicating you’re interested in movies, did you end up purchasing DVDs or going to the cinema? • Construct: Purchase Intention • Construct Validity Measure: Subsequent Consumer Behavior TYPES OF CONSTRUCT VALIDITY TESTS • Convergent Validity Testing-. To determine the construct validity of a selfesteem rating scale, compare it to other established self-esteem rating scales to see if they correlate this a convergent validity test. To establish construct validity of any one of these scales would simply involve administering the scale of interest (self-esteem scale #1) with one of the others (self-esteem scale #2). The researcher then only needs to calculate a correlation between scores of the two tests. • If the correlation is close to 1, then it could be said that self-esteem scale 1 has construct validity. This type of construct validity is called convergent validity. It involves assessing the degree of similarity between two scales that measure the same construct. Divergent Validity Testing: Test the construct validity of a selfesteem rating scale by comparing it to a different rating scales to see if they correlate. Here, we want to see low correlation because they’re testing different things. We call this a divergent validity test. • For this example, we could compare a self-esteem rating scale to an introvert/extravert rating scale. Hopefully, these scales do not correlate. • To conduct this assessment simply requires administering both scales to the same population, and then calculating the correlation between the scores on each scale. • Ideally, there would be a very low correlation (close to 0) between the two because they are measuring two theoretically distinct constructs. • Nomological validity is the extent to which the scale correlates in theoretically predicted ways with measures of different but related constructs. A theoretical model is formulated that leads to further deductions, tests and inferences • Exploratory Factor Analysis (EFA): Exploratory factor analysis (EFA) is a statistical procedure for assessing the individual questions on a measurement scale. It involves comparing questions that measure a component to see if the answers to these questions all correlate, demonstrating high construct validity. • For example, if a researcher wants to develop a measure of extraversion, they would start by identifying the theoretical components of that construct. They may be: • Friendliness • Agreeableness • Sociability • cheerfulness • Since the construct has four components, the scale should contain multiple questions that assess each one component – perhaps 8 questions per component for a total of 32 questions. • Of the 8 questions about ‘friendliness’, we should see that the respondents provide similar answers for all eight. In this case, the questions have a high degree of relatedness with each other and likely have high construct validity. • If the scale has good construct validity, then this pattern should hold for each component. However, some questions may not be as related as they should, which means they are not measuring the same thing that the others are measuring. This weakens construct validity. • https://maps.org/research-archive/mdma/mt1_docs/neoinventory.pdf • https://chsresults.com/blog/test/eysencks-personality-inventory-epiextroversionintroversion/ Reliability • Reliability refers to the extent to which a scale produces consistent results if repeated measurements are made. • Approaches for assessing reliability include the test–retest, alternative-forms and internal consistency methods. • Examples : Scales that measured weight, Tapes • If findings from research are replicated consistently, they are reliable. A correlation coefficient can be used to assess the degree of reliability. If a test is reliable, it should show a high positive correlation. Reliability Estimates • In test–retest reliability, participants are administered identical sets of scale items at two different times, In alternative-forms reliability, two equivalent forms of the scale are constructed. • The same participants are measured at two different times The higher the correlation coefficient, the greater the reliability Multi-Item Scales • generate some interval type of information. In interval scaling technique, a scale is constructed with the number or description associated with each scale position. Therefore, the respondent’s rating on certain characteristics of interest is obtained. For the majority of researchers, the rating scales are the preferred measuring device to obtain interval (or quasi-interval) data on the personal characteristics (i.e., attitude, preference, and opinions) of the individuals of all kind.14 There are arguments and counterarguments in favour of both single- and multi-item scales. In his research paper, “Reliability: A Review of Psychometric Basics and Recent Marketing Practices”, P. J. Peter has argued that the multiple measures are inherently more reliable because they enable computation of correlations between the items.15 A second argument in favour of the multi-item measures is that a multi-item measure captures more information than that captured by a single-item measure.16 The following section discusses some common multiitem scales such as Likert scales, semantic differential scales, staple scales, and numerical scales. • Summated Scaling Technique: • The Likert Scales The Likert scale is developed by Rensis Likert and is a most common In a Likert scale, each item response has five rating categories, “strongly disagree” to “strongly agree” as two extremes with “disagree,” “neither agree nor disagree,” and “agree” in the middle of the scale. Typically, a 1- to 5-point rating scale is used, but few researchers also use another set of numbers such as 2, 1, 0, 1, and 2. As another approach, scores are obtained from the respondents, and the sum is obtained across the scale items. After summing, an average is obtained for all the respondents. The summated approach is widely used, which is why the Likert scale is also referred as the summated scale. Semantic Differential Scales • The semantic differential scale consists of a series of bipolar adjectival words or phrases placed on the two extreme points of the scale. Some researchers prefer bipolar scales, whereas some other researchers prefer unipolar scales. In a bipolar scale, mid-point is the neutral point, whereas in a unipolar scale, the mid-point is simply a point between the two poles. • shows an example of a semantic differential scale in which positive adjectives are on the left side of the scale for Items 1, 2, 3, 4, and 7. This is a reason why the highest number of scale, that is, 7, is assigned to the left side of the scale. In contrast to the discussed items, for Items 5 and 6, left side of the scale carries negative adjectives or phrases. This is a reason why lowest rating number 1 is assigned to the left side of the Items 5 and 6. This is a deliberately done exercise because it avoids the halo effect. The halo effect has an adverse impact on the respondent’s answer because it is the tendency of a respondent to follow the previous judgment carelessly when all the items have the negative adjectives on the left side of the scale and the positive adjectives on the right. Good semantic differential scales keep some negative adjectives and some positive adjectives on the left side of the scale to tackle the problem of the halo effect. Staple Scales • A staple scale is a variationof the semantic differential scale; however, each item consists of just one word or phrase on which respondents rate the attitude object using a 10-item scale with just numerical labels. The staple scale is generally presented vertically with a single adjective or phrase in the centre of the positive and negative ratings • Numerical scales • Numerical scales provide equal intervals separated by numbers, as scale points to the respondents. These scales are generally 5- or 7-point rating scales. Continuous Rating Scale • In a continuous rating scale, the respondents rate the object by placing a mark on a continuum to indicate their attitude. In this scale, the two ends of continuum represent the two extremes of the measuring phenomenon. Decision on the Basis of Objective of Conducting a Research • 3.7.2 Decision Based on the Response Data Type Generated by Using a Scale To get the nominal information, a nominal scale (multiple-choice scales) is used. This kind of demographic information provides a new dimension to research. For example, product preference can always be clubbed with some demographic variables such as age or gender of a consumer. A respondent’s attitude significantly varies in the light of differences among the demographic characteristics of respondents. A researcher has to use a ranking scale when he or she is supposed to make comparisons between the two objects or object attributes. The ranking scales are also used to rank different objects simultaneously from a list of objects presented to them. The rating scales are generally used when research focus is to get the respondent’s response for an object on a rating continuum usually on a 1- to 5- or 1- to 7-point rating scale. For example, a respondent is presented with the 1- to 5-point rating scale in which 1 is strongly disagree and 5 strongly agree. The respondent is supposed to provide his or her opinion about the service facility statements (items) of a five-star hotel group. Sometimes, a ratio scale is used to obtain direct information such as average income of a group and its comparison with the other groups. 3.7.3 Decision Based on Using Single- or • Bajpai Naval. Business Research Methods (p. 62). Pearson Education. Kindle Edition. • 3.7.3 Decision Based on Using Single- or Multi-Item Scale Single-item scale or multiitem scale or both has its own advocates and opponents in the field of business research. Proponents of the multi-item scale believe that a single observation may be misleading and lacking in context, thus the multi-item measurement scales can help to overcome these distortions.21 Practitioner’s preference for single-item measures is not theoretically based but rather is practical, in that single-item measures minimize respondent refusal and reduce data collection and data processing cost.22 Based on the research objective, a researcher should take a decision about the single-item or multi-item scale. 3.7.4 Decision • Bajpai Naval. Business Research Methods (p. 62). Pearson Education. Kindle Edition. • 3.7.4 Decision Based on Forced or Non-Forced Choice In a forced-choice rating scale, the researchers do not include a “no opinion” option in the scale points. This forces a respondent to provide an opinion even when he has no opinion about the object. In some cases, researchers conduct a research study under the assumption that the respondent will definitely be providing an opinion in terms of selecting a rating point. In such situations, the respondents may sometimes have an undecided attitude and they usually select the mid-point of the scale. This mid-point of the scale is necessarily not the no opinion option and hence biases the result. The respondent’s tendency to select the middle option distorts the result, as measures of central tendency such as mean and median of the data tend to shift towards the mid-point. In such a situation, a researcher can incorporate a no opinion point in the scale to avoid biased responses. In a non-forced-choice rating scale, a no opinion option is provided by the researcher. 3.7.5 Decision Based on Using Balanced • Bajpai Naval. Business Research Methods (p. 62). Pearson Education. Kindle Edition. • 3.7.5 Decision Based on Using Balanced or Unbalanced Scale In a balanced scale, the number of favourable categories and unfavourable categories remains equal. In an unbalanced scale, favourable and unfavourable categories remain unequal. A balanced scale is of the following form: strongly disagree, disagree, neither agree nor disagree, agree, and strongly agree. Note that in this type of scale, two rating points indicate agreement, two rating points indicate non-agreement and one point is the neutral state (neither agree nor disagree). Hence, this scale is a balanced scale. An unbalanced scale is of the following form: strongly disagree, disagree, agree, strongly agree, and very strongly agree. It can be noted that in the discussed scale, three rating points indicate agreement and only two rating points indicate nonagreement resulting in an unbalanced scale. The respondents have the tendency of rating higher when object is familiar to them or when the object involves “ego” of the respondents. Usually, researchers use a balanced scale with equal number of favourable and unfavourable terms. Sometimes researchers know in advance that the respondents will present a skewed response in favour and non-favour of the research phenomenon. In this case, an unbalanced scale in the direction of skewness may be an appropriate choice. In the case of using an unbalanced rating scale, the researchers have to take this consideration when doing data analysis. 3.7.6 Decision Based on • 3.7.6 Decision Based on the Number of Scale Points and Its Verbal Description The researchers generally use a 3-, 5-, 7-, 9-, or 11-point scale. In some rare cases, a 13-point scale is also used. When an object is simple and has no major impact on the respondent’s life, a simple 3-point scale can be used. In other case, when the object requires high involvement of the respondent, any scale ranging from 5 to 11 points can be considered by the researcher. While deciding the number of scale categories, some factors such as handling comfort of respondents, respondent’s awareness about the subject matter, and mode of data collection method must be considered. It is obvious that the respondent finds a great deal of difficulty in handling scale categories if these are many. Therefore, based on the researcher’s essentiality to include scale categories, too many items in scale categories must be avoided. The respondent’s familiarity with the subject matter or objects allows a researcher to include a large number of categories. On the other hand, if the respondent is unaware or little aware about the object or subject matter, the inclusion of a large number of categories must be avoided. Mode of data collection is also a determinant of scale categories. For example, a researcher will be uncomfortable using a large number of categories while administering the questionnaire through a telephone. The telephone method of data collection requires a fewer scale categories. A scale can have numerical or verbal or pictorial descriptions associated with the scale points. In some cases, researchers label extreme scale points. In some other cases, the researchers label every scale point. As a general rule, the description of the scale point should be close to the concerned point. As another matter of understanding, labelling all the scale points allows a researcher to avoid scale ambiguity. These are the general recommendations, recommendations, although the final decision is a matter of researcher’s wisdom. EXERCISES • 1 You work in the marketing research department of a luxury watch brand. Your firm would like to measure the attitudes of retailers towards your brand and your main competitors. The attitudes would be measured using an online survey. You have been asked to develop an appropriate scale for this purpose. You have also been asked to explain and justify your reasoning in constructing this scale. • 2 Develop three comparative (paired comparison, rank order and constant sum) scales to measure attitude towards five popular brands of soft drink (e.g. Coca Cola, Pepsi, Dr. Pepper, 7 Up and Red Bull). Administer each scale to five students. No student should be administered more than one scale. Note the time it takes each student to respond. Which scale was the easiest to administer? Which scale took the shortest time? EXERCISES • 3 Develop a constant sum scale to determine preferences for restaurants. Administer this scale to a pilot sample of 20 students to determine their preferences for some of the popular restaurants in your town or city. Based on your pilot, evaluate the efficacy of the scale items you chose, and design new scale items that could be used for a full survey. • 4 Design Likert scales to measure the usefulness of the Louis Vuitton Möet Hennessy website. Visit the site at www.lvmh.com and rate it on the scales that you have developed. After your site visit, were there any aspects of usefulness that you had not considered in devising your scales, what were they and why were they not apparent before you made your site visit? 5 In a small group, discuss the following issues: ‘A brand could receive the highest median rank on a rank order scale of all the brands considered and still have poor sales’ and ‘It really does not matter which scaling technique you use. As long as your measure is reliable, you will get the right results