Forecasting Techniques Materials required for this course are at: www.business.duq.edu/faculty/davies © Copyright 2006. Do not distribute or copy without permission. 1 Don’t Trust Your Eyes © Copyright 2006. Do not distribute or copy without permission. 2 Desirable Properties of Estimators Unbiasedness: E ˆ Consider rolling a die. The population mean of the die rolls is 3.5. Suppose we take a sample of N rolls of the die. Let Xi be the i th die roll. We then estimate the population mean via the equation Parameter Estimator #1 1 N N Xi i 1 1 N Parameter Estimator #2 Xi N 1 i 1 N 1 if X i is odd Parameter Estimator #3 i 1 6 otherwise Estimators #1 and #3 are unbiased because, on average, they will equal 3.5. Estimator #2 is biased because, on average, it will be less than 3.5. © Copyright 2006. Do not distribute or copy without permission. 3 Desirable Properties of Estimators Pr ˆ 0 1 Consistency: Nlim Consider rolling a die. The population mean of the die rolls is 3.5. Suppose we take a sample of N rolls of the die. Let Xi be the i th die roll. We then estimate the population mean via the equation Parameter Estimator #1 1 N N Xi i 1 1 N Parameter Estimator #2 Xi N 1 i 1 N 1 if X i is odd Parameter Estimator #3 i 1 6 otherwise Estimators #1 and #2 are both consistent because, as N increases, the estimates (on average) approach 3.5. Estimator #3 is inconsistent because, as N increases, the estimate does not approach 3.5. © Copyright 2006. Do not distribute or copy without permission. 4 Desirable Properties of Estimators Efficiency: ˆ minimum attainable in the class of linear, unbiased estimators 1.7078 Standard Error of Estimator #1 N N 1 3.5 6 3.5 2 Standard Error of Estimator #3 2 2 2.5 Estimator #3 is not efficient because there is another linear unbiased estimator (Estimator #1) that attains a lesser standard error. © Copyright 2006. Do not distribute or copy without permission. 5 Impacts of Anomalies on Estimator Properties Parameter Estimates Anomaly Biased Inconsistent Non-Stationarity X X Omitted Variable X X Non-Linearity X X Regime Shift X (X) Non-Zero Errors X (X) Measurement Error X X Truncated Data X X Censored Data X X Standard Error Estimates Inefficient Biased Inconsistent X X Serial Correlation X X Heteroskedasticity X X Extraneous Variable X Multicollinearity X © Copyright 2006. Do not distribute or copy without permission. 6 Multiple Regression Analysis Example: A trucking company wants to be able to predict the round-trip travel time of its trucks. Data Set #15 contains historical information on miles traveled, number of deliveries per trip, and total travel time. Use the information to predict a truck’s round-trip travel time. Miles Traveled 500 250 500 500 250 400 375 325 450 450 Deliveries Travel Time (hours) 4 11.3 3 6.8 4 10.9 2 8.5 2 6.2 2 8.2 3 9.4 4 8 3 9.6 2 8.1 Approach #1: Calculate Average Time per Mile Trucks in the data set required a total of 87 hours to travel a total of 4,000 miles. Dividing hours by miles, we find an average of 0.02 hours per mile journeyed. Problem: This approach ignores a possible fixed effect. For example, if travel time is measured starting from the time that out-bound goods begin loading, then there will be some fixed time (the time it takes to load the truck) tacked on to all of the trips. For longer trips this fixed time will be “amortized” over more miles and will have less of an impact on the time/mile ratio than for shorter trips. This approach also ignores the impact of the number of deliveries. © Copyright 2006. Do not distribute or copy without permission. 7 Multiple Regression Analysis Example: A trucking company wants to be able to predict the round-trip travel time of its trucks. Data Set #15 contains historical information on miles traveled, number of deliveries per trip, and total travel time. Use the information to predict a truck’s round-trip travel time. Approach #2: Calculate Average Time per Mile and Average Time per Delivery Trucks in the data set averaged 87 / 4,000 = 0.02 hours per mile journeyed, and 87 / 29 = 3 hours per delivery. Problem: Like the previous approach, this approach ignores a possible fixed effect. This approach does account for the impact of both miles and deliveries, but the approach ignores the possible interaction between miles and deliveries. For example, trucks that travel more miles likely also make more deliveries. Therefore, when we combine the time/miles and time/delivery measures, we may be double-counting time. © Copyright 2006. Do not distribute or copy without permission. 8 Multiple Regression Analysis Example: A trucking company wants to be able to predict the round-trip travel time of its trucks. Data Set #15 contains historical information on miles traveled, number of deliveries per trip, and total travel time. Use the information to predict a truck’s round-trip travel time. Miles Traveled 500 250 500 500 250 400 375 325 450 450 Deliveries Travel Time (hours) 4 11.3 3 6.8 4 10.9 2 8.5 2 6.2 2 8.2 3 9.4 4 8 3 9.6 2 8.1 Timei 0 1 (milesi ) u i Approach #3: Regress Time on Miles The regression model will detect and isolate any fixed effect. Problem: The model ignores the impact of the number of deliveries. For example, a 500 mile journey with 4 deliveries will take longer than a 500 mile journey with 1 delivery. © Copyright 2006. Do not distribute or copy without permission. 9 Multiple Regression Analysis Example: A trucking company wants to be able to predict the round-trip travel time of its trucks. Data Set #15 contains historical information on miles traveled, number of deliveries per trip, and total travel time. Use the information to predict a truck’s round-trip travel time. Miles Traveled 500 250 500 500 250 400 375 325 450 450 Deliveries Travel Time (hours) 4 11.3 3 6.8 4 10.9 2 8.5 2 6.2 2 8.2 3 9.4 4 8 3 9.6 2 8.1 Timei 0 1 (deliveriesi ) u i Approach #4: Regress Time on Deliveries The regression model will detect and isolate any fixed effect and will account for the impact of the number of deliveries. Problem: The model ignores the impact of miles traveled. For example, a 500 mile journey with 4 deliveries will take longer than a 200 mile journey with 4 deliveries. © Copyright 2006. Do not distribute or copy without permission. 10 Multiple Regression Analysis Example: A trucking company wants to be able to predict the round-trip travel time of its trucks. Data Set #15 contains historical information on miles traveled, number of deliveries per trip, and total travel time. Use the information to predict a truck’s round-trip travel time. Miles Traveled 500 250 500 500 250 400 375 325 450 450 Deliveries Travel Time (hours) 4 11.3 3 6.8 4 10.9 2 8.5 2 6.2 2 8.2 3 9.4 4 8 3 9.6 2 8.1 Timei 0 1 (milesi ) 2 (deliveriesi ) u i Approach #5: Regress Time on Both Miles and Deliveries The multiple regression model (1) will detect and isolate any fixed effect, (2) will account for the impact of the number of deliveries, (3) will account for the impact of miles, and (4) will eliminate out the overlapping effects of miles and deliveries. © Copyright 2006. Do not distribute or copy without permission. 11 Multiple Regression Analysis Example: A trucking company wants to be able to predict the round-trip travel time of its trucks. Data Set #15 contains historical information on miles traveled, number of deliveries per trip, and total travel time. Use the information to predict a truck’s round-trip travel time. Regression model: Timei 0 1 (miles i ) 2 (deliveries i ) u i Estimated regression model: ^ Timei ˆ0 ˆ1 (miles i ) ˆ2 (deliveries i ) SUMMARY OUTPUT ˆ0 1.13 (0.952) [0.2732] Regression Statistics Multiple R 0.950678166 R Square 0.903788975 Adjusted R Square 0.876300111 Standard Error 0.573142152 Observations 10 ˆ1 0.01 (0.002) [0.0005] ˆ2 0.92 (0.221) [0.0042] R 2 0.90 ANOVA df Regression Residual Total Intercept X Variable 1 X Variable 2 2 7 9 SS MS F Significance F 21.60055651 10.80027826 32.87836743 0.00027624 2.299443486 0.328491927 23.9 Coefficients Standard Error t Stat P-value 1.131298533 0.951547725 1.188903619 0.273240329 0.01222692 0.001977699 6.182396959 0.000452961 0.923425367 0.221113461 4.176251251 0.004156622 © Copyright 2006. Do not distribute or copy without permission. Lower 95% Upper 95% -1.118752683 3.38134975 0.007550408 0.016903431 0.400575489 1.446275244 Standard deviations of parameter estimates and pvalues are typically shown in parentheses and brackets, respectively, near the parameter estimates. 12 Multiple Regression Analysis Example: A trucking company wants to be able to predict the round-trip travel time of its trucks. Data Set #15 contains historical information on miles traveled, number of deliveries per trip, and total travel time. Use the information to predict a truck’s round-trip travel time. Estimated regression model: ^ Timei ˆ0 ˆ1 (miles i ) ˆ2 (deliveries i ) ˆ0 1.13 (0.952) [0.2732] ˆ1 0.01 (0.002) [0.0005] ˆ2 0.92 (0.221) [0.0042] R 2 0.90 Notes on results: 1. Constant is not significantly different from zero. 2. Slope coefficients are significantly different from zero. 3. Variation in miles and deliveries, together, account for 90% of the variation in time. © Copyright 2006. Do not distribute or copy without permission. The parameter estimates are measures of the marginal impact of the explanatory variables on the outcome variable. Marginal impact measures the impact of one explanatory variable after the impacts of all the other explanatory variables are filtered out. Marginal impacts of explanatory variables 0.01 = increase in time given increase of 1 mile traveled. 0.92 = increase in time given increase of 1 delivery. 13 Causal vs. Exploratory Analysis The goal of exploratory analysis is to obtain a measure of a phenomenon. Example: Subjects are given a new breakfast cereal to taste and asked to rate the cereal. The measured phenomenon is taste. Although taste is subjective, by taking the average of the measures from a large number of subjects, we can measure the underlying objective components that give rise to the subjective feeling of taste. © Copyright 2006. Do not distribute or copy without permission. 14 Causal vs. Exploratory Analysis The goal of causal analysis is to obtain the change in measure of a phenomenon due to the presence vs. absence of a control variable. Example: Two groups of subjects are given the same breakfast cereal to taste and are asked to rate the cereal. One group is given the cereal in a black and white box. The other in a multi-colored box. The two groups of subjects exist under identical conditions (same cereal, same testing environment, etc.), with the exception of the color of the cereal box. Because the color of the cereal box is the only difference between the two groups, we call the color of the box the control variable. If we find a difference in subjects’ reported tastes, then we know that the difference in perceived taste is due to the color (or lack of color) of the cereal box. It is possible that, apart from random chance, one group of subjects reports liking the cereal and the other does not (e.g. one group was tested in the morning and the other in the evening). We would call this a confound. A confound is the presence of an additional (and unwanted) difference in the two groups. When a confound is present, it makes it difficult (perhaps impossible) to determine how much of the difference in reported taste between the two groups is due to the control and how much is due to the confound. © Copyright 2006. Do not distribute or copy without permission. 15 Causal vs. Exploratory Analysis Because the techniques for causal and exploratory analysis are identical (with the exception that causal analysis includes the use of a control variable whereas exploratory analysis does not), we will limit our discussion to causal analysis. © Copyright 2006. Do not distribute or copy without permission. 16 Designing Survey Instruments The Likert Scale We use the Likert scale to rate responses to qualitative questions. Example: “Which of the following best describes your opinion of the taste of Coke?” Too Sweet 1 Very Sweet 2 Just Right 3 Slightly Sweet 4 Not Sweet 5 The Likert scale elicits more information than a simple “Yes/No” response the analyst can gauge the degree rather than simply the direction of opinion. © Copyright 2006. Do not distribute or copy without permission. 17 Designing Survey Instruments Rules for Using the Likert Scale 1. 2. 3. 4. Use 5 or 7 gradations of response. fewer than 5 yields too little information more than 7 creates too much difficulty for respondents in distinguishing one response from another Always include a mid-point (or neutral) response. When appropriate, include a separate response for “Not applicable,” or “Don’t know.” When possible, include a descriptor with each response rather than simply a single descriptor on each end of the scale. Example: Yes Very Bad 1 No Very Bad 1 Bad 2 2 Neutral 3 3 Good 4 Very Good 5 4 Good 5 The presence of the lone words at the ends of the scale will introduce a bias by causing subjects to shun the center of the scale. © Copyright 2006. Do not distribute or copy without permission. 18 Designing Survey Instruments Rules for Using the Likert Scale 5. Use the same words and (where possible) the same number of words for each descriptor. Example: Yes Very Bad 1 Bad 2 Neutral 3 Good 4 Very Good 5 No Bad 1 Poor 2 OK 3 Better 4 Best 5 When using different words for different descriptors, subjects may perceive varying quantities of difference between points on the scale. For example, subjects may perceive that the difference between “Bad” and “Poor” is less than the difference between “Poor” and “OK.” © Copyright 2006. Do not distribute or copy without permission. 19 Designing Survey Instruments Rules for Using the Likert Scale 6. Avoid using zero as an endpoint on the scale. Example: Yes Very Bad 1 Bad 2 Neutral 3 Good 4 Very Good 5 No Very Bad 0 Bad 1 Neutral 2 Good 3 Very Good 4 On average, subjects will associate the number zero with “bad.” Thus, using zero at the endpoint of the scale can bias subjects away from the side of the scale with the zero. © Copyright 2006. Do not distribute or copy without permission. 20 Designing Survey Instruments Rules for Using the Likert Scale 7. Avoid using unbalanced negative numbers. Example: Yes Very Bad -2 Bad -1 Neutral 0 Good 1 Very Good 2 No Very Bad -3 Bad -2 Neutral -1 Good 0 Very Good 1 Subjects associate negative numbers with “bad.” If you have more negative numbers on one side of the scale than the other, subjects will be biased away from that side of the scale. © Copyright 2006. Do not distribute or copy without permission. 21 Designing Survey Instruments Rules for Using the Likert Scale 8. Keep the descriptors balanced. Example: Yes Very Bad 1 Bad 2 Neutral 3 Good 4 Very Good 5 No Very Bad 1 Bad 2 Slightly Good 3 Good 4 Very Good 5 Subjects will be biased toward the side with more descriptors. © Copyright 2006. Do not distribute or copy without permission. 22 Designing Survey Instruments Rules for Using the Likert Scale 9. Arrange the scale so as to maintain (1) symmetry around the neutral point, and (2) consistency in the intervals between points. Example: Yes Very Bad 1 No Very Bad 1 No Very Bad 1 Bad 2 Neutral 3 Good 4 Very Good 5 Bad Neutral Good Very Good 2 3 4 5 Bad 2 Neutral Good 3 4 Very Good 5 In the second example, subjects perceive the difference between “Neutral” and “Very Bad” to be greater than the difference between “Neutral” and “Very Good.” Responses will be biased toward the right side of the scale. In the third example, subjects perceive the difference between “Very Bad” and “Bad” to be greater than the difference between “Bad” and “Neutral.” Responses will be biased toward the center of the scale. © Copyright 2006. Do not distribute or copy without permission. 23 Designing Survey Instruments Rules for Using the Likert Scale 10. Use multi-item scales for ill-defined constructs. Example: “I liked the product.” Strongly Agree Agree 1 2 Yes No Neutral 3 Disagree 4 Strongly Disagree 5 “I am satisfied with the product.” Strongly Agree Agree Neutral 1 2 3 Disagree 4 Strongly Disagree 5 “I believe that this is a good product.” Strongly Agree Agree Neutral 1 2 3 Disagree 4 Strongly Disagree 5 “I liked the product.” Strongly Agree Agree 1 2 Disagree 4 Strongly Disagree 5 © Copyright 2006. Do not distribute or copy without permission. Neutral 3 24 Designing Survey Instruments Rules for Using the Likert Scale 10. Use multi-item scales for ill-defined constructs. Ill-defined constructs may be interpreted differently by different people. Use the multi-item scale (usually three items) and then average the items to obtain a single response for the ill-defined construct. Example: The ill-defined construct is Product satisfaction We construct three questions, each of which touch of the idea of product satisfaction. A subject gives the following responses: “I liked the product.” “I am satisfied with the product.” “I believe that this is a good product.” 4 4 3 Average response for Product satisfaction is 3.67 © Copyright 2006. Do not distribute or copy without permission. 25 Designing Survey Instruments Rules for Using the Likert Scale 10. Use multi-item scales for ill-defined constructs. Be careful that the multi-item scales all measure the same ill-defined construct. Yes “I liked the product.” “I am satisfied with the product.” “I believe that this is a good product.” No “I liked the product.” “I am satisfied with the product.” “I will purchase the product.” The statement “I will purchase the product” includes the consideration of “price” which the other two questions do not. © Copyright 2006. Do not distribute or copy without permission. 26 Designing Survey Instruments Rules for Using the Likert Scale 11. Occasionally, it is useful to verify that the subjects are giving considered (as opposed to random) answers. To do this, ask the same question more than once at different points in the survey. Look at the variance of the responses across the multiple instances of the question. If the subject is giving considered answers, the variance should be small. © Copyright 2006. Do not distribute or copy without permission. 27 Designing Survey Instruments Rules for Using the Likert Scale 12. Avoid self-referential questions. Yes “How do you perceive that others around you feel right now?” No “How do you feel right now?” Self-referential questions elicit bias because they encourage the respondent to answer subsequent questions consistently with the self-referential question. Example: If we ask the subject how he feels and he responds positively, then his subsequent answers will be biased in a positive direction. The subject will, unconsciously, attempt to behave consistently with his reported feelings. Exception: You can ask a self-referential question if it is the last question in the survey. As long as the subject does not go back and change previous answers, there is no opportunity for the self-reference to bias the subject’s responses. © Copyright 2006. Do not distribute or copy without permission. 28 Designing Survey Instruments Example: We want to test the effect of relevant news on purchase decisions. Specifically, we want to know if the presence of positive news about a low-cost product increases the probability of consumers purchasing that product. Causal Design: We will expose two subjects to news announcements about aspirin. The control group will see a neutral announcement that says nothing about the performance of aspirin. The experimental group will see a positive announcement that says that aspirin has positive health benefits. After exposure to the announcements, we will ask each group to rate their attitudes toward aspirin. Our hypothesis is that there is no difference in the average attitudes toward aspirin between the two groups. To account for possible preconceptions about aspirin, before we show the subjects the news announcements, we will ask how frequently they take aspirin. To account for possible gender effects, we will also ask subjects to report their genders. © Copyright 2006. Do not distribute or copy without permission. 29 Designing Survey Instruments How often do you take aspirin? Infrequently 1 2 Occasionally 3 4 5 Frequently 6 7 Please identify your gender (M/F). All subjects are first asked to respond to these questions. © Copyright 2006. Do not distribute or copy without permission. 30 Designing Survey Instruments Subjects in the control group see this news announcement. The analyst reads the headline and the introductory paragraph. Subjects in the control group are then asked to answer this question. Please rate your attitude toward aspirin. Unfavorable 1 © Copyright 2006. Do not distribute or copy without permission. 2 Neutral 3 4 Favorable 5 6 7 31 Designing Survey Instruments Subjects in the experimental group see this news announcement. The analyst reads the headline and the introductory paragraph. Subjects in the experimental group are then asked to answer this question. Please rate your attitude toward aspirin. Unfavorable 1 © Copyright 2006. Do not distribute or copy without permission. 2 Neutral 3 4 Favorable 5 6 7 32 Statistical inference within the context of classical regression analysis requires that the researcher test a hypothesis against a sample data set. Example 1. Hypothesize that the following relationship exists Y 1 X1 2 X 2 u 2. Determine the likelihood of observing the sample data assuming that the hypothesis is correct. The process requires the specification of a hypothesized model the theory leads the data. However, in many applications, there is no hypothesized model because there is no (or incomplete) theory the data leads the theory. Example: Clinical data, financial data In these instances, stepwise regression procedures are typically employed. © Copyright 2006. Do not distribute or copy without permission. Stepwise Regression Begin with a set of K candidate regressors. “Smartly” select regressors so as to achieve the highest adjusted R2. For a set of K candidate regressors, the number of regression models that can be constructed is 2K – 1. Ideally, the researcher would look at all 2K – 1 models and select the “best” model. In practice, the number of models becomes prohibitively large very quickly. With 40 candidate regressors, one can construct more than 1 trillion models. It would take 100 years to estimate every one of the models. © Copyright 2006. Do not distribute or copy without permission. For a set of K candidate regressors, the number of regression models that can be constructed is 2K – 1. 1,000,000,000,000,000 100,000,000,000,000 10,000,000,000,000 1,000,000,000,000 Number of Models 100,000,000,000 10,000,000,000 1,000,000,000 100,000,000 10,000,000 1,000,000 100,000 10,000 1,000 100 10 1 0 5 10 15 20 25 30 35 Number of Candidate Regressors © Copyright 2006. Do not distribute or copy without permission. 40 45 50 As K increases, the time required to run all 2K – 1 models increases exponentially. The practical limit (running on a single computer) is 25 to 30 candidate regressors. 100000000 100 years to estimate all possible models that can be constructed from 40 candidate factors 10000000 100000 10000 1000 100 10 0.01 0.001 0.0001 0.00001 0.000001 0.0000001 Number of Candidate Regressors © Copyright 2006. Do not distribute or copy without permission. 43 40 37 34 31 28 25 22 19 16 13 7 10 0.1 4 1 1 Hours to Estimate All Possible Models 1000000 Stepwise Regression To avoid looking at all 2K – 1 models, stepwise methods employ heuristics to arrive at a “good” model in a reasonable amount of time. Problems: 1. The likelihood of stepwise methods failing to find the best model falls as the number of candidate regressors rises. For K > 7, stepwise methods almost always fail to find the best model (according to stepwise’s definition of “best”). 2. Which model stepwise returns usually depends on the (arbitrary) starting point. 3. The likelihood of stepwise methods finding a spurious model rises with K. 4. Stepwise provides no information as to the quality of the returned model relative to other possible models. © Copyright 2006. Do not distribute or copy without permission. Stepwise methods select a starting model and then iteratively adjust the model in an attempt to find better models. A Quality of Model C Bad Poor Better Good Best D Set of all 2K-1 possible models. B Of the four starting points shown, only with starting point D will stepwise methods discover the best model. © Copyright 2006. Do not distribute or copy without permission. Alternative Approach: All Subsets or “Exhaustive” Regression Find the “best” model by estimating all possible 2K – 1 models. Problem: How to define “best?” Typically, highest adjusted R2. But, for K even relatively small (e.g. K > 10), the likelihood of the model with the highest adjusted R2 being spurious is high. Question: The R2 measure is a “within model” measure as it is based on information contained in a specific model. Can a measure be constructed that utilizes information “across models” in an attempt to guard against spurious results? © Copyright 2006. Do not distribute or copy without permission. Conclusion: Employing assumptions weaker than those of the Classical Linear Model, we can obtain unbiased parameter estimates via taking the mean of parameter estimates across models. Question: What is the distribution of parameter estimates across models? Construct a “cross-model test statistic” for each of the K candidate regressors. H0 : i 0 Ha : i 0 2K 1 2 ˆi , j ci s j 1 i Estimate of ith coefficient derived from the jth subset of candidate regressors. 2 ~ 22K 1 2 Standard deviation of the ith coefficient across the subsets of candidate regressors. © Copyright 2006. Do not distribute or copy without permission.