Statistical Methods/Techniques and Their Applications Situation: Trying to determine if the population means of 2 groups are significantly different from one another. The observations are interval data. Statistical Tool: t-test for two independent samples equal variance unequal variance matched pairs Situation: Trying to determine if 2 population proportions are significantly different from one another. The observations are interval data. Statistical Tool: z-test for two population proportions Situation: Trying to determine if observed frequencies (in a fixed number of categories) differ from expectations (past or from theory). The observations fall into nominal categories. We are counting number of observations in each category. Statistical Tool: Chi-Squared goodness of fit test Situation: Are two classifications of a population of nominal data independent of one another? A two-dimensional contingency table is used that has the count of a number of observations in each cell. Statistical Tool: Chi-Squared test for independence Situation: We are trying to determine whether or not there is a linear relationship (or correlation) between two interval variables. We want to forecast one thing from one other thing, if possible. Statistical Tool: Simple linear regression Situation: We are trying to determine whether or not the means of a number of (k) groups of interval data are significantly different from one another. The observations are interval. Statistical Tool: One-way ANOVA Situation: We are interested in examining the effects on an interval response variable (or observations) of two factors. Is there a factor A effect? Are the means of the groups significantly different for factor A? Is there a factor B effect? Are the means of the groups significantly different for factor B? Is there an AB interaction effect (the interaction should be analyzed first). Is the response for factor A dependent on the level of factor B and vice-versa? Statistical Tool: Two-factor ANOVA or GLM ANOVA Situation: We are trying to predict or estimate the value of one variable (interval) from several other variables (interval or categorical). The relationship between variables (iv and dv) can be linear or non-linear. Statistical Tool: Multiple Regression If the analyst believes there could be a quadratic relationship between iv and dv, a quadratic term should be included and tested. That is at middle values for the iv, the dv values are especially high (or low). If the scatter plot of the dv versus an iv looks approximately as below, we should likely test a quadratic term for significance, using a t test. If the analyst would like to determine if interaction terms are appropriate, these terms should be included and tested using a t test. If the interaction is significant, do not interpret the individual variables in isolation. Example: Multiple Regression model with 2 predictor variables (one we believe has a quadratic relationship with the dv), and interaction. Y 0 1 x1 2 x 2 3 x12 4 x1 x 2 Example: Multiple Regression with one Interval variable and Indicator (Nominal/Categorical) variables representing 3 groups Y 0 1 x1 2 I 1 3 I 2 Situation: We have historical data and we want to forecast a single variable at some point(s) in the future. The data has a long term upward or downward trend and seasonality. Statistical Tool: Time-Series Analysis Ft Yˆt X SI t Ratio yt yˆ t Assessing the Forecast or Forecasting Methods n MAD y i 1 t Ft n y t actual value of the time series at time period t Ft forecasted value of the time series at time period t n number of time periods