Stat 100 Learning Objectives Data Collection and Surveys A. Students can identify the population of interest, parameter, sample and statistics from a study. B. Distinguish between an observational study and an experiment. C. Identify whether a probability sampling method or a non-probability sampling method was used to obtain the study data. D. Determine if the probability method used to obtain data was a simple random sample, stratified, or cluster. E. Explain the difference between random assignment and random selection and how these two concepts affect inference. F. Given a study, recognize if potential biases such as response, nonresponse, or selection exist. G. Calculate an approximate margin of error for a sample survey. Graphing A. Decide what graphs are appropriate for displaying quantitative and categorical variables. B. Identify shape of a distribution of data – right skew, left skew, or symmetric) when presented with a histogram. C. Given a variable of interest, identify whether the variable is categorical (binary, ordinal, nominal) or quantitative (discrete, continuous). D. From a numerical description of a variable, predict what shape the histogram would most likely take. E. Identify potential outliers in a histogram or boxplot. F. Identify errors in graphing such as not starting at zero, changing axis units, and mislabeling an axis. Numeric Summarization A. Calculate the five number summary for a set of data. B. Explain how mean and median are related for different distribution shapes (right skew, left skew, and symmetric). C. Describe how outliers affect various numerical summaries (mean, median, range, and standard deviation). D. Use quartiles to find the Interquartile Range (IQR) and identify outliers using the 1.5*IQR technique. E. Apply the Empirical Rule (68-95-99.7% Rule) to a symmetric set of data. F. Calculate z-scores for a symmetric set of date and interpret what this z-score means in terms of the standard deviation. Basic Probability 1 A. Understand difference between subjective, relative frequency, and classical probabilities and be able to identify which approach was used to assign a probability in a given scenario. B. Identify from a probability scenario events that are simple, complementary, mutually exclusive, and independent. C. Explain the difference between events that are mutually exclusive and independent. D. Correctly apply multiplication rule for two independent events, the addition rule for union of two events, and the complement rule. E. Determine if a given probabilities for all possible outcomes are legitimate by checking if each event probability is a number from zero to one, inclusive, and that the sum of all outcome probabilities is one. Normal Distribution A. When presented with the mean, standard deviation, and observed value for normally distributed set of data, find the z-score and use the Standard Normal Table to find percentiles. B. Calculate the appropriate observed score needed to obtain a given percentile, mean, and standard deviation for a normally distributed set of data. C. Use the Standard Normal Table to find probability/proportion/percentage of observation values that would fall above/below a given z-score. Sampling Distributions A. Be able to identify if the sample mean or sample proportion satisfies the conditions to apply normal distribution methods. B. Calculate the z-score for the sample mean and sample proportion when presented with summary statistics. C. Identify if the sample mean or sample proportion would better describe the sample distribution for a given study Confidence Intervals A. Correctly apply a given margin of error to find the confidence interval for a parameter. B. Explain how sample size, level of confidence, and standard deviation can affect width of confidence intervals (one proportion, one mean, and two mean difference). C. Be able to identify which confidence interval (one proportion, one mean, and two mean difference)) would be most appropriate to apply to a given study. D. Explain the difference between standard error and margin of error. E. Provide a correct interpretation for a given confidence interval (one proportion, one mean, and two mean difference). F. Calculate the appropriate standard error and margin of error for a confidence interval (one proportion, one mean, and two mean difference) when given the level of confidence, sample statistics, and sample size. 2 G. Identify the appropriate parameter of interest being estimated by a given confidence interval. Hypothesis Testing A. Correctly identify the appropriate null and alternative hypotheses, including one or two sided, for a given study objective B. From a given set of summary statistics, calculate the Z test statistic and p-value, and make the appropriate statistical decision for one proportion, one mean, and two mean difference. C. Make the correct statistical decision for a given study objective. D. Explain contextually that the p-value is the result of the sample producing a sample statistic with a large difference from the hypothesized value and that this difference was unlikely to be obtained if the hypothesized value was true. E. Be able to explain the relationship between a confidence interval and a two-side hypothesis test when making statistical decisions. F. Given a set of p-values identify which p-value provides the strongest/weakest evidence against the null hypothesis. G. Apply the appropriate hypothesis steps for conducting tests of significance for one proportion, one mean, and two mean difference. H. Identify correctly what the Type I and Type II errors would be when presented with the results of a statistical study. I. Recognize the relationship between test statistics and p-values, that as test statistics move away from zero the p-value will decrease. Correlation and Simple Linear Regression A. Visually determine from a scatterplot if a potential linear relationship is negative or positive. B. Set up the correct hypotheses, in words, to test if a linear relationship exists between two quantitative variables. C. Identify from a linear regression equation the response(outcome) and predictor(explanatory) variables, the slope , and y-intercept. D. Use a given linear regression equation to find the predicted value for a new observation. E. Identify whether an outlier in a scatterplot would be classified as an outlier or influential outlier. F. Determine from a given set of correlation values which correlation indicates the strongest/weakest linear relationship between two quantitative variables. G. Know the risk that extrapolation plays when applying a linear regression equation to data outside the range of the observed x-values. F. When making statistical decisions regarding correlation students should be able to explain that the results do not imply causation. G. Determine the direction of the slope of a regression equation from a given correlation, and vice-versa. That is, understand that the slope and correlation are directly related. 3 H. Understand that change in units does not affect the correlation value between two quantitative variables. I. When presented with a study of the linear relationship between two quantitative variables students can identify which variable is the response and which is the predictor. Analyzing Two Categorical Variables (2 x 2) A. When given a 2x2 table, students can calculate the risk, relative risk, odds, and odds ratio. B. Provide a proper interpretation of a relative risk and the importance the baseline risk plays in this interpretation. C. Set up the correct hypotheses statements for a test of two categorical variables for conducting a Chi-square test of independence. D. Identify possible lurking or confounding variables that may affect the statistical results in a study of two categorical variables. E. Explain the principle of Simpson's Paradox in a study of two categorical variables. F. Understand what is meant by the expected cell counts and how these are used to calculate the Chi-square test statistic. Experiments A. Determine if a given scenario uses random assignment. B. Explain what random assignment means in regards to conducting an experiment, that random assignment allows the researcher to infer causation. C. Explain what role blinding the subjects, researcher(s), or both play in the results of an experiment, and which one is applied to avoid placebo and experimenter effects. Miscellaneous A. Explain the general concept of a meta analysis, as well as the advantages and caveats of conducting such a study. This includes the file-drawer problem, the value of increased sample size, incorrectly combining studies, and potential of Simpson's Paradox. B. Correctly identify the appropriate statistical test that should be applied to analyze a study, choosing from one proportion, one mean, two mean difference, correlation/linear regression, and Chi-square test of independence. C. Correctly identify and use appropriate statistical notation of a study value as either a parameter or statistic. 4