MODULE 1 Statistics - the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making more effective decisions. Phases of Statistics: 1. Descriptive statistics - methods concerned with organizing, presenting, summarizing, and analyzing a set of data without drawing conclusions or inferences about a population. 2. Inferential - methods concerned with the analysis of sample data leading to predictions or inferences about the population Basic Vocab of Statistics: Population - collection, or set, of individuals or objects or events whose properties are to be analyzed Sample - A portion or part or subset of the population of interest Parameter - a numerical characteristic of the population Statistic Variable - a characteristic of an item or individual element in a population or sample Data - different values associated with a variable population, the first unit being selected at random. Stratified random sampling - the population is divided or stratified into more or less homogenous subpopulations (stratum) before sampling is done. Cluster sampling - Population is divided into several “clusters,” each representative of the population MODULE 2 Pie chart - a circle which is divided into sectors in such a way that the area of each scope is proportional to the size of the quantity represented by that sector. Column chart - consists of a series of rectangular bars where the length of the bar represents the magnitude to be demonstrated Bar chart - data labels are long or you have too many data sets to display. Scatter plot - gives a visual picture of the relationship between the two variables, and aids the interpretation of the correlation coefficient or regression model. Types of data: 1. Qualitative (categorical) - data have values that can only be placed into categories, 2. Quantitative Data (Numerical) - data that can be expressed in numbers. Discrete – counted, whole number Continuous – decimal Levels of measurements 1. Nominal scale – no implied ranking of categories 2. Ordinal scale - the categories of a variable can be ranked 3. Interval scale - contains the property of identity, order, and equality of scale but does not possess the absolute zero property and multiples of measures are not meaningful. 4. Ratio scale - contains the property of identity, order, equality of scale and the absolute zero property and multiples of measures are meaningful Sources of Data 1. Primary source - The data collector is the one using the data for analysis 2. Secondary source - The person performing data analysis is not the data collector Sampling Techniques 1. Probability sampling - every element in the population is given a chance of being included in the population Simple random sampling Systematic sampling - a method of selecting a sample by taking every kth unit from an ordered Tabular presentation - process of condensing classified data and arranging them systematically in rows and columns Common Types of Tables 1. Summary Table for Categorical data - a form of frequency distribution table where observations are classified based on categorical names 2. Single-value Grouping for Numerical Data - a form of frequency distribution table where distinct values are used as classes Steps in Constructing an FDT 1. Determine an adequate number of intervals (K). (usually between 5 to 20 class intervals) Suggested Formulas: 𝐾 = √𝑛; or K = 1+ 3.322log n, where n is the sample size 2. Determine the range (R). R = highest-lowest 3. Compute the class width (c). c = R/K Round off c to a value that is easy to work with. (Suggested Rule: c must have the same number of decimal places as the original data) 4. List the class intervals MODULE 5 Level of significance - the maximum probability with which we are willing to commit a Type I error. - 0.05 (significant), 0.01 (highly significant) Test statistic - a statistic computed from the sample on which the decision to reject or not to reject H0 is based - if the computed test statistic falls in the rejection region, the H0 is rejected Critical or Rejection Region - a part of the set of all possible values of a test statistic for which H0, - the size of the critical region is determined by the alpha level. is rejected - Sample data that fall in the critical region will warrant the rejection of the null hypothesis. Critical Value - boundary between the rejection region and the nonrejection region - a value from a z- or t-table Example: Suppose the null hypothesis, Ho, is: Frank’s rock climbing equipment is safe. Type 1 error: Frank concludes that his rock climbing equipment may not be safe when, in fact, it really is safe. Type 2 error: Frank concludes that his rock climbing equipment is safe when, in fact, it is not safe.