[BG1201] STATISTICS I Chapter1 What is statistics? Statistics is the science of collecting, organizing, presenting, analyzing and interpreting data to assist in making more effective decisions. Type of statistics ο· Descriptive statistics Method of organizing, summarizing and presenting data in an informative way ο· Inferential statistics (Inductive statistics)the methods used to find out something about a population based on a sample or a decision, estimate, prediction or organization about a population based on a sample. A population A population is a collection of all possible individuals, objects or measurement of interest. A sample A sample is a portion, or part of the population of interest. 1|Page [BG1201] STATISTICS I Type of variables (data) 1. Qualitative Variable The characteristic of variable being studied is nonnumeric. 2. Quantitative Variable The variable can be reported numerically. Example Balance in my account, number of children. Quantitative Variable can be classified as - Discrete Variablecan only assume certain values and there are usually “gaps” between values. - Continuous Variablecan assume any value within a specific range. Variable Qualitative Ciscrete variable Quantitative continuous variable 2|Page [BG1201] STATISTICS I Level of measurement 1. Nominal Data can be classified into categories but cannot be arranged in an ordering scheme. Each value of data can be assigned a code in a form of a number where numbers are simply labels. You can count but not order or measure nominal data. For example, eyes color, gender. 2. Ordinal Involves data that can be arranged in some order or have a rating scale attached. Can count and order. Bu not measure the data. The difference between data values cannot be determined or are meaningless. For example ranking of students ( freshmen, sophomore, junior, senior) 3. Interval Is the next higher level. It includes all characteristics of the ordinal level, but in addition, the difference between values is constant. It is also important to note that 0 is just a point on scale. It does not represent the absence of the condition. For example temperature, GPA, score, … 4. Ratio Is the highest level. It has all characteristics of the interval level. But in addition, the 0 is meaningful and the ratio between two numbers is meaningful. For example, salary, age,… 3|Page [BG1201] STATISTICS I Chapter 2 Frequency Distribution and Graphic Presentation Frequency Distribution A grouping of data into categories showing the number of observation in each mutually exclusive category. The steps for organizing data into a frequency distribution Step 1 Decide a number of classes, usually between 5 and 15 Step 2 Compute the class width πΆπππ π π€πππ‘β = π»ππβππ π‘ − πππ€ππ π‘ π£πππ’π ππ. ππ ππππ π ππ Step 3 Create non overlapping classes. The smallest value is the lower class limit of the first class. Add the class width to find the lower class limit of the second class. Count the number of items in each class. Step 4Tally the data into the classes ancount the number in each class. Suggestions on constructing Frequency Distribution 1. The class widths used in the frequency distribution should be equal 2. Too many classes or few classes might not reveal the basic shape of the set of data. 3. Avoid overlapping stated class limits. 4. Try to avoid open-ended classes. They cause problem in graphing and in determining measure of central tendency and dispersion, described in chapter 3 and 4 4|Page [BG1201] STATISTICS I Class limit Lower limit = the lower end of the class. Upper limit = the upper end of the class. Midpoint Also called class mark, is half way between the lower and the upper class limits. ππππππππ‘ = πππ€ππ πππππ‘ + π’πππππππππ‘ 2 Actual class limit (class boundary)→ actual lower limit, actual upper limit π΄ππ‘π’ππ πΆπππ π πππππ‘ = π’ππππ πππππ‘ ππ ππππ€π ππππ π + πππ€ππ πππππ‘ ππ βππβππ ππππ π 2 Cumulative Frequency Less than = number of data items whose value are smaller than an upper boundary of a class. More than = number of data items whose value are larger than an lower boundary of a class. Relative frequency Ratio of frequency of that class by the total frequency Percentage frequency Relative frequency×100= ____% 5|Page [BG1201] STATISTICS I Graphic presentation of a frequency distribution There are three commonly used graphic forms: Histogram Frequency A graph in which the classes are marked on the horizontal axis and the class frequencies on the vertical axis. The class frequencies are represented by the height of the bars and the bars are drawn adjacent to each other 14 12 10 8 6 4 2 0 79.5 99.5 119.5 139.5 159.5 179.5 199.5 219.5 Actual class limit Frequency Polygon Consists of line segments connecting the points formed by the class midpoint and the class frequency. 14 12 Frequency 10 8 6 4 2 0 79.5 99.5 119.5 139.5 159.5 179.5 199.5 219.5 6|Page [BG1201] STATISTICS I Ogive (Cumulative frequency distribution) Is used to determine how many data values are below or above a certain value 60 Frequency 50 40 less than 30 more than 20 10 0 79.5 99.5 119.5139.5159.5179.5199.5219.5 7|Page [BG1201] STATISTICS I Chapter 3 Measures of location Measures of Central tendency A single value that summarizes a set of data. It locates the value. You are familiar with the concept of an average. For example, The average annual maintenance expense $269 for a new car and $565 for a car more than one year old. We will being by discussing the most widely used and widely reported measure of central tendency, the arithmetic mean. Population mean Is the sum of all values in the population divided by the number of values in the population. Any measurable characteristic of a population is called a parameter. The mean of a population is a parameter. π= π₯1 + π₯2 + β― + π₯π ∑ π₯ = π π Whereπ represents the population mean. It is the Greek lowercase letter “mu” N in the number of items in the population. X represents any particular value. ∑ is the Greek capital letter “sigma” and indicate the operation of adding. 8|Page [BG1201] STATISTICS I Sample mean The mean of a sample and the mean of population are computed in the same way, but the notation used is different. The mean of a sample, or any other measure based on sample data, is called a statistic. π₯Μ = π₯1 + π₯2 + β― + π₯π ∑ π₯ = π π WherexΜ stands for the sample mean. It is read “x bar” The lower case n is the number of items in the sample The arithmetic mean has several important properties: 1. Every set of interval level and ratio level data has a mean. 2. All the values are included in computing the mean. 3. A set of data has only one mean The mean does have several disadvantages, however. Recall that the mean uses the value of every item in the sample or population, in its computation. If one or two of these values are either extremely large or extremely small, the mean might not be an appropriate average to represent the data. The mean is also inappropriate if there is an open-ended class for data tallied into a frequency distribution. If a frequency distribution has the open-ended class “$100,000 and close to $100,000, $500,000, or $10 million. Since we lack information about their incomes, the arithmetic mean income for this distribution cannot be determined. 9|Page [BG1201] STATISTICS I Weighted Mean Is a special case of the arithmetic mean. When the data are not equally important, we can assign to each a weight that is proportional to its relative importance and calculate the weighted mean π₯Μ = π€1 π₯1 + π€2 π₯2 + β― + π€π π₯π ∑ π€π₯ = ∑π€ π€1 + π€2 + β― + π€π Example Amy getsquiz scores of 65, 83, 80 and 90 points. She gets 92 points on her final examination. Find the mean score if the quizzes each count for 15% and the final counts for 40% of the final grade. Shift CLR Shift 15; DT =1 65 DT Shift ; = 83 15 DT Shift ; = 80 15 DT Shift ; = 90 15 DT Shift ; = 92 40 Shift S-var 1 = 84.5 Combined mean π₯Μ = ∑ ππ₯Μ π1 π₯Μ 1 + π2 π₯Μ 2 + β― = ∑π π1 + π2 + β― 10 | P a g e [BG1201] STATISTICS I Median It has been pointed out that for data containing one or two very large or very small values, that arithmetic mean may not be a good measure of central tendency. For such case, a different measure of central tendency which can better describe data is the median. Shape of the distribution ο· ο· Symmetric Asymmetric (Skewed to the right or skewed to the left) As the distribution becomes nonsymmetrical, or skewed, the relationship among the three averages changes. In a positively skewed distribution, the arithmetic mean is the largest of the three averages. Why? Because the mean is influenced more than the median or mode is the smallest 11 | P a g e [BG1201] STATISTICS I Conversely, in a distribution that is a negatively skewed, the mean is the lowest of the three averages. The mean is influenced by a few extremely low observations. The median is greater than the arithmetic mean and the modal value is the largest. **If the distribution is highly skewed, the mean should not be used to represent the data** 12 | P a g e [BG1201] STATISTICS I Chapter 4 Why study dispersion The first reason: an average only the locate the data but does not tell us anything the spread of the data For example, if your nature guide told you that the river ahead average 3 feet in depth, would you cross it without additional? Probably not. You would want to know something about the variation in the depth. If the maximum depth of the river 3.25 feet and the minimum 2.75 feet. You not probably agree to cross. Before making decision about crossing the river, you want information on both typical depth and the variation in the depth of the river. The second reason is to compare the spread in two or more distributions. A small value for a measure of dispersion indicate that the data are clustered around the mean. Conversely, a large measure of dispersion indicates that the data are scatter widely about their mean. We will consider several measure of dispersion. The range is based on the location of the largest and the smallest values in the data set. The mean deviation, the variance, and the standard deviation are all based on deviations from the mean. 13 | P a g e [BG1201] STATISTICS I Range The simplest measure of dispersion. Max – Min Mean Deviation A serious defect of the range is that it is based on only two values, the highest and the lowest, it does not take into consideration all of the values. The mean deviation does. It measure the mean amount by which the values in a population, or sample, vary from their mean. ∑|π₯ − π₯Μ | π Where | | indicates the absolute value (the sign of the deviation from the mean are disregarded 14 | P a g e [BG1201] STATISTICS I The mean deviation has two advantages. First, if uses all the values in the computation. Second, it is easy to understand – it is average amount by which values deviate from the mean. However, its major drawback is the use of absolute values. Generally, absolute values are difficult to work with, so the mean deviation is not used as frequently as other measures of dispersion, such as standard deviation. Variance and standard deviation are also based on the deviations from the mean. Variance is the average of squared deviations from the mean. Standard deviation is the positive square root of the variance ∑(π₯ − π)2 π = π 2 15 | P a g e [BG1201] STATISTICS I Population variance (pronounced sigma square) Why would we use the standard deviation when we already have the variance? Because the standard deviation is a more measure. The variance is a squared quantity, it is an average of squared numbers. By taking its square root, we “unsquare” the unit and get quantity denoted in the original unit in the problem. If the observation differ from the mean by one unit or more, the variance tends to be large because it is in squared units. The mathematical properties of the valiance simplify some computation, but the standard deviation is more easily interpreted. √∑(π₯ − π)2 π= π *Population standard deviation * (pronounced sigma) 2 ∑ (π₯ − π₯Μ ) π2 = π−1 16 | P a g e [BG1201] STATISTICS I Sample variance ∑(π₯ − π₯Μ )2 π=√ π−1 *sample standard deviation* Why is this seemingly insignificant change made in denominator? although the use of n is logical, it tends to underestimate the population variance, π 2 . The use of (n1) in the denominator provides the appropriate correction for this tendency. because the primary use of sample statistics like π 2 is to estimate population parameter like π 2 , (n1) is preferred to n when defining the sample variance. Some properties of the mean and the variance 1. If a fixed value d is added or subtracted from each of the observations in the data, then a. The mean of the new data = mean of original ±π b. The variance remains = unchanged 2. If each observed value in the data is multiplied by a fixed constant c, then a. Mean of the new data = C time mean of original 17 | P a g e [BG1201] STATISTICS I b. Variance of new data = πΆ 2 time variance of original Relative Dispersion Coefficient of variation (CV.) is very useful when 1. The data are in different units, but the means are far apart (such as the incomes of the top executives and the incomes of the unskilled employees) CV. is the ratio of standard deviation to the mean, expressed as a percent πΆπ = ππ· (100) = β― % ππππ The coefficient of variation is often used as a measure of risk, for instance, in investment, the CV. measures the variation of the returns (standard deviation) relative to the size of the mean return Skewness is the measurement of the lack of symmetry of the distribution. Coefficient of skewness ππ = (ππππ−ππππππ) π π‘ππππππ πππ£πππ‘πππ 18 | P a g e [BG1201] STATISTICS I Chapter 5 Principles of counting Counting is a mathematical technique that enables us to determine number of possible ways an event can occur. The Multiplication Formula If there are M ways of doing one thing and N ways of doing another thing, there are MxN ways of doing both. The Addition Formula If there are M ways of doing one thing and N ways of doing another thing, there are M+N ways of doing either one but not both. Factorial ( n! is called n factorial ) Is the continued product of the first n natural numbers. n! = n(n-1)(n-2)(n-3)…(3)(2)(1) Combination The number of ways to choose r objects from a group of n objects without regard to order. (Order is not important) π! nCr = π!(π−π)! 19 | P a g e [BG1201] STATISTICS I Probability The probability of an event is the measure of the chance that the event will occur. (It describes the relative possibility the event will occur) Probability can only assume a value between 0 and 1 or between 0% and 100%. Three key words are used in the study of probability ; experiment , outcome , and event. ο· Experiment A process that leads to the occurrence of one (and only one) of several possible observations. ο· Outcome A particular result of an experiment. ο· Event A collection of one or more outcomes of an experiment. Approaches to Probability Two approaches to probability will be discussed, namely, the objective and the subjective viewpoints. 20 | P a g e [BG1201] STATISTICS I Objective probability 1. Classical probability ππ’ππππ ππ πππ£ππππππ ππ’π‘πππππ Probability of an event 2. = π‘ππ‘ππ ππ’ππππ ππ πππ π ππππ ππ’π‘πππππ = π(πΈ) π(π) Empirical probability Probability of event happening = ππ’ππππ ππ π‘ππππ ππ£πππ‘ ππππ’ππππ ππ πππ π‘ π‘ππ‘ππ ππ’ππππ ππ πππ πππ£ππ‘ππππ Subjective probability The available opinions and other information and then estimating or assigning the probability. Some Rules of Probability Rule of Addition General rule of addition is used to combine events that are not mutually exclusive. P(A or B) = P(A) + P(B) - P(A and B) P(AUB) = P(A) + P(B) - P (A∩B) 21 | P a g e [BG1201] STATISTICS I If two events are mutually exclusive, the special rule of addition is used to combine. P(A or B) = P(A) + P(B) P(AUB) = P(A) + P(B) Complement Rule (A‘ or AαΆ ) P(A) + P(A‘) = 1 P(A‘) = 1 – P(A) Condition Probability Conditional Probability P(AΗB) = π(π΄∩π΅) π(π΅) =Probability of A given that B has occurred P(BΗA) = π(π΄∩π΅) π(π΄) = Probability of B given that A has o 22 | P a g e [BG1201] STATISTICS I Chapter 6 Discrete Probability Distribution Probability Distribution A listing of all the outcomes of on experiment and the probability associated with each outcome. Ex. Suppose we are interested in the number of heads shoeing face up on three tosses of a coin. What is the probability distribution for the number of heads? Sample Space= S = {πππ, πππ», ππ»π, π»ππ, ππ»π», π»ππ», π»π»π, π»π»π»} X= Number of heads 0 1 2 3 P(x) 1/8 = 0.125 3/8 = 0.375 3/8 = 0.375 1/8 = 0.125 1 Random Variables A quantity resulting from an experiment that, by change, can assume, different values. A random variable may be either discrete or continuous. 23 | P a g e [BG1201] STATISTICS I Discrete Random Variables A variable that can assume only certain clearly separated values resulting from a count of some item of interest. Ex. number of students, number of rooms in a house. Continuous Random Variable A variable that can assume one of an infinitely large number of values, within certain limitations. Ex. height, weight, tire pressure, … Binomial Probability Distribution The binomial probability distribution is one of the most widely used discrete probability distribution. It is applied to fond the probability that an outcome will occur x times in n performance of an experiment. Characteristics of a binomial distribution 1. An outcome on each trial of an experiment is classified into one of two mutually exclusive categories – a success or a failure. 2. The random variable is the result of counting the number of success in a fixed number of trials. 3. The probability of a success stays the same for each trial. So does the probability of failure 24 | P a g e [BG1201] STATISTICS I 4. The trial are independent, meaning that the outcome of one trial does not affect the outcome of any other trial. To construct a particular binomial probability distribution, we must know the number of trails and the probability of success on each trail. For example, if Stat I examination consists of 10 multiple choice questions, the number of trails is 10. If each 1 question on each trail is 4 or 0.25. Using the formula of the binomial probability distribution π(π₯ ) = ππΆπ₯ π 2 (1 − π₯)π−π₯ Where n is the number of trails. x is the number of successes. π is the probability of a success on each trail. Mean of binomial distribution π = ππ Variance of a binomial distribution π 2 = ππ(1 − π) 25 | P a g e [BG1201] STATISTICS I Chapter 7 The normal Probability Distribution We will continue our study of probability distribution in this chapter by examining a very important continuous probability distribution, the normal probability distribution. As noted in the preceding chapter, a continuous random variable is one that can assume an infinite number of possible values within specified range. A large of phenomena in the real world is normally distributed either exactly or approximately. The normal probability distribution and its accompanying normal curve have the following characteristics: 26 | P a g e