Probability and Statistics Dr Yehya Mesalam Text book • Probability & Statistics for Engineers & Scientists, Ronald E. Walpole, 9th edition 2012, Pearson Dr Yehya Mesalam Brief list of Course topics 1. 2. 3. 4. 5. 6. 7. 8. Introduction to statistics and data analysis. Introduction to probability theory. Random variables and probability distributions. Mathematical Expectation Some discrete probability distribution. Some continuous probability distribution. Functions of Random Variables Fundamental sampling distributions and data descriptions. Dr Yehya Mesalam Evaluation Scheme • Midterm Exam • Report • Activities 30 30 60 = 60% • Final exam 40 • Total 100 =100% Dr Yehya Mesalam 40 = 40% 4 Chapter 1 Introduction to statistics and data analysis Dr Yehya Mesalam 5 What is Statistics? • Statistics is the science of collecting, organizing, summarizing, and analyzing information to draw conclusions or answer questions. • Statistics is a way to get information from data. It is the science of uncertainty. Dr Yehya Mesalam 6 Steps of Statistical Practice • Preparation: Set clearly defined goals, questions of interests for the investigation • Data collection: Make a plan of which data to collect and how to collect it • Data analysis: Apply appropriate statistical methods to extract information from the data • Data interpretation: Interpret the information and draw conclusions Dr Yehya Mesalam 7 Statistical Methods • Descriptive statistics include the collection, presentation and description of numerical data . • Inferential statistics include making inference, decisions by the appropriate statistical methods by using the collected data. • Model building includes developing prediction equations to understand a complex system. Dr Yehya Mesalam 8 Descriptive Statistics • Descriptive statistics involves the arrangement, summary, and presentation of data, to enable meaningful interpretation, and to support decision making. • Descriptive statistics methods make use of – graphical techniques – numerical descriptive measures. • The methods presented apply both to – the entire population – the sample Dr Yehya Mesalam 9 Basic Definitions • Population: The collection of all items of interest in a particular study. • Sample: A set of data drawn from the population; a subset of the population available for observation • Parameter: A descriptive measure of the population, e.g., mean • Statistic: A descriptive measure of a sample • Variable: A characteristic of interest about each element of a population or sample. Dr Yehya Mesalam 10 Collecting Data • Target Population: The population about which we want to draw inferences. • Sampled Population: The actual population from which the sample has been taken. Dr Yehya Mesalam 11 Types of Variables Qualitative Ordinal Quantitative Non Ordinal Discrete Dr Yehya Mesalam Continuous 12 Types of data - examples Numerical data Age - income 55 42 75000 68000 . . . . Weight gain +10 +5 . . Nominal Person Marital status 1 2 3 married single single . . . . Computer Brand 1 2 3 . . IBM Dell IBM . . Dr Yehya Mesalam 13 Types of data - examples Numerical data Nominal data A descriptive statistic for nominal data is the proportion of data that falls into each category. Age - income 55 42 . . 75000 68000 . . gain Weight +10 +5 . . IBM 25 50% Dell Compaq 11 8 22% 16% Dr Yehya Mesalam Other 6 12% Total 50 14 14 Types of Variables Qualitative Ordinal Quantitative Non Ordinal Discrete Dr Yehya Mesalam Continuous 15 Types of Variables •Qualitative variables (what, which type…) measure a quality or characteristic on each experimental unit. (categorical data) •Examples: •Hair color (black, brown, blonde…) •Make of car (Dodge, Honda, Ford…) •Gender (male, female) •State of birth (Iowa, Arizona,….) Dr Yehya Mesalam 16 Types of Variables •Quantitative variables (How big, how many) measure a numerical quantity on each experimental unit. (denoted by x) Discrete if it can assume only a finite or countable number of values. Continuous if it can assume the infinitely many values corresponding to the points on a line interval. Dr Yehya Mesalam 17 Graphing Qualitative Variables • Use a data distribution to describe: – What values of the variable have been measured – How often each value has occurred • “How often” can be measured 3 ways: – Frequency – Relative frequency = Frequency/n – Percent frequency = Relative frequency* 100 Dr Yehya Mesalam 18 Example • A bag contains 25 colored balls: • Raw Data: m m m m m m m m m m m m m m m m m m m m m m m m m • Statistical Table: Color Tally Frequency Relative Frequency Percent Red mmm 3 3/25 = .12 12% Blue mmmmmm 6 6/25 = .24 24% Green mm mm 4 4/25 = .16 16% mmmmm 5 5/25 = .20 20% 3 3/25 = .12 12% 4 4/25 = .16 16% Orange Brown Yellow mmm m m m m Dr Yehya Mesalam 19 6 Graphs Frequency 5 4 3 Bar Chart 2 1 0 Pareto Chart Brown Yellow Red Blue Orange Green Color Brown 12.0% Green 16.0% Pie Chart Yellow 16.0% Orange 20.0% Angle= Red 12.0% Relative Frequency times 360 Blue 24.0% Dr Yehya Mesalam 20 Example A sample of 30 persons who often consume donuts were asked what variety of donuts was their favourite. The responses from these 30 persons were as follows: glazed frosted glazed frosted filled filled filled plain plain other other filled other other frosted plain glazed glazed other glazed glazed other glazed frosted glazed other frosted filled filled filled Construct a frequency distribution table for these data. Dr Yehya Mesalam 21 Solution Dr Yehya Mesalam 22 Solution Relative Frequency and Percentage Distributions Frequency of that category Re lative frequency of a category Sum of all frequencies Calculating Percentage Frequency Percentage Frequency = (Relative frequency) * 100 23 Graphical Presentation of Qualitative Data A graph made of bars whose heights represent the frequencies of respective categories is called a bar graph. Dr Yehya Mesalam 24 Graphical Presentation of Qualitative Data A circle divided into portions that represent the relative frequencies or percentages of a population or a sample belonging to different categories is called a pie chart. Dr Yehya Mesalam 25 Calculating Angle Sizes for the Pie Chart Dr Yehya Mesalam 26 Pie chart for the percentage distribution Dr Yehya Mesalam 27 Scatter Plot Dr Yehya Mesalam 28 Scatter Plot Dr Yehya Mesalam 29 Dot Plot Draw the dot plot for the following data, then calculate the mean, median, and mode 0.86, 0.49, 0.46, 0.52, 0.62, 0.79, 0.75, 0.47, 0.26, 0.43 Mean Calculation: x x1 x2 x3 x4 x5 x6 ..... xn x 5.65 x 5.65 x 0.565 n 10 Dr Yehya Mesalam 30 Dot Plot Median Calculation:Rearrange the data n=10 Median Order =5&6 0.26, 0.43, 0.46, 0.47, 0.49, 0.52, 0.62, 0.75, 0.79, 0.86 Median =( 0.49+0.52)/2 = 0.505 Mode No Mode Dr Yehya Mesalam 31 Stem and Leaf Displays In a stem-and-leaf display of quantitative data, each value is divided into two portions – a stem and a leaf. The leaves for each stem are shown separately in a display. Dr Yehya Mesalam 32 Example The following are the scores of 30 college students on a statistics test: 75 69 83 52 80 72 81 84 77 96 61 64 65 76 71 79 86 87 71 79 72 87 68 92 93 50 57 95 92 98 Construct a stem-and-leaf display. Dr Yehya Mesalam 33 Solution Dr Yehya Mesalam 34 Solution Dr Yehya Mesalam 35 Solution Dr Yehya Mesalam 36 Example The following data give the monthly rents paid by a sample of 30 households selected from a small town. 880 1210 1151 1081 721 985 1231 630 1175 1075 1023 932 850 952 1100 775 825 1140 1235 1000 750 750 915 1140 965 1191 1370 960 1035 1280 Construct a stem-and-leaf display for these data. Dr Yehya Mesalam 37 Solution Dr Yehya Mesalam 38 Example Construct a stem-and-leaf display for the given data Dr Yehya Mesalam 39 39 Solution Dr Yehya Mesalam 40 Solution Dr Yehya Mesalam 41 Mean The mean for ungrouped data is obtained by dividing the sum of all values by the number of values in the data set. Thus, Mean for population data: x Mean for sample data: x x N n Where x is the sum of all values; n is the sample size; is the population mean; x is the sample mean. N is the population size; Dr Yehya Mesalam 42 Mean 1. 2. 3. 4. Most common measure of central tendency Acts as „balance point‟ Affected by extreme values („outliers‟) Denoted x where n x i 1 n x i x 1 x 2 … x n n Dr Yehya Mesalam 43 Example Find the mean of cash donations made by these eight Persons. 319, 199, 110, 63, 21, 315, 26, 63 Solution x x 1 x2 x3 x4 x5 x6 x7 x8 319199110 63 21 315 26 63 1116 x 1116 x 139.5 $139.5million n 8 Dr Yehya Mesalam 44 Example Raw Data: 10.3 4.9 8.9 11.7 6.3 7.7 n x x i 1 n i x 1 x 2 x 3 x 4 x 5 x 6 6 10 . 3 4 . 9 8 . 9 11 .7 6 . 3 7 .7 6 8 . 30 Dr Yehya Mesalam 45 Median 1. Measure of central tendency 2. Middle value in ordered sequence • • If n is odd, middle value of sequence If n is even, average of 2 middle values 3. Position of median in sequence n 1 n is odd Order 2 n is even Order n n , 2 2 +1 4. Not affected by extreme values Dr Yehya Mesalam 46 Median Example • Raw Data: 24.1 22.6 21.5 23.7 22.6 • Ordered: 21.5 22.6 22.6 23.7 24.1 • Position: 1 2 3 4 5 Positioning Point n 1 2 Median 5 1 3 .0 2 22 . 6 Dr Yehya Mesalam 47 Median Example • Raw Data: 10.3 4.9 8.9 11.7 6.3 7.7 • Ordered: 4.9 6.3 7.7 8.9 10.3 11.7 • Position: 1 2 3 4 5 6 Positioning Point n 2 Median 7 .7 8 . 9 6 3 ,4 2 8 . 30 2 Dr Yehya Mesalam 48 Mode 1. Measure of central tendency 2. Value that occurs most often 3. Not affected by extreme values 4. May be no mode or several modes 5. May be used for quantitative or qualitative data Dr Yehya Mesalam 49 Mode Example • No Mode Raw Data: 10.3 4.9 8.9 11.7 6.3 7.7 • One Mode Raw Data: 6.3 4.9 8.9 6.3 4.9 4.9 • More Than 1 Mode Raw Data: 21 28 41 28 Dr Yehya Mesalam 43 43 50 Range 1. Measure of dispersion 2. Difference between largest & smallest observations Range= R = Max Value – Min Value 3. Ignores how data are distributed 7 8 9 10 7 8 9 10 Range = 10 – 7 = 3 Range = 10 – 7 = 3 Dr Yehya Mesalam 51 Variance & Standard Deviation 1. Measures of dispersion 2. Most common measures 3. Consider how data are distributed 4. Show variation about mean (x or μ) x = 8.3 4 6 8 10 12 Dr Yehya Mesalam 52 Standard Notation Measure Mean Sample Population x s Standard Deviation Variance Size s 2 n Dr Yehya Mesalam 2 N 53 Variance and Standard Deviation Basic Formulas for the Variance and Standard Deviation for Ungrouped Data x 2 and s N x x x 2 2 2 N x x 2 2 n 1 and s n 1 where σ² is the population variance, s² is the sample variance, σ is the population standard deviation, and s is the sample standard deviation. Dr Yehya Mesalam 54 Variance and Standard Deviation Short-cut Formulas for the Variance and Standard Deviation for Ungrouped Data 2 x x N x x N 2 2 n x 2 x 2 and s 2 N n(n 1) 2 2 N n x x 2 and s 2 n(n 1) where σ² is the population variance, s² is the sample variance, σ is the population standard deviation, and s is the sample standard deviation. Dr Yehya Mesalam 55 Variance Example Raw Data: 10.3 4.9 8.9 11.7 6.3 7.7 n ( x s 2 i 1 s 2 i x 10 . 3 8 . 3 ) where n 1 ( n 2 2 ) ( x i 1 n 4 .9 8 .3 2 ) x i 8 .3 ( … 7 .7 8 . 3 ) 6 1 6 . 368 Dr Yehya Mesalam 2 56 Variance Example Raw Data: n= 6 10.3 4.9 8.9 11.7 6.3 7.7 n x x 2 2 s 2 2 x 445.18 x 49.8 n(n 1) 6 * 445.18 (49.8) s 6.368 6*5 2 2 s 6.368 2.523 Dr Yehya Mesalam 57 Summary of Variation Measures Measure Range Standard Deviation (Sample) Standard Deviation (Population) Formula XMax – XMin n Description Total Spread x x 2 i i1 Dispersion about Sample Mean n 1 n x µ 2 i x i1 Dispersion about Population Mean N n Variance (Sample) xi x i1 2 n 1 Squared Dispersion about Sample Mean Dr Yehya Mesalam 58 Box-and-whisker Plot A box and whisker plot also called a box plot displays the five number summary of a set of data. The five number summary is • The minimum value • First quartile (Q1) • Median, • Third quartile (Q3) • The maximum value Dr Yehya Mesalam 59 Box-and-whisker Plot In a box plot, we draw a box from the first quartile to the third quartile. A vertical line goes through the box at the median. The whiskers go from each quartile to the minimum or maximum. minimum Lower quartile maximum Median Upper quartile Dr Yehya Mesalam 60 Example Construct a box-and-whisker plot for the following data. 85,92,78,88,90,88,89 Dr Yehya Mesalam 61 Solution 1. Order the test scores from least to greatest 78, 85,88, 88, 89, 90,92 2. Find the median of the test scores. 88 3. Find Find the quartiles. The first quartile (Q1) is the median of the data points to the left of the median. 85 The third quartile (Q3) is the median of the data points to the right of the median 90 Dr Yehya Mesalam 62 Solution 4. Complete the five-number summary by finding the min and the max. Min = 78 Max = 92 Min 78 Maz 85 Q1 88 88 89 Median 90 92 Q3 Dr Yehya Mesalam 63 Example Use the given data to make a box-and-whisker plot. 31, 23, 33, 35, 26, 24, 31, 29 Dr Yehya Mesalam 64 Solution Order the data from least to greatest. Then find the minimum, lower quartile, median, upper quartile, and maximum. 23 24 26 29 31 31 33 35 31 median: 29 + = 30 2 lower quartile: 24 + 26 = 2 upper quartile: 31 + 33 2 25 = 32 minimum: 23 maximum: 35 Dr Yehya Mesalam 65 Solution Draw the box and whiskers. Draw a number line and plot a point above each value. 23 24 26 29 31 31 33 35 22 24 26 28 30 32 34 Dr Yehya Mesalam 36 38 66 Frequency Histograms • • • • • Divide the range of the data into 5-12 subintervals of equal length. Calculate the approximate width of the subinterval as Range/number of subintervals. Round the approximate width up to a convenient value. Sturges rule K= 1+3.3log (N). Create a statistical table including the subintervals, their frequencies and relative frequencies. Dr Yehya Mesalam 67 Example • The following are balances (in $) of 100 accounts receivable taken from the ledger of XYZ Store. 31 38 41 52 59 46 74 69 93 69 83 78 74 77 35 79 80 71 56 69 34 33 92 37 60 43 51 74 68 83 49 34 71 58 83 94 78 48 34 50 68 65 64 95 92 77 84 41 40 38 38 60 67 50 76 99 38 94 48 70 80 95 98 55 49 54 60 62 70 88 94 85 59 68 51 87 53 57 54 46 46 69 64 61 63 78 55 66 73 75 Dr Yehya Mesalam 60 65 61 66 81 86 42 51 76 64 68 Example • Using 7 equal intervals with the lowest starting at 30, compute the mean, and the variance using shortcut method. • calculate mode and median (analytically and graphically) • Estimate the value below which 75% of the values fall. Dr Yehya Mesalam 69 Solution • Determine the Min value Dr Yehya Mesalam 70 Example • The following are balances (in $) of 100 accounts receivable taken from the ledger of XYZ Store. 31 38 41 52 59 46 74 69 93 69 83 78 74 77 35 79 80 71 56 69 34 33 92 37 60 43 51 74 68 83 49 34 71 58 83 94 78 48 34 50 68 65 64 95 92 77 84 41 40 38 38 60 67 50 76 99 38 94 48 70 80 95 98 55 49 54 60 62 70 88 94 85 59 68 51 87 53 57 54 46 46 69 64 61 63 78 55 66 73 75 Dr Yehya Mesalam 60 65 61 66 81 86 42 51 76 64 71 Solution • Determine the Min value = 31 • Determine the Max value = Dr Yehya Mesalam 72 Example • The following are balances (in $) of 100 accounts receivable taken from the ledger of XYZ Store. 31 38 41 52 59 46 74 69 93 69 83 78 74 77 35 79 80 71 56 69 34 33 92 37 60 43 51 74 68 83 49 34 71 58 83 94 78 48 34 50 68 65 64 95 92 77 84 41 40 38 38 60 67 50 76 99 38 94 48 70 80 95 98 55 49 54 60 62 70 88 94 85 59 68 51 87 53 57 54 46 46 69 64 61 63 78 55 66 73 75 Dr Yehya Mesalam 60 65 61 66 81 86 42 51 76 64 73 Solution • • • • • • • Determine the Min value = 31 Determine the Max value = 99 Calculate the range = Max – Min But the starting point is given 30 use Min = 30 Range = 99 – 30 = 69 Interval Length C = Range / No. of intervals C= 69 / 7 = 9.85 =10 Dr Yehya Mesalam 74 Solution L.L U.L 30 Dr Yehya Mesalam 75 Solution L.L C=10 U.L 30 40 Dr Yehya Mesalam 76 Solution L.L C=10 U.L 30 40 C=10 50 Dr Yehya Mesalam 77 Solution L.L U.L 30 40 50 60 70 80 90 Dr Yehya Mesalam 78 Classes L.L U.L 30 39 40 49 50 59 60 69 70 79 80 89 90 99 Solution Dr Yehya Mesalam 79 Solution L.L U.L 30 39 40 49 50 59 60 69 70 79 80 89 90 99 f Dr Yehya Mesalam 80 Example • The following are balances (in $) of 100 accounts receivable taken from the ledger of XYZ Store. 31 38 41 52 59 46 74 69 93 69 83 78 74 77 35 79 80 71 56 69 34 33 92 37 60 43 51 74 68 83 49 34 71 58 83 94 78 48 34 50 68 65 64 95 92 77 84 41 40 38 38 60 67 50 76 99 38 94 48 70 80 95 98 55 49 54 60 62 70 88 94 85 59 68 51 87 53 57 54 46 46 69 64 61 63 78 55 66 73 75 Dr Yehya Mesalam 60 65 61 66 81 86 42 51 76 64 81 Solution L.L U.L f 30 39 11 40 49 50 59 60 69 70 79 80 89 90 99 Dr Yehya Mesalam 82 Example • The following are balances (in $) of 100 accounts receivable taken from the ledger of XYZ Store. 31 38 41 52 59 46 74 69 93 69 83 78 74 77 35 79 80 71 56 69 34 33 92 37 60 43 51 74 68 83 49 34 71 58 83 94 78 48 34 50 68 65 64 95 92 77 84 41 40 38 38 60 67 50 76 99 38 94 48 70 80 95 98 55 49 54 60 62 70 88 94 85 59 68 51 87 53 57 54 46 46 69 64 61 63 78 55 66 73 75 Dr Yehya Mesalam 60 65 61 66 81 86 42 51 76 64 83 Solution L.L U.L f 30 39 11 40 49 12 50 59 60 69 70 79 80 89 90 99 Dr Yehya Mesalam 84 Solution L.L U.L f 30 39 11 40 49 12 50 59 16 60 69 23 70 79 17 80 89 11 90 99 10 100 Dr Yehya Mesalam 85 Solution f = Relative Frequency L.L U.L f F f relative X 30 39 11 11 0.11 34.5 40 49 12 23 0.12 50 59 16 39 0.16 54.5 60 69 23 62 0.23 64.5 70 79 17 79 0.17 74.5 80 89 11 90 0.11 84.5 90 99 10 100 0.1 94.5 100 F = Cumulative Frequency C=10 44.5 C=10 1 class mark (X) or Mid Point = (LL+UL )/2 X1 = (30+39)/2 =34.5 Dr Yehya Mesalam 86 Solution L.L U.L f F f relative X f*X f*(x - x )2 f*X2 30 39 11 11 0.11 34.5 379.5 9637.76 13092.75 40 49 12 23 0.12 44.5 534 4609.92 23763 50 59 16 39 0.16 54.5 872 1474.56 47524 60 69 23 62 0.23 64.5 1483.5 3.68 95685.75 70 79 17 79 0.17 74.5 1266.5 1838.72 94354.25 80 89 11 90 0.11 84.5 929.5 4577.76 78542.75 90 99 10 100 0.1 94.5 945 9241.6 89302.5 6410 31384 442265 100 X 1 f i * Xi n Mean (X ) = 64.1 Dr Yehya Mesalam 87 Solution L.L U.L f F f relative X f*X f*(x-x )2 f*X2 30 39 11 11 0.11 34.5 379.5 9637.76 13092.75 40 49 12 23 0.12 44.5 534 4609.92 23763 50 59 16 39 0.16 54.5 872 1474.56 47524 60 69 23 62 0.23 64.5 1483.5 3.68 95685.75 70 79 17 79 0.17 74.5 1266.5 1838.72 94354.25 80 89 11 90 0.11 84.5 929.5 4577.76 78542.75 90 99 10 100 0.1 94.5 945 9241.6 89302.5 6410 31384 442265 100 Mean (X ) = Variance (S2 ) S.D (s) CV 64.1 317.010101 17.80477748 0.277765639 C.V s X 1 S 2 S2 fi ( X i X )2 n 1 n f i X i2 ( f i X i ) 2 Dr Yehya Mesalam n(n 1) 88 histogram 25 20 15 10 5 0 30 39 40 49 50 59 Dr Yehya Mesalam 89 histogram 25 20 15 10 5 0 30 39 40 49 50 59 Dr Yehya Mesalam 90 histogram 25 20 15 10 5 0 30 39 40 49 50 59 Dr Yehya Mesalam 91 histogram Take scale every 1 Cm = 10 $ or degree as given in your data X1 X2 X3 X7 1 Cm 44.5 34.5 C 54.5 64.5 74.5 84.5 94.5 Dr Yehya Mesalam 92 histogram X1 X2 X3 X7 C 2 C Dr Yehya Mesalam 93 Solution L.L U.L L. B U. B 30 39 29.5 39.5 40 49 39.5 49.5 50 59 49.5 59.5 60 69 59.5 69.5 70 79 69.5 79.5 80 89 79.5 89.5 90 99 89.5 99.5 The graph of histogram must be on the boundaries not on the limits L.B for class i= U.B for class i= L.L i +U.L i-1 2 L.L i+1 + U.L i 2 L.B 1 = (30+29)/2 =29.5 U.B 1 = (39+40)/2 =39.5 Then L.B i = U.B i-1 or U.B i = L.B i+1 L.B 2 = (40+39)/2 =39.5 L.L & U.L is the Lower Limit & upper limit for the class L.B & U.B is the Lower boundary& upper boundary for the class Dr Yehya Mesalam 94 histogram X1 X2 25 15 10 5 44.5 C 54.5 59.5 34.5 39.5 24.5 49.5 0 29.5 frequency 20 64.5 74.5 84.5 94.5 104.5 class mark Dr Yehya Mesalam 95 histogram 25 0.20 frequency 20 15 10 Relative frequency fr 0.05 5 0 24.5 34.5 44.5 54.5 64.5 74.5 84.5 94.5 104.5 class mark Dr Yehya Mesalam 96 Polygon 25 frequency 20 15 10 5 0 24.5 34.5 44.5 54.5 64.5 74.5 84.5 94.5 104.5 class mark Dr Yehya Mesalam 97 Polygon 25 frequency 20 15 10 5 0 24.5 34.5 44.5 54.5 64.5 74.5 84.5 94.5 104.5 class mark Dr Yehya Mesalam 98 Polygon 25 frequency 20 15 10 5 0 24.5 34.5 44.5 54.5 64.5 74.5 84.5 94.5 104.5 class mark Dr Yehya Mesalam 99 Polygon 25 frequency 20 15 10 5 0 24.5 34.5 44.5 54.5 64.5 74.5 84.5 94.5 104.5 class mark Dr Yehya Mesalam 10 Polygon 25 frequency 20 15 10 5 0 24.5 34.5 44.5 54.5 64.5 74.5 84.5 94.5 104.5 class mark Dr Yehya Mesalam 10 histogram fr 25 0.20 15 10 Mode frequency 20 5 0.05 0 24.5 34.5 44.5 54.5 64.5 74.5 84.5 94.5 104.5 class mark Dr Yehya Mesalam 10 Median Class limit 30-39 40-49 50-59 60-69 70-79 80-89 90-99 fi F 11 12 16 23 17 11 10 100 11 23 39 62 79 90 100 ~ X Lmed n Fmed 1 C* 2 f med 60-0.5= 59.5 Dr Yehya Mesalam 10 Median Class limit 30-39 40-49 50-59 60-69 70-79 80-89 90-99 fi 11 12 16 23 17 11 10 100 Less than F 11 23 39 62 79 90 100 ~ X Lmed n Fmed 1 C* 2 f med 60-0.5= 59.5 Dr Yehya Mesalam 10 Median Class limit 30-39 40-49 50-59 60-69 70-79 80-89 90-99 fi 11 12 16 23 17 11 10 100 Less than F 11 23 39 62 79 90 100 ~ X Lmed n Fmed 1 C* 2 f med 60-0.5= 59.5 Dr Yehya Mesalam 10 Median Class limit 30-39 40-49 50-59 60-69 70-79 80-89 90-99 fi 11 12 16 23 17 11 10 100 Less than F 11 23 39 62 79 90 100 ~ X Lmed n Fmed 1 C* 2 f med 60-0.5= 59.5 Median =59.5+10(50-39)/23 = 64.28 Dr Yehya Mesalam 10 Mode Class limit 30-39 40-49 50-59 60-69 70-79 80-89 90-99 ^ fi X Lmod 11 12 16 23 17 11 10 100 1 C* 1 2 60-0.5= 59.5 Dr Yehya Mesalam 10 Mode Class limit 30-39 40-49 50-59 60-69 70-79 80-89 90-99 ^ fi X Lmod 11 12 16 23 17 11 10 100 1 C* 1 2 60-0.5= 59.5 1 23 16 7 Dr Yehya Mesalam 10 Mode Class limit 30-39 40-49 50-59 60-69 70-79 80-89 90-99 ^ fi X Lmod 11 12 16 23 17 11 10 100 1 C* 1 2 60-0.5= 59.5 1 23 16 7 2 23 17 6 Dr Yehya Mesalam 10 Mode Class limit 30-39 40-49 50-59 60-69 70-79 80-89 90-99 ^ fi X Lmod 11 12 16 23 17 11 10 100 1 C* 1 2 60-0.5= 59.5 1 23 16 7 2 23 17 6 Mode=59.5+10(7/13) = 64.88 Dr Yehya Mesalam 11 O-Gives ( Less Than & More than) Lower Less Boundary Than 29.5 0 39.5 49.5 59.5 69.5 79.5 89.5 99.5 Dr Yehya Mesalam 11 O-Gives ( Less Than & More than) Lower Less Boundary Than 29.5 39.5 0 11 49.5 59.5 69.5 79.5 89.5 99.5 Dr Yehya Mesalam 11 O-Gives ( Less Than & More than) Lower Less Boundary Than 29.5 0 39.5 11 49.5 23 59.5 39 69.5 62 79.5 79 89.5 90 99.5 100 Dr Yehya Mesalam 11 Solution L.L U.L f F 30 39 11 11 40 49 12 23 50 59 16 39 60 69 23 62 70 79 17 79 80 89 11 90 90 99 10 100 100 Dr Yehya Mesalam 11 O-Gives ( Less Than & More than) Lower Less Boundary Than 29.5 0 39.5 11 49.5 23 59.5 39 69.5 62 79.5 79 89.5 90 99.5 100 More Than 100 Dr Yehya Mesalam 11 O-Gives ( Less Than & More than) Lower Less Boundary Than More Than 29.5 0 100 39.5 11 89 49.5 23 59.5 39 69.5 62 79.5 79 89.5 90 99.5 100 Dr Yehya Mesalam 11 O-Gives ( Less Than & More than) Lower Less Boundary Than More Than 29.5 0 100 39.5 11 89 49.5 23 77 59.5 39 61 69.5 62 38 79.5 79 21 89.5 90 10 99.5 100 0 M than =n- L than M than +L than =n Dr Yehya Mesalam 11 O-Gives ( Less Than & More than) Lower Less Boundary Than More Than 29.5 0 100 39.5 11 89 49.5 23 77 59.5 39 61 69.5 62 38 79.5 79 21 89.5 90 10 99.5 100 0 M than +L than =n Dr Yehya Mesalam 11 O-Gives ( Less Than & More than) Lower Less Boundary Than More Than More Than O-Gives Less Than 29.5 0 100 39.5 11 89 110 100 90 49.5 23 77 59.5 39 61 69.5 62 38 Cum. Frequency 80 70 60 50 40 30 79.5 79 21 20 10 89.5 90 10 0 29.5 99.5 100 0 39.5 49.5 59.5 69.5 79.5 89.5 99.5 Lower Boundary Dr Yehya Mesalam 11 O-Gives ( Less Than & More than) Lower Less Boundary Than More Than More Than O-Gives Less Than 29.5 0 100 39.5 11 89 110 100 90 49.5 23 77 59.5 39 61 69.5 62 38 Cum. Frequency 80 70 60 50 40 30 79.5 79 21 20 10 89.5 90 10 0 29.5 99.5 100 0 39.5 49.5 59.5 69.5 79.5 89.5 99.5 Lower Boundary Dr Yehya Mesalam 12 O-Gives ( Less Than & More than) Lower Less Boundary Than More Than More Than O-Gives Less Than 29.5 0 100 39.5 11 89 110 100 90 23 77 59.5 39 61 69.5 62 38 70 60 Mediam at n=50 50 40 Median 49.5 Cum. Frequency 80 30 79.5 79 21 20 10 89.5 90 10 0 29.5 99.5 100 0 39.5 49.5 59.5 69.5 79.5 89.5 99.5 Lower Boundary Dr Yehya Mesalam 12 O-Gives ( Less Than & More than) Estimate the value below which 75% of the values fall. 75% of the sample obtained more ( above) the value 51 More Than O-Gives Less Than 110 100 90 80 Cum. Frequency n= 100 100% ? 75% Then at frequency value =75 draw horizontal line cuts Less Than and More Than then determine the required value 70 60 50 40 30 20 10 0 39.5 49.5 59.5 Lower Boundary Dr Yehya Mesalam 69.5 79.5 89.5 99.5 77 29.5 51 75% of the sample obtained less (blew)the value 77 12 Short Cut Method X X0 S2 C2 fd C i i n n f i d i2 ( f i d i ) 2 n(n 1) Dr Yehya Mesalam 12 Short Cut Method L.L U.L f F f relative d f*d 30 39 11 11 0.11 -3 -33 40 49 12 23 0.12 -2 -24 50 59 16 39 0.16 -1 -16 60 69 23 62 0.23 0 0 70 79 17 79 0.17 1 17 80 89 11 90 0.11 2 22 90 99 10 100 0.1 3 30 Sum 100 Mean (X ) = Variance (S2 ) S.D (s) 317.010101 17.80477748 0.277765639 C.V CV 64.1 s X 1 f *d2 -4 X X0 C f i di n X = 64.5+ 10 (-4/100)=64.1 Dr Yehya Mesalam 12 Short Cut Method L.L U.L f F f relative d f*d f*d2 30 39 11 11 0.11 -3 -33 99 40 49 12 23 0.12 -2 -24 48 50 59 16 39 0.16 -1 -16 16 60 69 23 62 0.23 0 0 0 70 79 17 79 0.17 1 17 17 80 89 11 90 0.11 2 22 44 90 99 10 100 0.1 3 30 90 -4 314 Sum Mean (X ) = Variance (S2 ) S.D (s) C.V 100 64.1 317.010101 17.80477748 0.277765639 1 S 2 C 2 n f i d i2 ( f i d i ) 2 n( n 1) S2 = 102 *[(100*314-(-4)2 )/(100*99)] =317.010101 Dr Yehya Mesalam 125 Short Cut Method L.L U.L f F f relative d f*d 30 39 11 11 0.11 0 0 40 49 12 23 0.12 1 12 50 59 16 39 0.16 2 32 60 69 23 62 0.23 3 69 70 79 17 79 0.17 4 68 80 89 11 90 0.11 5 55 90 99 10 100 0.1 6 60 Sum Mean (X ) = Variance (S2 ) S.D (s) C.V 100 64.1 317.010101 17.80477748 0.277765639 1 f *d2 296 X X0 C f i di n X = 34.5+ 10 (296/100)=64.1 Dr Yehya Mesalam 126 Example Complete the following table, and then find the mean, mode, median, variance and CV. Draw the Histogram, Frequency polygon, Relative frequency histogram Class limits X L XU 50 - C xi fi F di 34 0 fi d 2i 8 - 10 - 14 - C fi di 10 - 119 65 16 120 Dr Yehya Mesalam 127 Solution Complete the following table, and then find the mean, mode, median, variance and CV. Draw the Histogram, Frequency polygon, Relative frequency histogram Class limits X L XU 50 - C xi fi F di 34 0 fi di fi d 2i 8 - 10 - 14 - 10 - 119 120= 50+7C 65 16 Then C=10 Dr Yehya Mesalam 128 Solution Complete the following table, and then find the mean, mode, median, variance and CV. Draw the Histogram, Frequency polygon, Relative frequency histogram Class limits X L XU xi fi F di 50 - 59 54.5 8 -2 60 - 69 64.5 10 -1 70 - 79 74.5 80 - 89 84.5 90 - 99 94.5 100 - 109 104.5 110 - 119 114.5 34 fi d 2i 0 1 10 fi di 14 2 3 65 4 Dr Yehya Mesalam 16 129 Solution Complete the following table, and then find the mean, mode, median, variance and CV. Draw the Histogram, Frequency polygon, Relative frequency histogram Class limits X L XU xi fi F di 50 - 59 54.5 8 -2 60 - 69 64.5 10 -1 70 - 79 74.5 80 - 89 84.5 14 1 90 - 99 94.5 10 2 100 - 109 104.5 110 - 119 114.5 34 fi di fi d 2i 0 14 3 1 65 4 Dr Yehya Mesalam 16 130 Solution Complete the following table, and then find the mean, mode, median, variance and CV. Draw the Histogram, Frequency polygon, Relative frequency histogram Class limits X L XU xi fi F di 50 - 59 54.5 8 8 -2 60 - 69 64.5 10 18 -1 70 - 79 74.5 16 34 0 80 - 89 84.5 14 48 1 90 - 99 94.5 10 58 2 100 - 109 104.5 6 64 3 110 - 119 114.5 1 65 4 fi di fi d 2i 14 Dr Yehya Mesalam 16 131 Solution Complete the following table, and then find the mean, mode, median, variance and CV. Draw the Histogram, Frequency polygon, Relative frequency histogram Class limits X L XU xi fi F di fi di fi d 2i 50 - 59 54.5 8 8 -2 -16 32 60 - 69 64.5 10 18 -1 -10 10 70 - 79 74.5 16 34 0 0 0 80 - 89 84.5 14 48 1 14 14 90 - 99 94.5 10 58 2 20 40 100 - 109 104.5 6 64 3 18 54 110 - 119 114.5 1 65 4 4 16 30 166 Dr Yehya Mesalam 132 Solution Complete the following table, and then find the mean, mode, median, variance and CV. Draw the Histogram, Frequency polygon, Relative frequency histogram Class limits X L XU xi fi F di fi di fi d 2i 50 - 59 54.5 8 8 -2 -16 32 60 - 69 64.5 10 18 -1 -10 10 70 - 79 74.5 16 34 0 0 0 80 - 89 84.5 14 48 1 14 14 90 - 99 94.5 10 58 2 20 40 100 - 109 104.5 6 64 3 18 54 110 - 119 114.5 1 65 4 4 16 30 166 Dr Yehya Mesalam 133 Short Cut Method X X0 fd C i i n Mean = 74.5 + 10 ( 30/65) = 79.11 S2 C2 n f i d i2 ( f i d i ) 2 n(n 1) Variance = (10)2 [65*166 – (30)2 ] / [65*64 ] = 237.74 S.D = (237.74) 0.5 =15.41 Dr Yehya Mesalam 134 Shape 1. Describes how data are distributed 2. Measures of Shape • Skew = Symmetry Left-Skewed Mean Median Symmetric Mean = Median Dr Yehya Mesalam Right-Skewed Median Mean 135 Moment About the Origin mK/ f m K mK Xi i X n / 2 m f / 3 m f / 4 f m Xi n f / 1 i About the Mean i Xi 2 Xi 3 f (X i m1 f (X i i Xi n Dr Yehya Mesalam X) i n m2 f (X m3 f (X m4 f (X i i 0 X )2 n i i X )3 n n 4 X )K n n i i i i X )4 n 136 Moment • Coefficient of Skewness 1 1 0 Left-Skewed Mean Median Skewness to Left 1 See page 38 m3 m23 1 0 Symmetric Mean = Median Normal Distribution Dr Yehya Mesalam 1 0 Right-Skewed Median Mean Skewness to Right 137 Moment • Coefficient of Kurtosis 2 2 m4 m22 2 3 2 3 Symmetric Leptokurtic Normal Distribution Dr Yehya Mesalam 2 3 Platykurtic 138 Example • From the given graph, complete the following tables, draw the histogram and polygon, determine the mode and median graphically, and calculate the mean, median, mode, variance, standard deviation, and coefficient of variation Class limits xi fi fr Dr Yehya Mesalam 139 Solution • Class limits xi 12 - 16 14 17 - 21 fr d fd 6 6/80 -3 -18 19 8 8/80 -2 -16 22 - 26 24 14 14/80 -1 -14 27 - 31 29 24 24/80 0 0 32 - 36 34 14 14/80 1 14 37 - 41 39 8 8/80 2 16 52 - 46 44 6 6/80 3 18 80 1 sum fi 0 X = 29+ 5(0/80)=29 Dr Yehya Mesalam 140 Example • From the given graph, complete the following tables, draw the histogram and polygon, determine the mode and median graphically, and calculate the mean, median, mode, variance, standard deviation, and coefficient of variation Class limits xi fi fr Dr Yehya Mesalam 141 Example • Complete the table, compute the mean, variance, , and mode and median analytical and graphical Class limit Frequency Relative frequency Boundaries Cumulative frequency ? - ? ? ? More than ? 100 20 - ? ? ? More than 19.95 92 ? - ? 17 ? More than ? ? ? - ? ? ? More than ? 46 ? - ? ? 0.12 More than 37.95 ? ? - ? ? ? More than ? 5 ? ? More than ? ? Dr Yehya Mesalam 142 Solution • Class limit Frequency Relative frequency Boundaries Cumulative frequency 14 - 19.9 8 0.08 More than 13.95 100 20 - 25.9 29 0.29 More than 19.95 92 26 - 31.9 17 0.17 More than 25.95 63 32 - 37.9 29 0.29 More than 31.95 46 38 - 43.9 12 0.12 More than 37.95 17 44 - 49.9 5 0.05 More than 43.95 5 100 1 More than 49.95 0 Dr Yehya Mesalam 143 Solution L.L U.L f d Fd f d2 14 19.9 8 -3 -24 72 20 25.9 29 -2 -58 116 26 31.9 17 -1 -17 17 32 37.9 29 0 0 0 38 43.9 12 1 12 12 44 49.9 5 2 10 20 -77 237 Sum mean X 100 30.33 Variance S2 64.62181818 S.D 8.038769693 Dr Yehya Mesalam 144 Solution viscosity 35 29 30 29 20 17 15 10 8 5 Mode 12 Mode frequency 25 5 0 16.95 22.95 28.95 34.95 40.95 46.95 class mark Dr Yehya Mesalam 145 Solution O-Gives 110 100 100 95 100 90 83 92 Cum. Frequency 80 70 60 Mediam at n=50 63 54 More Than Less Than 50 37 40 46 30 20 8 10 17 5 0 0 0 13.96 19.96 25.96 31.96 37.96 43.96 49.96 Lower Boundary Dr Yehya Mesalam 146 147