STATISTICS AND OPTIMIZATION Dr. Asawer A. Alwasiti Contenet Chapter One: introduction fundamental elements of statistics, types of data, Methods of describing data, measures of central tendency, measures of variation measures of relative standing Chapter Two: Probability discreet of random variable: the probability distribution, the binomial probability distribution, The hypogeometric probability distribution, Poison distribution. Continuous random distribution: the continuous random variable, uniform probability distribution, normal probability distribution Chapter Three: Testing of Hypothesis Sampling distributions – Testing of hypothesis for mean, variance, proportions and differences using Normal, t, Chisquare and F distributions - Tests for independence of attributes and Goodness of fit. Chapter Four: Simple linear regression Regression model, methods of least squares, the coefficient of correlation, The coeffiecnt of determination Chapter Five DESIGN OF EXPERIMENTS Experimental design terminology, factorial design Completely randomized design – Randomized block design Chapter One Introduction Statistics: is concerned with scientific methods for collecting, organizing, summarizing, presenting and analyzing data as well as drawing valid conclusions and making reasonable decisions. Types of data: Quantitative data : are those that represent the quantity or amount of something, measured on a numerical scales. For example; the power frequency Qualitative data: it’s the data that can only classified i.e. posses no numerical representation Population: refers to all the persons, objects, source or measurements under consideration, or it is a data set that is our target of interest. Sample: refers to any portion of the population Descriptive Statistics: used to organize, summarize and describe measures of sample. It uses numbers to summarize information which is known about some situation. Inductive (inference) statistics: are used to predict population parameters from sample measures. Variables: is a symbol such as X, Y ,H…. which can assume any of the prescribed set of values. It contains qualitative and quantitative variables Continuous variable: can theoretically assume any value between two given values depending on accuracy of measurements Discrete variable: all data can be obtained from counting Parameter: the measures which describe population characteristics. Example: The reliability of computer system is measured in terms of life length of a specific hardware component (e.g hard disk life). To estimate the reliability of a particular system , 100 computer component are tested until they fail, under their life length are recorded. What is the population of interest? What is the sample? Are the data are qualitative or quantitative? How could the sample information be used to estimate the reliability of the computer system? Graphical method of representation Qualitative Data They are usually achieved using Bar graph or Pie chart Bar graph: the category (class) of the qualitative variable is represented by Bar graph in which the height of each bar is either the class frequency, class relative frequency or class percentage. Pie chart: the category (class) of the quantitative variable is represented by Pie chart. The size of each slice is proportional to the class relative frequency. Pareto diagram: a bar graph with the category (class) of the qualitative variable arranged by height in descending order from left to right. Example: Group of researchers investigating the safety of nuclear power reactors and the hazard of using energy, they discovered 45 energy related accident worldwide since1977 that resulted in multi factories as: category frequency Coal mine collapse 7 Dam frailer 4 Gas explosion 28 lightning 1 Nuclear reactor 1 Oil fire 4 total 45 Chart of Causes 30 28 26 24 22 Frequency 20 18 16 14 12 10 8 6 4 2 0 Coal mine collapse Gas explosion Nuclear reactor Dam frailer lightning Causes Oil fire Pie Chart of Causes Oil fire Nuclear reactor lightning Coal mine collapse Dam frailer Gas explosion Quantitative Data It can be represented in graphical or numerical way Graphical representation Quantitative Data can be represented graphically by Histogram Frequency distribution Raw data: are collected data which have been collected numerically Array: arranged of raw data in ascending or descending order. Range: the difference between the largest and smallest value Frequency distribution: a table arrangement of data by classes together with the corresponding class frequencies. Class interval: A symbol defining the class. Class mark: is the mid point of the class interval Formation of frequency distribution: Determine the largest and smallest observation Take total width = range + 1 unit in the last significant digit Dived total width in 5-20 class of equal width Calculate class width, interval and class mark Calculate frequencies Histogram Graphical representation of frequency distribution consist of a set of rectangular having: Basis with centers at class marks and lengths equal to the class width Area proportional to class frequencies Frequency polygon Formed by connecting the mid points of the tops of the rectangular in the histogram Relative frequency Is the frequency of the class divided by the total frequency and expressed as a percentage Example The pH level of drilling mud of well that determined within 24 hr is shown in table below, make the frequency distribution table and graph the data 7.25 7.37 7.33 7.38 7.3 7.35 7.34 7.38 7.41 7.39 7.26 7.3 7.39 7.31 7.39 7.34 7.34 7.32 7.42 7.43 7.36 7.35 7.34 7.32 7.24 7.35 7.37 7.35 7.45 7.46 7.36 7.26 7.39 7.35 7.33 7.3 7.34 7.39 7.4 7.4 7.34 7.34 7.39 7.3 7.37 7.25 7.33 7.33 7.41 7.4 7.3 7.29 7.28 7.29 7.32 7.36 7.32 7.38 7.43 7.45 Example The viscosity of 40 sample of drilling mud measured in cp is shown below. 50.2them in frequency 49.3 table and 49.9with histogram. 50.1 50.5 Represent 51.1 49.7 50.3 49.9 51.4 49.8 49.6 49.5 49.8 50.7 50.2 50.4 50 50.7 48.6 48.9 50 50 50.3 49.4 49.9 48.6 50 49.4 50.6 50 49.9 50.6 50.8 49 49.5 51.3 50.8 50.2 50.3