Module 1 EMath 2 - Engineering Data Analysis The Role of Statistics in Engineering OUTLINE 1-1. The Engineering Method and Statistical Thinking 1-2. Collecting Engineering Data 1-3. Mechanistic and Empirical Models 1-4. Probability and Probability Models Learning Goals At the end of the module, you are expected to have learned the following: Explain engineering problem solving process. Describe the application of statistics in the engineering problem solving process. Distinguish between enumerative and analytic studies. Explain three methods of data collection: retrospective study, observational study, and designed experiment. Explain the difference between mechanistic and empirical models. Describe the application of probability in the engineering problem solving process. 1-1 The Engineering Method and Statistical Thinking Engineering Problem Solving Process Engineers develop practical solutions and techniques for engineering problems by applying scientific principles and methodologies. Existing systems are improved and/or new systems are introduced by the engineering approach for better harmony between humans, systems, and environments. In general, the engineering problem solving process includes the following steps: 1. Problem definition: Describe the problem to solve. 2. Factor identification: Identify primary factors which cause the problem. 3. Model (hypothesis) suggestion: Propose a model (hypothesis) that explains the relationship between the problem and factors. 4. Experiment: Design and run an experiment to test the tentative model. 5. Analysis: Analyze data collected in the experiment. 6. Model modification: Refine the tentative model. 7. Model validation: Validate the engineering model by a follow-up experiment. 8. Conclusion (recommendation): Draw conclusions or make recommendations based on the analysis results. (Note) Some of these steps may be iterated as necessary. Application of Statistics Statistical methods are applied to interpret data with variability. Throughout the engineering problem solving process, engineers often encounter data showing variability. Statistics provides essential tools to deal with the observed variability. Examples of the application of statistics in the engineering problem solving process include, but are not limited to, the following: 1. Summarizing and presenting data: numerical summary and visualization in descriptive statistics. 2. Inferring the characteristics (mean, median, proportion, and variance) of single/two populations: z, t, x2, and F tests in parametric statistics; signed, signed-rank, and rank-sum tests in nonparametric statistics. 3. Testing the relationship between variables: correlation analysis; categorical data analysis. 4. Modeling the causal relationship between the response and independent variables: regression analysis; analysis of variance. 5. Identifying the sources of variability in response: analysis of variance. 6. Evaluating the relative importance of factors for the response variable: regression analysis; analysis of variance. 7. Designing an efficient, effective experiment: design of experiment. These applications would lead to development of general laws and principles such as Ohm's law and design guidelines. Enumerative vs. Analytic Studies Two types of studies are defined depending on the use of a sample in statistical inference: 1. Enumerative study: Makes an inference to the well-defined population from which the sample is selected. (e.g.) defective rate of products in a lot 2. Analytic study: Makes an inference to a future (conceptual) population. (e.g.) defective rate of products at a production line 1-2 Collecting Engineering Data Data Collection Methods Three methods are available for data collection: 1. Retrospective study: Use existing records of the population. Some crucial information may be unavailable and the validity of data be questioned. 2. Observational study: Collect data by observing the population with as minimal interference as possible. Information of the population for some conditions of interest may be unavailable and some observations be contaminated by extraneous variables. 3. Designed experiment: Collect data by observing the population while controlling conditions on the experiment plan. The findings would obtain scientific rigorousness through deliberate control of extraneous variables. 1-3 Mechanistic and Empirical Models Mechanistic vs. Empirical Models Models (explaining the relationship between variables) can be divided into two categories: 1. Mechanistic model: Established based on the underlying theory, principle, or law of a physical mechanism. where: I = current, E = voltage, R= resistance, and ɛ = random error 2. Empirical model: Established based on the experience, observation, or experiment of a system (population) under study. (e.g.) y = β0 + β1x + ɛ where: y = sitting height, x = stature, and ɛ = random error 1-4 Probability and Probability Models Application of Probability Along with statistics, the concepts and models of probability are applied in the engineering problem solving process for the following: 1. Modeling the stochastic behavior of the system: discrete and continuous probability distributions. 2. Quantifying the risks involved in statistical inference: error probabilities in hypothesis testing. 3. Determining the sample size of an experiment for a designated test condition: sample size selection Reference: Montgomery D. C. and Runger G.C., 2018, Applied Statistics and Probability for Engineers, 7th Edition, 111 River Street, Hoboken, NJ: John Wiley & Sons, Inc.