ENGINEERING DATA ANALYSIS STATISTICS derived from the word “state”, was used to refer to a collection of facts of interest to the state. the science that deals with the systematic method of collecting, classifying, presenting, analyzing and interpreting qualitative and numerical data. The art of learning from data. It is concerned with the collection of data, their subsequent description, and their analysis, which often leads to the drawing of conclusions. POPULATION: The collection of all individuals or items under consideration in a statistical study. SAMPLE: That part of the population from which information is obtained. DATA: a set of observations a set of possible outcomes. PARAMETER: a number that is used to represent a population characteristic and that generally cannot be determined easily. STATISTIC: a numerical characteristic of the sample; a statistic estimates the corresponding population parameter. DESCRIPTIVE STATISTICS This refers to the methods of summarizing and presenting data in the form which will make them easier to analyze and interpret. It characterizes the distribution of a set of observations on a specific variable or variables. Includes the construction of graphs, charts, and tables and the calculation of various descriptive measures such as averages, measures of variation, and percentiles. INFERENTIAL STATISTICS Consists of methods for drawing and measuring the reliability of conclusions about a population based on information obtained from a sample of the population. This refers to the drawing of valid conclusions or inferences about a population based on representative sample systematically taken from the same population. This study source was downloaded by 100000858453199 from CourseHero.com on 12-13-2022 10:59:53 GMT -06:00 https://www.coursehero.com/file/45825333/REVIEWER-FINAL-Engineering-Data-Analysisdocx/ ENGINEERING DATA ANALYSIS If the purpose of the study is to examine and explore information for its own intrinsic interest only, the study is descriptive. If the information is obtained from a sample of a population and the purpose of the study is to use that information to draw conclusions about the population, the study is inferential. Thus, a descriptive study may be performed either on a sample or on a population. Only when an inference is made about the population, based on information obtained from the sample, does the study become inferential. VARIABLES any characteristics, number, or quantity that can be measured or counted. A variable may also be called a data item It is called a variable because the value may vary between data units in a population, and may change in value over time. TYPES OF VARIABLES NUMERIC VARIABLES have values that describe a measurable quantity as a number, like 'how many' or 'how much'. Therefore numeric variables are quantitative variables. CATEGORICAL VARIABLES have values that describe a 'quality' or 'characteristic' of a data unit, like 'what type' or 'which category'. fall into mutually exclusive (in one category or in another) and exhaustive (include all possible options) categories. categorical variables are qualitative variables and tend to be represented by a non-numeric value. NUMERIC/ QUANTITATIVE VARIABLES CONTINUOUS VARIABLE. Observations can take any value between a certain set of real numbers. The value given to an observation for a continuous variable can include values as small as the instrument of measurement allows. DISCRETE VARIABLE. Observations can take a value based on a count from a set of distinct whole values. A discrete variable cannot take the value of a fraction between one value and the next closest value. CATEGORICAL/ QUALITATIVE VARIABLES This study source was downloaded by 100000858453199 from CourseHero.com on 12-13-2022 10:59:53 GMT -06:00 https://www.coursehero.com/file/45825333/REVIEWER-FINAL-Engineering-Data-Analysisdocx/ ENGINEERING DATA ANALYSIS Ordinal variable Observations can take a value that can be logically ordered or ranked. The categories associated with ordinal variables can be ranked higher or lower than another, but do not necessarily establish a numeric difference between each category. Nominal variable. Observations can take a value that is not able to be organized in a logical sequence FREQUENCY DISTRIBUTION It is a tabular arrangement of data showing its classification or grouping according to magnitude or size. Class Interval – This refers to the grouping defined by a lower limit and an upper limit Class frequency – refers to the number of observations belonging to a class interval Class mark – is the midpoint or middle value of the class interval Class boundary – is the more precise expressions of the class limits also called the true limits. Class size – is the width of each class interval GRAPHS Data can be summarized or presented in two ways: 1. Tabular 2. Charts/graphs. BAR CHART: used to display the frequency distribution in the graphical form. PIE CHART: used to display the frequency distribution. It displays the ratio of the observations. LINE CHART: used to display the trend of observations. It is a very popular display for the data which represent time. HISTOGRAM: Looks like the bar chart except that the horizontal axis represent the data which is quantitative in nature. There is no gap between the bars. FREQUENCY POLYGON: looks like the line chart except that the horizontal axis represent the class mark of data whichfrom is quantitative nature. This study source was downloadedthe by 100000858453199 CourseHero.com onin 12-13-2022 10:59:53 GMT -06:00 https://www.coursehero.com/file/45825333/REVIEWER-FINAL-Engineering-Data-Analysisdocx/ ENGINEERING DATA ANALYSIS OGIVE: line graph with the horizontal axis represent the upper limit of the class interval while the vertical axis represent the cummulative frequencies. MEASURES OF CENTRAL TENDENCY Offer us a “point'' estimate, or single number, which we can use as a summary of a distribution of scores. One number which represents or characterizes the entire distribution (as best as one number can). Keep in mind, the center point of scores in a distribution may not be in the middle of the scale of those scores. MODE: The most frequently occurring score in a distribution. MEDIAN: The “middle'' score of a distribution. The point that lies in the middle of a distribution. MEAN: The arithmetic average of the scores of a distribution. The sum of the observations divided by the number of observations in a data set. TRIMMED MEAN simply refers to a mean calculated after “trimming'' a certain percentage of extreme scores. MEDIAN is an extreme example of a trimmed mean; the median trims all but the middle score or middle two scores. M-ESTIMATORS are weighted means; meaning scores near the middle are given more weight and scores at the extremes are given less weight. VARIATION a way to show how data is dispersed, or spread out. MEASURES OF SPREAD A proper description of a set of data should include both these characteristics. RANGE The difference between the largest and the smallest sample values. This study source was downloaded by 100000858453199 from CourseHero.com on 12-13-2022 10:59:53 GMT -06:00 https://www.coursehero.com/file/45825333/REVIEWER-FINAL-Engineering-Data-Analysisdocx/ ENGINEERING DATA ANALYSIS It depends only on extreme values and provides no information about how the remaining data are distributed. MEAN ABSOLUTE DEVIATION All items in the distribution must be taken into account and determine the amount by which each item value varies from the mean of the distribution the average distance between each data value and the mean. VARIANCE It is the average of the squared deviation values from the distribution’s mean. If all values are identical, the variance is zero, the greater the dispersion of values, the greater the variance. general idea of the spread of your data. STANDARD VARIATION indicates how far, on average, the observations in the sample are from the mean of the sample. This study source was downloaded by 100000858453199 from CourseHero.com on 12-13-2022 10:59:53 GMT -06:00 https://www.coursehero.com/file/45825333/REVIEWER-FINAL-Engineering-Data-Analysisdocx/ ENGINEERING DATA ANALYSIS ____________________ derived from the word “state”, was used to refer to a collection of facts of interest to the state. ____________________ classifying, the science that deals with the systematic method of collecting, presenting, analyzing and interpreting qualitative and numerical data. ____________________ The art of learning from data. ____________________ It is concerned with the collection of data, their subsequent description, and their analysis, which often leads to the drawing of conclusions. This study source was downloaded by 100000858453199 from CourseHero.com on 12-13-2022 10:59:53 GMT -06:00 https://www.coursehero.com/file/45825333/REVIEWER-FINAL-Engineering-Data-Analysisdocx/ ENGINEERING DATA ANALYSIS ____________________ The collection of all individuals or items under consideration in a statistical study. ____________________ That part of the population from which information is obtained. ____________________ a set of observations ____________________ a set of possible outcomes ____________________ This refers to the methods of summarizing and presenting data in the form ____________________ which will make them easier to analyze and interpret. It characterizes the distribution of a set of observations on a specific variable or variables. ____________________ Includes the construction of graphs, charts, and tables and the calculation of various descriptive measures such as averages, measures of variation, and percentiles. ____________________ Consists of methods for drawing and measuring the reliability of conclusions about a population based on information obtained from a sample of the population. ____________________ This refers to the drawing of valid conclusions or inferences about a population based on representative sample systematically taken from the same population. ____________________ any characteristics, number, or quantity that can be measured or counted. ____________________ A _________ may also be called a data item ____________________ It is called a _______ because the value may vary between data units in a population, and may change in value over time. ____________________ have values that describe a measurable quantity as a number, like 'how many' or 'how much'. Therefore numeric variables are quantitative variables. ____________________ have values that describe a 'quality' or 'characteristic' of a data unit, like 'what type' or 'which category'. ____________________ fall into mutually exclusive (in one category or in another) and exhaustive (include all possible options) categories. ____________________ _______________ are qualitative variables and tend to be represented by a non-numeric value. ____________________ Observations can take any value between a certain set of real numbers. ____________________ The value given to an observation for a _______________ can include values as small as the instrument of measurement allows. ____________________ Observations can take a value based on a count from a set of distinct whole values. ____________________ A ____________ cannot take the value of a fraction between one value and the next closest value. This study source was downloaded by 100000858453199 from CourseHero.com on 12-13-2022 10:59:53 GMT -06:00 https://www.coursehero.com/file/45825333/REVIEWER-FINAL-Engineering-Data-Analysisdocx/ ENGINEERING DATA ANALYSIS ____________________ Observations can take a value that can be logically ordered or ranked. ____________________ The categories associated with ordinal variables can be ranked higher or lower than another, but do not necessarily establish a numeric difference between each category. ____________________ Observations can take a value that is not able to be organized in a logical sequence ____________________ Data can be summarized or presented in two ways and Tabular and Charts/graphs. ____________________ used to display the frequency distribution in the graphical form. ____________________ used to display the frequency distribution. ____________________ It displays the ratio of the observations. ____________________ used to display the trend of observations. ____________________ It is a very popular display for the data which represent time. ____________________ Looks like the bar chart except that the horizontal axis represent the data which is quantitative in nature. ____________________ There is no gap between the bars. ____________________ looks like the line chart except that the horizontal axis represent the class mark of the data which is quantitative in nature. ____________________ line graph with the horizontal axis represent the upper limit of the class interval while the vertical axis represent the cummulative frequencies. ____________________ Offer us a “point'' estimate, or single number, which we can use as a summary of a distribution of scores. ____________________ One number which represents or characterizes the entire distribution (as best as one number can). ____________________ Keep in mind, the center point of scores in a distribution may not be in the middle of the scale of those scores. ____________________ The most frequently occurring score in a distribution. ____________________ The “middle'' score of a distribution. ____________________ The point that lies in the middle of a distribution. ____________________ The arithmetic average of the scores of a distribution. ____________________ The sum of the observations divided by the number of observations in a data set. ____________________ simply refers to a mean calculated after “trimming'' a certain percentage of extreme scores. ____________________ is an extreme example of a trimmed mean; the median trims all but the middle score or middle two scores. ____________________ are weighted means; meaning scores near the middle are given more weight and scores at the extremes are given less weight. ____________________ The difference between the largest and the smallest sample values. This study source was downloaded by 100000858453199 from CourseHero.com on 12-13-2022 10:59:53 GMT -06:00 https://www.coursehero.com/file/45825333/REVIEWER-FINAL-Engineering-Data-Analysisdocx/ ENGINEERING DATA ANALYSIS ____________________ It depends only on extreme values and provides no information about how the remaining data are distributed. ____________________ All items in the distribution must be taken into account and determine the amount by which each item value varies from the mean of the distribution ____________________ the average distance between each data value and the mean. ____________________ It is the average of the squared deviation values from the distribution’s mean. If all values are identical, the variance is zero, the greater the dispersion of values, the greater the variance. ____________________ general idea of the spread of your data. This study source was downloaded by 100000858453199 from CourseHero.com on 12-13-2022 10:59:53 GMT -06:00 https://www.coursehero.com/file/45825333/REVIEWER-FINAL-Engineering-Data-Analysisdocx/ Powered by TCPDF (www.tcpdf.org)