2-6 Multivariate Data

advertisement

2-1 Data Summary and Display

Engineering Example

Suppose that an engineer is developing a rubber compound for use in

O-rings. The O-rings are to be employed as seals in plasma etching tools used in the semiconductor industry, so their resistance to acids and other corrosive substances is an important characteristic. The engineer uses the standard rubber compound to produce eight Orings in a development laboratory and measures the tensile strength of each specimen after immersion in a nitric acid solution at 30 ° C for

25 minutes [refer to the American Society for Testing and Materials

(ASTM) Standard D 1414 and the associated standards for many interesting aspects of testing rubber O-rings]. The tensile strengths (in psi) of the eight O-rings are 1030, 1035, 1020, 1049, 1028, 1026,

1019, and 1010. Then, he decides to consider a modified formulation of the rubber in which a Teflon additive is included. Eight O-ring specimens are made from this modified rubber compound and subjected to the nitric acid emersion test described earlier. The tensile test results are 1037, 1047, 1066, 1048, 1059, 1073, 1070, and 1040.

2-1 Data Summary and Display

2-1 Data Summary and Display

Population Mean

For a finite population with N measurements, the mean is

The sample mean is a reasonable estimate of the population mean .

2-1 Data Summary and Display

Sample Variance and Sample Standard Deviation

2-1 Data Summary and Display

The sample variance is

The sample standard deviation is

2-1 Data Summary and Display

Computational formula for s 2

2-1 Data Summary and Display

Population Variance

When the population is finite and consists of N values, we may define the population variance as

The sample variance is a reasonable estimate of the population variance .

2-2 Stem-and-Leaf Diagram

Steps for Constructing a Stem-and-Leaf Diagram

2-2 Stem-and-Leaf Diagram

2-2 Stem-and-Leaf Diagram

2-2 Stem-and-Leaf Diagram

2-2 Stem-and-Leaf Diagram

2-2 Stem-and-Leaf Diagram

Q

1

Q

2

Q

3

= (20+21)/2 = (143+145)/2 = 144

= (40+41)/2 = 161.5

= (60+61)/2 = 181

2-2 Stem-and-Leaf Diagram

2-3 Histograms

A histogram is a more compact summary of data than a stem-and-leaf diagram. To construct a histogram for continuous data, we must divide the range of the data into intervals, which are usually called class intervals , cells , or bins . If possible, the bins should be of equal width to enhance the visual information in the histogram.

2-3 Histograms

2-3 Histograms

2-3 Histograms

2-3 Histograms

2-3 Histograms

An important variation of the histogram is the Pareto chart .

This chart is widely used in quality and process improvement studies where the data usually represent different types of defects, failure modes, or other categories of interest to the analyst. The categories are ordered so that the category with the largest number of frequencies is on the left, followed by the category with the second largest number of frequencies, and so forth.

2-3 Histograms

2-4 Box Plots

• The box plot is a graphical display that simultaneously describes several important features of a data set, such as center, spread, departure from symmetry, and identification of observations that lie unusually far from the bulk of the data.

Whisker

Outlier

Extreme outlier

2-4 Box Plots

2-4 Box Plots

2 nd quartile = median = 161.5

1 st quartile = 143.5

3 rd quartile = 181

IQR = Q

3

– Q

1

= 181 – 143.5 = 37.5

1.5 IQR = 56.25

Q

1

- 1.5 IQR = 143.5 – 56.25 = 87.25

IQR = Q

3

– Q

1

= 181 – 143.5 = 37.5

1.5 IQR = 56.25

Q

3

+ 1.5 IQR = 237.25

2-4 Box Plots

2-5 Time Series Plots

• A time series or time sequence is a data set in which the observations are recorded in the order in which they occur.

• A time series plot is a graph in which the vertical axis denotes the observed value of the variable (say x ) and the horizontal axis denotes the time (which could be minutes, days, years, etc.).

• When measurements are plotted as a time series, we often see

• trends,

• cycles, or

• other broad features of the data

2-5 Time Series Plots

2-5 Time Series Plots

2-5 Time Series Plots

2-6 Multivariate Data

• The dot diagram, stem-and-leaf diagram, histogram, and box plot are descriptive displays for univariate data; that is, they convey descriptive information about a single variable.

•Many engineering problems involve collecting and analyzing multivariate data , or data on several different variables.

•In engineering studies involving multivariate data, often the objective is to determine the relationships among the variables or to build an empirical model.

2-6 Multivariate Data

2-6 Multivariate Data

2-6 Multivariate Data

Sample Correlation Coefficient

The strength of a linear relationship between two variables

2-6 Multivariate Data

Strong when 0.8≤ r ≤ 1, weak 0 ≤ r ≤ 0.5, and moderate otherwise

2-6 Multivariate Data

2-6 Multivariate Data

2-6 Multivariate Data

2-6 Multivariate Data

2-6 Multivariate Data

Download