Scatter Diagram - Zoe-s-wiki

advertisement
Scatter Diagram
The Scatter Diagram is a tool for determining the potential correlation between two different sets of
variables, i.e., how one variable changes with the other variable. This diagram simply plots pairs of
corresponding data from two variables, which are usually two variables in a process being studied.
The scatter diagram does not determine the exact relationship between the two variables, but it does
indicate whether they are correlated or not. It, by itself, also does not predict cause and effect
relationships between these variables.
The scatter diagram is used to: 1) quickly confirm a hypothesis that two variables are correlated; 2)
provide a graphical representation of the strength of the relationship between two variables; and 3)
serve as a follow-up step to a cause-effect analysis to establish whether a change in an identified
cause can indeed produce a change in its identified effect.
To make a scatter diagram for two variables requiring confirmation of correlation, the following
simple steps are usually followed:
1) collect 50-100 pairs of data for the two variables and tabulate them;
2) draw the x- and y-axes of the diagram, along with the scales that increase to the right for the xaxis and upward for the y-axis;
3) assign the data for one variable to the x-axis (the independent variable) and the data for the other
variable to the y-axis (the independent variable);
4) plot the data pairs on the scatter diagram, encircling (as many times as necessary) all data points
that are repeated.
Interpretation of the resulting scatter diagram is as simple as looking at the pattern formed by the
points. If the data points plotted on the scatter diagram are all over the place with no discernible
pattern whatsoever, then there is no correlation at all between the two variables of the scatter
diagram. An example of a scatter diagram that shows no correlation is shown in Figure 1.
Figure 1. A Scatter Diagram showing no correlation
There is positive correlation between two sets of data if an increase in the x-value results in an
increase in the y-value. Figure 2a shows a scatter diagram that exhibits positive correlation. Note
that in such a correlation, the data points constitute a perceivable diagonal line that goes from the
lower left to the upper right corner.
Not all sets of data pairs will exhibit a strong positive correlation, even if an increase in the x-value
somehow results generally in an increase in the y-value. An example of this 'weak' type of positive
correlation is shown in the scatter diagram of Figure 2b, which is said to exhibit just a 'possible
positive correlation.' This scatter diagram still shows a perceivable diagonal line going in the upper
right direction, but the points are more spread apart than in a scatter diagram with strong positive
correlation.
Figure 2. Scatter Diagrams showing positive correlation (a, left) and
just a possible positive correlation (b, right)
If the scatter diagram formed also shows a perceivable diagonal line, but the line is going in a
direction opposite that of positive correlation (i.e., from the upper left to the lower right corner) as
shown in Figure 3a, then the data pairs are exhibiting negative correlation. This means that y
decreases as x increases. Again, the negative correlation is strong if the line formed by the data
points is narrow and very defined.
If the negative correlation is not strong, resulting in data points that are not closely packed together,
then there is just a 'possible negative correlation.' An example of a scatter diagram for such type of
correlation is shown in Figure 3b.
Figure 3. Scatter Diagrams showing negative correlation (a, left) and
just a possible negative correlation (b, right)
Of course, more complex types of correlation may also be identified using a scatter diagram. Once
a type of correlation is established, the engineer may choose to proceed with a further and more indepth investigation of the correlation using other analysis tools.
Determining the exact nature of correlation between variables can lead to benefits. These include:
1) better understanding of cause-effect relationships; 2) reduction of data gathering requirements; 3)
establishment of more effective process controls; 4) easier development of check and balance
schemes; etc. To realize these benefits, however, the engineer has to use other analytical tools to
complement the scatter diagram, since the latter is only used as a quick visual check for possible
correlation before a more in-depth study is undertaken.
http://www.siliconfareast.com/scatterdiagram.htm
Download