Scattered Diagram and Determination of the Outliers

advertisement
Scattered Diagram:
I used the given data in table 12.1 to input values into Java Script. The application shows
that there are 22 X-Outliers and zero Y-Outliers.
Regression:
For this application, I used the data from problem 12.25 in the text. The text shows the
P-value to be 0.000 and F to be 25.54. The application looks to be working correctly.
Regression Analysis with Diagnostic Tools
I used the example that began with 12.3 and was used throughout the chapter and found
the same results as in our textbook. This java application verified all the information that
was found in the book. It also allowed me to understand how to quickly ascertain Pvalue in question 12.29. This is obviously a useful tool for regression analysis.
Scattered Diagram
Since the application would not accept the data from Example 12.3, I used the data from
12.1. This figure seems to echo the example that is given in the book. This is a good
way to get a sample scatter diagrams and find outliers that may or may not be present.
Testing the Population Correlation Coefficient
Again, I used the data present in example 12.3. We find that this affirms the data that is
found in the text and is yet another useful tool for regression analysis.
Regression Analysis with Diagnostic Tools
Using the data from the “excel problem” in the lecture notes. The JavaScript results
agree however the JavaScript was easier to use and provides the diagnostic checking too,
these useful parts are not available in Excel.
Regression Analysis with Diagnostic Tools
The data from highway deaths per 100 million vehicle miles and highway speed limits for
10 countries are given below:
(Death, Speed) = (3.0, 55), (3.3, 55), (3.4, 55), (3.5, 70), (4.1, 55), (4.3, 60), (4.7, 55),
(4.9, 60), (5.1, 60), and (6.1, 75).
I use these data to test this program.
Results are:
Mean (x) = 60
Mean (y) = 4.24
Variance (x) = 50
Variance (y) = 0.9493333
Slope = 0.0755556
Its Standard Error = 0.0407401
Intercept = 0.293333
Its Standard Error = 2.459341
Correlation = 0.54833
Its Standard Error = 0.2956633
F- Statistic = 3.4394525
Its P-value = 0.1008
Conclusion: Strong evidence against linearity
Report on the residuals:
Randomness: Strong evidence against randomness
Normality: No evidence against normality.
The numerical results agree with results from Excel. Notice that a p-value of 0.1008 for
F-statistics is close enough to 0.10 that at the 0.10 confidence level the evidence still
favors rejecting the null hypothesis, i.e. there is evidence of a linear relationship. Notice
also that there might be scaling problem with the data, since x and y have a huge
difference in their magnitudes. So one might suggest dividing X value by, say 10, before
constructing a regression model. To overcome the non-randomness of the residual one
may use the log transformation of the X values.
Scattered Diagram
Data from textbook problem 12.8 are used in this program. Results are:
Scatter Diagram
The numbers represent observation-counts,
which may have repeated values or almost the same magnitude.
-------------
-------------
-------------
-------------
-----1
1
1
2
----
-------------
----2
1
-------
-------------
-------1
-----
-------------
-1
-----------
-------------
No outlier was removed.
The resulting scattered diagram gives a visual depiction, although not exact, rendition of
the data, which nevertheless is acceptable for a high-level view of the data and possible
linear relationship.
Regression Analysis with Diagnostic Tools for Predictions:
This is our lecture notes problem,
Testing the Population Correlation Coefficient:
This is our homework problem 12_8, which we also found was resulting in rejection of null
hypothesis.
Scattered Diagram and Determination of the Outliers:
Textbook Problem 12_16:
Download