Uploaded by Jocelyn A.

Activity - ChiSquare

advertisement
Dr. Thomas R. Boucher
Texas A&M University-Commerce
MATH 453 Essentials of Statistics Activity:
Pearson’s Chi-Square
Name:
____________________________________
The file ‘2013 Car MPG.csv’ contains EPA MPG estimates for a variety of 2013
models. We would like to compare the number of cylinders ‘cylinders’ across the various
classes (‘class’) of automobiles. Consider the automobiles used in the testing to be a
representative sample of all 2013 automobiles.
(1) Use StatCrunch to create a contingency table with ‘class’ defining the rows and
‘cylinders’ defining the columns. Paste the table here. Does the table suggest
anything?



There are several 0’s or empyy cells which will impact the chi values,
most of which come from pickup and suv
It appears that most cars across class have either 4, 6, or 8 cylinders.
Few cars across class have 5 or 12 cyllinders.
(2) Use StatCrunch to create individual bar charts for ‘class’ and for ‘cylinders’, and
a grouped bar chart for the data for ‘class’ and ‘cylinders’. Include the plots here.
Do the plots show anything?
Dr. Thomas R. Boucher






Texas A&M University-Commerce
The highest frequency of cylinders are 4,6, and 8
The highest frequency of car classes are compact, and midsize
We can see overlap between the cars with the highest frequency and the highest
cylinder frequency
4 cylinder mostly has compact and midsize
6 cylinders is the most varied
8 cylinders is most SUV and pickup
(3) Use Pearson’s Chi-Square test in StatCrunch to test for an association between
‘class’ and ‘cylinders’. Paste the output here. What do you conclude? Can you be
confident in the results?

This chi square value is very large
Dr. Thomas R. Boucher


Texas A&M University-Commerce
The p value is very small and we reject the null and accept the alternative
because of this
We can’t be confident in the result because over 20% of cells have a count
less than 5
(4) You should have seen a significant association (with caveats) between ‘class’ and
‘cylinders’. Use StatCrunch to create another contingency table containing the
observed and expected counts, and the contributions to the Chi-Square and use
these to interpret the association.




The biggest reasons we rejected the null are compact cars with 4 cylinders, SUV’s
with 4 cylinders, and Suvs with 8 cylinders.
AS we can see there are several values that are more than expected such as 8
cylinder and compact cars
There are more however that are less than expected such as 4 cylinder compact
cars.
We can see that those that were less than expected were big contributions which is
likely why we got the results we did.
Download