# File - Nellie`s E

```Math 1040
• Intro To Statistics
• Professor: Zeph Allen Smith
• Presented by: Nellie Sobhanian
Introduction:
• For this class, Math 1040, our term project consists of buying a bag of
skittles and counting the number of skittles based on its color. Our class was
to report these numbers for the instructor, who created a collective data that
represents each respondent’s numbers. There are 38 respondents, or students
in this class. The cumulative number of skittles came out to 2435. The colors
represented a categorical data (colors) which corresponds to a quantitative
data, or the number of skittles in each color. The entire data consisted of
Red(500), Orange(446), Yellow(474), Green(503) and Purple (512). Next
chart, is a pie chart and a Pareto chart that represents the collected data.
Skittles Pie Chart
Pareto Chart
Name
Value
Relative frequency
Cumulative frequency
Purple Skittles
512
21.03
21.03
Green Skittles
503
20.66
41.68
Red Skittles
500
20.53
62.22
Yellow Skittles
474
19.47
81.68
Orange Skittles
446
18.32
100
SUM
2435
• By observing the pie chart versus each of the colors of the Pareto graph, it
becomes evident that the pie chart is easier to get a holistic view of the data,
at a glance. The Pareto chart, however, helps to get a more detailed look of
the data, in an individual manner. The frequency represents the number of
respondents who have had the same number of skittles.
Summery of My bag of skittles
• The table below, represents the numbers of skittles on my own personal bag,
which can be used for compare and contrast purposes.
Purple Skittles: 13
Red: 12
Orange:13
Green: 13
Yellow: 10
Total: 61 Skittles
Summary Statistics
Column
Mean
Std. dev.
Median
Min
Max
Q1
Q3
IQR
Sum
Red
84.707317
383.86015
13
0
2435
10
16
6
3473
Orange
22.871795
69.67752
11
3
446
9
14
5
892
Yellow
24.307692
74.049825
12
3
474
10
14
4
948
Purple
26.25641
79.903467
14
4
512
11
16
5
1024
Green
25.794872
78.50115
13
7
503
11
16
5
1006
Histogram Charts
Box plot
Recap
• As presented above, each different chart has their unique way of portraying
the same set of information. Some are easier to distinguish than others, but
the sophisticated style of each chart allows for comparing and contrasting to
figure out what chart ultimately works the best with the data. The box plot
shows that the number of red skittles exceeds the number of any other
skittles, and the numbers of other colors of skittles are approximately equal;
the histogram verifies this observation. I only bought one bag of skittles,
therefore the 38 other bags of skittles, one per respondent, provides for a
better sample.
Reflection
• As previously explained in the introduction paragraph, categorical or qualitative data
is measured based on characteristics of the data, such as color, name brand, taste,
etc. as opposed to quantitative data which is basically anything that can be measured
such as number, weight, distance, etc. The most helpful graph to measure qualitative
data was the pie chart because it provides a visual of how one color such as red, is
appears more frequently than other colors. The Pareto chart was the most helpful
with quantitative data; it provided the number of frequency, the number of skittles
in descending order, which in my opinion, is the best chart that provides the most
quantitative information out of the other charts that are presented here.
Confidence Interval Estimates
• Confidence interval is essentially an estimated range of values which is likely to include an unknown population parameter or the
estimated range being calculated from a given set of sample data.
• Construct a 99% confidence interval estimate for the true proportion of yellow candies: using the confidence interval formula
(see scanned sheet for work shown) the proportions of yellow candies were .179 &lt; p &lt;.221 thus they are between 18 and 22. (See
Scanned sheet for work shown p.1)
• Construct a 95% confidence interval estimate for the true mean number of candies per bag: I calculated the sample mean which
is 64, and the standard deviation which is 13.2 of the 38 bags of skittles. The sample size is the grand total of the skittles which is 2435.
I plugged in the numbers into the formula which calculated the true mean number of candies per bag to be between 60 (rounded up
from 59.7) to 68. (See scanned sheet for work shown p.2)
• .Construct a 98% confidence interval estimate for the standard deviation of the number of candies per bag: By using the
estimating population parameter, specifically the standard deviation aspect, of 61, I plugged in the square root of 37, multiplying by the
square root of the standard deviation (13.2) squared and dividing it by x2R which the calculator provided me with 49.588. The same
process was used to figure out the left side. The answer resulted in 11.40 &lt; α &lt; 21.26. (See Scanned Sheet for work shown p.3)
Hypothesis Test
• A statistical hypothesis is an assumption about a population parameter. This assumption may
or may not be true. Hypothesis testing refers to the formal procedures used by statisticians to
accept or reject statistical hypotheses.
• Use a 0.05 significance level to test the claim that 20% of all Skittles candies are red:
Using the hypothesis proportion formula (see scanned sheet for work shown), We fail reject the
null hypothesis because there is not sufficient evidence to support the alternative hypothesis. (See
scanned sheet below p.1 for work shown)
• Use a 0.01 significance level to test the claim that the mean number of candies in a bag
of Skittles is 55. After using the hypothesis mean formula (see scanned work) We reject H0 if
the t is less than or equal to -2.715 or if t is greater than or equal to 2.715. Comparing t (4.25) to
2.715 results in rejection of the null hypothesis due to enough evidence to support the alternative
hypothesis. (See scanned sheet below p.2 for work shown)
Recap
• In order to construct the confidence interval and the hypothesis test, I used
the holistic data of the entire class into the appropriate formula. My own
sample of skittles met the 20% red skittle hypothesis test in a bag. However,
the mean of my sample was 12 due to a smaller sample proportion. Some
errors could have been caused due to outliers; there were 3 numbers that
were unusually high. What could have been improved was to compare our
data with other statistics classes to obtain a better sampling.
Reflection
What have I learned as a result of this project?
• Discuss how the math skills that you applied in this project.
• Identify specific parts of the project and your own process in completing the
project that may have applications for other classes.
• Discuss how the project helped to develop your problem solving skills.
• Discuss how this project changed the way you think about real world math
applications. If your thinking was not changed, then discuss how the project