- ap-statistics

advertisement
m&ms® Project
Part 1 (10 pts)
Throughout this course we will be using m&ms® data to help explain some of the concepts
being taught as well as give you a feel for how these methods can be used. We will be exploring
plain m&ms® and I have chosen to use the 1.69 oz size bags for convenience and
affordability. From a larger perspective, the purpose of our report is to examine the packaging
process for plain 1.69 ounce bags of m&ms®.
In order to get started, everyone needs to visit three (3) different stores and purchase a plain
m&ms® 1.69 oz bag of candy. It is not necessary to keep track of which bag came from which
store. Why do we need to do this? We are taking a sample of the population of all 1.69 oz bags
of plain m&ms® that are produced. In order for our results to be meaningful, we need a random
sample. Random samples are found by sampling at complete and total random. By visiting three
different stores and then randomly selecting a bag from the display, we are assuring that we are
obtaining a true random sample. Other sampling methods will also produce a random
sample. The concepts of population, sample, and sampling methods will be explained further in
your readings.
Download the mnminput.xls file found in the below attachment. Once you have your three bags,
you will need to open each one individually and record the number of candies of each color
within the bag (blue, orange, green, yellow, red, and brown) into the file. Once you have done
this for each bag, save the file and place it into the Part 1 dropbox as an attachment.
Here is the best part of the project. When you are done counting the colors and recording your
data, feel free to enjoy the candy!
Please note that everyone’s data will be combined into ONE class data set. This data set will be
used for all the remaining parts of the project. There will be an announcement when the class
data set has been uploaded to Doc Sharing. Because we are only sampling 1.69 oz bags, all our
conclusions will be based upon plain 1.69 oz bags of m&ms®. We will be focusing on the
overall percentage of each color and the total number of candies per bag.
At the end of the project, you will be writing a summary of all parts of the project. It will be
written as a formal report to a supervisor or manager. More information will be given.
AP STATISTICS
M&Ms® Project
Part 2 (30 pts)
Use the Excel data file found in the below attachment (mnmproject.xls) to complete this
assignment. If using Excel 2003, you will want to go to Tools > Add-Ins and make sure the
boxes to the left of both analysis toolpaks are checked as we will be using those tools throughout
the term. If using Excel 2007, you will click on the Microsoft Office Button in the upper left,
then click “Excel Options”. Click on Add-Ins, then in the Manage box, select Excel AddIns. Click Go. In the “Add-Ins available box, select the Analysis ToolPak check box and then
“Ok”. Please refer to the sample project file (samplemnmpart2.xls) in the Doc Sharing area for
an example of what your completed file will look like if you follow all the outlined steps. Be
sure to submit your Excel file into the dropbox.
From this point forward, we will be focusing on the data from two perspectives: color
proportions and the number of candies per bag. For the color proportions, the information that
will be used is the total for each color and the total number of candies sampled. For the number
of candies per bag, you will use the data in the num. candies in bag column.
18 pts (3 for each color). Add up each of the columns. Calculate the sample proportions ( ) for
each of the colors. For example, the sample proportion of blue candies would be the total
number of blue candies from all the bags divided by the total number of candies in all the bags.
3 pts. Calculate the sample mean ( ) number of candies per 1.69 oz bag. (Find the mean of the
number of candies in bag column).
3 pts. In another sheet in the Excel data file, create a histogram for the number of candies per
bag. To do this in Excel 2003, go to Tools in the menu and click on Data Analysis, then select
Histogram. In Excel 2007, select the Data tab and then click on Data Analysis in the Analysis
box. Click in the input range and then select the column containing the total number of candies
per bag. Be careful that you do not include the cell that has the total for the column. Select the
bin column on the data page for bin range. The “bin” column tells Excel how to group the data
into classes for graphing. If you include the cell with the labels (total and bin) when selecting
the cells, be sure the box next to Labels is checked. Under output options, select the new
worksheet ply and name it Histogram. Check next to Chart output. Finally click ok, and the
histogram will be within the new worksheet page. If you will click and hold on the square in the
bottom center of the histogram chart, then move the mouse down, you can widen the chart and
make it easier to read.
If you do not have the Analysis toolpak in Excel, you can use StatCrunch to do this part. A link
to StatCrunch can be found under the Tools for Success, the webliography and on each
homework and MML quiz problem. Copy the “number of candies in bag” column into
StatCrunch. To do this, you will copy the column in Excel including the column header (num.
candies in bag). Then after opening StatCrunch, you will select Data > Load Data > from
paste. In the window that opens, click on “paste data from clipboard”, then click “okay”. The
data will be loaded into StatCrunch. Next select Graphics > Histogram. In the popup window,
select the name of the column that contains your data, click next. Next, enter 1 in the box to the
right of “binwidth”, then click “Create Graph!” The output can be copied and pasted into your
Excel file.
3 pts. In another sheet in the file, use Excel to compute the descriptive statistics for the total
number of candies per bag. This data includes sample mean, sample standard deviation, median,
mode, as well as other useful information. To complete this, select Data Analysis again, then
select Descriptive Statistics. In the input range, select the “number of candies in bag”
column. Include the first row containing the label. Make sure the grouped by columns is
selected and check the box next to “Labels in first row”. For the output, select new worksheet
ply and name it Descriptive Stats. Check the box next to “Summary Statistics”.
You can also do this part in StatCrunch. With the “number of candies in bag” data copied in the
previous step, select Stat > Summary Stats > Columns. In the popup window, select the name of
the column that contains your data, then click “Calculate”. The output can be copied and pasted
into your Excel file.
3 pts. Finally, in another worksheet titled “Part 2 Summary”, please summarize the following
information from this part of the project:
 Total number and calculated proportions (sample proportions) for each color
 Sample mean number of candies per bag
 Sample standard deviation of mean number of candies per bag
 Written description of the histogram. Describe its shape: normal, skewed, etc. Also
identify any potential outliers.
 Sample sizes of this project
o Total number of candies sampled (total of the number of candies in bag
column). This is the sample size for all future parts dealing with color
proportions.
o
Total number of bags sampled (number of rows of data). This is the sample size
for all future parts dealing with the mean and standard deviation number of
candies per bag.
NOTE: You should be using the combined class data set, not your sample of three bags.
It is imperative that you correctly complete the Part 2 Summary as we will be using the
information summarized on that page for ALL the remaining parts of the project. To help keep
the sample sizes straight, remember that when you calculated the proportion of each color, you
divided by the total number of candies sampled. When you calculated the mean number candies
per bag, you divided by the total number of bags.
At the end of this project, you will be writing a report, explaining the method and presenting the
results from each part of the project. You might find it useful to write this as you complete the
work, so the report will be mostly written by the time it is assigned.
MAT 300
M&Ms® Project
Part 3 (21 pts)
We will be constructing confidence intervals for the proportion of each color as well as the mean
number of candies per bag. You will use the methods of 6.3 for the proportions and 6.1 for the
mean. For the Bonus, you will use the sample size formula on page 338.
You can use StatCrunch to assist with the calculations. A link for StatCrunch can be found
under Tools for Success in Course Home. Here is also a
link: http://statcrunch.pearsoncmg.com/statcrunch/larson_les4e/dataset/index.html. You can
also find additional help on both confidence intervals and StatCrunch in the Online Math
Workshop under Tab: “MAT300 Archived Workshops”. Specifically you will be looking for
Confidence Intervals and Using Technology – CI.
Submit your answers in Excel, Word or pdf format. Upload your file to the dropbox. If
calculating by hand, be sure to keep at least 4-6 decimal places for the sample proportions
to eliminate large rounding errors.
3 pts. Construct a 95% Confidence Interval for the proportion of blue M&Ms® candies.
3 pts. Construct a 95% Confidence Interval for the proportion of orange M&Ms® candies.
3 pts. Construct a 95% Confidence Interval for the proportion of green M&Ms® candies.
3 pts. Construct a 95% Confidence Interval for the proportion of yellow M&Ms® candies.
3 pts. Construct a 95% Confidence Interval for the proportion of red M&Ms® candies.
3 pts. Construct a 95% Confidence Interval for the proportion of brown M&Ms® candies.
3 pts. Construct a 95% Confidence Interval for the mean total number of candies (large
samples).
BONUS: 5 pts. How many candies should be sampled to obtain a 95% CI of the proportion of
blue candies with a 4% margin of error if the known proportion of blue candies is 0.24?
HELP:
Color Proportions
You will need the information from the Part 2 Summary. For the colors, the confidence intervals
need to be found using the formulas in section 6-3 for proportions. The margin of error formula
is
is the sample proportion of the color. It will change for each color.
is found by 1 - , so it will also change for each color.
n is the total number of candies sampled.
So let us do an example with “purple”. Let’s say there were 732 purple candies out of 3500 total
candies. The sample proportion of purple candies is 732/3500 = 0.209143. This is what you did
in part 2. Now to find the confidence interval, we need to calculate E. Let's construct a 95% CI.
For that confidence level (95), the z-value is 1.96. We also need
: =1= 1 - 0.209143 = 0.790857.
Now let's plug in:
The confidence interval is found by
- E,
+E
- E: 0.209143 - 0.013474 = 0.19567 = 19.57%
+ E: 0.209143 + 0.013474 = 0.22262 = 22.26%
So the confidence interval is (0.1957, 0.2226).
You will follow this procedure for EACH color.
IF you have a TI 83/84, you can do the following: STAT, TESTS, 1-PropZInt, ENTER
NOTE: 1-PropZInt: 1 proportion z confidence interval
x = total number of that color
n = total number of candies
Then on the next screen enter 732 next to x, 3500 next to n, 0.95 next to C-Level and then
calculate enter.
On the next screen the second line is the confidence interval: (0.19567,0.22262)
The third line is the value: .2091428571
On the last line is the sample size: 3500
IF you want to use StatCrunch, you will select Stat > Proportions > one sample > with
summary. Then it will ask for the number of successes (total number of that color) and number
of observations (total number of candies). Click on Next. In the next screen, click on the radio
button to the left of “Confidence Interval”, enter the decimal of the confidence level and then
click Calculate. The output will have the confidence interval:
95% confidence interval results:
p : proportion of successes for population
Method: Standard-Wald
Proportion Count Total Sample Prop.
P
732 3500
Std. Err.
L. Limit
U. Limit
0.20914286 0.006874427 0.19566923 0.2226165
Mean
To find the confidence interval for the mean number of candies, you will need (sample mean),
s (sample standard deviation) and n. All of these values were summarized on Part 2 Summary.
Here n is the number of BAGS sampled. The margin of error formula is
IF using the TI 83/84: STAT, TESTS, ZInterval, Enter.
Select Inpt: Stats
s: enter s value to at least 4 decimal places
: enter value to at least 4 decimal places
n: enter number of bags sampled
C-Level: enter confidence level desired
Calculate, ENTER
On the next screen the second line is the confidence interval
Then and n.
IF using StatCrunch, it is best if you copy the num. candies in bag data into StatCrunch, but you
can use the summary information. The path is Stat > Z Statistics > one sample > with summary
(if using summary information) or with data (if data is entered into the column).
If using summary: You will be prompted for the sample mean, sample standard deviation and
sample size. Click Next. In the next screen, click the button to the left of Confidence Interval
and then enter the confidence level as a decimal and then click Calculate.
If using data: You will first select the column with the data, then click next. The next screen is
the same as with summary.
At the end of this project, you will be writing a report, explaining the method and presenting the
results from each part of the project. You might find it useful to write this as you complete the
work, so the report will be mostly written by the time it is assigned.
MAT 300
M&Ms® Project
Part 4 (21 pts)
Use the M&Ms® data to complete this assignment. You will be using the methods of 7.4 for the
color proportions and 7.2 for the mean number of candies per bag. For the Bonus you will be
using the methods of 7.5.
You can use StatCrunch to assist with the calculations. A link for StatCrunch can be found
under Tools for Success in Course Home. Here is also a
link: http://statcrunch.pearsoncmg.com/statcrunch/larson_les4e/dataset/index.html. You can
also find additional help on both confidence intervals and StatCrunch in the Online Math
Workshop under Tab: “MAT300 Archived Workshops”. Specifically you will be looking for
Hypothesis Tests and Using Technology – Hypothesis Testing.
Submit your answers in Excel, Word or pdf format. Place the file in the dropbox. Be sure to
state clear hypotheses, test statistic values, critical value or p-value, decision (reject/fail to
reject), and conclusion in English. When doing calculations for the color proportions, keep
at least 4-6 decimal places sample proportions, otherwise you will encounter large rounding
errors.
Masterfoods USA states that their color blends were selected by conducting consumer preference
tests, which indicated the assortment of colors that pleased the greatest number of people and
created the most attractive overall effect. On average, they claim the following percentages of
colors for M&Ms® milk chocolate candies: 24% blue, 20% orange, 16% green, 14% yellow,
13% red and 13% brown.
3 pts. Test their claim that the true proportion of blue M&Ms® candies is 0.24 at the 0.05
significance level.
3 pts. Test their claim that the true proportion of orange M&Ms® candies is 0.20 at the 0.05
significance level.
3 pts. Test their claim that the true proportion of green M&Ms® candies is 0.16 at the 0.05
significance level.
3 pts. Test their claim that the true proportion of yellow M&Ms® candies is 0.14 at the 0.05
significance level.
3 pts. Test their claim that the true proportion of red M&Ms® candies is 0.13 at the 0.05
significance level.
3 pts. Test their claim that the true proportion of brown M&Ms® candies is 0.13 at the 0.05
significance level.
3 pts. On average, they claim that a 1.69 oz bag will contain more than 54 candies. Test this
claim (µ > 54) at the 0.01 significance (σ unknown).
BONUS: 5 pts. It is important that the total number of candies per bag does not vary very
much. As a result of this quality control, the desired standard deviation is 1.5. Test the claim (α
= 0.05) that the true standard deviation for number of candies per 1.69 oz bag is no more than 1.5
(σ < 1.5).
HELP:
Color proportions
Revisiting the purple example from before: we had 732 purple candies out of 3500 total candies.
The sample proportion of purple candies is 732/3500 = 0.2091428571.
Now let's say you want to test that the true proportion of purple candies is 21% (0.21).
First define your hypotheses. Claim: p = 0.21
H0: p = 0.21 (null)
H1: p ≠ 0.21 (alternative)
Next we need to calculate the test statistic. For this type of test, it is a z and a two tailed test. You
have been asked to test at alpha = 0.05, so we will reject the null if the test statistic, z, is positive
and greater than 1.96 OR if the test statistic, z, is negative and smaller than -1.96. (NOTE: This
is the same as if the absolute value of the test statistic is greater than 1.96.)
Review:
→ sample proportion (0.209143)
p → assumed value in null (0.21)
q → 1 - p (0.79)
n → total number of candies (3500)
Test statistic:
Because the test statistic is between -1.96 and 1.96, we FAIL TO REJECT. We have insufficient
evidence to suggest the true proportion is not 0.21.
You will follow this procedure for EACH color.
IF using the TI 83/84: STAT, TESTS, 1-PropZTest
Po: assumed proportion (0.21)
x: number of successes (732)
n: total number of candies (3500)
In the next line, select the correct alternative hypothesis/test, then Calculate, Enter.
On the next screen, the second line shows the test.
The next line has the test statistic.
The next line has the p-value of the test (if less than significance level, reject null)
The next two lines have
and n.
IF using StatCrunch, you will want Stat > Proportions > One Sample > with summary. In the
first window, you will enter the same information as for part 3: number of the color (number of
successes) and total number of candies (number of observations). Then click Next, and in the
following window, enter the claimed proportion as a decimal in the box next to “null”, select the
inequality that matches the alternative hypothesis and then click Calculate. The output will
include the test statistic (Z-Stat) and the p-value.
Hypothesis test results:
p : proportion of successes for population
H0 : p = 0.21
HA : p ≠ 0.21
Proportion Count Total Sample Prop.
p
732 3500
Std. Err.
Z-Stat
0.20914286 0.006884766 -0.12449848
Mean
When you test for the mean number of candies per bag, you will need
(sample standard deviation) and n (total number of bags) as before.
P-value
0.9009
(sample mean), s
The test statistic is a z, because we have a large sample.
Test statistic:
IF using the TI 83/84: STAT, TESTS, Z-Test
Input: Stats
0: assumed mean value
: sample or known standard deviation
: sample mean
N: sample size
Then select the correct alternative hypothesis/test, then Calculate, Enter.
On the next screen, the second line shows the test.
The next line has the test statistic.
The next line has the p-value of the test (if less than significance level, reject null)
The next two lines have and n.
IF using StatCrunch, you will use Stat > Z Statistics > one sample as with part 3.
BONUS
This is a test about a standard deviation. You can use StatCrunch, however, StatCrunch deals
with variances, so you would enter 1.5² = 2.25 as the null value. You would use Stat > Variance
> One Sample. Again, it would be best to use the actual data entered into a column, but you can
also use summary values, as long as you carry at least 4 decimal places.
At the end of this project, you will be writing a report, explaining the method and presenting the
results from each part of the project. You might find it useful to write this as you complete the
work, so the report will be mostly written by the time it is assigned.
M&Ms® Project
Part 5 (3 pts)
Using the methods in Section 8.4, test the hypothesis (α = 0.05) that the population proportions
of red and brown are equal (pred = pbrown). You are testing if their proportions are equal to one
another, NOT if they are equal to one another AND equal to 13%. NOTE: These are NOT
independent samples, but we will use this approach anyway to practice the method. This also
means that n1 and n2 will both be the total number of candies in all the bags. The “x” values for
red and brown are the counts of each we found on the Data page. You will need to calculate the
weighted p:
Be sure to state clear hypotheses, test statistic, critical value or p-value, decision (reject/fail to
reject), and conclusion in English. Submit your answer as a Word, Excel, .rtf or .pdf format and
place in the dropbox.
HELP
You can use StatCrunch or the TI to help with this test. Needed information for both tools
include:
x1 = number of red
n1 = total number of candies
x2 = number of brown
n2 = total number of candies
For the TI, you will want 2-PropZTest. Then select the appropriate alternative (not equal), and
Calculate then enter. The output will have the test statistic (z), p-value (p), sample p values,
weighted p (
), then repeat of sample sizes.
For StatCrunch, you will select Stat > Proportions > Two Sample > with summary. The output
will contain the test statistic (Z-Stat) and p-value.
Additional help is available in the Online Math Workshop under MAT300 Archived
Workshop. Specifically Two Sample Inferences and Using Technology – Two Sample.
At the end of this project, you will be writing a report, explaining the method and presenting the
results from each part of the project. You might find it useful to write this as you complete the
work, so the report will be mostly written by the time it is assigned.
M&Ms® Project Report
(15 pts)
A template report file can be found in the course shell: mnmprojectreport.doc. Before your write
your report, watch the video titled “mnmunwrapped.wmv” located in the course shell. It is a
3:30-minute video segment from the TV show “Unwrapped,” showing many parts of the
production process, which might give you some ideas. Ignore the color percentages quoted in the
segment.
Imagine you are a quality control manager at the Masterfoods plant. Write a two to three
(2–3) page report on all the parts of the project. Structure your paper using the following
headers:
o Introduction: Purpose of Report
o Project Part 1: Sampling Method
o Project Part 2: Method, Analysis, Results
o Project Part 3: Method, Analysis, Results
o Project Part 4: Method, Analysis, Results
o Project Part 5: Method, Analysis, Results
o Quality Control: Assume that at least one of the tests from Part 4 was rejected
(proportion not equal to targeted amount set by Masterfoods). Discuss how you
would investigate the operations of the plant to determine why the proportions
were off the targeted values. Speculate on three or more possible conditions in
plant and bagging process that could have caused the observed results.
o Conclusion
You should explain what was done as well as the results. Tables can be used to present
results and information. Your audience is a supervisor or manager who is unfamiliar with
this project and may or may not be familiar with statistical terms. As a result, you will
either need to explain/define statistical terms or write them in a way that a layman can
understand.
You will be graded on the following criteria:
1. Present the methods, analysis, and results for the five parts of the project. See above
Project Parts 1 through 5.
2. Explain how to investigate unexpected results (failed test(s) in Part 4) and speculate on at
least three plausible causes for the observed results.
3. Clarity in explaining all statistical terminology in every-day language.
4. Writing, grammar, sentence structure, APA format.
The format of the report is to be as follows:
o Typed, double-spaced, Times New Roman font (size 12), one-inch margins on all
sides, APA format.
o Type the question followed by your answer to the question.
o In addition to the two to three (2–3) pages required, a title page is to be included.
The title page is to contain the title of the assignment, your name, the instructor’s
name, the course title, and the date.
NOTE: You will be graded on the presentation of information, the accuracy and completeness
of your answers, the logic/organization of the report, the clarity of your explanations, and your
writing skills (grammar, style, spelling, etc.)
The assignment will be graded using the following rubric and is worth 6% of course grade:
Outcomes Assessed
Criteria
1. Present the
methods, analysis,
and results for the
five parts of the
project.
2. Explain how to
investigate
unexpected results
(failed test(s) in
Part 4) and
speculate on at
least three
plausible causes
for the observed
results.
3. Clarity: Explain
all statistical
terminology in
every-day
language.


Discuss application of course content in the context of a professional setting.
Use technology and information resources to research issues in statistics.
Grading Rubric for M&M® Project Report
0
Unacceptable
Did not submit or
did not present the
methods, analysis,
and results for the
five parts of the
project; or omitted
key information
and/or included
irrelevant
information.
Completed with less
than 60% accuracy,
thoroughness, and
logic.
Did not complete
the assignment or
failed to explain
how to investigate
unexpected results
or to speculate on
plausible causes; or
omitted key
information and/or
included irrelevant
information.
Completed with less
than 60% accuracy,
thoroughness, and
logic.
Did not complete
the assignment or
did not explain all
statistical
terminology in
every-day language.
10
Developing
Partially presented
the methods,
analysis, and results
for the five parts of
the project;
completed
with 60-79%
accuracy and
thoroughness.
20
Competent
Sufficiently
presented the
methods, analysis,
and results for the
five parts of the
project; completed
with 80-89%
accuracy and
thoroughness.
30
Exemplary
Fully presented the
methods, analysis,
and results for the
five parts of the
project; completed
with 90-100%
accuracy and
thoroughness.
Partially explained
how to investigate
unexpected results
and speculated on at
least one plausible
cause; completed
with 60-79%
accuracy and
thoroughness.
Sufficiently
explained how to
investigate results
and speculated on at
least two plausible
causes; completed
with 80-89%
accuracy and
thoroughness.
Fully explained how
to investigate
unexpected results
and speculated on at
least two plausible
causes; completed
with 90-100%
accuracy and
thoroughness.
Explained partially
all statistical
terminology in
every-day language
or explained
sufficiently some of
Explained
sufficiently all
statistical
terminology in
every-day language;
completed with 80-
Explained fully all
statistical
terminology in
every-day language;
completed with 90100% accuracy and
Criteria
4. Writing –
Grammar,
sentence structure,
paragraph
structure, spelling,
punctuation, APA
usage.
0
Unacceptable
; or omitted key
information and/or
included irrelevant
information.
Completed with less
than 60% accuracy,
thoroughness, and
logic.
Did not complete
the assignment or
had 8 or more
different errors in
grammar, sentence
structure, paragraph
structure, spelling,
punctuation, or
APA usage. (Major
issues)
10
Developing
the statistical
terminology in
every-day
language; completed
with 60-79%
accuracy and
thoroughness.
20
Competent
89% accuracy and
thoroughness.
30
Exemplary
thoroughness.
Had 6 - 7 different
errors in grammar,
sentence structure,
paragraph structure,
spelling,
punctuation, or APA
usage. (Many issues)
Had 4 - 5 different
errors in grammar,
sentence structure,
paragraph structure,
spelling,
punctuation, or
APA usage. (Minor
issues)
Had 0 - 3 different
errors in grammar,
sentence structure,
paragraph structure,
spelling,
punctuation, or
APA usage.
Download