File - Josh O'Farrell

advertisement
1
Josh O'Farrell
Math 1040-011
Instructor: T. Hilton
April 30, 2012
Time and $, Are They Related When You Grocery Shop?
INTRODUCTION
It is probably safe to assume that almost all of us go grocery shopping on a relatively
regular basis. Because most of us cannot afford to eat out all the time we need to go to the
store to buy food. When I say store I am generally referring to anywhere you go to buy your
groceries. It is important to understand that we spend more than just money at the store. We
also spend our time, which is also a precious commodity. But what if the two are related? This
aspect of the grocery store is what our group decided to focus on. Our group, consisting of:
Jennifer Gerrard, Charonda Edwards, Brad Peterson, and myself, posed the following research
question for our statistics class. "Is the amount of time a person spends in a grocery store
related to the amount of money spent at that same grocery store?" Basically we are trying to
see if our independent (or explanatory) variable, the time spent at a grocery store, is trying to
predict our dependent (or response) variable, the money spent on groceries at that store. In
this situation both of our variables are quantitative types of data, meaning we can count them.
To collect our data we would need to ask people who came out of the grocery store two
individual questions to satisfy both variables. The first question was, how much time did you
spend in the grocery store? The other question was, how much money did you spend in the
grocery store? This type of data collection is considered an observational study. Specifically a
case control study because we would have the individuals look back in time and provide data to
us from the past. For our purposes it would have been impossible to ask these questions to the
entire grocery shopping population from the stores we went to observe. So we chose to use a
systematic sampling approach to collect the data needed for our study. In the systematic
method each of us would go to a different grocery store (of a different grocery chain), and ask
every Kth person exiting the store the two questions I mentioned earlier. We decided the value
of K would be every 6th person, and we would each ask 10 people for a total sample of 40 data
points. We each randomly rolled a die to determine our random starting point. I specifically
went to the Sunflower Farmers Market in Murray on State Street to collect my ten data points.
DATA STATISTICS & ANALYSIS
All of us did indeed collect 10 data points each for a total of 40 in our sample. Table 1
contains all the data points of our sample.
Table 1 Sample
X- Time (min)
25
10
Y- Cost ($)
83
48
2
12
10
9
4
75
20
3
10
10
30
60
30
45
15
5
30
10
9
30
10
15
30
20
15
15
5
15
3
15
50
8
5
15
30
3
30
30
20
21
3
40
18
248
193
8
8
15
7
19
31
4
8
13
27
53
11
160
10
60
70
40
17
79
10
14
3
5
90
3
10
100
86
5
44
35
57
3
You can see in Table 2 our independent variable descriptive statistics. Below that table
are the frequency diagram and boxplot diagram for this variable. You will notice that both
diagrams indicate the distribution of the data is skewed right with a median at 15 minutes and a
range of 72 minutes. This variable does have one outlier and it is located at 75 minutes.
Table 2 Descriptive Statistics for X - Time (min)
Mean
19.65
S.D.
16.01
Min
3
Q1
9.5
Median Q3
15
30
Max
75
Range
72
Mode
30
Outliers
75
4
Our dependent variable descriptive statistics can be found in table 3. Below that are the
frequency diagram and boxplot diagram for this variable. Similarly to our independent variable,
the distribution of the data for our dependent variable is skewed right. It has a median at $20
and a range of $245. This variable has three outliers, and they are located at $160, $193, and
$248.
Table 3 Descriptive Statistics for Y - Money ($)
Mean
43.9
S.D.
53.96
Min
3
Q1
9
Median Q3
20
58.5
Max
248
Range
245
Mode
Outliers
3, 8, 10 160,
193,
248
5
6
When combining the variables in a Cartesian coordinate system we can see what the
data looks like in a scatter plot in Figure 1, which also includes the estimated linear regression
line based on the data we collect. You can see the equation for this line is written in the figure
and has a R-value = 0.5410. The critical value of the correlation coefficient for n40 = .312. When
comparing our R-value to the critical value for our sample size of 40 we can see that the R-value
is greater than the critical value. Because the value is greater, and the slope of the regression
line is positive, we can say that there is a positive linear relationship between the two variables.
Figure 1 Scatter Plot Chart of Data
Scatter Plot with Regression Line
300
250
y = 1.8237x + 8.0639
R = 0.5410
Y - Money Spent ($)
200
Y
150
Predicted Y
Linear (Y)
100
50
0
0
10
20
30
40
50
60
70
80
X- Time in Store (min)
LESSONS LEARNED
One of the things I picked up on right away was the difference in accuracy of the values
given between the money spent and the time spent in the store. Individuals questioned could
easily look at their receipt and tell me the exact amount they spent; however, it seems the
majority of the data given for the amount of time spent in the store was estimated at best.
Some individuals would look at their watch to try to calculate the time, and others would simply
7
provide me a number (or a best guess). In one case I was asking the questions to a couple who
had completed their shopping and there was a 15 minute difference in opinion between them
for how long they were in the store shopping for groceries. Overall I feel that the data for the
time values are off from what the true value actually should be.
While a positive linear relation does exist between the two variables, I believe there
could be other factors (or lurking variables) that influence how much money is spent in a
certain amount of time at the grocery store. The data points (45, 4) and (60, 19) are $90 and
$117 respectively below the expected values for those amounts of time spent in the grocery
store. One possible explanation could be a shopper's familiarity with that particular grocery
store. If those individuals walked into that particular store for the very first time it may take
them more time than the average shopper to find the items they are looking for. Shopping
habits may also be a factor. In this case maybe these shoppers might have been the kind that
like to walk up and down every isle to "window shop," and may actually only get a couple of
items. There is also a difference in the opposite direction such as at the points (15, 100) and
(20, 193) are $65 and $148 respectively above their expected values for those times. Perhaps
these shoppers were intimately familiar with "their" store and knew exactly where to find the
items on their list; therefore taking less time than normal to grocery shop. They may also have
been purchasing significant quantities of items, such as in a case lot sale, or more expensive
items. Sales may also affect the values for money spent in the grocery store. While these are
reasonable explanations for varied data it is inconclusive for me to say definitively if these were
variables affecting our study.
CONCLUSION
We set out to see if there was a relationship between time spent and money spent in
the grocery store. While there may be other factors that could possibly have an effect on the
correlation between the variables it is difficult to say with absolute certainty. Based on the
data we collected we can say that there is a positive linear relation between the two variables
we were looking at. This basically means that yes, there is a relationship between the time we
spend in a grocery store and the amount of money spent.
Download