Statistics Term Project My first numerical variable is age. The unit of measurement for this variable is years . A few possible values for this first numerical variable are 15, 30, and 65. My second numerical variable is number of states/countries lived in. The unit of measurement for this variable is number of states/countries. A few possible values for this second numerical variable are 1, 3, and 8. My research question is “Is age related to the number of states or countries lived?” (first variable) (second variable) To answer this research question, I would gather data as follows: I will go to a few public places with a wide range of age groups available ie: Liberty Park, Coffee Break, and Tower Theater, and Trader Joes I will ask any patrons that I have the time and opportunity to survey. I will ask any person that I happen to come by, as I go from one part of the store/park to the other. My sample will be forty people. The Purpose of this study will be to find the correlation between age and the number of states or countries people have lived in. I expect to see a rise in the number of states or countries lived in, the older the person sampled is. I also expect to see the graph level off after a certain age around the mid-forties. I gathered the data for this study by going to Trader Joes and asked the survey question to ten patrons who were shopping around the store and one worker, as they were walking in and out of the store. I went on to Liberty Park and gathered data as I walked from one end of the park to the other, I asked people as they were jogging, walking, and biking. It was getting a little darker at this point, and there were hardly any children. Out of all the people I asked at the park, all twenty people answered. Some people looked very uncomfortable answering the questions, until I told them that I only needed their first name and that the data collected was for a school project they seemed willing to help. As I walked back to my car I passed by a large crowd at the tower theater and asked the people outside my survey question. I asked seven people the survey question and also two people walked away as I tried to ask. I also went to coffee break and got a small sample of eight people I asked the workers and the few customers there at the time. Overall, I was surprised by how willing most people were to participate in the survey. Gathering the data was surprisingly much less intimidating than I thought it would be. Most of the people also wanted to tell me a story behind all the places they have lived, it was really fun to talk to these people. The population I tried to cover by these places were the general population of Salt Lake City dwellers. I chose the locations I did because they seemed to be places where I could find very diverse age groups and different lifestyles. 1. Statistics for your first quantitative variable organized in a table: mean, standard deviation, five-number summary, range, mode, outliers 1a.) Which would be a better measure of center for this variable (mean or median)? I will use the MEAN to best measure the center of this variable because there were no outliers and it seems to have a symmetrical distribution. Summary statistics: Column Age Mean Std. Dev. Median Range Min Max Q1 Q3 Mode Outliers 33.68182 16.957912 28.5 72 2.Histogram for the first quantitative variable 4 76 24 47 26 none 3. Box-plot for the first quantitative variable. 4.Statistics for the second quantitative variable organized in a table: mean, standard deviation, five number summary, range, mode, outliers 4a.) Which would be a better measure of center for this variable (mean or median)? Summary statistics: Column # of States/Countries Lived Mean Std. Dev. Median Range Min Max Q1 Q3 Mode Outliers 3.1818182 2.6960154 2.5 13 1 14 2 3 3 5. Histogram for the second quantitative variable 6. Box plot for the second quantitative variable 7. Statistics for testing the correlation between the two variables: linear correlation coefficient (USE R and Not R2) and equation for line of regression. Simple linear regression results: Dependent Variable: var3 Independent Variable: var2 var3 = 1.9900125 + 0.03538424 var2 Sample size: 44 R (correlation coefficient) = 0.2226 Estimate of error standard deviation: 2.659499 Parameter estimates: Parameter Estimate Std. Err. Intercept 1.9900125 0.8998044 Slope Alternative DF 0.03538424 0.023916256 Model SS MS F-stat 1 15.482214 15.482214 2.188938 P-Value ≠ 0 42 2.2116058 0.0325 ≠ 0 42 0.1465 Analysis of variance table for regression model: Source DF T-Stat P-value 0.1465 1.479506 Error 42 297.06323 Total 43 312.54544 7.072934 8. Scatter plot that includes line of regression 9. The equation for the line or regression. Statistical Analysis: For the X variable in my study (Age) I used the mean as the center of the distribution histogram I created looked symmetrical and there were no outliers, so there would be no significant changes in the data. For the Y variable in this study (#of states/countries lived) I used the median to best describe the center of my data set. The histogram I created showed that there was not a symmetrical or normal distribution and the median works for any shaped graph. There were also a number of outliers in the Y variable data set so using the mean would throw the true center off by a large margin. The median is a resistant center of the variable because large outliers do not significantly change the data. When I ran the computations for the correlation coefficient I got the number R (correlation coefficient)= 0.2226 which means that the two variable X and Y have a WEAK/POSITIVE relationship. When X increases Y also slightly increases. 10. Interpretation and Conclusions: discuss how you would answer the original question. My original question was: “Is Age related to the number of States of Countries lived?” Meaning, does the number of states or countries you’ve lived in increase with age? My survey showed that there was a weak but positive correlation between the two variables. So it suggests that with age there may be an increase of the number of states you’ve lived in. Although there were a number of outliers, I believe with a larger population the correlation will increase slightly. The problem I saw with this survey is that after a certain point just because you are ninety years old will not mean that you will have lived in 50 states/countries. Many people tend to settle down and find a home at some point in their lives so there will not always be a linear line to best interpret the data. It may have a curve to the line. I was honestly dreading the gathering data part of this survey. I actually did this twice. The first time I used a system of every “nth” person, this was not a good way to collect data because it automatically eliminated a portion of my population and was not appropriate to collect a random sample. The second time around I asked everybody I came across and anybody that would answer my questions. I covered a lot of ground and I feel that I got a better representation of the community ie: Salt Lake. What I wish I had done differently was to go to the park at an earlier time when there were more children because I don’t feel that I got a big enough sample of the younger population. I ended up really having fun the second time around and people were really willing to help me out. I was surprised at how easy most people were to approach, although I did have a few people who avoided me like the plague. It was an eye opening experience and most people wanted to tell me stories about their travels. For the most part I ended up getting into a l;ot of conversations with the people who participated in my survey. I was pleasantly surprised by the kindness of the people in Salt Lake.