Cody Olsen Statistics Term Project Purpose of the Study: To see if there is any correlation between the number of formal dances attended during grades 9-12 and the number of different people kissed during that same time frame. Study Design: Due to limited time and resources for this project I decided to perform a simple random sample with the employee roster in my office. The individuals in my office range from ages 20 to 60 and they originate from many different states in the U.S. I assigned a 2 digit number to each of our 80 employees and by using a random number generator in excel I randomly chose 40 of them. I then handed each chosen person a survey with 2 simple questions: Within the grades 9-12, how many different people did you kiss? Within grades 9-12, how many formal dances did you attend? These were anonymously filled out and then returned to me. I got exactly 30 surveys back. The Data: Statistics for testing correlation Difficulties, Surprises encountered: One difficulty or surprise that I came across was in the actual verbiage of the survey question. I did have 2 people ask me what qualified as a “formal dance”. I did answer their question and I sent a communication out to the rest of those being surveyed explaining what a formal dance would be. This could be difficult because some of those being surveyed could have interpreted “formal dance” differently. Looking back, I could have found a better way to phrase that question. Analysis: Both of my variables had right skewed distributions. The “People kissed” Variable was more skewed having a mean of 14.5 and an outlier of 75. “Dances attended” was also right skewed but did not have any outliers. The critical value for a correlation coefficient of a sample size of 30 is 0.361. The correlation coefficient from my sample is -0.256. The data from my sample has a weak negative correlation. Interpretation and Conclusions: The data gathered proved to have a weak negative correlation. The information gathered has too weak of a relationship to make any strong predictions when it comes to the two response variables. The information did turn out somewhat how I did predict in that the very high number and low numbers of “people kissed” both shared low number of “dances attended” for the most part. A simple color scale in excel allowed me to see that. But I do believe that any prediction made using one response variable to predict the other would not carry much confidence. People kissed 75 50 50 40 35 26 25 25 19 13 13 9 8 6 6 5 4 4 4 4 3 3 2 2 2 1 1 0 0 0 Dances attended 2 5 2 3 6 0 3 7 4 14 2 0 9 2 12 14 10 0 12 12 3 5 3 13 9 1 5 7 2 1