Isha Humagain Data Analysis Project Part 1 I select body measurement data set. It is data analysis project. The project is about collecting sample from the population or from the data set (categorical) and organizing or analyzing the data in the Pie chart and Pareto chart through the computer by using excel program. In this project I will be using two different sampling methods from the four we have studied. So, I am using random sampling method and Systematic sampling method. And I will create population Pie chart, population Pareto chart, random sample pie chart, random sample Pareto chart, Systematic sample Pie chart and Systematic sample Pareto chart. For creating all these charts I have to use the excel program in the computer and have to use the different formula for each sample and population. At first for finding the population pie chart I select the whole population from the data set which I select only categorical data. And from the categorical data I found the different numbers of male and female. I found 247 male was 260 female from the entire population. By using 247 (also frequency of male) for male and 260 (also frequency of female) for female I create population Pie chart (gender of population) through computer excel program. By using the same value for male population and female population, I create population Pareto chart by finding cumulative frequency and percentage from the given gender frequency. For the other part, I have to choose two different sampling method from the four sampling method that we have studied. I used simple random sampling method and systematic sampling method. I select two sample of size n>35 (e.g. 1, 2, 3…….36) randomly. For creating the pie chart for random sampling method, at first I took the categorical data from the data set. There was 17 male and 19 female by using the formula through excel (=random between) it helps me to obtain sample. And then we got sample data for random sampling by using the formula (= RANDOM between 0&1). And I create Pareto chart for random sample method. In this chart I found the cumulative frequency and the percentage by using excel formula in the computer which help me to create Pareto chart. And for the For the systemic sampling method, I select two sample of size where n>35(1, 2, 3, 4…..36) pretty much like random sampling method. For creating Pareto chart for systematic sampling method at first I took the categorical data from the data set. There was 17 male and 19 female and I obtain it simply by using mod number formula in excel (= MOD (Number, Divisor) which also help me to create systematic Pareto chart. Part 2 At first I select the whole population and from there I select categorical data. And from the categorical I found the different numbers of male and female. I found 247 male was 260 female from the entire population. By using 247 (also frequency) for male and 260 (also frequency) for female I create population Pie chart (gender of population) through computer excel program. By using the same value for male population and female population, I create population Pareto chart by finding cumulative frequency and percentage from the given gender frequency. Population Pie chart Gender of the population 49% 51% Population Pareto chart: 1=Male 0=Female 265 100% 90% 260 80% 70% 255 60% 250 50% Frequency 40% Percentage 30% 245 20% 10% 240 0% 1=Male 0=Female I used simple random sampling method and systematic sampling method. I select two sample of size n>35 (1, 2, 3…….36) randomly. For creating the pie chart for random sampling method, at first I took the categorical data from the data set. There was 17 male and 19 female by using the formula through excel (=random between) it helps me to obtain sample. And then we got sample data for random sampling by using the formula (= RANDOM between 0&1). And I create Pareto chart for random sample method. In this chart I found the cumulative frequency and the percentage by using excel formula in the computer which help me to create Pareto chart. Random Sample pie chart Gender of the Random sample data 47% 53% Random Sample Pareto chart: 1=Male 0=Female 19.5 100% 90% 19 80% 18.5 70% 60% 18 17.5 50% Frequency 40% percentage 30% 17 20% 16.5 10% 16 0% 1=Male 0=Female For other part I used systemic sampling method. I select two sample of size where n>35(1, 2, 3, 4…..36). For creating Pareto chart for systematic sampling method at first I took the categorical data from the data set. There was 17 male and 19 female and I obtain it simply by using mod number formula in excel (= MOD (Number, Divisor) which also help me to create systematic Pareto chart. Systematic sample Pie chart: Frequency 47% 53% Systematic sample Pareto chart: 1=Male 0=Female 19.5 100% 90% 19 80% 18.5 70% 60% 18 17.5 50% Frequency 40% percentage 30% 17 20% 16.5 10% 16 0% 1=Male 0=Female As I compare the result from two different samples, I didn’t find any difference. Even there is no difference in the population between two samples. There is almost everything is similar between two samples. They have the same number of population in which male is 17 and female is 19 in both samples and both the picture of Random sampling Pie chart /Pareto chart and Systematic sampling Pie chart/ Pareto chart look exact same. Part 3 300 Population Age 250 200 150 100 50 0 19 29 39 49 59 69 More Fig: Population age Random Sampling Age 16 14 12 10 8 6 4 2 0 Frequency Frequency 18 23.5 29 34.5 Bin 40 45.5 More Fig: Random sampling age Frequency Systematic sampling Age 18 16 14 12 10 8 6 4 2 0 Frequency 19 25 31 37 Bin Fig: Systematic sampling age 43 49 More For the quantitative variable I select age from the entire data. I compute the population means, population standard deviation and five number summary in computer through excel program. From the given data I found Bins and frequency, which help me to create a histogram for entire population. And by using all the data (age) I create the box plot for population. From the four sampling method we have studied I choose random sampling method and systematic sampling method. For this sample, I select two sample of size which is n>35 (i.e. 1, 2, 3…..36). Again, I compute the sample mean, sample standard deviation and five number summary by using excel formula in computer for both the random sampling method and systematic sampling method. The shape of the histogram for the both sampling and systematic random sampling method and in histogram of population age for entire data looks similar, i.e. Right skewed. And the box plot looks exactly same for all (population age, random sampling method and systematic sampling method). The value of simple random sample where, Number (n) = 36 Sample mean (X bar) =27.64 Sample standard deviation(s) =8.98 The value of systematic sampling method is, Number (n) = 36 Sample mean (X bar) =28.31 Sample standard deviation=8.82 Using different sampling method also the value are pretty similar. Part 6: In this data analysis project I learned so many things about the excel program in the computer. The project is about collecting sample from the population or from the data set (categorical) and organizing or analyzing the data in the Pie chart and Pareto chart through the computer by using excel program. This help me to gain excel program knowledge. I had applied different statistics math skill to develop this project such as I had used random sampling methods, systematic sampling method. Using these two samples I learned to create population Pie chart, population Pareto chart, random sample pie chart, random sample Pareto chart, Systematic sample Pie chart and Systematic sample Pareto chart. For creating all these charts I have to use the excel program in the computer which I learned and used the various formula for each sample and population. In the part 3 of the project I learned that I have to use either age or height or weight for the quantitative variable. And I select age from the entire data. I compute the population means, population standard deviation and five number summaries in computer through excel program which I never knew before. From the given data I found Bins and frequency, which help me to create a histogram for entire population which was also a new thing to me. And by using all the data (age) I create the box plot for population. This project has changed me a lot that the way I used to think about the real world application. I used to think that statistics means only about the sample, population, parameter and different charts. Now I learn that beside these, it is the one of the great way of study in various fields.