Statistics 1040- Project & reflection

advertisement
Isha Humagain
Data Analysis Project
Part 1
I select body measurement data set.
It is data analysis project. The project is about collecting sample from the population or
from the data set (categorical) and organizing or analyzing the data in the Pie chart and Pareto
chart through the computer by using excel program. In this project I will be using two different
sampling methods from the four we have studied. So, I am using random sampling method and
Systematic sampling method. And I will create population Pie chart, population Pareto chart,
random sample pie chart, random sample Pareto chart, Systematic sample Pie chart and
Systematic sample Pareto chart. For creating all these charts I have to use the excel program in
the computer and have to use the different formula for each sample and population.
At first for finding the population pie chart I select the whole population from the data set
which I select only categorical data. And from the categorical data I found the different numbers
of male and female. I found 247 male was 260 female from the entire population. By using 247
(also frequency of male) for male and 260 (also frequency of female) for female I create
population Pie chart (gender of population) through computer excel program. By using the same
value for male population and female population, I create population Pareto chart by finding
cumulative frequency and percentage from the given gender frequency.
For the other part, I have to choose two different sampling method from the four
sampling method that we have studied. I used simple random sampling method and systematic
sampling method. I select two sample of size n>35 (e.g. 1, 2, 3…….36) randomly. For creating
the pie chart for random sampling method, at first I took the categorical data from the data set.
There was 17 male and 19 female by using the formula through excel (=random between) it
helps me to obtain sample. And then we got sample data for random sampling by using the
formula (= RANDOM between 0&1). And I create Pareto chart for random sample method. In
this chart I found the cumulative frequency and the percentage by using excel formula in the
computer which help me to create Pareto chart. And for the For the systemic sampling method, I
select two sample of size where n>35(1, 2, 3, 4…..36) pretty much like random sampling method. For
creating Pareto chart for systematic sampling method at first I took the categorical data from the data
set. There was 17 male and 19 female and I obtain it simply by using mod number formula in excel (=
MOD (Number, Divisor) which also help me to create systematic Pareto chart.
Part 2
At first I select the whole population and from there I select categorical data. And from
the categorical I found the different numbers of male and female. I found 247 male was 260
female from the entire population. By using 247 (also frequency) for male and 260 (also
frequency) for female I create population Pie chart (gender of population) through computer
excel program. By using the same value for male population and female population, I create
population Pareto chart by finding cumulative frequency and percentage from the given gender
frequency.
Population Pie chart
Gender of the population
49%
51%
Population Pareto chart:
1=Male
0=Female
265
100%
90%
260
80%
70%
255
60%
250
50%
Frequency
40%
Percentage
30%
245
20%
10%
240
0%
1=Male
0=Female
I used simple random sampling method and systematic sampling method. I select two sample of
size n>35 (1, 2, 3…….36) randomly. For creating the pie chart for random sampling method, at first I took
the categorical data from the data set. There was 17 male and 19 female by using the formula through
excel (=random between) it helps me to obtain sample. And then we got sample data for random
sampling by using the formula (= RANDOM between 0&1). And I create Pareto chart for random sample
method. In this chart I found the cumulative frequency and the percentage by using excel formula in the
computer which help me to create Pareto chart.
Random Sample pie chart
Gender of the Random sample data
47%
53%
Random Sample Pareto chart:
1=Male
0=Female
19.5
100%
90%
19
80%
18.5
70%
60%
18
17.5
50%
Frequency
40%
percentage
30%
17
20%
16.5
10%
16
0%
1=Male
0=Female
For other part I used systemic sampling method. I select two sample of size where n>35(1, 2, 3,
4…..36). For creating Pareto chart for systematic sampling method at first I took the categorical data
from the data set. There was 17 male and 19 female and I obtain it simply by using mod number formula
in excel (= MOD (Number, Divisor) which also help me to create systematic Pareto chart.
Systematic sample Pie chart:
Frequency
47%
53%
Systematic sample Pareto chart:
1=Male
0=Female
19.5
100%
90%
19
80%
18.5
70%
60%
18
17.5
50%
Frequency
40%
percentage
30%
17
20%
16.5
10%
16
0%
1=Male
0=Female
As I compare the result from two different samples, I didn’t find any difference. Even there is no
difference in the population between two samples. There is almost everything is similar between two
samples. They have the same number of population in which male is 17 and female is 19 in both
samples and both the picture of Random sampling Pie chart /Pareto chart and Systematic sampling Pie
chart/ Pareto chart look exact same.
Part 3
300
Population Age
250
200
150
100
50
0
19
29
39
49
59
69
More
Fig: Population age
Random Sampling Age
16
14
12
10
8
6
4
2
0
Frequency
Frequency
18
23.5
29
34.5
Bin
40
45.5 More
Fig: Random sampling age
Frequency
Systematic sampling Age
18
16
14
12
10
8
6
4
2
0
Frequency
19
25
31
37
Bin
Fig: Systematic sampling age
43
49
More
For the quantitative variable I select age from the entire data. I compute the population
means, population standard deviation and five number summary in computer through excel
program. From the given data I found Bins and frequency, which help me to create a histogram
for entire population. And by using all the data (age) I create the box plot for population.
From the four sampling method we have studied I choose random sampling method and
systematic sampling method. For this sample, I select two sample of size which is n>35 (i.e. 1, 2,
3…..36).
Again, I compute the sample mean, sample standard deviation and five number summary
by using excel formula in computer for both the random sampling method and systematic
sampling method. The shape of the histogram for the both sampling and systematic random
sampling method and in histogram of population age for entire data looks similar, i.e. Right
skewed. And the box plot looks exactly same for all (population age, random sampling method
and systematic sampling method).
The value of simple random sample where,
Number (n) = 36
Sample mean (X bar) =27.64
Sample standard deviation(s) =8.98
The value of systematic sampling method is,
Number (n) = 36
Sample mean (X bar) =28.31
Sample standard deviation=8.82
Using different sampling method also the value are pretty similar.
Part 6:
In this data analysis project I learned so many things about the excel program in the
computer. The project is about collecting sample from the population or from the data set
(categorical) and organizing or analyzing the data in the Pie chart and Pareto chart through the
computer by using excel program. This help me to gain excel program knowledge. I had
applied different statistics math skill to develop this project such as I had used random sampling
methods, systematic sampling method. Using these two samples I learned to create population
Pie chart, population Pareto chart, random sample pie chart, random sample Pareto chart,
Systematic sample Pie chart and Systematic sample Pareto chart. For creating all these charts I
have to use the excel program in the computer which I learned and used the various formula for
each sample and population.
In the part 3 of the project I learned that I have to use either age or height or weight for
the quantitative variable. And I select age from the entire data. I compute the population means,
population standard deviation and five number summaries in computer through excel program
which I never knew before. From the given data I found Bins and frequency, which help me to
create a histogram for entire population which was also a new thing to me. And by using all the
data (age) I create the box plot for population.
This project has changed me a lot that the way I used to think about the real world
application. I used to think that statistics means only about the sample, population, parameter and
different charts. Now I learn that beside these, it is the one of the great way of study in various
fields.
Download