Calculating Probability

advertisement
Ch1.1 Population and Samples
I. What is Statistics?
Statistics is the science of collecting and analyzing (numerical) data (taken
from The Oxford American Dictionary)
Usually it involves collecting partial information (a sample) from a population,
and using it to make generalizations (inference) about the population.
Ex1. Sue wants to know the mean height of undergraduate students in NC State
University. Since she doesn’t have the resources to measure every student, she
chose to measure 100 random students in the University.
Ex2. A GE engineer wants to know the average time life of their 13-W energysaving light bulbs produced by a new procedure. Some number of random light
bulbs is necessary. Suppose data on life time of 30 such light bulbs were
collected.
II. Some statistical terms:
Data: collection of facts or observations
Variable: A characteristic of the object (or individual) in the population
Univariate data: the data where there is only one variable
Bivariate data: the data where there are only two variables
Multivariate data: the data where there are more than two variables
Population: A collection of objects (or individuals) to which we would like to make
inference
Sample: A subset of the population of interest
Ex 1. In Sue’s study,
The data is: 100 students’ heights
The variable of interest is: (students’) Height
The data set is a set containing 100 students’ heights
The population of interest is: NSCU students
The sample is: 100 selected NSCU students
1
Ex 2. In the GE study,
The data is: 30 life times
The variable of interest is: life time of GE’s light bulbs
The data set is a set of 30 life times
The population of interest is: GE’s 13-W energy-saving light bulbs
The sample is: 30 light bulbs
II. Branches of Statistics
1. Producing data: Sampling design, experiment design
 Collect data to answer specific questions by sampling or experimentation.
2. Describing data: Descriptive statistics
 Deal with the presentation of the data-------summarizing the data with
numerical and graphical methods
 Making inference: Inferential statistics
 Use information from a sample to draw conclusions about a population
 One key aspect of inferential statistics is that there is some amount of
uncertainty associated with using sample data to draw conclusions about a
population
Ex 1. (Sue’s example)
Sue can follow a certain random sampling scheme to select the 100
students. Such sampling scheme guarantees that the selected students
are representative of NCSU students
1. Sue can use methods in descriptive statistics to summarize the information
of the 100 students (i.e., her sample), such as to report the average height
of the 100 students.
2. Sue can use techniques in inferential statistics to draw conclusions about
the overall population of undergraduate students in NCSU based on the
information obtained from her sample.
 Suppose that the average height of the 100 students was 65’. Sue may estimate that,
2
based on her sample, the average height of all undergraduate students in NCSU is
also 65’ and with possible error of 1.1’ (that is, 65  1.1).
EX 2. The GE engineer can do the same thing as Sue.
 In this class, we’ll concentrate on descriptive statistics and inferential statistics.
 Big picture of the class: (also see syllabus)
3
Download