MTH 207 Elementary Statistics

advertisement
MTH 207 Elementary Statistics
Chapter 1 & 2 Notes
Basic definitions:
statistics – is a collection of methods for planning experiments, obtaining
data, and then organizing, summarizing, presenting, analyzing, interpreting,
and drawing conclusions based on the data
descriptive statistics – are used when the purpose of an investigation
is to describe the data that have been collected.
inferential statistics – are used when the purpose of the research is
not to describe that data, but to generalize or make inferences based
on it. In general, two major factors influence one’s confidence that
what holds true for the sample also holds true for the population at
large: method of sample selection and size of sample
The research process:
1. Specify research goals
2. Review the literature
3. Formulate hypotheses
4. Measure and record
5. Analyze the data
6. Invite scrutiny
population – the complete collection of all elements to be studied
census – the collection of data from every element in a population
sample – a subcollection of elements drawn from a population
probability or random sample – selected in such a way that each element in
the population has an equal chance of being represented
sampling frame – a list of elements in the population
Common Sampling Methods:
 Simple random sample – n subjects are selected in such a way that
every possible sample of size n has the same chance of being
chosen
 Stratified sampling – subdivide the population into at least 2
different subpopulations (strata) that share the same
characteristics (such as gender), then we draw a sample from each
stratum
 Systematic sampling – select every kth element in the population
 Cluster sampling – divide the population into sections/clusters,
then randomly select a few of those sections, and then choose all
of the members from those selected sections
 Convenience sampling – use what is readily available
hypothesis – a statement that describes a relationship between at least two
variables; these statements are based on either research or personal
knowledge
independent variable – the variable that is producing or creating the effect
or doing the influencing to the dependent variable
dependent variable – the variable that is being affected or influenced
control variables – any variables other that the above that can have an
affect on the independent-dependent variable relationship (see page 16)
parameter – a numerical measurement describing some characteristic of a
population
statistic – is a numerical measurement describing some characteristic of a
sample
qualitative or categorical data– can be separated into different categories
that are distinguished by some nonnumeric characteristic. Ex: types of
majors in college
quantitative data (also known as scale or numerical) – consists of numbers
representing counts or measurements. Ex: the number of students in NY
colleges.
4 levels of measurement:
nominal level – classes or subclasses are only named or enumerated in
this level of measurement, they are not compared. Different numbers are
assigned to different classes; no other possible comparisons between the
numbers can be made.
ordinal level – different numbers are assigned to different amounts of
the property, and the higher the number assigned to a person or object, the
less (or more) of the property the person or object is observed to have. Ex:
ranking your 10 favorite college teachers. Note: it is not true in the ordinal
level that equal numerical differences along the numerical scale correspond
to equal increments in the property being measured.
interval level – equal numerical differences correspond to equal
increments in the property; therefore we can make meaningful statements
about the amount of difference between the points.
ratio level – interval level with the additional property of zero (an
absolute zero).
Any numerical operation can be performed on any set of numbers;
whether the resulting numbers are meaningful, however, depends on the
particular level of measurement being used.
discrete – result from either a finite number of possible values or a
countable number of possible values; takes on values of integers (also
qualitative)
continuous – results from infinitely many possible values that can be
associated with points on a continuous scale in such a way that there are no
gaps or interruptions. In any unit of measurement, whenever it can take on
the values a and b, it can also theoretically take on all the values between a
and b.
More examples:
 Ice cream flavors
 The speed of five runners in a 1-mile race, as measured by the runner’s
order of finish. 1 for winner, 2 for second, etc.
 The number of people going to a particular movie theater each night as a
measure of the theater’s gross income from ticket sales, assuming each
ticket costs $7.00.
 Population of all eighth grade students in the US, with X representing
the region of the country in which the student lives. 1 = northeast, 2 =
north central, 3 = south, and 4 = west.
 Toss a coin 100 times and X represents the number of heads obtained for
each set of 100 tosses.
Uses and abuses of statistics (not in text)
 small samples – even a large sample can be biased
 precise numbers – a statistic that is very precise is not necessarily
accurate
 guesstimates – estimating how many people at the million man
march
 distorted percentages
 partial pictures
 deliberate distortions
 loaded questions – since we already have enough nuclear warheads
to blow up the world, should more federal money be spent on the
defense budget?
 misleading graphs – see text!
 pictographs – often drawn distorted
 pollster pressure – answering to favor self-image
 bad samples
Download