statistics - Roiden Fredrich

advertisement
STATISTICS
What is Statistics?
 Statistics
consists of a body of methods for
collecting and analyzing data (Agresti &
Finlay, 1997). It is a method of dealing
with data. It is a tool concerned with the
collection, organization, presentation,
analysis and interpretation of numerical
information.
Two branches of statistics
. Descriptive Statistics is concerned with the
presentation of information in a convenient, usable
and understandable form (Runyon and Haber, 1986).
Other writers refer to descriptive statistics as the
procedure used in describing properties of a sample,
or of a population where complete population data
are available.
Example: If we measure the
Intelligence Quotient (IQ) of all the
students in the School of Graduate
Studies and calculate its mean, that
mean is a descriptive statistics because
it describes the characteristics of a
complete population.
Inferential Statistics is concerned with
generalizing this information more specifically,
with making inferences about population
which are based upon samples taken from
population (Runyon & Haber, 1986).
Here a sample is selected with the intent of
predicting what the larger population is like.
 Example:
If we wish to make a
statement about the mean IQ of all
students in the School of Graduate
Studies at the Bukidnon State
College computed on a sample of
100 students and estimate the error
involved, we use the procedure
from inferential statistics.
Terms and Concepts
Variable and Constant

Variable refers to a characteristics or
phenomenon which may take on different
values. In addition, a variable is something
that has two or more meaningful and useful
divisions, categories, characteristics, or values
(Grimm & Wozniak, 1990).
Example:
1. Grade point average
2. Height
3. Weight
4. Tribe
5. Age
These will take on different values when
different individuals are observed.
Another example of variables are:
shirt in different sizes (small, medium, large,
extra-large).
Social class with categories of
upper, middle and lower class.
Religion with categories of
Roman Catholic, Protestant, Seventh Day
Adventist, Mormons, etc.
A variable is contrasted with a constant,
the value of which never changes.
Example: pi, is a constant which always
takes the value of 3.1416….
Population, Sample and
Census

Population is a complete set of
individuals, objects or measurements of
interest in a study. Sometimes the
population is a clearly defined set of
subjects.
Example:
We may wish to investigate all the
students’ grades after this course to find
out relationship between their Grade
Point Average and their scores in other
foundation subjects.
 Sample
is a subset of a population.
It is a portion of the population.
Oftentimes it is impossible to take all
the members of the population
because of cost, time and
manpower constraints. A subgroup
may be selected to represent the
total population.
Example:
We may choose only 100 students from
the School of Graduate Studies at the
Bukidnon State College. The 100 students
are then the sample.
 Census
is the collection of data from
every element in the population (Triola,
1998). In census there is what we call as
complete enumeration.
Closely related to the concepts of
population and sample are the concepts of
parameter and statistic. The following
definitions are easy to remember if we
recognize the alliteration in “population
parameter” and sample statistic.”
Parameter and Estimates

Parameter is any characteristic of the
population which is measurable. It is a
numerical measurement describing some
characteristic of a population. Usually,
parameter or population values are
unknown. We estimate them from sample
values. In statistical notation, the Greek
letters (e.g. . µ and σ are to represent
population parameters).
 Example:
The grade point average
and standard deviation of all
students in the School of Graduate
Studies.
Estimate or statistic calculated from a sample
in order to estimate the population
parameter. It is a numerical summary of the
sample data. We shall employ the Roman
letters (X and s) to represent estimates.
Different symbols are used for parameters
and statistics.
Example: The mean IQ scores of a random sample
of students under this class is used to estimate the IQ
scores of all the students in School of Graduate
Studies.
Characteristic
Parameter
Mean
Standard deviation
Variance
Pearson Correlation Coefficient
𝜇 , mu

2

Statistic
_
X
s
S2
r
Number of Cases
N
n


 The
Nature of Data
Some data sets consist of numbers (such
as heights, scores in the test, etc.) and
others are nonnumerical (such as
gender). The terms quantitative and
qualitative data are often used to
distinguish between these two types.
1.Quantitative data consists of numbers
representing counts or measurements.
Quantitative data can be
described by distinguishing between
the discrete and continuous types.
.

Discrete data result from either a finite
number of possible values or countable
number of possible values. The number of
possible values is 0, or 1, or 2 and so on.
Continuous data result from infinitely many
possible values that can be associated with
points on a continuous scale in such a way that
there are no gaps or interruptions
.
When data represent counts, they are
discrete; when they represent
measurements, they are continuous.
 The number of students in this class is
discrete data; the amount each one
has in the wallet now is a continuous
data because they are measurements
that can assume any value over a
continuous span.
Four Levels of Measurement

Another way to classify data is to
use four levels of measurement:
1. nominal,
2. ordinal,
3. interval and
4. ratio.
 The
nominal level of measurement is characterized
by data that consist of names, labels, or categories
only. The data cannot be arranged in an ordering
scheme (such as low to high). The simplest
measurement scale is termed nominal or
classificatory.
The categories of nominal variables do not differ by
quantity, degree, or amount, but only by kind.
Example:
The two categories of the nominal variable
“gender” (male and female) are distinct, do not
overlap, include possible sexes, and cannot be
ordered or ranked. The same would be true of the
nominal variable “region” which might be broken into
the categories of NCR, Region I, Region II, Region III,
Region IV, Region V, Region VI, Region VII, Region VIII,
Region IX, Region X, Region XI, Region XII, and ARMM,
etc.
Nominal scales represent the lowest level of
measurement because they allow you only to count
and compare the number of cases in each category.
Other examples of nominal scales are given below:

The numbers on baseball players’ uniforms are
nominal in nature. In Social Science research,
groups in sample are commonly labeled with
numbers (such as 1 = Matigsalog, 2 = Talaandig, 3 =
Higaonon, 4 = Manobo). However, when these
numbers have been attached to categories,
averaging the numbers together is not usually
advisable. On the scale above for ethnic groups,
the average score of 1.87 would have no meaning.
 The
ordinal measurement scales involves data that
may be arranged in some order, but differences
between data values either cannot be determined
or are meaningless. The ordinal measurement
scales classify people or things into types or kinds,
but with one additional feature. Here the classes or
categories can be ranked. Ordinal categories are
distinct, mutually exclusive, and exhaustive, but
they are also orderable in terms of quantity,
magnitude, or some other criteria.
In other words, ordinal
measurement scales have the
property of magnitude but not the
property of equal intervals for the
property of absolute 0. It allows us to
rank individuals or objects but not to
say anything about the meaning of
the differences between the ranks.
 Example:
For example, the three categories of the
ordinal scale “social classes” (upper, middle, and
lower) are distinct, do not overlap, include the
entire range of social class, and can be ranked:
The upper class is higher than the middle class and
the middle class is higher than the lower class. No
statement can be made however about the
amount of difference between categories. The
differences between upper and middle and
between middle and lower are not calculable.
Another example is ranking
students GPA. If you ranked 1st in a
class of 400, the rank indicates
greater than or less than, but not
how much higher or lower.
 The
interval level of measurement is like the ordinal
level, with the additional property that we can
determine meaningful amounts of differences
between data. However, there is no inherent
(natural) zero starting point (where none of the
quantity is present.
Although the categories of nominal and
ordinal scales cannot be further subdivided on
a measurement scale, the values of interval
permit distances and differences between
values on a scale to be considered or
measured. Some social researchers even
distinguish between interval and ratio scales. In
both cases interval scales are of equal size.
Whereas with interval scales there is an arbitrary
zero point, however, with ratio variables there is
a true zero point where zero is equivalent to a
total absence of the variable.
 Example:
For example, time measured by
calendars temperature on the Fahrenheit
scale, and intelligence by IQ scores are
interval variables because zero values do not
mean the total absence of time,
temperature, or intelligence, respectively. In
contrast, age, income, and urbanization
(percent of a population living in urban
places) are ratio variables because zero
values do indicate a total absence of those
attributes.

The ratio level of measurement scale is the
interval level modified to include the
inherent zero starting point (where zero
indicates that none of the quantity is
present). For values at this level, differences
and rations are both meaningfully.
 For
most statistical purposes interval and ratio scales
are treated as a similar type of measurement
scales. Note, however, that a major difference is
the fact that one cannot form ratios with values of
interval scale. For example, it is incorrect to say that
60o is twice as hot as 30o; but it is correct to say that
PhP 60,000.00 is twice as much as PhP 30,000.00.
Because of the scarcity of interval variables, the
ambiguity concerning the differences between
interval and ratio scales, and their similar statistical
treatment, it makes sense to treat these two types
of measurement scales as one type.
Download