Presentation on Types of Data

advertisement
Statistics
An intro to our world of Data
by S. Middleton, M.A.
There are lots of reasons one takes a
course in school but…
WHY STUDY STATS?
Statistics is Historical . . .
Statistical Census taking goes back to
Babylonia and ancient Egypt, however the
Roman Empire gives us its name. Statistics
is derived from the Latin word status,
which means “state.”
Examples of a Census:
“In those days Cesar Augustus issued a decree that
a census should be taken of the entire Roman
world. This was the first census that took place
while Quirinius was Governor of Syria. And
everyone went to his own town to register”.
Recorded in a book of the Bible by the Physician Luke in Ch. 2 verses 1-3
Statistics is Current . . .
Modern day executives use statistics to
shape decision making. Take for example-
Moneyball
Moneyball is the story of
Billy Beane, a Mt. Carmel
Graduate, who changed
the way Major League
Baseball evaluates their
prospects based upon
statistical information.
Why Study Stats?
As a student, we study Statistics to be able
to read and understand various statistical
studies in our careers.
 Statistical procedures are basic to research
in all fields.
 To become better citizens and consumers.

In this presentation and in the Stats course…
WHAT WILL I LEARN?
Goals for our study…
We will answer these questions:
 What are the branches of Statistics?
 What are data?
 Where do we get data?
Two Branches of Statistics
 Descriptive Statistics,
which utilizes numerical & graphical methods to
look for patterns in a data set, summarize the
information in a convenient form.
 Inferential Statistics,
which utilizes sample data to make predictions,
estimates, decisions or other generalizations
about a larger set of data.
DESCRIPTIVE STATISTICS
• The U.S. market for credit cards.
The Nilson
company
collected data on
credit or debit
purchases
recorded in the
U.S. during the
first six months
of 1998.
US Market Share for Credit Cards
DISCOVER
6%
DINERS CLUB
1%
AMERICAN
EXPRESS
18%
VISA
50%
MASTER
CARD
25%
DESCRIPTIVE STATISTICS
With descriptive statistics, the statistician
tries to DESCRIBE a situation.
 Often the data is presented in some
meaningful form, such as charts, graphs, or
tables.

A second branch is…
INFERENTIAL STATISTICS

Family Home Journal Study
A group of 1017 men aged 48 years old
was studied for 18 years. It was found that
for unmarried men 60% to 70% were still
alive at age 65. For married men 90% were
alive till age 65. They Concluded that
marriage contributed to the length of
one’s life.
What are data?
The KINDS of data.
 The SOURCE of data
 TECHNIQUES in obtaining data.
 The MEASUREMENT Classifications.

Data can be …
(classified by Kind)
 QUANTITATIVE
- involvesNumbers
OR
 QUALITATIVE
- is a sometimes called “categorical”
DATA . . . Classified for you
Data
Quantitative
(Measured)
Qualitative
(Categorical)
Discrete
(can be counted)
1, or 2 but not 1.2
Continuous
( values)
1, 1.01, 1.1,1.07, etc.
Data has a source …

A POPULATION“All of the Observations or Measurements”

A SAMPLE “ A Portion of the Population”
An Example from Hawaii

The department of Agriculture wants to know if
this crop of pineapples are under sized. They
They take the individual weights of a sample of
100 pineapples from an experimental field of
pineapples for study.
What’s the Population ?
The Sample ?
The Weight of all
the pineapples in the
field.
100 pineapples
TECHNIQUES to Produce data
Observation
 Experiment
 Simulation
 Survey

Data can be Produced by…
 Observation –
the researcher merely observes what is
happening or what has happened in the past.
Motorcycle Industry Council–
collected data on the ages and incomes of
motorcycle owners in 1980 and then again in 1998.
The researchers merely stated that motorcycle
owners were getting older (USA today). There was
no research intervention.
Data can be Produced by…

Experimentation –
the researcher manipulates one of the variables and
tries to determine how the manipulation influences
the other variables.
Virginia Polytechnic University
(Psychology Today)–
they divided the female undergraduate students into two
groups and had them do as many sit-ups as possible in 90
seconds. The first group was only told “to do your best”,
while the second group was told to try an increase their
best by 10% each day. They were measured again after 4
days to see what happened.
Data can be Produced by…

Simulation –
a researcher may use probability experiments to
mimic real life situations that might be too costly,
dangerous, or time-consuming.
NASA space shuttle pilots are trained using the simulator, rather
than learning on the real shuttle.
Data can be Produced by…

Survey –
one of the most common methods for obtaining information
is a survey. There are many types, but 4 common methods
are: Telephone, Mailed Questionnaire, Personal Interview,
and surveying records.
Literary Digest In 1932, Literary digest conducted a survey by mailing
questionnaires to subscribers asking questions about the
upcoming election.
Which Technique is Best ?

In the following slides, use each of the
techniques just presented -Surveying,
Experimentation, or Census taking, to
answer the question which seems
appropriate for each senario.
A Study of the effect of
stopping the cooling process of
anuclear reactor.

Probably SIMULATION.
I don’t think you want a melt down !
Study the effect of calcium
supplements given to young
girls on their bone mass

EXPERIMENTATION
This is very similar to studies the AMA performs
regularly. In fact this one was done by Tom
Lloyd, who used 94 girls half of which were
given the calcium and half given a placebo. He
found that 1.3% more bone mass was gained by
girls using the calcium treatment. He published
his findings in the “Journal of the American
Medical Association”.
Study the credits earned of
each student enrolled at
MCHS.

Surveying Records The registrar can keeps these records for every
student, you could gain a report of this data.
Data can be CLASSIFIED by
how it is measured?
There are 4 levels of measurement
• Nominal Level
• Ordinal Level
• Interval Level
• Ratio Level
Nominal Level
means “in name only”
it refers to data that has no way of
organizing or ranking data as
greater than other data.
 Nominal
Examples include: color,
names of cities, ski areas, etc.
Ordinal Level
 Ordinal
Level means includes
categorical data that can be ranked
or placed in an order but actual
differences between data values can
not be determined or are
meaningless.
Examples include: NBA teams,
AAA rated motels, Digital
Cameras, etc.
Interval Level
 Interval
means that data can be
compared and includes meaningful
differences between the values.
There is NO real ZERO involved,
however.
Examples include: temperature
ratings, years when Democrats
won elections, etc.
Ratio Level
 Ratio
Level allows for ranking,
taking differences, and finding a
“ratio” between the data values. It
makes sense to say that one data
value is “twice” as long as the other.
Examples include: length, time lapse
between check at a bank, temperature
measured in 0K.
Levels of Measurement
From different data values classify them as Nominal,
Ordinal, Interval, or Ratio.
Senator’s name is Sam Wilson.
Nominal
The Senator is 58 yrs old.
Ratio
He was elected to the Senate in
1980, 1986, 1992
Interval
His Income is $878,314.
Ratio
A leading magazine claims he is
ranked 7th based on his voting
record on bills regarding schools.
Ordinal
In Summary, Here’s What YOU
should know.

The KINDS of data. Quantitative & Qualitative

The SOURCE of data. Population or Sample

Techniques for PRODUCING data.
Observation, Survey, Experiment, Simulation,

CLASSIFICATION measurement levels.
Nominal, Ordinal, Interval, or Ratio
It’s YOUR turn!!!!!!
Take each of the questions we
answered in the survey at the
beginning of this unit and classify
each level of measurement for the
data values collected.
Download