File - Jenne Meyer PhD

advertisement
Week 5
Dr. Jenne Meyer

Field of statistics regarded as dry, too
antiseptic and unfeeling

Dispel this misapprehension by sharing these
data stories with you

Individual people speaking from behind the
numbers
Hurricane Katrina was the costliest and one of the
deadliest hurricanes in American history. Damages
exceeded $50 billion and fatalities exceeded 1300,
according to the National Oceanic and
Atmospheric Administration. In September 2005, a
survey was conducted of a group of hurricane
survivors who had later been moved to shelters in
the Greater Houston area. The respondents who
did not evacuate were asked what was their most
important reason for not evacuating.
Katrina survivors’ reasons for not evacuating.
Katrina survivors’ most important reasons for
not evacuating
Look at the
accompanying
illustration. Do you find
this 2000 presidential
election ballot
confusing? There is
evidence that many
Palm Beach County,
Florida, residents did
find the ballot confusing.

According to the Palm Beach Post, confused
voters marked more than one choice on the
“butterfly ballot.”

Gore likely lost 6607 votes because of these
ballots.
Evidence that many confused Palm Beach
County voters chose Reform Party candidate
Pat Buchanan by mistake.

Informal meaning of statistic-number that
describes a person, a group, or a set of items.
 According to a recent survey, 54% of the men
surveyed responded that they believed in aliens,
and 33% of the women did.
Statistics is the art and science of collecting,
analyzing, presenting, and interpreting data.
A business major interested in diversifying her
portfolio to stocks based on their price/earnings
ratio.
 A psychology major interested in whether there are
differences in therapeutic outcomes between
traditional counseling methods and a new cognitive
approach.
 An education major interested in whether listening
to a Mozart sonata before taking an exam can
significantly improve a grade.


How would researchers go about studying
whether superstitions change the way people
behave? What kind of evidence would
support the hypothesis that Friday the 13th
causes a change in human behavior?

T. J. Scanlon and his co-researchers thought that if
there were fewer vehicles on the road on Friday
the 13th than on the previous Friday, this would
be evidence that some people were playing it safe
on Friday the 13th and staying off the roads.

What they deemed important is the effect of
such a superstition on human behavior and how
to measure such an effect as a change in
behavior.

Phase 1 Data collection.

Select method to collect the data.

Obtained data kept by the British
Department of Transport on the traffic flow
through certain junctions of the M25
motorway in England.

Phase 2 Data analysis.

Determine ways to analyze the data.

Compared the number of vehicles passing
through certain junctions on the M25
motorway on Friday the 13th and the
previous Friday during 1990, 1991, and 1992.
Table 1.3 Traffic through M25 junctions, 1990–1992




Phase 3 Data presentation
Presentation of the results important
Researchers found a highly respectable
journal, the British Medical Journal, in which
to publish their findings.
Other methods: delivering a talk at a
conference, writing up a report for one’s
supervisor, or presenting a class project.




Phase 4 Data interpretation
Results should be understandable to nonstatisticians.
In this case, researchers chose decrease in
number of vehicles as the criterion on which to
base support for their hypothesis that people
changed their behavior on Friday the 13th.
Consistent decrease in traffic on Friday the
13th supports hypothesis.

Refers to methods for summarizing and
organizing the information in a data set.

We use numbers (such as counts and
percents) and graphics to describe the data
set, as a first step in data analysis.

An element is a specific entity for which
information is collected.

A variable is a characteristic of an element,
which can assume different values for
different elements.

An observation is the set of values of the
variables for a given element.

Is a variable that does not have a numeric
value but is classified into categories.

Qualitative variables are also called
categorical variables, because they can be
grouped into categories.



A quantitative variable is a variable that takes
numeric values.

Quantitative variables can be classified as
either discrete or continuous.

A discrete variable can take either a finite or a
countable number of values.

Each value can be graphed as a separate point on
a number line, with space between each point.

A continuous variable can take infinitely many
values, forming an interval on the number line,
with no space between the points.




Nominal – names, labels, or categories. No natural or
obvious ordering of nominal data (such as high to low)
Ordinal – arranged in a particular order, no arithmetic can
be performed on ordinal data, ie poor, satisfactory, good,
or best.
Interval – Same as ordinal data, has the extra property
that subtraction may be carried out on interval data, “no
natural zero”, high temperatures in the city of Pompey’s
Pillar, Montana for the month of December.
Ratio – similar to interval data, division may be carried
out, natural zero exists, example salaries of college
professors.

Descriptive methods of data analysis are
widespread and quite informative.

The modern field of statistics involves much
more than simply summarizing a data set.

Learning about the characteristics of a
population by studying those characteristics
in a subset of the population (that is, in a
sample)





Time
Cost
Destructive nature of sampling
Access to sample
Sampling is good enough


Attributes or characteristics of the population are generally
normally distributed.
For instance, when attributes such as height and weight are
considered, most people will be clustered around the mean,
leaving only a small number at the extremes who are either
very tall or very short






What is the relevant target population of focus to
the study?
What exactly are the parameters we are interested
in investigating?
What kind of a sampling frame is available?
What is the sample size needed?
What costs are attached to the sampling design?
How much time is available to collect the data from
the sample?







Simple random sample
Systematic sampling (every nth one)
Stratified random sampling (random samples from
segments)
Cluster sampling (random clusters)
Area sampling
Convenience sampling
Quotas





Administrative – self identifying, usually
comes from the company
Classification – demographic info
Target questions
Structured – closed-ended
Unstructured – open-ended








Surveys (paper, online, in person, mail in)
Interviews
Focus Groups
Ethnography
Observation
Case study
Content analysis
Omnibus survey **




Should this question be asked?
Is the questions of proper scope and
coverage?
Can the participant adequately answer this
question as asked?
Will the participant willingly answer this
question, as asked?






Vague or ambiguous terminology
Technical terminology
Hypothetical questions – must be reasonable
for meaningful answers
Leading questions (Would you agree the
government’s policies on healthcare are
unfair?), would you agree, do you agree
questions…
Value judgments – do not express your views
Context effects – be aware of the impact of



GM took a step back when it tried to market
the NOVA in Central and South America. In
Spanish, “No va” means “it doesn’t go.”
Pepsi’s “Come Alive With the Pepsi
Generation,” when translated into Chinese,
means “Pepsi brings Your Ancestors From the
Grave.”
Frank Perdue’s chicken slogan, “It takes a
strong man to make a tender chicken”
translates in Spanish to, “It takes an aroused





Demographic
Behavioral
Attitudes and opinions
Knowledge
Intentions, expectations, and aspirations




Conducted when a survey or sampling
methods cannot be used.
Researchers investigate how varying the
predictor variable affects the response variable
A predictor variable (or explanatory variable)is
a characteristic purported to explain
differences in the response variable.
Treatment - predictor variable that takes the
form of a purposeful intervention

Three main factors:

Control

Randomization

Replication

Used when an experimental study is not
possible for ethical reasons

Observes whether the subjects’ differences
in the predictor variable are associated with
differences in the response variable

No attempt to manipulate the variables

A 2006 Surgeon General’s report found that
“the evidence is sufficient to infer a causal
relationship” between secondhand tobacco
smoke exposure from parental smoking and
respiratory illnesses in infants and children.
Was this report based on an experimental
study or an observational study?

Solution

Unethical to force the parents of a treatment
group to smoke tobacco

The study must have been an observational
one
Download