Open an Excel Worksheet in SPSS

advertisement
α
Critical values for a test of hypothesis depend upon a test statistic,
which is specific to the type of test, and the significance level, α, which
defines the sensitivity of the test. A value of α = 0.05 implies that the
null hypothesis is rejected 5% of the time when it is in fact true.
The choice of α is somewhat arbitrary, although in practice values of
0.1, 0.05, and 0.01 are common. Critical values are essentially cut-off
values that define regions where the test statistic is unlikely to lie; for
example, a region where the critical value is exceeded with probability
α if the null hypothesis is true.
The null hypothesis is rejected if the test statistic lies within this
region which is often referred to as the rejection region(s).
Why P=0.05?
1
Friday, 18 March 2016
2:42 PM
α
It was Fisher who suggested giving 0.05 its special status, describing
the standard normal distribution, states
“The value for which P=0.05, or 1 in 20, is 1.96 or nearly 2; it is
convenient to take this point as a limit in judging whether a deviation
ought to be considered significant or not. Deviations exceeding twice
the standard deviation are thus formally regarded as significant. Using
this criterion we should be led to follow up a false indication only once
in 22 trials, even if the statistics were the only guide available. Small
effects will still escape notice if the data are insufficiently numerous
to bring them out, but no lowering of the standard of significance
would meet this difficulty.”
Fisher first published Statistical Methods for Research Workers in 1925
R.A. Fisher 1970 “Statistical Methods for Research Workers” 14th ed., revised
and enlarged, Edinburgh, Oliver and Boyd
2
When To Plot?
On page 447 only two scree plots
are presented. It is said that a
picture is worth 1000 words. Is
this true? Surely you could
describe these pictures in less
than 100!
Plots fine for presentations, do not
always belong in a paper/report.
Petraitis, J., Lampman, C., Boeckmann, R., &
Falconer, E. (2014). Sex differences in the
attractiveness of hunter-gatherer and
modern risks Journal of Applied Social
Psychology, 44(6), 442-453 DOI:
10.1111/jasp.12237
Figure 1. Eigenvalue for 101 pairs of male
behaviours and 101 pairs of female
behaviours.
3
When To Plot?
Caption is bigger than the
graph and repeats the
information. Does a Stock
Chart belong in a scientific
paper?
Too much white space.
Figure 2. Mean attractiveness risk by type of risk and sex
of dating partner. Scores above .00 indicate that the risk
behaviours were rated as attractive traits for dating
partners; scores below .00 indicate that risky behaviours
were not preferred in dating partners, and non-risky
alternatives were preferred. Standard errors are
represented by whisker bars and ranged from .042 to
.047. A, B and C represent results from pairwise
comparisons: Means with the same lettering are not
significantly different at the .05 level; means with
dissimilar lettering are significantly different.
In any publication only
present data once,
graph/table/raw data.
Obviously you must then
discuss.
Petraitis, J., Lampman, C., Boeckmann, R., &
Falconer, E. (2014). Sex differences in the
attractiveness of hunter-gatherer and
modern risks Journal of Applied Social
Psychology, 44(6), 442-453 DOI:
10.1111/jasp.12237
4
Stock Chart - Excel
To display High, Low and Closing values plus Volume Traded
Company A
Company B
Volume
2000
1500
High
3.5
2.8
Low
1.4
0.8
Close
2.9
1.4
Specimen Data
Volume
Stock prices
5
Example - Excel
A study of 22 patients suffering from Parkinsons disease was
conducted. An operation was performed on 8 of them, while it improved
their general condition it might adversely affect their speech. In the
data a higher value indicates a greater difficulty in speaking.
Operated
2.6
2
1.7
2.7
2.5
2.6
2.5
3
Others
1.2
1.8
1.9
2.3
1.3
3
2.2
1.3
1.5
1.6
1.3
1.5
2.7
2
What do you
think?
What does the
width imply!
6
Example - Excel
A study of 22 patients suffering from Parkinsons disease was
conducted. An operation was performed on 8 of them, while it improved
their general condition it might adversely affect their speech. In the
data a higher value indicates a greater difficulty in speaking.
Operated
2.6
2
1.7
2.7
2.5
2.6
2.5
3
Others
1.2
1.8
1.9
2.3
1.3
3
2.2
1.3
1.5
1.6
1.3
1.5
2.7
2
What do you
think?
Does the false
origin mislead?
7
Example - Excel
A study of 22 patients suffering from Parkinsons disease was
conducted. An operation was performed on 8 of them, while it improved
their general condition it might adversely affect their speech. In the
data a higher value indicates a greater difficulty in speaking.
Operated
2.6
2
1.7
2.7
2.5
2.6
2.5
3
Others
1.2
1.8
1.9
2.3
1.3
3
2.2
1.3
1.5
1.6
1.3
1.5
2.7
2
What do you think?
Much better, a
confidence interval
plot.
8
Example - Excel
A study of 22 patients suffering from Parkinsons disease was
conducted. An operation was performed on 8 of them, while it improved
their general condition it might adversely affect their speech. In the
data a higher value indicates a greater difficulty in speaking.
Operated
2.6
2
1.7
2.7
2.5
2.6
2.5
3
Others
1.2
1.8
1.9
2.3
1.3
3
2.2
1.3
1.5
1.6
1.3
1.5
2.7
2
What do you
think?
Does the false
origin mislead?
9
Example - Minitab
Never under estimate a humble box plot, particularly for simultaneously comparing a
number of variables.
8
7
6
5
4
3
2
1
m1_c
m1_n
m2_c
m2_n
m3_c
m3_n
m4_c
m4_n
10
Example - Minitab
John Tukey designed the box plot as an easy-to-draw data visualization as part of
exploratory data analysis. The box plot has persisted into the computer age as an
information-rich graphic that conveys key features of a numeric dataset at a glance.
Unlike the bar chart, it uses statistical summaries (median and interquartile range) that
are robust in the presence of skewness and outliers and require no assumptions about the
population. It shows the full range of the sample data, provides information about the
tails, and indicates the shape of the data.
The Box Plots Alternative for Visualizing Quantitative Data
Regina Nuzzo
American Academy of Physical Medicine and Rehabilitation 2016 DOI:
10.1016/j.pmrj.2016.02.001
Tukey JW. Exploratory Data Analysis. Boston, MA: Addison-Wesley; 1977.
11
Example - Minitab
A study of 22 patients suffering from Parkinsons disease was
conducted. An operation was performed on 8 of them, while it improved
their general condition it might adversely affect their speech. In the
data a higher value indicates a greater difficulty in speaking.
Operated
2.6
2
1.7
2.7
2.5
2.6
2.5
3
Others
1.2
1.8
1.9
2.3
1.3
3
2.2
1.3
1.5
1.6
1.3
1.5
2.7
2
Two-sample T for Operated vs Others
Operated
Others
N
8
14
Mean
2.450
1.829
StDev
0.411
0.557
SE Mean
0.15
0.15
Difference = mu (Operated) - mu (Others)
Estimate for difference: 0.621429
95% CI for difference: (0.149634, 1.093224)
T-Test of difference = 0 (vs not =): T-Value = 2.75
Both use Pooled StDev = 0.5103
p < 0.05
P-Value = 0.012
DF = 20
Confidence interval excludes zero
12
Example - Minitab
A study of 22 patients suffering from Parkinsons disease was
conducted. An operation was performed on 8 of them, while it improved
their general condition it might adversely affect their speech. In the
data a higher value indicates a greater difficulty in speaking.
Operated
2.6
2
1.7
2.7
2.5
2.6
2.5
3
Others
1.2
1.8
1.9
2.3
1.3
3
2.2
1.3
1.5
1.6
1.3
1.5
2.7
2
Boxplot plus means
13
Example - Minitab
A study of 22 patients suffering from Parkinsons disease was
conducted. An operation was performed on 8 of them, while it improved
their general condition it might adversely affect their speech. In the
data a higher value indicates a greater difficulty in speaking.
Operated
2.6
2
1.7
2.7
2.5
2.6
2.5
3
Others
1.2
1.8
1.9
2.3
1.3
3
2.2
1.3
1.5
1.6
1.3
1.5
2.7
2
14
Example - SPSS
A study of 22 patients suffering from Parkinsons disease was
conducted. An operation was performed on 8 of them, while it improved
their general condition it might adversely affect their speech. In the
data a higher value indicates a greater difficulty in speaking.
Operated
2.6
2
1.7
2.7
2.5
2.6
2.5
3
Others
1.2
1.8
1.9
2.3
1.3
3
2.2
1.3
1.5
1.6
1.3
1.5
2.7
2
Independent Samples Test
Levene's Test for Equality
t-test for Equality of Means
of Variances
F
Sig.
t
df
Sig. (2-
Mean Difference
tailed)
Equal variances assumed
1.299
.268
Std. Error
95% Confidence Interval of the
Difference
Difference
Lower
Upper
-2.748
20
.012
-.62143
.22618
-1.09322
-.14963
-2.990
18.461
.008
-.62143
.20786
-1.05735
-.18551
Data
Equal variances not assumed
p < 0.05
Confidence interval excludes zero
15
Example - SPSS
A study of 22 patients suffering from Parkinsons disease was
conducted. An operation was performed on 8 of them, while it improved
their general condition it might adversely affect their speech. In the
data a higher value indicates a greater difficulty in speaking.
Operated
2.6
2
1.7
2.7
2.5
2.6
2.5
3
Others
1.2
1.8
1.9
2.3
1.3
3
2.2
1.3
1.5
1.6
1.3
1.5
2.7
2
Confidence interval plot
16
Example - SPSS
A study of 22 patients suffering from Parkinsons disease was
conducted. An operation was performed on 8 of them, while it improved
their general condition it might adversely affect their speech. In the
data a higher value indicates a greater difficulty in speaking.
Operated
2.6
2
1.7
2.7
2.5
2.6
2.5
3
Others
1.2
1.8
1.9
2.3
1.3
3
2.2
1.3
1.5
1.6
1.3
1.5
2.7
2
Stock chart
Not keen!!
17
Advice
Where can you seek further advice?
“The goal of this chapter is to offer guidance to natural resource
professionals and other scientists on how to design tables and figures
for use in scientific communication.”
Chapter 7 - Design of Tables and Figures for Display of Scientific Data
Roger D. Moon
Scientific Communication for Natural Resource Professionals
Cecil A. Jennings, Thomas E. Lauer, and Bruce Vondracek, editors
Published by the American Fisheries Society
Publication date: August 2012
ISBN: 978-1-934874-28-8
18
Advice
Technical authors can convey their information in prose, tables, or
figures. Leaving aside visual information, such as photographs and maps,
Wainer (1997:129) quoted Tufte’s rule of thumb: “(if) three numbers
or fewer, use a sentence; four to twenty numbers, use a table; more
than twenty, use a graph.” A further consideration is an author’s
purpose. If the intent is simply to record a short set of numbers and
three or fewer can be arranged in a readable sequence, then prose is
likely to require the least space.
Wainer, H. 1997. Visual revelations. Graphical tales of fate and
deception from Napoleon Bonaparte to Ross Perot. Springer-Verlag,
New York.
19
Advice
Ehrenberg (1977) offered rules for construction of numerical tables,
and many apply to nonnumeric tables as well. His golden rule is
“patterns and exceptions should be obvious at a glance; at least once
one knows what they are.” Otherwise, a taxed reader is likely to give up
before the intended points are found.
Ehrenberg, A.S.C. 1977 Rudiments of Numeracy Journal of the Royal
Statistical Society 140(3) 277– 297.
20
Advice
A graphical counterpart to Ehrenberg’s golden rule for tables is that
authors should design their graphs to allow readers to see values,
patterns, and exceptions in the graphed data, and they should be able
to do so with greatest accuracy and ease. In a similar vein, Tufte
(1983) urged graphic designers to strive for excellence by presenting
interesting data with “clarity, precision, and efficiency.”
Tufte, E. R. 1983. The visual display of quantitative information.
Graphics Press, Cheshire, Connecticut.
21
Advice
In this age of high graphics and electronic “solutions”, the humble
numeric table faces strong and often ill-informed opposition. Many
believe that tables are intrinsically less interesting than the
alternatives. Nonsense! Those with an interest in (or a compulsion to
learn about) a subject are interested in relevant numbers. You don’t
need to trick people into looking at them. As Tufte says: “If the
statistics are boring, then you’ve got the wrong numbers…” A more
justifiable criticism is that tables are often poorly designed and
difficult to read and interpret. This is true but the charge equally
applies to the alternatives.
When to use numeric tables and why: Guidelines for the brave
By Sally Bigwood and Melissa Spore
See “Presenting Numbers, Tables and Charts” published by Oxford
University Press in 2003
22
Advice
BS 7581:1992 - Guide to presentation of tables and graphs
Kitemark!
Advice on Tables (data),
Charts, Data
representation, Graphic
representation, Design,
Colour, Documents for a
mere £158.
Better order from our
library! Not currently in
stock!
23
Advice
But how can brevity and clarity be achieved, especially without complex
technical subject-matter? Much advice about writing is good but very
specific, like "Use active and not passive verbs." Does not have much
impact. There are, however, five rules (below) that can have a wider
more pervasive effect. Whenever I do not manage to apply them fully,
I know I could have done better.
Ehrenberg, A.S.C. 1982 “Writing Technical Papers Or Reports”
American Statistician 36(4) 326-329
DOI: 10.2307/2683079
24
Advice -The Five Rules
1.
Start at the End. We usually write papers or reports in a historical
way, finishing with our results and conclusions. But readers usually
want to know our findings before learning how they were obtained.
Technical reports and learned articles are not detective stories. We
therefore should start at the end, giving our main results and
conclusions first.
2.
Be Prepared to Revise. Few people can write clearly without revision.
3.
Cut Down on Long Words. Technical writing is often dense and heavy.
It can be made more readable by using shorter sentences and fewer
long words.
4.
Be Brief. Brevity is best achieved by leaving things out. This works at
all levels: sections, paragraphs, sentences, and words.
5.
Think of the Reader. We must consider what our readers will do with
our report or paper. What will they want to communicate to others?
25
Advice
“Lack of numeracy is due mainly to the way data are presented. Most tables of
data can be improved by following a few simple rules, such as drastic rounding,
ordering the rows of a table by size, and giving a brief verbal summary of the
data.”
Ehrenberg, A.S.C. 1981 “The Problem Of Numeracy” American Statistician
35(2) 67-71
DOI: 10.2307/2683143
26
Advice
“These few simple rules or guidelines can work wonders in communicating a
table of numbers.
1.
Giving marginal averages to provide a visual focus;
2.
Ordering the rows or columns of the table by the marginal averages or
some other measure of size (keeping to the same order if there are
many similar tables);
3.
Putting figures to be compared into columns rather than rows (with
larger numbers on top if possible);
4.
Rounding to two effective digits;
5.
Using layout to guide the eye and facilitate comparisons;
6.
Giving brief verbal summaries to lead the reader to the main patterns
and exceptions.”
27
Advice
“Many people say that graphs make data easier to see. But this is not always so.
Many graphs are not easy to follow. Graphs usually fail if they do not have a
simple story-line to tell. This restricts them to communicating qualitative
aspects of the data, i.e. shapes or orders of magnitude. For this they can excel.
In contrast, detailed quantitative features are usually difficult to read off
from a graph.”
Ehrenberg, A.S.C. 1978 “Graphs Or Tables” Statistician 27(2) 87-96
DOI: 10.2307/2987904
28
Advice
For the more adventurous.
“P(p)roposes a unified approach to exploratory and confirmatory data analysis,
based on considering graphical data displays as comparisons to a reference
distribution. The comparison can be explicit, as when data are compared to sets
of fake data simulated from the model, or implicit, as when patterns in a twoway plot are compared to an assumed model of independence. Confirmatory
analysis has the same structure, but the comparisons are numerical rather than
visual.”
Andrew Gelman 2004 “Exploratory Data Analysis for Complex Models” Journal
of Computational and Graphical Statistics 13(4) 755–779 2004
DOI: 10.1198/106186004X11435
29
Does It Always Matter?
The first rule of performing a project
1
The supervisor is always right
The second rule of performing a project
2
If the supervisor is wrong, rule 1 applies
30
Download