α Critical values for a test of hypothesis depend upon a test statistic, which is specific to the type of test, and the significance level, α, which defines the sensitivity of the test. A value of α = 0.05 implies that the null hypothesis is rejected 5% of the time when it is in fact true. The choice of α is somewhat arbitrary, although in practice values of 0.1, 0.05, and 0.01 are common. Critical values are essentially cut-off values that define regions where the test statistic is unlikely to lie; for example, a region where the critical value is exceeded with probability α if the null hypothesis is true. The null hypothesis is rejected if the test statistic lies within this region which is often referred to as the rejection region(s). Why P=0.05? 1 Friday, 18 March 2016 2:42 PM α It was Fisher who suggested giving 0.05 its special status, describing the standard normal distribution, states “The value for which P=0.05, or 1 in 20, is 1.96 or nearly 2; it is convenient to take this point as a limit in judging whether a deviation ought to be considered significant or not. Deviations exceeding twice the standard deviation are thus formally regarded as significant. Using this criterion we should be led to follow up a false indication only once in 22 trials, even if the statistics were the only guide available. Small effects will still escape notice if the data are insufficiently numerous to bring them out, but no lowering of the standard of significance would meet this difficulty.” Fisher first published Statistical Methods for Research Workers in 1925 R.A. Fisher 1970 “Statistical Methods for Research Workers” 14th ed., revised and enlarged, Edinburgh, Oliver and Boyd 2 When To Plot? On page 447 only two scree plots are presented. It is said that a picture is worth 1000 words. Is this true? Surely you could describe these pictures in less than 100! Plots fine for presentations, do not always belong in a paper/report. Petraitis, J., Lampman, C., Boeckmann, R., & Falconer, E. (2014). Sex differences in the attractiveness of hunter-gatherer and modern risks Journal of Applied Social Psychology, 44(6), 442-453 DOI: 10.1111/jasp.12237 Figure 1. Eigenvalue for 101 pairs of male behaviours and 101 pairs of female behaviours. 3 When To Plot? Caption is bigger than the graph and repeats the information. Does a Stock Chart belong in a scientific paper? Too much white space. Figure 2. Mean attractiveness risk by type of risk and sex of dating partner. Scores above .00 indicate that the risk behaviours were rated as attractive traits for dating partners; scores below .00 indicate that risky behaviours were not preferred in dating partners, and non-risky alternatives were preferred. Standard errors are represented by whisker bars and ranged from .042 to .047. A, B and C represent results from pairwise comparisons: Means with the same lettering are not significantly different at the .05 level; means with dissimilar lettering are significantly different. In any publication only present data once, graph/table/raw data. Obviously you must then discuss. Petraitis, J., Lampman, C., Boeckmann, R., & Falconer, E. (2014). Sex differences in the attractiveness of hunter-gatherer and modern risks Journal of Applied Social Psychology, 44(6), 442-453 DOI: 10.1111/jasp.12237 4 Stock Chart - Excel To display High, Low and Closing values plus Volume Traded Company A Company B Volume 2000 1500 High 3.5 2.8 Low 1.4 0.8 Close 2.9 1.4 Specimen Data Volume Stock prices 5 Example - Excel A study of 22 patients suffering from Parkinsons disease was conducted. An operation was performed on 8 of them, while it improved their general condition it might adversely affect their speech. In the data a higher value indicates a greater difficulty in speaking. Operated 2.6 2 1.7 2.7 2.5 2.6 2.5 3 Others 1.2 1.8 1.9 2.3 1.3 3 2.2 1.3 1.5 1.6 1.3 1.5 2.7 2 What do you think? What does the width imply! 6 Example - Excel A study of 22 patients suffering from Parkinsons disease was conducted. An operation was performed on 8 of them, while it improved their general condition it might adversely affect their speech. In the data a higher value indicates a greater difficulty in speaking. Operated 2.6 2 1.7 2.7 2.5 2.6 2.5 3 Others 1.2 1.8 1.9 2.3 1.3 3 2.2 1.3 1.5 1.6 1.3 1.5 2.7 2 What do you think? Does the false origin mislead? 7 Example - Excel A study of 22 patients suffering from Parkinsons disease was conducted. An operation was performed on 8 of them, while it improved their general condition it might adversely affect their speech. In the data a higher value indicates a greater difficulty in speaking. Operated 2.6 2 1.7 2.7 2.5 2.6 2.5 3 Others 1.2 1.8 1.9 2.3 1.3 3 2.2 1.3 1.5 1.6 1.3 1.5 2.7 2 What do you think? Much better, a confidence interval plot. 8 Example - Excel A study of 22 patients suffering from Parkinsons disease was conducted. An operation was performed on 8 of them, while it improved their general condition it might adversely affect their speech. In the data a higher value indicates a greater difficulty in speaking. Operated 2.6 2 1.7 2.7 2.5 2.6 2.5 3 Others 1.2 1.8 1.9 2.3 1.3 3 2.2 1.3 1.5 1.6 1.3 1.5 2.7 2 What do you think? Does the false origin mislead? 9 Example - Minitab Never under estimate a humble box plot, particularly for simultaneously comparing a number of variables. 8 7 6 5 4 3 2 1 m1_c m1_n m2_c m2_n m3_c m3_n m4_c m4_n 10 Example - Minitab John Tukey designed the box plot as an easy-to-draw data visualization as part of exploratory data analysis. The box plot has persisted into the computer age as an information-rich graphic that conveys key features of a numeric dataset at a glance. Unlike the bar chart, it uses statistical summaries (median and interquartile range) that are robust in the presence of skewness and outliers and require no assumptions about the population. It shows the full range of the sample data, provides information about the tails, and indicates the shape of the data. The Box Plots Alternative for Visualizing Quantitative Data Regina Nuzzo American Academy of Physical Medicine and Rehabilitation 2016 DOI: 10.1016/j.pmrj.2016.02.001 Tukey JW. Exploratory Data Analysis. Boston, MA: Addison-Wesley; 1977. 11 Example - Minitab A study of 22 patients suffering from Parkinsons disease was conducted. An operation was performed on 8 of them, while it improved their general condition it might adversely affect their speech. In the data a higher value indicates a greater difficulty in speaking. Operated 2.6 2 1.7 2.7 2.5 2.6 2.5 3 Others 1.2 1.8 1.9 2.3 1.3 3 2.2 1.3 1.5 1.6 1.3 1.5 2.7 2 Two-sample T for Operated vs Others Operated Others N 8 14 Mean 2.450 1.829 StDev 0.411 0.557 SE Mean 0.15 0.15 Difference = mu (Operated) - mu (Others) Estimate for difference: 0.621429 95% CI for difference: (0.149634, 1.093224) T-Test of difference = 0 (vs not =): T-Value = 2.75 Both use Pooled StDev = 0.5103 p < 0.05 P-Value = 0.012 DF = 20 Confidence interval excludes zero 12 Example - Minitab A study of 22 patients suffering from Parkinsons disease was conducted. An operation was performed on 8 of them, while it improved their general condition it might adversely affect their speech. In the data a higher value indicates a greater difficulty in speaking. Operated 2.6 2 1.7 2.7 2.5 2.6 2.5 3 Others 1.2 1.8 1.9 2.3 1.3 3 2.2 1.3 1.5 1.6 1.3 1.5 2.7 2 Boxplot plus means 13 Example - Minitab A study of 22 patients suffering from Parkinsons disease was conducted. An operation was performed on 8 of them, while it improved their general condition it might adversely affect their speech. In the data a higher value indicates a greater difficulty in speaking. Operated 2.6 2 1.7 2.7 2.5 2.6 2.5 3 Others 1.2 1.8 1.9 2.3 1.3 3 2.2 1.3 1.5 1.6 1.3 1.5 2.7 2 14 Example - SPSS A study of 22 patients suffering from Parkinsons disease was conducted. An operation was performed on 8 of them, while it improved their general condition it might adversely affect their speech. In the data a higher value indicates a greater difficulty in speaking. Operated 2.6 2 1.7 2.7 2.5 2.6 2.5 3 Others 1.2 1.8 1.9 2.3 1.3 3 2.2 1.3 1.5 1.6 1.3 1.5 2.7 2 Independent Samples Test Levene's Test for Equality t-test for Equality of Means of Variances F Sig. t df Sig. (2- Mean Difference tailed) Equal variances assumed 1.299 .268 Std. Error 95% Confidence Interval of the Difference Difference Lower Upper -2.748 20 .012 -.62143 .22618 -1.09322 -.14963 -2.990 18.461 .008 -.62143 .20786 -1.05735 -.18551 Data Equal variances not assumed p < 0.05 Confidence interval excludes zero 15 Example - SPSS A study of 22 patients suffering from Parkinsons disease was conducted. An operation was performed on 8 of them, while it improved their general condition it might adversely affect their speech. In the data a higher value indicates a greater difficulty in speaking. Operated 2.6 2 1.7 2.7 2.5 2.6 2.5 3 Others 1.2 1.8 1.9 2.3 1.3 3 2.2 1.3 1.5 1.6 1.3 1.5 2.7 2 Confidence interval plot 16 Example - SPSS A study of 22 patients suffering from Parkinsons disease was conducted. An operation was performed on 8 of them, while it improved their general condition it might adversely affect their speech. In the data a higher value indicates a greater difficulty in speaking. Operated 2.6 2 1.7 2.7 2.5 2.6 2.5 3 Others 1.2 1.8 1.9 2.3 1.3 3 2.2 1.3 1.5 1.6 1.3 1.5 2.7 2 Stock chart Not keen!! 17 Advice Where can you seek further advice? “The goal of this chapter is to offer guidance to natural resource professionals and other scientists on how to design tables and figures for use in scientific communication.” Chapter 7 - Design of Tables and Figures for Display of Scientific Data Roger D. Moon Scientific Communication for Natural Resource Professionals Cecil A. Jennings, Thomas E. Lauer, and Bruce Vondracek, editors Published by the American Fisheries Society Publication date: August 2012 ISBN: 978-1-934874-28-8 18 Advice Technical authors can convey their information in prose, tables, or figures. Leaving aside visual information, such as photographs and maps, Wainer (1997:129) quoted Tufte’s rule of thumb: “(if) three numbers or fewer, use a sentence; four to twenty numbers, use a table; more than twenty, use a graph.” A further consideration is an author’s purpose. If the intent is simply to record a short set of numbers and three or fewer can be arranged in a readable sequence, then prose is likely to require the least space. Wainer, H. 1997. Visual revelations. Graphical tales of fate and deception from Napoleon Bonaparte to Ross Perot. Springer-Verlag, New York. 19 Advice Ehrenberg (1977) offered rules for construction of numerical tables, and many apply to nonnumeric tables as well. His golden rule is “patterns and exceptions should be obvious at a glance; at least once one knows what they are.” Otherwise, a taxed reader is likely to give up before the intended points are found. Ehrenberg, A.S.C. 1977 Rudiments of Numeracy Journal of the Royal Statistical Society 140(3) 277– 297. 20 Advice A graphical counterpart to Ehrenberg’s golden rule for tables is that authors should design their graphs to allow readers to see values, patterns, and exceptions in the graphed data, and they should be able to do so with greatest accuracy and ease. In a similar vein, Tufte (1983) urged graphic designers to strive for excellence by presenting interesting data with “clarity, precision, and efficiency.” Tufte, E. R. 1983. The visual display of quantitative information. Graphics Press, Cheshire, Connecticut. 21 Advice In this age of high graphics and electronic “solutions”, the humble numeric table faces strong and often ill-informed opposition. Many believe that tables are intrinsically less interesting than the alternatives. Nonsense! Those with an interest in (or a compulsion to learn about) a subject are interested in relevant numbers. You don’t need to trick people into looking at them. As Tufte says: “If the statistics are boring, then you’ve got the wrong numbers…” A more justifiable criticism is that tables are often poorly designed and difficult to read and interpret. This is true but the charge equally applies to the alternatives. When to use numeric tables and why: Guidelines for the brave By Sally Bigwood and Melissa Spore See “Presenting Numbers, Tables and Charts” published by Oxford University Press in 2003 22 Advice BS 7581:1992 - Guide to presentation of tables and graphs Kitemark! Advice on Tables (data), Charts, Data representation, Graphic representation, Design, Colour, Documents for a mere £158. Better order from our library! Not currently in stock! 23 Advice But how can brevity and clarity be achieved, especially without complex technical subject-matter? Much advice about writing is good but very specific, like "Use active and not passive verbs." Does not have much impact. There are, however, five rules (below) that can have a wider more pervasive effect. Whenever I do not manage to apply them fully, I know I could have done better. Ehrenberg, A.S.C. 1982 “Writing Technical Papers Or Reports” American Statistician 36(4) 326-329 DOI: 10.2307/2683079 24 Advice -The Five Rules 1. Start at the End. We usually write papers or reports in a historical way, finishing with our results and conclusions. But readers usually want to know our findings before learning how they were obtained. Technical reports and learned articles are not detective stories. We therefore should start at the end, giving our main results and conclusions first. 2. Be Prepared to Revise. Few people can write clearly without revision. 3. Cut Down on Long Words. Technical writing is often dense and heavy. It can be made more readable by using shorter sentences and fewer long words. 4. Be Brief. Brevity is best achieved by leaving things out. This works at all levels: sections, paragraphs, sentences, and words. 5. Think of the Reader. We must consider what our readers will do with our report or paper. What will they want to communicate to others? 25 Advice “Lack of numeracy is due mainly to the way data are presented. Most tables of data can be improved by following a few simple rules, such as drastic rounding, ordering the rows of a table by size, and giving a brief verbal summary of the data.” Ehrenberg, A.S.C. 1981 “The Problem Of Numeracy” American Statistician 35(2) 67-71 DOI: 10.2307/2683143 26 Advice “These few simple rules or guidelines can work wonders in communicating a table of numbers. 1. Giving marginal averages to provide a visual focus; 2. Ordering the rows or columns of the table by the marginal averages or some other measure of size (keeping to the same order if there are many similar tables); 3. Putting figures to be compared into columns rather than rows (with larger numbers on top if possible); 4. Rounding to two effective digits; 5. Using layout to guide the eye and facilitate comparisons; 6. Giving brief verbal summaries to lead the reader to the main patterns and exceptions.” 27 Advice “Many people say that graphs make data easier to see. But this is not always so. Many graphs are not easy to follow. Graphs usually fail if they do not have a simple story-line to tell. This restricts them to communicating qualitative aspects of the data, i.e. shapes or orders of magnitude. For this they can excel. In contrast, detailed quantitative features are usually difficult to read off from a graph.” Ehrenberg, A.S.C. 1978 “Graphs Or Tables” Statistician 27(2) 87-96 DOI: 10.2307/2987904 28 Advice For the more adventurous. “P(p)roposes a unified approach to exploratory and confirmatory data analysis, based on considering graphical data displays as comparisons to a reference distribution. The comparison can be explicit, as when data are compared to sets of fake data simulated from the model, or implicit, as when patterns in a twoway plot are compared to an assumed model of independence. Confirmatory analysis has the same structure, but the comparisons are numerical rather than visual.” Andrew Gelman 2004 “Exploratory Data Analysis for Complex Models” Journal of Computational and Graphical Statistics 13(4) 755–779 2004 DOI: 10.1198/106186004X11435 29 Does It Always Matter? The first rule of performing a project 1 The supervisor is always right The second rule of performing a project 2 If the supervisor is wrong, rule 1 applies 30