Statistics, not Sadistics! A Practical Guide to Statistics for Non-Statisticians By James Riley Estep, Jr., Ph.D. Professor of Christian Education Lincoln Christian Seminary (Lincoln, Illinois) Visiting Professor of Statistics Southern Baptist Theological Seminary (Louisville, Kentucky) Page |1 Table of Contents 2013 Edition Section 1: Foundational Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Section 2: Overview of the Statistical Landscape . . . . . . . . . . . . . . . . . . . . . . 6 Section 3: Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Section 4: Inferential Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Section 5: Correlation Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Section 6: What to Look Out For . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Page |2 Section 1 Foundational Concepts Where do the numbers come from? Education is at least in part based on theories derived from the findings of the social sciences, e.g. learning, developmental, or administrative theories. Social science research takes two different forms: Qualitative and Quantitative.1 Qualitative research generates data in the form of words, e.g. interview transcripts, written descriptions of observations, or historical documents; and is hence does not rely on statistical analysis. Quantitative research generates numerical data, e.g. satisfaction rated on a scale of 1-5, rankings, scores on standardized tests, or preparation times. The analysis of the numerical data generated form quantitative research methods is called statistics. Statistics is not just math. It is a specialized form of math. It is mathematics applied to social science research, i.e. quantitative research. Statistics are also used in the field of assessment, which is becoming increasingly important to educational institutions and student standardized testing. This guide to statistics is not intended to be an introduction or survey of quantitative research, nor is it intended to replace an introduction to statistics textbook. Rather, this paper makes several assumptions: You are in or have had a statistics class (such as the one may be in right now). You have access to a basic textbook on statistics that contains more in depth material, especially the formulas. This paper will not show the formulas since (a) most of you will use a computer and hence not even see the formula, and (b) some of the formulas has very complex in detail, but show nothing about their use in research. You will do most of your statistical computations on a computer, using Excel, SSPS You make use of user-friendly websites like www.surveymonkey.com, www.questionpro.com or the “dissertation helps” on A third form of social science research may be considered the “mixed-method,” but this form is in fact the intentional combination of qualitative and quantitative, and not really a new form of research. However, mixed method is becoming the preferred form of social science research. 1 Page |3 www.leadership.sbts.edu; and so you don’t need to generate the numbers yourself. You will probably employ a statistician to do most of your “number crunching” for a thesis or dissertation. In any of these cases, you will still need to comprehend statistics, so as to design the appropriate research and request the appropriate computations. In short, you have to be smart enough to ask the statistician for the right stat! This paper is a crash course in statistics for non-statisticians. It focuses more on understanding statistics than actually doing the math. Population, Sample, Soup and Statistics “How was the soup?” Technically, you cannot tell the waitress how the soup was, since you did not actually consume all the soup in the pot (or at least I hope not). Rather, you base your response on the bowl that was drawn from pot, only a sample of the soup that your waitress provided. The population is the pot, and the sample is what you draw from the pot. Sample size is important, so you need to stir the pot and use a ladle to gather a suitable sample from which to draw your conclusion, since using just a small spoon may not give you an accurate taste of the whole pot. Also, stirring the pot before pulling out the sample is important, so as to guarantee that the appropriate mixture of all the soup’s ingredients is represented. This is all part of the research design. All research design starts with a question, such as “What does the soup taste like?” It is an unknown. We design research to gather data about the soup so as to answer the question. It is concerned with the procedures that involve gathering accurate data from a population being studied. The numbers used in statistics are generated by doing quantitative research on a population or sample. Hence, the more precisely a population is described and the more accurately a sample is pulled from the population, the more accurate the numbers will be describing the population/sample. For example: QUESTION: How many hours does the average SBTS student study for an exam in Greek 1 class? Page |4 POPULATION All students at SBTS in Greek 1 SAMPLE 20 students pulled from population ???? DATA SURVEY generating numerical responses Report Findings Statistical Analysis Figure 1.1 Since a researcher may not be able to contact every student in Greek 1 (population), he/she decides to randomly select a representative group of Greek 1 students (sample) which is a more manageable project. The survey is then used on the sample to generate numerical data for statistical analysis. A sample/population calculator is available on www.leadership.sbts.edu Spoiler Alert: If the entire population can be reached, then one need only use “descriptive statistics”. However, if a sample is used, inferential and/or correlative statistics must be used, which are far more complicated. on “dissertation helps” and the “Articles and Resources” page. This resource will help you avoid making a basic mistake in research design, and flaw the data upon which the statistics are calculated. The findings of the survey are then reported based on statistical analysis. A Tale of Three “Datas” Where do the numbers come from? Obviously quantitative research methods, as described above. These numbers represent the reality being studied. But, not all numbers are alike. It is critical to keep in mind the three kinds of data as they relate to the three kinds of data: Categorical, Ordinal, Interval (also known as ratio). Categorical/Nominal: Data separated into mutually exclusive categories; A scale that “measures” in terms of names or designations of discrete units or categories. o For example: age groups, gender, academic major o Enables statistician to determine the mode, the percentage values, or a chisquared Page |5 Ordinal: Logically ordered categorical data; a scale that “measures” in terms of such values as “more” or “less,” “larger” or “smaller,” but without specifying the size of the interviews o For example, a Likert scale surveys with Strongly disagree; Disagree; Neutral; Agree; Strongly agree. o Enables statisticians to determine the median, percentile rank, and rank correlation. Interval/Ratio: The distinctions between interval and ratio data are subtle, but fortunately, this distinction is often not important. Certain statistical methods, i.e. geometric mean and a coefficient of variation can only be applied to ratio data. o Interval: A scale that measures in terms of equal intervals or degrees of difference, but whose zero-point, or point of beginning, is arbitrarily established (not absolute). For example: Intelligence Quotient (IQ), since there is not real zero. Enables statisticians to determine the mean, standard deviation, and product movement correlation; also allows for most inferential statistical analyses o Ratio: A scale that measure in terms of equal intervals and an absolute zero-point of origin. E.g., birth weight in kilograms or a measurement of distance or height Enables one also to determine the geometric mean and the percentage variation; allows one to conduct virtually any inferential statistical analysis. These three kinds of data require different statistical treatment, e.g. certain formulas only work with ordinal data rather than categorical. Mismatch the data with the statistical formula . . . its like asking how many sides a circle has. The type of data can also influence how data is presented and what kind of chart one uses to report findings. Page |6 Section 2 Overview of the Statistical Landscape Not all Statistics are Alike! Anyone who has traveled across country can readily tell the difference between the United State’s southwest vs. the Midwest vs. the deep South vs. New England. While certain similarities exist in all three regions, distinctive geographic and geological features mark the regions rather distinctively. The American landscape is in fact s mosaic, a set of landscapes, each with its own features, ecologies, and resources. Statistics is a plural term. The statistical landscape consists of three distinctive regions: Descriptive, Inferential, and Correlation. Table 2.1 provides a ready reference to the basic idea of each of these three statistics (Sections 3-5 will address each in detail). Statistics Type of Question: What is X? Statistics: Descriptive Metaphor: Snapshot Formulas/Tests: Mean, Median, Mode, Range, Quartile, Whisker-Box Plot, Zscore, and Standard Deviation Applications: “I want to know what my congregation’s membership is like?,” e.g. age, height, weight, gender mix, ethnicity, income levels, years attending this congregation. Table 2.1 Type of Question: What are the odds of X happening? Statistics: Inferential Metaphor: Gambling Formulas/Tests: Type of Question: Are and How Related are X and Y? Hypothesis testing, p-value, Z-test, t-Test, Chi-squared, ANOVA (1-way and 2-way) Applications: “Given the average age of a congregation member, what are the odds of finding a 72 year old?” OR “How sure am I that the opinion of a sample of my congregation about the new program represents the whole congregations?” Pearson r, Coefficient of Determination, Scattergram; Spearman rho, Kendall’s tau Applications: “Is there a significant correlation between the education level of my congregation members and their participation in Sunday school?” Statistics: Correlation Metaphor: Thermometer Formulas/Tests: Page |7 Descriptive Statistics Descriptive statistics . . . describe. It is the most basic form of statistical analysis of a body of numbers. “How many people attend your church?” How do you respond? Do you read off all the Sunday morning attendances for the previous year? Probably not. Rather, summarize all those attendances into a single number, the average attendance (statistically called the mean). “How did you do in seminary?” You don’t recount your academic transcript; you probably give your g.p.a. (grade point average) as a means of summarizing your academic performance into a single, concise number. We use descriptive statistic most frequently and often do not even realize it. Descriptive statistics are like a snapshot. “Tell me what you see?” Descriptive statistics are used to summarize, analyze and share any set of numbers. However, like a good photograph, you must be able to get everyone in the shot, or the picture is incomplete. Descriptive statistics are useful only when the entire population can be reached. While they are the basis of all the other forms of statistics, when you only have a sample, descriptive statistics are not enough to reach a firm conclusion. Descriptive statistics cannot answer all the questions or serve all our statistical needs, and hence two other forms of statistics exist. Inferential Statistics What are the odds? Many of us have felt the frustration of playing the board game Risk® and having a territory defended by one single army that consistently rolls a 6 every time against your superior forces. “What are the odds?” “That’s not right!” “You’re cheating!” and the battle ensues. Inferential statistics are typically used in regard to the probability of a response occurring. More specifically, if a sample is pulled from a population, how certain can the researcher be that the numbers gained from the sample accurately reflect the population being studied; i.e., what are the odds of the population data occurring outside the sample data? This is used in political polling when projecting a winner in an election. With a small percentage of the vote cast (sample) certain statistical extrapolations can be made to indicate if this is reflective of the voting public (population), answering the question “What are the odds the other candidate can still catch the leading candidate?” This also speaks to the assurance one has that their descriptive statistics are accurate. It becomes a matter of confidence. How confident are you that your study of the sample is an accurate Page |8 depiction of the population? Las Vegas “odds-makers” are the heir apparent to this form of statistics, since it began with a gambling question. In the mid-1600s a professional gambler named Chevalier de Mere made fortune. He would bet unsuspecting patrons that in 4 rolls of a die he could get a six at least once. He was so successful at the game that soon people refused to play him. He then challenged them that he could roll two sixes in 24 rolls of the die . . . he failed. Confused, he contacted Blaise Pascal – famous mathematician. He corresponded with a friend, Pierre de Fermat, and they developed the first successful theory of probability . . . inferential statistics!2 This may be the only beneficial development from the gambling industry! Correlation Statistics When the temperature rises, so does the red line in a thermometer. As it cools, the red line lowers. There is an obvious relationship between the red line and the temperature. However, if someone had never seen a thermometer before, they could incorrectly conclude, “Amazing, if the red line drops, the temperature lowers; and if it goes up, the temperature rises!” Correlation statistics is a special form of descriptive statistics in which the relationship between two or more variable relate to one another. Correlation can not only determine the presence of a relationship, but the nature of the relationship as well. “Is there a connection between small group participation and involvement in the congregation’s ministry?” This is a question that correlation statistics can answer. In its most basic form, only the fact of the relationship can be demonstrated, not the form of the relationship, e.g. causation. More advanced correlation statistics can determine not only the existence of a relationship, but the event of causation. Correlation statics are in part based on Inferential statistics, since it is dealing with a sample and explaining the probability of an event, i.e. if this happens to X, then Y occurs this much. Variables? A variable is the item or items being analyzed. If I want to see the correlation between a students grade point average and their score on a standardized test (such as SAT, ACT, or GRE), then I’m studying the relationship between two variables. 2 Cf. Allan G. Bluman (1995), Elementary Statistics, 2nd Edition (Chicago: Irwin), 13. Page |9 Variables come in two different types: Categorical are those that are non-numerical, such as gender, which has two categories male and female. Numerical variables are counted or measured. Discrete numerical variables are counted in whole numbers, e.g. 1, 2, 3, 4; like asking how many times someone attended worship annually. Continuous numerical variables are continuous because they are not simply counted, but are measured, i.e. using decimals, and hence are continuous; e.g., the distance a household is from your church building, 5.239 miles. You could say discrete is a count, continuous is an amount. The type of variable does appropriate statistical formula and the way in which data is presented (Section 6). Common Problem: Mismatching! As Table 2.1 illustrates, the appropriate use of statistics requires alignment. The question being asked requires the application of the appropriate type of statistics. Different types of statistics have different formulas associated with them. Finally, the outcomes of the different statistical formulas lend themselves to different analyses of the data. In short, mismatching creates an irreconcilable error to occur in research. For example, if statisticians hear the phrase “Chi-squared,” they know that inferential statistics are involved; whereas if someone mentions Pearson r, then correlation statistics are being discussed. [If you are slightly panicking right now; fear not, these will be explained later in the paper, and all these functions are calculated by Excel. For now, just realize that all statistics are not alike.] Likewise, if inferential statistics are being applied, then one can assume that hypothesis testing or sample reliability are involved. Similarly, if correlation statistics are being used, one can assume that the question involves the relationship of two variables. Table 2.1, as well as Figure 4.1 and 5.1, are designed to maintain statistical alignment in research design. P a g e | 10 Section 3 Descriptive Statistics What’s going on here? You have a large set of numbers . . . how do you describe them? Descriptive statistics are designed to provide an accurate overview of any set of numbers. Like a photograph, they capture certain elements of the scene. Like the photo to the left, you could say, “It has palm trees, a beach, an ocean, and mountains in the distance.” While there may be more the photo, such as waves in the ocean, or mountains closer than others in the distance, the descriptive given is enough to identify the photo and accurately describe the contents. Descriptive statistics has several standard methods of describing a set of numbers. It is not designed to project or capture an image beyond the limits of the frame, or explain the relationship between items in the photo. Photos capture one moment in time within the limited scope of the lens. That is descriptive statistics. Descriptive Statistics Computations Mean: The technical term for “average,” the balancing point of all the data. It is derived by adding all the values in a data set and dividing by the number of values. This is designed to identify the center of the data, i.e. central tendency. Median: The middle value when all the values in a data set are arranged lowest to highest. This is designed to identify the center of the data, i.e. central tendency. Mode: The value in a data set that occurs most frequently. Range: The difference between the lowest and highest number in a data set. This is designed to demonstrate the spread of values in a data set, the variance. Quartiles: The three values that split a data set into four equal parts, or quarters. This is designed to summarize large data sets into 25th, 50th, and 75th percentiles, i.e. quarters. Note: standardized educational testing uses the quartile system. o Q1, the first quartile, 25%, means that a quarter of the values in the data set are smaller than this value and 75% are larger o Q2, the second quartile, 50%, means that half of the values in the data set are smaller than this value and half are larger. P a g e | 11 o Q3, the third quartile, means that 75% of the values in the data set are smaller than this value and 25% are larger. Whisker-Box Plot: It is like a graphic representation of quartiles. It is a test of “skewness,” i.e. the relation of the mean and median in a data set (See Figure 3.1). o Symmetrical Shape: When the mean and median are the same o Left-Skewed Shape: When the mean is less than the median, and hence more values are to the left of the mean. o Right-Skewed Shape: when the mean is more than the median, and hence more values are to the right of the mean. o NOTE: This is critical since inferential statistics assumes that data has a symmetrical shape, or bell curve. 6 Mean 5 Mean Mean 4 3 2 1 0 Symmetric Left Right Figure 3.1: Symmetric, Left, and Right Skewed Data Sets Z-score: The difference between a value and the mean, divided by the standard deviation. This is designed to identify the spread of data from the mean, the variance. Zscore is a score; meaning it is used to assess variance. A Z-score of +/- 3 identifies that that value is an extreme value, far from the mean. Standard Deviation: The measure of variation of a set of data from the mean, like the width of the mean. This is designed to demonstrate the spread of data from the mean of the data set. ILLUSTRATION You are an instructor in a seminary. You give a midterm exam to a class of 20 students. The following are the scores on the midterm (the data set), arranged smallest to largest: P a g e | 12 58 59 63 68 74 77 83 85 87 88 88 88 90 90 91 92 95 96 97 99 At lunch, a colleague asked, “How did your student do on the midterm?” Rather than rattling off the grade book, you could simply respond as follows: Mean: 83.4 Median: 88.5 (since it is an even numbered data set, 20 values, the median is between the 10th and 11th value). Mode: 88 (That value occurs three times in the data set, more than any other value) Range: 41 points (the highest value – the lowest value, i.e. 99-58 = 41) Quartiles: Q1 = 76.25 Q2 = 88 Q3 = 91.25 Skewness: 0.931 (using Microsoft Excel, this figure was provided. Since a value of 0 would indicate a symmetrical skew, and this is “close,” i.e. it is still a zeropoint-something, then the data is almost symmetrical, maybe only slightly skewed! Z-score: Since the Z-score for the lowest test was -2.0 and the Z-score for the highest was +1.23; neither being outside the +/- 3 range, none of the scores are considered “extreme.” Standard Deviation: 12.65 points, meaning most of the grades were between 96.05 (the mean + the standard deviation) and 70.75 (the mean – the standard deviation). This is probably more information than your colleague requested, required, or even expected, but it does provide an accurate depiction of the class’ performance on the midterm exam! P a g e | 13 Section 4 Inferential Statistics How sure am I that my description is accurate? Inferential statistics can be described in a variety of ways, but perhaps the most common phrase is, “What are the odds of that happening?” Inferential statistics is the mathematical basis of gambling, or more specifically probability. However, for our purposes, the basic description for our purposes is to discuss the difference between sets of data. It is used in research to determine the level of confidence one can have generalizing conclusion about the population from a sample. “How confident can I be that the sample’s data is an accurate reflection of the population in general?” This is the task of inferential statistics. We can assume that the data drawn from a sample of a population will accurately reflect the actual population; but that’s an assumption! Inferential statistics serves to validate more objectively the accuracy of the sample’s data in relation to the population. Rather than having to say, “I feel confident . . .” or “Well, I hope it’s accurate!”, a statistician can say, “I’m 90% confident that the sample data accurately reflects the population.” Inferential Statistic Calculations Hypothesis Testing refers to a set of methods used to make inferences about expected or hypothesized values between a sample and population. What methods (plural) are available for testing hypotheses? p-value approach: Given the null hypothesis is true, the p-value tests for extreme values in the sample. You reject the null hypothesis if the p-value is less than α, you do not reject the null hypothesis is the p-value is more than α. Hence the phrase, “If the p-value is low, then the null hypothesis must go.” o α , also known as “level of significance, refers to the odds of a Type 1 error occurring in a hypothesis test. It should be measured at .05 to insure a Type 1 error does not occur. Z-test (not Z score): What level of confidence do you have that a given set of data is normally distributed and that a given value falls into a predictable pattern, not in P a g e | 14 the extreme? This is the Z-test or Z-value. Typically, 95% is the level of certainty used in research. This corresponds to a Z-value of +/- 1.96. If a test statistic is greater than 1.96, you reject the null hypothesis. If you lower the level of certainty, this may change . . . but Inferential Statistics Flowchart3 Figure 4.1 Group Differences Interval Ordinal 2 Groups Wilcoxin Matched Pairs T-test MannWhitney U-test Wilcoxin Rank-Sum Figure 4.1 test 3+ Groups KruskalWallis H test 1 Group One-sample z-test Sample & Population mean known; n>30 One-sample t-test σ unknown 2 Groups Independent Samples t-test 2 Independent Groups Matched Samples t-test 2 matched groups 3+ Groups One-way ANOVA 1 ind. var., 1 dep. var. – ind. groups Factorial ANOVA 2+ ind. var., 1 dep. var. – ind. groups DESCRIPTIONS Wilcoxin Matched Pairs t-Test: It is like the Matched samples t-Test. See below Adapted from Dr. Rick Yount, (2006). “Research Design and Statistical Analysis in Christian Ministry,” 4th Edition. (self published, Southwestern Baptist Theological Seminary, Fort Worth, Texas), pp. 5-3, and subsequent material from pp. 5-4 to 5-7. 3 P a g e | 15 Mann-Whitney U Test: Compares two independent samples; e.g. “Did learning, as measured by the number correct on a quiz, occur faster for Class 1 or Class 2?” It is like the Independent Samples t-Test. Wilcox Rank Sum: Compares the magnitude and direction of differences between two groups, e.g. “Is Bible college twice as effective as no Bible college experience for helping develop writing skills?” It is like the Independent Samples t-Test. Kruskal-Wallis H Test: A form of ANOVA that compares the difference between two or more independent samples; e.g. How do rankings of associate deans differ between four academic fields?” Like the one-way ANOVA. One-sample z-test: Is data from one sample significantly different from its population? If the sample is > than 30, use this test One-sample t-test: Is data from one sample significantly different from its population? If the sample is < than 30, use this test. See below For example: You know the average age of pastors in the SBC. You collect information from a single sample of SBC pastors with seminary educations. Is there a significant difference in the average age of the sample and the population? Independent Samples t-Test: Like the ones above, this test computes whether data from two independently randomly selected samples is significantly different. See below Matched Samples t-Test: This test computes whether data from two groups of paired subjects (e.g. husbands and wives; elders and deacons) samples is significantly different. See below One-way ANOVA (Analysis of Variance): Also known as a F-test, it tests for difference between two or more means on one dimension, e.g. How do rankings of associate deans differ between four academic fields?” Factorial/Two-way ANOVA (Analysis of Variance): tests for difference between two or more independent samples on more than one dimension; e.g. How do rankings of P a g e | 16 associate deans differ between four academic fields and gender?” It measures the interaction among the independent variables. t-Test, but which one? When you are testing for differences between groups, a variety of t Tests are available. How do you know which one to use? P a g e | 17 Same Participants NO YES 2 Groups 3+ Groups t-test for dependent samples Related measures of analysis of variance 2 Groups 3+ Groups t-test for independent samples Simple Analysis of Variance Figure 4.24 Adapted from Neil J. Salkind (2008), Statistics for People Who Think They Hate Statistics (Los Angeles: Sage Publications), 190. 4 P a g e | 18 Hypothesis Testing Made Easy! State a Hypothesis (Null & Alternative) Set the Level of Significance (α) (typically .05) Select the Appropriate Test Mean/Variance, z-Test, t-Test, X2, ANOVA Calculate p-value & compare to α Reject or Accept the Null Hypothesis! General Rule: If p is low, then the hypothesis must go. Figure 4.3 P a g e | 19 Section 5 Correlation Statistics How does X influence Y? “I wonder if there is a relationship between the temperature in my dorm room and the red liquid in the thermometer? What might be that relationship?” Correlation statistics is used to determine the existence, strength, and direction of variables in a study, especially similarities among variables. “Yes, there is a relationship between the temperature in my dorm room and the red liquid in the thermometer. When temperature rises, the red liquid rises in direct relation to the temperature, and this is very strong relationship.” Correlation statistics are sometimes called coefficient of similarity or just association. Different tests are used depending on the type of variable: Categorical or Nominal: one that has two or more categories, but there is no intrinsic ordering to the categories, e.g gender has two categories (male and female) or hair color having more than two categories (blonde, brown, brunette, red). Neither of these categories have an order highest to lowest. Two Categories: Spearman rho ( ) and/or Kendall’s tau ( ) Three or More Categories: Kendall’s W Ordinal: Like a categorical or nominal variable, except there is an obvious ordering of the variables, e.g. economic status with three categories (low, medium and high) or educational experience with levels such as elementary school graduate, high school graduate, some university and university graduate. In both instances, the spacing between the values may not be the same across the levels of the variables. One Categories: Chi-squared (χ2), “Goodness of Fit” Two Categories: Chi-squared (χ2), “Test of Independence” Interval: Like an ordinal variable, except that the intervals between the values are equally spaced, e.g. measuring economic status in increments of $10,000, the space between the variables is equal. Two Categories: Pearson’s r and Linear Regression P a g e | 20 Three Categories: Multiple Regression Specific types of tests are tied to specific types of variables. Mismatch the variable and/or test, bad data! Correlation Flow Chart5 Figure 5.1 Relationships/ Similarities Ordinal Categorical Interval 2 Ranks 3+ Ranks 1 Variable 2 Variables 2 Variables 3+ Variables Spearman rho (ρ) Kendall’s W Chi-square Goodness of Fit Chi-square Test of Independence Contingency Coefficient; Cramer’s Phi Pearson’s r Multiple Regression Kendall’s tau (τ) Figure 5.1 Simple Linear Regression DESCRIPTIONS Spearman rho (ρ): Computes correlation between ranks (used when at least 10 or more pares of rankings are available) given by two groups; e.g. “What is the correlation between rank in the senior year of college and rank during the first year of seminary?” Adapted from Dr. Rick Yount, (2006). “Research Design and Statistical Analysis in Christian Ministry,” 4th Edition. (self published, Southwestern Baptist Theological Seminary, Fort Worth, Texas), p. 5-3, and subsequent material from pages 5-4 to 5-7. 5 P a g e | 21 Kendall’s tau (τ): Similar to Spearman rho, but used when less than 10 ranks are available given by two groups; e.g. two church staff members’ ranking a list of seven characteristics about church health, this could compute the degree of agreement between the rankings. Kendall’s W (also known as Kendall’s Coefficient of Concordance): Similar to the Spearman rho and Kendall’s tau, it measures the degree of agreement in ranking from more than two groups, 3+. e.g. five church staff members’ ranking a list of seven characteristics about church health, this could compute the degree of agreement between the rankings. Chi Square (χ2): Determines if the number of occurrences across categories is random; e.g. “Did coffee, hot chocolate, and hot tea sell an equal number of cups during the last week?” Chi Square Goodness of Fit: Compares actual categorical counts with expected categorical counts, e.g. did actual gender enrollment match what was expected? How good did the expectation fit the actual? Chi Squared Test of Independence: Compares two categorical variables to determine if any relationship exists between them, i.e. are they independent variables. Pearson r (or Pearson’s Product Moment Correlation Coefficient): Directly measures the degree of relationship between two interval variables. Simple Linear Regression: Computes an equation of a line which allow researchers to predict one interval variable from another, e.g., “If x . . . then y.” Multiple Regression: A statistical procedure where several interval variables (3+) are used to predict one. t Test, but which one? When you are testing for relationships or similarities between groups, a variety of t Tests are available. How do you know which one to use? P a g e | 22 How many variables? 2 Groups 3+ Groups t-test for significance of the correlation coefficient Regression, factor analysis Figure 5.26 6 Adapted from Neil J. Salkind (2008), Statistics for People Who Think They Hate Statistics (Los Angeles: Sage Publications), 190. P a g e | 23 Section 6 What to Watch Out For Common Errors, Mistakes, Misfortunes, and Resources of Statistics (& Quantitative Research) Appropriate Presentations If the variable is categorical . . . Determine 1-2 variables to present If one, use summary chart, or a bar, pie, or pareto chart If two, 2-way cross table If the variable is numerical . . . Determine 1-2 variables to present If one, use frequency and percentage distribution, histogram or a dot scale If two, determine if time order is important: o If yes, time-series plot o If not, use a scatter plot Resources for Statistical Work7 www.statistics.com http://www.psychstat.missouristate.edu/scripts/dws148f/statisticsresourcesmain.asp http://www.stat.ucle.edu/calculators/ http://www.Anselm.edu/homepage/jpitocch/biostatshist.html -- history of statistics http://www.anu.edu.au/nceph/surfstat/surfstathome/surfstat.html http://www.davidmlane.com/hyperstat/index.html -- tutorials! http://www.lib.umich.edu/govdocs/stats.html http://mathforum.org/workshops/sum96/data.collections/datalibrary/data.set6.html http://www.stat.ufl.edu/vlib/statistics.html http://noppa5.pc.helsinki.fip/links.html Survey Dilemmas8 Ill-defined Population Neil J. Salkind (2008), Statistics for People Who Think They Hate Statistics (Los Angeles: Sage Publications), Chapter 19. 8 Adapted from Deborah Rumsey (2003), Statistics for Dummies (Hoboken, New Jersey: Wiley Publishing), Chapter 20. 7 P a g e | 24 Mismatched Sample and Population Sample wasn’t randomly selected Sample wasn’t large enough Insufficient response to requests Inappropriate form of survey Misworded survey items (statements or questions) Timing of the Study Inadequate training for research assistants Survey items didn’t relate to the research questions Avoiding Erroneous Statistical Conclusions9 Statistics don’t prove anything. Conclusions cannot be based on statistically insignificant data Correlation does not mean causation Assuming a normal distribution without demonstrating a normal distribution. Partial reporting of the findings Assuming a larger sample size is better and solves all problems Intentional omission of unexpected findings Misapplication of results beyond the population. Conclusions based on non-random sample Selected vs. Responded Adapted from Deborah Rumsey (2007), Intermediate Statistics for Dummies (Hoboken, New Jersey: Wiley Publishing), Chapter 21. 9 P a g e | 25 Bibliography Levine, David M. and David F. Stephan (2005). Even You Can Learn Statistics. Upper Saddle River, New Jersey: Pearson/Prentice Hall. Pyrczak, Fred (2006). Making Sense of Statistics: A Conceptual Overview, 4th Edition. Glendale, California: Pyrczak Publishing. Rumsey, Deborah (2003). Statistics for Dummies. Hoboken, New Jersey: Wiley Publishing. ________ (2007). Intermediate Statistics for Dummies. Hoboken, New Jersey: Wiley Publishing. Salkind, Neil J. (2008). Statistics for People Who (Think They) Hate Statistics, 3rd Edition. Los Angeles: Sage Publications. Yount, Rick (2006). “Research Design and Statistical Analysis in Christian Ministry,” 4th Edition. Fort Worth, Texas: Southwest Baptist Theological Seminary.