Why Statistics Are Important In this introductory chapter we begin by providing our general theoretical position for this book - that social work research, including quantitative research methods, should be carried out from a strucfural and anti-oppressive perspective. We provide examples of how a quantitative approach has been throughout the history of social work practice in order to address important structural issues. We also introduce the basic statistics terminology used and conclude with an overview of the book. Over 10% of First Nations children in three sample provinces were in child of 2005 versus 3% for Metis children and 0.5% for non-Aboriginal children. Overall, it has been estimated that there are three times as many First Nations children placed in out-of-home care today than in residential schools at the height of the residential school movement. (Trocme, MacLaurin, Fallon, et al. 2005: 16) welfare care as of May The above quote, from a studyentitledWelfare Understandi ngthe Overrepresentati on ofFirofst System, offers dramatic example Nations Children in Canada’s Child a providing quantitative evidence to bring attention to important social justice issues, in this case the deplorable state of the Aboriginal child welfare system in Canada. Victor Thiessen (1993), in his book Arguing with Numbers, states that convincing funders, the government and the general public to accept arguments requires hard evidence in the form of numbers. Whether it is the needs of clients to show how well advocate for social justice causes, a program serves social workers need to provide statistical evidence to defend their arguments. In order to present this kind of information effectively, social workers must have a good understanding of quantitative research methods and statistical analyses. The aim of this book is to lay the foundation for this knowledge and provide an introduction to statistical concepts as they relate to social work, and to do so using a social justice lens. Victor Thiessen’s book reinforces the point that those in the field of social work need to be familiar with quantitative research methods. This is in spite of the fact that progressive social workers have moved away from a positivist approach to or to 2 Statistics for Social Justice research and rely instead on a post-positivist, or interpretist, approach. Positivism paradigm that holds that social behaviour can be studied and under¬ stood in a rational, objective and scientific manner. The term was coined by early French sociologist Auguste Comte and the most influential advocates were a group of philosophers who were collectively known as the “Vienna Circle of Logical Positivists.” Their approach “amounted to the methodological assertion that any variable which cannot be directly represented by a measurement operation has no place in science” (Ford 1975, cited in Lincoln and Guba 1985: 45). Post-positivist writers, on the other hand, argue that science subjugates knowledge and that approaches to social work knowledge should “install the client as an important site ofknowledge” (Rossiter 2000: 27). For interpretist research¬ ers (another term for post-positivist researchers), scientific methods are seen as limiting the multiple voices that can contribute to the construction of social work knowledge. They argue that the rigour and rigidity of experimental methods cannot account for the complexity of human relations and interventions. These researchers claim that an interpretive approach, which relies primarily on qualitative methods, rebalances empiricist methods “with subjective, intuitive and inductive approaches, thus lending support to new paradigms which integrate theorizing, practice and research as part ofholistic experience” (Fook 1996: 197). While quantitative methods best fit within the positivist paradigm, rather than the interpretivist paradigm, and in spite of the legitimacy of the above critiques of positivism, social work research does need to rely on quantitative methods in order to advocate for meaningful social change (van de Sande and Schwartz 2011). Progressive social workers, including those who adopt a structural perspective, need to be familiar with both qualitative and quantitative methods. What distinguishes is a research the structural social work researcher is purpose of the research - not the research method chosen but the promoting social justice. THE STRUCTURAL PERSPECTIVE The “structural” originated in the United States in the 1970s. Ruth R. Goldberg introduced this term to show how social workers could intervene “to improve the quality of the relationship between people and their social environment by bringing to bear, changing, or creating social structures” (1974: 32). In Canada, Maurice Moreau and his colleagues at Carleton University utilized feminist and Marxist principles to develop this approach. Bob Mullaly’s work on the structural continued through the 1990s. He pub¬ lished a series of three books (1993,1997, 2007) which provide a comprehensive framework for the approach. Mullaly proposes that the structural approach “seeks to change the social system and not the individual who receives, through no fault term Middleman and Gale Why Statistics Are Important 3 of their own, the results of defective social arrangements” (2007: 245 ). He argues that within the sociological literature there are two competing perspectives — order and conflict with regard to how society functions. The order perspective, which — Mullaly argues is consistent with neoliberal ideology, sees society as basically func¬ tioning in an orderly fashion. Neoliberal ideology holds that capitalism is a sound economic system and that the wealth accumulated by corporations and individuals will trickle down to lower income members of society. Those who believe in the order perspective see social problems as caused by individuals who do not respect the rules of society. Mainstream social work has adopted the order perspective and is concerned primarily with helping individuals adjust to the expectations of society. Donna Baines notes: “In social work, social problems are often depoliticized by defining them as the failings and shortcomings of individuals ... One of the ways that wider social problems are individualized and depoliticized is by giving them medical or psychiatric diagnoses or criminal labels” (2007: 5). Mullaly believes that the social work profession should adopt the conflict perspective, in keeping with a social democratic and Marxist ideology, which is based on socialist theory and a radical analysis of society. Those who subscribe to the conflict perspective believe that the more power a group possesses, the more it is able to pursue its own self-interests and oppress others through coercion and subjugation. Mullaly characterizes society as a struggle between competing interest groups, and adds that “the goal of structural social work is twofold: (l) to alleviate the negative effects on people of an exploitative and alienating social order; and (2) to transform the conditions and social structures that cause these negative effects” (2007: 245). To explain how the structural approach translates into practice and research, Moreau (1989) identifies the following five practice methods. 1. 2. Defence of the Client: Social workers using the structural approach help to de¬ fend their clients against an oppressive system. Quite often, clients are not fa¬ miliar with their rights and require someone to advocate on their behalf. This includes writing letters, attending meetings and, if needed, subverting agency policy. Collectivization: Clients need to know that they are not alone in their strug¬ gles. It is common for clients to feel that the problems they face are the result of their individual shortcomings. An important role of the structural social worker is to connect clients to support networks and reduce their felt sense of isolation and alienation. 3. Materialization: A of the structural approach is the materialist analysis. Many of the personal problems experienced by clients are a direct result of material deprivation; for example, a single parent on social assistance cornerstone 4 Statistics for Social Justice may struggle with depression and feelings of inadequacy. Rather than focus¬ ing on the mental health issues experienced by clients, social workers using a material analysis will help the clients make the connection between their poverty and mental health issues and assure them they are not to blame for structural 4. 5. problems beyond their control. Increasing Client Power in the Worker-Client Relationship: Clients coming for assistance typically experience feelings of powerlessness. Part of the work of a structural social worker is to increase the power of clients in the worker-client relationship by clear contracting, avoiding jargon, sharing rationales behind proposed interventions and ensuring that clients see what is in their files. In this way, clients will view themselves as being in control of their own prob¬ lems and the possible solutions. Enhancing Client Power Through Personal Change: A challenge for the struc¬ tural social worker is to maximize the client’s potential for personal change of thoughts and behaviours that are self-destructive or destructive to others without judging or blaming. This is done by focusing on clients’ strengths and helping them make the connection between their thoughts and behaviours and their social context. Moreau’s practice methods also relate to research. For instance, with respect to materialization, researchers have found a strong correlation between poverty and mental illness; the lower a person’s socio-economic status the greater their chance of suffering from mental distress. Studies have shown that poverty, unemployment diagnosis of mental illness and psychiatric hospitalization and are, therefore, causal factors (Hudson 2005). Helping clients acquire resources such as safe and stable housing, employment and adequate food is of utmost importance in the prevention and reliefof mental distress. Studies that document discrimination can be used in the process of collectivization to help clients understand the effects of social injustice (Schwartz and O’Brien 2010). Involving clients at all stages of the research, including in quantitative studies, is one way of empowering the client. This practice is encouraged in participatory action research, an activist approach to research that seeks to engage and empower the community. Proponents of participatory action research and community empowerment research argue for participant empowerment and conscientization to be reflected in research studies. In this way, the funding organization is made aware of what clients feel they need and has the opportunity to be more respon¬ sive. Furthermore, it is empowering for clients to see their ideas reflected in the and lack of affordable housing precede a results of the study. Why Statistics Are Important 5 ANTI-OPPRESSIVE PRINCIPLES (2011) identifies ten anti-oppressive practice (aop) principles, some directly relate to social work research. One that stands out is: “Social work is not a neutral, caring profession, but an active political process” (5). While most mainstream social work research texts emphasize the importance of being neutral and objective, progressive social workers argue that social work research should not be neutral; it should actively pursue social justice and social change. Another important aop principle is that “participatory approaches between practitioners and clients are necessary” (21). This means that the service user, typically the focus of the research, should be actively involved in the design of the research. Another aop principle is that “self-reflexive practice and ongoing social analysis are essential components of social justice oriented social work practice” (22). Self-reflexive Donna Baines ofwhich exercises our are biases objective, on our as or also essential in social work research. We need to become aware of they relate to our research — such as the pressure to be neutral and biases as a result of our privilege — in order to best limit their effect research. Some authors suggest that research should only be undertaken when there good fit between the researcher’s theoretical approach to practice and their approach to research (Westhues, Cadell, Karabanow, et al. 1999). A good part of students’ anxiety about conducting anti-oppressive research may result from a lack of logical fit. Research questions, methods of data collection and ways of interpreting data need to be consistent with the researcher’s theoretical analysis. “Only by knowing that oppression is a social construction can social work embark on a deconstruction of oppressive practices and reconstruction of society charac¬ terized by true social equality” (Mullaly 2007: 284). The difference between a traditional quantitative approach and a progressive approach lies in the underlying purpose ofthe research. We need to remember that the end goal of carrying out research from a social justice perspective is structural change. For example, for almost two decades, Campaign 2000, a group established in 1991, has been conducting research on child poverty in Canada. In 1989, the House of Commons unanimously adopted a resolution to work on the elimination of child poverty in Canada by the year 2000. Every year, Campaign 2000 publishes a report card to let the federal government know how it is doing in terms of meet¬ ing its promise. Clearly, the government has not achieved its goal. Nevertheless, the research conducted by Campaign 2000 serves as a constant reminder to the government to continue in its efforts. On the twentieth anniversary of the adoption of the resolution, the government renewed its commitment to reduce and eventu¬ ally eliminate child poverty in Canada. Our point here is that Campaign 2000 uses traditional quantitative methods of analysis — e.g., data from Statistics Canada is 6 — Statistics for Social Justice and produces frequency tables and graphs commonly found in mainstream, quantitative research with the goal of advocacy and creating change. PROGRESSIVE QUANTITATIVE SOCIAL WORK RESEARCH One of the earliest examples of quantitative research to promote structural change study done by Dorothea Dix in the Unites States in the early 1840s. Dix conducted a detailed examination of the treatment of people with mental illness who were held in prison. With evidence in hand, she drafted a petition to the Massachusetts Legislature. was a I the advocate of the helpless, forgotten, insane, and idiotic, of beings sunk to a condition from which the most unconcerned would start with real horror; of beings wretched in prisons, come as men and and women; wretched in almshouses.... If my pictures are displeasing, subjects, it must be recollected, offer no tranquil, refined, or composing features, (quoted in Snyder 1975: 68) more coarse our and severe, my Dix used the results of her findings to increase awareness of the inhumane and to advocate for improved conditions. She is credited with helping to establish state hospitals for mentally ill patients both nationally and internationally (Snyder 1975). Later, in 1889, Jane Addams and Ellen Starr established Hull House in Chicago. The success of their work was the result of a careful study on the needs ofthe popula¬ tion in the Chicago tenement neighbourhoods. Addams and Starr carried out both quantitative and qualitative research that included community mapping as well treatment as of these people observational and interview data. The results of their research led the state, federal and international levels. In her Addams recalls the There was, at to reforms at book Twenty Years at Hull House, importance ofstatistical evidence in promoting social change: that time, statistical information on Chicago industrial early resident of Hull House, suggested to the Illinois State Bureau of Labor that they investigate the sweating system in Chicago with its attendant child labor. The head of the Bureau adopted this suggestion and engaged Mrs. Kelley to make the investigation. When the report was presented to the Illinois Legislature, a special committee was appointed to look into the Chicago conditions. I well recall that on the Sunday the members of this commission came to dine at Hull-House, our hopes ran high, and we believed that at last some of the worst ills under which our neighbors were suffering would be brought to an end. (1912: 202) no conditions, and Mrs. Florence Kelley, an Why Statistics Are Important 7 (1997) describes research carried out in Canada during the lat¬ part of the nineteenth century by Sir Herbert Brown Ames, who was a wealthy manufacturer living in Montreal. Ames felt that his elite position carried with it a responsibility for the welfare of the working class. During the fall and winter of 1896, he surveyed each home in an area of one square mile, which included 38,000 homes in a working-class neighbourhood in Montreal. He gathered data on employment, family income, housing conditions and rental charges. His report, published in 1897, challenged conventional attitudes about the causes of poverty: Dennis Guest ter As to the of poverty, chief among them is insufficient employment. nothing is earned, although there are such subsisting more or less worthily upon charity. Almost without excep¬ tion each family has a wage-earner, often more than one, and upon the regularity with which the wage-earner secures employment depends on the scale of living for the family. (Ames 1897: 52) Few are causes the families where Ames also selected the poorest causes of the a poor, smaller sample of 323 families, which he described as a more complete explanation of the in order to provide of poverty. With 109 families, the reply was “irregularity of work.” without vocation but their employment was intermittent and often work ceased altogether for considerable periods. With 87 families, or 28 per cent, the answer was that the wage earner has no work whatsoever, nor did there seem to be any immediate prospect of getting any. With 27 families, or 9 per cent, old age has unfitted and with a like number sickness had prevented the workers from earning the or 34 per cent, The wage-earners were not requisite support. (Ames 1897: 55) As a result of his research, Ames was that poverty was instrumental in helping Canadians understand “largely rooted in economic and social arrangement” (Guest 1997:31). Another Canadian example described by Guest (1997) was the work by Leonard Marsh, former director of an interdisciplinary social science research program at McGill University. Based largely on his extensive research work as well as the work of other social scientists, Marsh published the Report on Social Security jor Canada report became the blueprint for health and social security programs developed in Canada during the 1940s, 1950s and 1960s. in 1943. This 8 Statistics for Social Justice BASIC CONCEPTS Many of the terms used in quantitative research methods have specific mean¬ ings. This section provides brief definitions of some of the basic concepts used in statistics. Many other terms are used in statistics, which are introduced in the following chapters. Data and Information The data is used in both qualitative and quantitative research; when used in normally in the form of numbers or scores. Data are the raw scores or numbers obtained using question¬ naires and other methodology. Note that data is the plural form, while datum is singular. The data are analyzed and interpreted, and the results of this analysis and interpretation is the information. Conclusions are based on the informa¬ tion resulting from analysis. For instance, if we were to conduct an evaluation of a program designed to help female survivors of abuse improve their self-esteem, we might use a standardized questionnaire that measures self-esteem — such as the Rosenberg Self-Esteem Scale. We would administer the questionnaire at the start of the program and again at the end, with the women’s scores being the data. We would then analyze the data using a statistical test to determine if the scores obtained at the end of the program are significantly higher. The results of this analysis is the information on which we can base our conclusion about the term statistics, data refers to the results of the measurements, effectiveness of the program. Variables and Constants In its simplest term, a variable is a characteristic that varies. Examples ofvariables gender, race, income, education level, religion and type of employment. A constant is a characteristic that is the same for the people or objects that are the focus the research. Referring to our example of the program for survivors of abuse, the constants for the women in our study are l) that they are female, 2) that they have experienced abuse and 3) that they are all participants in the program. often found in social work research include Conceptualization and Operationalization Quantitative research methods rely on precise definitions and measurement, and the conceptualization refers to the used in choosing and clearly defining the variables included in the study. This precise conceptualization allows others to replicate the study and allows readers of the research to be clear on what is being researched. In the above example, the variable that we are interested in is the self-esteem of the women in our study. We must therefore define what we mean by self-esteem. The term operationalization refers to the method used to term process Why Statistics Are Important 9 measure the variable. In the case of our study, after we have defined what we mean by self-esteem, we must show how we plan to measure this variable — in this case, that we intend to use a standardized instrument, the Rosenberg Self-Esteem Scale. We must demonstrate that this scale will provide the accurate and reliable data that will help us in our analysis. Reliability and Validity The degree to which the measurement instrument provides consistent results time is called reliability. While it is impossible for any instrument, especially those used to measure psychological characteristics, to provide perfectly consistent measures, we need to provide evidence that the instrument we have chosen will provide reasonably consistent results. We explain in a later chapter how this is done. Validity refers to the degree that our chosen instrument will truly measure what it is supposed to measure, and not something else. The questionnaire mentioned above may provide consistent results, but it may not measure what it is intended to measure. There are accepted ways in which validity is determined, which are explained later in the book. In our example, the Rosenberg Self-Esteem Scale has been tested for reliability and validity, which means that we can be safe in assuming that it will provide reliable and valid data. over Levels of Measurement There are four levels of measurement: nominal, ordinal, interval and ratio. A good remember the order is that the first letter of each word spells the French classifying observations into mutually exclusive categories, with no inherent order or rank. For example, if we were to ask “Are you employed?” as a yes or no question, this would be at the nominal level of measurement. The ordinal level is used when classifying observa¬ tions that are mutually exclusive and have an inherent order to them. An example is level of education: elementary school, high school or university. The third is the interval level and involves classifying observations that are mutually exclusive, have an inherent order and have equal spacing between categories. The fourth is the ratio level and involves classifying observations that are mutually exclusive, have an inherent order, have equal spacing and reflect the absolute magnitude. A lot of quantitative data is at the interval/ratio level of measurement. Typical examples way to word “noir” The nominal level of measurement involves are variables such as is income and scores on questionnaires. OVERVIEW OF THE BOOK Quantitative statistical analysis is normally divided into two broad categories: descriptive and inferential statistics. While Chapter 2 of this book provides an historical overview of empiricism, the epistemic basis of quantitative methods, 10 Statistics for Social Justice Chapters 3 to 6 introduce descriptive statistics. Descriptive statistics are used describe the characteristics of a sample or population. Most research projects carried out by social workers involve descriptive analyses. For instance, in the case of needs assessments, the data analysis can be presented using frequency distribu¬ tions, bar charts, and graphs that describe the characteristics and needs ofthe service users. In the case of client satisfaction surveys, the analysis may result in percentage distribution tables indicating the percentage of service users who are satisfied with the program. Chapter 3 introduces frequency distributions and graphs. If we want to describe a typical case in our sample, we would use measures of central tendency. The three common measures of central tendency are mode, median and mean. The mode is the value that appears most often, the median is the halfway point between the range of values, and the mean is the average of all the values. If we want to describe to what extent the values vary, the most common method is the standard deviation, which describes the average distance of all the values from the mean. Chapter 4 provides a more complete explanation of central tendency and variability. The next three chapters introduce the basic principles of statistics. Chapter 5 focuses on the properties of the normal distribution. This type of distribution is important because some ofthe more common statistical tests can only be done ifthe variables are normally distributed within the population of interest. Chapter 6 looks at the principles of hypothesis testing, which are fundamental to determining if the results of the study are statistically significant. Chapter 7 introduces sampling distributions and the various statistical tests used in quantitative methods. to The remainder of the book focuses on inferential statistics, which are used to generalize findings to a larger population based on a sample of cases. If the sample selected using a probability approach, meaning that each individual within the population of interest had an equal chance ofbeing selected, then we can infer the characteristics of the population from the characteristics of the sample. Inferential statistical tests can also show if there is a relationship between two or more variables. In the case of a summative program evaluation (discussed in Chapter 6), we would use an explanatory approach, which will tell us if there is a cause and effect relationship between the program (the independent variable) and the effect on clients (the dependent variable). For instance, did our program for women who are survivors of abuse cause the improvement in self-esteem? If we were able to use a classic experimental design and involve a control group, we would be able to compare the scores obtained by the program (or experimental) group with those obtained by the control group. We would start with a hypoth¬ esis: a testable statement describing the relationship between the independent and dependent variables. In our example of the summative program evaluation, the hypothesis would be: “Participants in the experimental group will score higher was Why Statistics Are Important than participants II in the control group.” For this type ofinferential statistical analysis (described in Chapter 8). The t test is a parametric test, dependent (outcome) variable is at the interval or ratio level, and the data are normally distributed. On the other hand, if we are only able to obtain data at the nominal level, we can still carry out a program evaluation using a non-parametric test: the chi-square test (introduced in Chapter 9). Let’s say, we are looking at a program to help women we would which use means the t test that the data for the survivors of abuse become more assertive. We want to know if women who attend the program are less likely to return to abusive situations. We would still have two independent variable would be the program attendance, and the data would be at the nominal level with two possible scores; attend or did not attend. The data for the dependent or outcome variable would also be at the nominal level variables: the and have two scores: return or did not return to an abusive situation. Another typical statistical method is testing for correlation (covered in Chapter 10). If we want to find out ifthere is a relationship between years of service and job satisfaction scores, we would find the correlation coefficient for these two variables. This test requires that the data for both variables be at the interval or ratio level. If we find that we have a very strong correlation, we can use this information to develop simple linear regression, which we can use to predict what outcome score the participant will have, such as job satisfaction, based on the level of the predictor variable, years of service. In Chapter 11, we review how the concepts presented in a the preceding chapters can help us achieve meaningful structural change. Chapter report writing and a brief introduction to a commonly used statistical software package called SPSS. 12 covers SUMMARY The important points to remember from this chapter are that social work does and should conduct research to using both quantitative and qualitative methods as a way promote social justice. Quantitative social work research should be carried out within structural and anti-oppressive practice perspective. We provided several examples of social workers who used quantitative methods to argue for social change. Finally, we defined a number of terms used in statistics that we need to understand in order to conduct research and promote social justice. a historical REVIEW QUESTIONS 1. Describe the “positivism.” How is it different from post-positivism? Why paradigms important to social workers carrying out research? Compare and contrast the “order perspective” and the “conflict perspective.” are 2. both term 12 Statistics for Social Justice Discuss how each relates to 4. 5. structural approach to social work. you feel is most important for a social worker conducting research from a structural perspective and why? Provide an example of a research study that uses traditional quantitative methods whose goal is to promote social change. Where did you hear about this research? How does it relate to what was discussed in this chapter? Discuss how a research study might conceptualize and operationalize the vari¬ 3. Which of Moreau’s five a practice methods do able of “income.” 6. Self-reflexive exercise: Take a few minutes research and statistics. Was this tistics done a to think about your history with positive experience? Was this research/sta- framework? How did it make you feel? might your past experiences with research and/or statistics be shaping the way you view or feel about this course? How using a structural or aop 2 The History of Empiricism In this chapter, we trace the history of empiricism from the early writings of Aristotle to contributions made by British empiricists. We also show that, while empiricism was developed primarily by white male philosophers, it has recently been modified by feminist scholars, whose work illustrates that more empiricism can be employed within a structural and anti-oppressive approach ARISTOTLE For Aristotlphysics e, commonl y consiand deredthetohuman be the fisciences, rst empiricist, naturalandsciethics, ences, and biology, such the politics such as have the as level of validity as mathematics. Because of his devotion to science, Aristotle is generally credited with being the Father of Modern Science. His influ¬ ence in the area of logic and scientific investigation on Western thinking cannot be overstated; empiricism became the dominant scientific method for acquiring knowledge. From Aristotle, the Western world learned about the importance of observation and the structure of logic and of deductive reasoning, which are at the heart of the scientific method. Aristotle’s approach shows us the promise and peril of empirical science, as well as the important distinction between craft and science. Although Aristotle introduced empiricism, it was primarily the seventeenthcentury British philosophers who established it as the dominant scientific method of the Western world. The British empiricists were a product of the modern age, when science gradually replaced religion as the chief source of knowledge. In some respects, because of unquestioned and almost dogmatic faith, science became the new religion. Social work was part of Western culture, which was convinced that science was capable of solving the ills of the world, including disease and poverty. This is why, even today, so many in our profession are reluctant to choose post¬ positivist and interpretist methods instead of empirical methods as important ways of knowing. While the academic part of social work has largely moved beyond the positivist view and its faith in empiricism, the professional part of social work is as commit¬ ted as ever. We see this in the extent to which evidence-based practice (ebp) same 13 Statistics for Social Justice 14 has been adopted by our profession. The underlying principle of evidence-based use knowledge that is gathered and tested empirically to guide their practice. In social work research, especially in the area of program evaluations, funders often insist on empirically tested outcomes. In fact, social work practitioners are still told to develop intervention plans with clear measurable goals and objectives. practice is that social workers should THE BRITISH EMPIRICISTS Thomas Hobbes born in Malmesbury, England, in 1588, the year of the Spanish Armada. He was educated at Oxford and served as a tutor to the Cavendish family, and as a secretary and clerk to Francis Bacon. In 1640, he published The Elements of Law, Natural and Politic, in which he described the principles of his philosophy on human nature and human society. That same year, he fled to Paris in anticipation of the civil war in England, and he remained there for more than ten years. During this period, Hobbes became tutor to the future King Charles II. He also became familiar with Descartes’ work and wrote a critique on his philosophy of the mind. Hobbes established himself as a materialist and believed that there was no such thing as a non-material mind. Historians view Hobbes and Descartes as founders of two opposing schools of philosophy: British empiricism was and continental rationalism. IfThomas Hobbes is described as the founder ofBritish empiricism, John Locke the influential. He born in Somerset, England, the son of a gentle¬ fought in the parliamentary cavalry. He attended school at Westminster, where he studied Latin, Greek and Hebrew. He obtained his master of arts degree from Christ Church, Oxford and, upon graduation, became interested in chemistry and physiology and spent the next several years studying medicine. In 1667, Locke became physician and political advisor to the Earl of Shaftesbury, a member of King Charles II’s inner cabinet. However, when Shaftesbury was found to be involved in a plot to exclude James II, the Catholic brother of the king, from his rightful claim to the throne, he had to flee to Holland. Locke was likewise obliged to flee. It was during his stay in Holland, in 1690, that Locke produced his greatest philosophical work, An Essay Concerning Human Understanding, which is regarded as one of the world’s classics. Upon his return from exile, Locke obtained various posts in the civil service, and in 1704, after years of ill health, he passed away. Locke lived during a time of great political and religious upheaval. Like his predecessors Descartes and Hobbes, Locke’s ideas were revolutionary and chal¬ lenged the religious dogma of the day, but unlike them, society was more ready and able to accept his work. An Essay Concerning Human Understanding built upon Hobbes’s theories on sensations and proposed that all knowledge is derived from was man most who was The History of Empiricism 15 experience. He argued that the mind of a child is like a “tabula rasa," a blank slate upon which experience writes. Let us then suppose the mind to be, as we say, white paper void of all char¬ How comes it to be furnished ? Whence comes it by that vast store which the busy and boundless fancy of man has painted on it with an almost endless variety? Whence has it all the materials of reason and knowledge? To this I answer, in one word, from experience; in that all our knowledge is founded, and from that it ultimately derives itself. Our observation, employed either about external sensible objects, or about the internal operations of our minds perceived and reflected on by ourselves, is that which supplies our understanding with all materials of thinking. These two are the fountains of knowledge, from whence all the ideas we have, or can naturally have, do spring. (Locke 1993: 45) acters, without any ideas. As part ofhis theory of knowledge, Locke introduced the concept of probability. suggested that human reason could be divided into two parts, those of which an individual is certain and those of which “it is wise to accept” but which only have the probability ofbeing true. He explained that since there is very little in our world that we know with certainty to be true, we need to act based on the probability of something to be true (Russell 1972). He believed that the real essence of things are unknown to us; we cannot have true knowledge of items in the natural world but only a probable belief. We can only have knowledge of things within the bounds of our sensations, and the “love of truth” should keep us from going beyond this point (Kenny 2004). In this respect, Locke anticipated the importance of prob¬ ability, particularly as it is used in quantitative analysis. Locke greatly advanced the philosophy of knowledge. While many of his ideas were challenged because of their inconsistencies, during their time they were revolutionary and greatly influenced philosophy in England and throughout Europe. For his contribution, He he is considered the Father of Empiricism. The member of the British discuss, George Berkeley, Kilkenny, Ireland. When he was fifteen, he attended Trinity College in Dublin, where in 1704 he obtained a bachelor of arts. On the strength of two mathematical papers, he became a fellow of the College. His most famous works were published when he was still quite young. In 1709, when only twenty-one, he wrote An Essay Towards a New Theory of Vision, in which he gave an account of how we judge distances. Distances and sizes, he argued, we judge by vision, while shapes are judged by touch. We learn to judge these qualities by means of experience. In 1710, he wrote The Principles of Human Knowledge, where he proposed and ingeniously defended that there is no such thing as matter. Matter was next born in 1685 near empiricists we 16 Statistics for Social Justice experienced by means of the senses is only an idea and does not actually exist reality (Kenny 2006). In 1728, he sailed to Newport, Rhode Island, in the Unites States, hoping to establish a college. However, when a promised grant did not materialize, he returned to England. Nevertheless, citizens of the United States were impressed by his commitment to education and named a college at Yale after him and, much later, a university town in California. In 1734, he became the Bishop of Cloyne and, in addition to his pastoral duties, continued to write and study until his death in 1753. While accepting the basics of learning through experience, he believed that all the qualities exist in the mind of the individual from information derived through the senses. He suggested that the senses only perceive light, sound and smell but make no inferences about them. He distinguished between ideas and perception, stating “to be is to be perceived” (“esse ispercipi”). He proposed that the only reality is the mind, since it is impossible for something to be thought about, without its existing, even ifonly as a mental image. He agreed with Locke’s beliefthat complex ideas are formed in the mind by the association of simple ideas. He also believed that, through experience, the mind is able to create associations of ideas. The last of the British empiricists we consider took empiricist principles to an extreme. David Hume was born in Edinburgh, Scotland, in 1711, into a junior branch of a noble Scottish family. He was the youngest son of a widowed mother and consequently had to learn to make his own way in the world. From age twelve to fifteen, he studied literature and philosophy at the University of Edinburgh. He attempted to enter the legal profession but gave up because of an irresistible inter¬ est in philosophy. His attempts to enter the business world resulted in the same conclusion. Instead, he decided to move to France and live frugally on a small inheritance. He attended La Fleche College, where Descartes had studied over a hundred years earlier. Hume’s first major work, A Treatise of Human Nature, was published in 1739, when he was just twenty-eight. While Treatise initially received little attention, it became the main target of criticism of the German idealists. While struggling to get recognition as a scholar, he worked as a diplomat and in various government services. He nevertheless continued to write, and, in the 1750s, when his publications finally began to sell, he at last enjoyed some prosperity and rec¬ ognition. Hume retired in 1769 and returned to Edinburgh, where he lived until as in his death in 1776 (Kenny 2006). exploring the question of the mind, and going a step fur¬ ther than Berkeley, he completely did away with the mind as an entity beyond the sensible qualities available in experience and proposed instead that what is believed to be the mind is “the flow of ideas, memories, imagination and feelings” (Chaplin and Krawiec 1968: 21). The mind is simply a bundle of such processes; it is not observable itself, but only through its action in perception and thought. Hume continued The History of Empiricism 17 Hume distinguished between various types of ideas and suggested that ideas of closely related to the original perception while ideas of imagination are less distinct. In this way he acknowledged that there are mental processes at work organizing these various ideas into mental constructs. He also agreed with his empiricist predecessors that complex ideas are combinations of simple ideas formed by association. memory are We find by experience, that when any impression has been present with again makes its appearance there as an idea; and this it may do after two different ways: either when in its new appearance it retains a considerable degree of the first vivacity, and is somewhat intermediate betwixt an impression and an idea. The faculty, by which we repeat our impressions in the first manner, is called the memory, and the other the imagination. ’Tis evident at first sight, that the ideas of the memory are much more lively and strong than those of the imagination, and the former faculty paints its objects in distinct colours, than any which are employ’d by the latter. (Hume 2003: IS) the mind, it pointed out by Chaplin and Krawiec (1968), Hume had a reductionistic and possible a more contemporary, positivist paradigm. By giving such prominence to primary perceptions and minimizing the existence of even simple concepts of knowledge, the British empiricists radically changed the course of Western philosophy. As mechanistic view of the mind that made FEMINIST EMPIRICISM Until recently, the dominant view of the Western scientific world that only possessed the necessary qualities to engage in science. Linda Jean Shepherd, biochemist and author of Lifting the Veil: The Feminine Face of Science, noted that was men when the institutions of science were forming during the mid-seventeenth century, Royal Society of London believed that its business was “to raise a Masculine Philosophy” (1993: 19). Shepherd argued that the commonly held view was that scientists were to be rational, neutral and objective (and male). During the twentieth century, the traditional view of science and philosophy as solely the domain of men was challenged. Very slowly, a few influential female scientists paved the way for a feminist epistemology. Some notable examples include Marie Curie, who won a Nobel Prize in physics in 1903, and in chemistry in 1911, for her work on radioactivity. Dorothy Crowfoot Hodgkin won a Nobel Prize in 1964 in chemistry for her work on the structures ofbiochemical substances. In 1983, Barbara McClintock won a Nobel Prize in physiology of medicine for her work in genetics. As of 2012, only forty-three women had been awarded a the Statistics for Social Justice 18 Nobel Prize, out of 862 people and organizations that had been named laureates (Connelly 2012). Feminist empiricists accept the basic principles of traditional empiricism — only knowledge based on direct observation through the senses should be accepted as scientific fact. They also uphold the empiricist view that science is about formulating hypotheses that must be tested against experience. However, they believe that traditional empiricism can be improved upon by making certain modifications. They suggest that science is not value-free and that the scientific that method is not sufficient to screen out Alessandra Tanesini states that there all of the influence of values. are two main criticisms of traditional “male” epistemology; the first concerns individualism. Philosophers such as Descartes and Locke felt that to achieve true knowledge an individual must free himself from the influences of society. Stemming from this, individualism holds that knowledge only achievable by a fully autonomous and separate individual and rejects the notion that social factors play a role in the production of knowledge. Feminist epistemologists severely criticize the notion that knowledge is achiev¬ able only by an autonomous and separate individual knower. Instead they believe is that social factors both relevant to, and among the causes of, knowledge, and bearing on what they know. With respect to feminist work, Gayle Letherby states: “It is important that we recognize the importance of our intellectual biography” (2003: 8). Some feminists suggest that it is not the individual who has knowledge, but the community. Lynn Hankinson Nelson, for example, argues: “It is the communities that construct and acquire knowledge” (1993: 123). A second criticism looks at the concept of the knower. The traditional subject was emotionally detached, objective and value-neutral. As such, scholars were expected to write using the third person passive. Those of us trained in the tra¬ ditional scientific method were expected to say “it was found that” rather than “I found that.” Letherby argues that feminist researchers should write in the first person. The characteristics of emotional detachment and objectivity were associ¬ ated with “maleness” and contrasted the more “feminine” characteristics, including nurturing, receptivity, cooperation and intuition (Shepherd 1993). It becomes evident in the literature that while the traditional epistemic subject was supposed to represent the universal subject, in reality, the subject was male (Tanesini 1999). Feminist epistemologists believe that detachment and value-neutrality is nei¬ ther possible nor desirable. They posit that values cannot be turned on or off like a switch and suggest that emotions and values should be acknowledged. Once acknowledged, emotions and values can enrich an investigation. Letherby argues: “The ‘value freedom’ of traditional research is challenged but not the empiricist goals” (2003: 44). are that the social location ofthe knower has a The Feminist History of Empiricism 19 empiricism is described by Sandra Harding as an attempt to rectify the sexism and androcentrism of current science by arguing that these are “social by stricter adherence to the existing methodological norms of scientific inquiry” (1987: 24). There are two ways in which these biases can be eliminated or at least minimized. The first is by recognizing that the “context of discovery” is just as important and the “context of justification.” The context of discovery refers to the elements that play a role in the discovery of theories. These elements can include, among others, 1) research hypotheses, 2) theoretical and conceptual framework and 3) method of analysis. The context ofjustification refers to the choice of evidence used to argue in favour or against a theory. In other biases correctable words, choices are made in terms of what evidence is examined and what evidence (Tanesini 1999). which the impact of biases can be eliminated or reduced is by focusing on the social location of the researcher. is deemed irrelevant Another way in Traditional empiricism does not direct the researcher to locate themselves plane as their subject matters. Consequently, when non-feminist researchers gather evidence for or against hypotheses, scientific method bereft of such a method is impotent to locate and eradicate the androcentrism that shapes the research processes. (Harding 1987:184) in the same critical Because traditional that they are neutral and objective, they acknowledge that social factors influence and shape their research. There are two kinds of feminist empiricism: contextual feminist empiricism, developed by Helen Longino (1990); and naturalized feminist empiricism, developed by Lynn Nelson (1990). We examine the key concepts ofeach by focus¬ ing on the contributions of each of these scholars. are not able empiricists assume to Helen Longino Helen professor of philosophy at the University of Minnesota. She began her career as an activist in the anti-war and women’s liberation movements of the 1960s and 1970s. She was chair of the American Philosophical Association s Committee on the Status of Women in the Profession and has written extensively on feminist epistemology. Longino first began working on her book Science as Social Knowledge (1990) out of frustration that traditional philosophy of science had not acknowledged the relationship between social values and scientific inquiry. As an alternative to the “value-free” concept of science, her book provides an analysis that reconciles social values with the objectivity of science (1990: ix). Longino begins by making Longino is a the distinction between the “constitutive” values of science — that is the values or 20 Statistics for Social Justice rules that determine what is with the “contextual” values, philosophy of science main¬ tains that these two are distinct and independent of one another. Longino, on the other hand, states that the “influence exerted by social and cultural context on the directions of scientific development have led many observer-critics of science to reject the value-freedom of science” (6). She believes that scientific knowledge must be viewed within its political, social and cultural context and that scientific knowledge is social knowledge and can be achieved only by individuals working which are within a acceptable science — the social and cultural values. Traditional community context. In her first chapter in Good Science, Bad Science, Longino covers what she principles of epistemology. First, she states that science means knowledge and true knowledge is achieved by scientific investigation. Second, the philosophy of science relies on general criteria such as truth, accuracy, simplicity and predictability; she believes that these criteria should produce “good” science. However, while traditional epistemology insists that allowing the influence of social and cultural values results in “bad” science, Longino argues “not only that the science practices and content on the one hand and social needs and values on the other are in dynamic interaction but that the logical and cognitive structures of scientific inquiry require such interaction” (1990: 5). Longino concludes by proposing that scientific inquiry should never stop. The solution to making science less traditional is that science should be ongoing and embrace multiple social contexts — even those that are incompatible. It should always be open to divergent ideas, including those which are radically different than the commonly held ideas of a community. By acknowledging the social context of science, Longino makes an important contribution to feminist empiricism and feminist epistemology more generally; however, some feminist scholars, such as Lynn Harkinson-Nelson, suggest that it does not go far enough. believes are basic Lynn Hankinson-Nelson professor of philosophy at the University of Washington. She obtained a B.A. in 1980 from Rutgers University in New Jersey and a Ph.D. in 1987 from Temple University in Philadelphia. Her areas of expertise include feminist epistemology, feminist philosophy of science and philosophy of biology and the social sciences. Like Longino, Nelson accepts the basic principles of empiricism and also agrees that science should acknowledge the influence ofvalues (Nelson 1990). Where she differs from Longino is in her beliefthat science is not only influenced by social and Lynn Hankinson-Nelson is a cultural values but that science is a social endeavour. Naturalized feminist empiricism shares the contextual feminist empiricist view that science is not value-free and must be understood within its political, social and cultural context. However, it goes further The History of Empiricism 21 in challenging traditional empiricism by suggesting that even the theories upon which are based are socially constructed. In traditional empiricism, a scientific investigation begins with a theoretical framework upon which a research hypothesis is formulated. Next, data are collected and analyzed, which will prove or disprove the hypothesis. The problem is that if the theory is socially constructed, the data are analyzed and conclusions drawn in the context of that socially constructed theory. The same data could result in very different conclusions if the investigation was based on a different socially constructed theory. Looking at our social work theories as examples, and comparing psychodynamic theories to structural theories, the former emphasizes ego functioning whereas the latter focuses on the structural causes of the problems faced by individuals (Lundy 2004). The naturalized feminist empiricist argues that data are not independent of the theory, but that theory are influenced by how the data are observed, measured and understood. Lynn Nelson accepts “that there is a world that shapes and constrains what is reasonable to believe, and that it does so by impinging on our sensory receptors” (1990: 20). She adds that “there probably is no logical coherent way to doubt (this) thesis” (20). The difficulty is, as Nelson points out, that once our mind has processed this sensory information, the end product bears little resemblance to the original sensory information. The process of conceptualizing the “raw ” sensory data requires language that is shaped by history and culture and involves psychological processes, which act as a filter. Any attempt to strip away these cultural and psycho¬ logical processes to return to the original sensory data is bound to meet with failure. What is required is an extra-theory which explains how we theorize and identify the connection between the “natural world” and our conceptualization of that world (Quine 1972). Traditional empiricists believe that epistemology, the study of how we understand knowledge, provides the “extra-scientific” justification of science. Nelson, who relies heavily on the work of WV.O. Quine, states that no such “extra-scientific” justification is possible and that epistemology is “within” scientific discoveries science. are Nevertheless, Nelson insists that made and the link between sensory an account of how scientific discoveries evidence and scientific discoveries, while difficult, should be possible. Nelson identifies the main feminist criticisms of traditional empiricism, point¬ ing out that traditional empiricists have ignored the sex/gender issue as it relates Traditional empiricists insist that their work focuses the “human experience” and that it includes women; however, there is ample evidence that the experiences of women and men are substantially different. If this is the case, Nelson asks,-why has traditional science refused to include sex/gender issues as a variable in its investigation? One of the fundamental principles of science is that its conclusions are always tentative. As new theories and discoveries are made, they replace old, less adequate ones. If this is true, traditional science should be to science. on 22 Statistics for Social Justice willing to take seriously the criticism that sex/gender issues have not been dealt with adequately as intervening variables. Another important criticism is the “individualism” of science. Traditional empiricists hold that science is an individual pursuit and is not connected to any particular community. Nelson argues that there is a scientific community that it is very much part of the larger social community, and its values are reflected in the scientific work being conducted. Since the values of society have been historically androcentric, Feminist so have the values of the scientific community. empiricism is different than traditional empiricism in number of feminist empiricism is not individualist but acknowledges that science is a community activity and that it reflects the values of society. Science is also done with subjects and that the participation ofthe subject should be acknowledged. Finally, a feminist empiricist recognizes the "who” that is theorizing matters and believes that they are not and cannot be neutral or objective. areas. a First, feminist empiricism incorporates feminist insights. Second, IMPLICATIONS FOR SOCIAL WORK KNOWLEDGE The introduction of empiricism marked a turning point in the Western scientific complete is our belief in the scientific method based on empiricism that, for most of us, it has become the unquestioned assumption on how to achieve true knowledge. In our Western world, only phenomena whose existence can be proven scientifically should be accepted as fact; true knowledge must be empirically verifi¬ able. The scientific method based on empiricism heavily influences social work, just as it does all disciplines. The most important contribution of feminist empiricists was that they chal¬ lenged the dogmatic position that science is individualistic, objective and value-free. While upholding some of the basic premises of empiricism — that knowledge must be based on empirical evidence — feminist empiricists successfully tested the individualism of science and provided sound arguments to prove that knowl¬ edge should always be viewed within a social context. They also demonstrated that science can never be completely objective and that being value-free is neither possible nor desirable. This affirms our belief that basic social work values as expressed in our codes of world. So ethics must be followed even as we conduct research. For instance, the International Federation of Social Workers (ifsw) states: “Social workers have a responsibility promote social justice, in relation to society generally, and in relation to the people with whom they work” (2012). Potts and Brown explain: “The purpose of anti-oppressive research is not only to produce knowledge but also to examine, unsettle and shift the power relations” (2005: 5). to The History As of Empiricism 23 stated, feminist empiricists uphold the notion that the basic principles of sound. For social work, this suggests that we should continue to including, but not exclusively, quantitative methods. Like other empiricists, feminist empiricists believe that the scientific method is still the most effective method available for acquiring knowledge; that is, they believe that there is a world out there that we can discover through our senses. They point to the tremendous scientific achievements of the last centuries as proof, all the while acknowledging that improvements to the scientific method are needed. In social work, the type of research approach we choose should depend on the nature of the topic being researched. A qualitative approach may be more suit¬ able for the investigation of new phenomena about which little is known. On the empiricism are teach basic research methods, other hand, when program, The we want to know if a certain cause, such as a new intervention produces the desired effect, we may want to use a quantitative approach. question is not which approach is better but which approach will best provide the information research question. In many cases it makes sense to qualitative and quantitative methods in order to take advantage of the best of both approaches Contrary to the beliefheld by much of the Western scientific world, the knower is not simply an individual, but rather is an individual who is part of a larger com¬ munity. An important contribution of feminist epistemologists was their ability to demonstrate that science is a social endeavour that it is usually conducted within a scientific community, which in turn is part of a larger society and which is subject to the social, cultural and political values of the time and place. An important aspect of social work training is reflexivity. Before we engage in practice with oth¬ ers, through introspection and intersubjective reflection, we critically examine our values and biases and the impact they have on our practice. The same is true with to answer our combine social work research. Before we can values and biases and how our conduct and what to hold become a knower, we should be social location affects the type be aware of our of research that we knowledge. Finally, and most importantly, feminist empiricists have affirmed the concept that the knower is not simply male. It is safe to say, that while patriarchy is still prevalent within the scientific community, in social work at least, both women and men are engaged in scientific research. As a profession committed to eradicating all forms of oppression, including the oppression of women, the social work profes¬ sion sees this development as long overdue. we true 24 Statistics for Social Justice SUMMARY In this chapter, we explored the history of empiricism and the roots of the scientific early writings of Aristotle and examined the contribu¬ tions made by the British empiricists. We also looked at the important modifications to empiricism proposed by feminist empiricists, whose work brought it to a point where it is now more in keeping with a structural and anti-oppressive approach. method. We discussed the REVIEW QUESTIONS 1. 2. Who of the key players in bringing empiricism into Western think¬ ing? Discuss their individual contributions. Discuss this statement: “While the academic part of social work has largely moved beyond the positivist view and its faith in empiricism, the professional part of social work is as committed as ever.” Has this been true in your experi¬ were some ence? 3. What were the contributions of feminist empiricism? Discuss how their ideas to are empiricists to our understanding of in line with the structural approach social work. 4. Describe how naturalized feminist empiricism is in search from a structural and aop perspective. 5. “Science is a social endeavour.” Discuss what this How does it relate to research conducted from a line with conducting re¬ statement means to you. structural perspective? 3 Frequency Distributions and Graphs In this chapter we focus on descriptive statistics and explain various methods used to describe data. We begin by introducing different kinds of frequency distributions, including absolute, cumulative and percentage distributions. We describe different types of graphs, such as bar graphs, pie charts, histograms, frequency polygons and scatterplots. We also provide guidance on how to best use graphs, how to avoid common mistakes and how to maximize the impact of descriptions for the purpose of meeting social justice goals. Ihe chapter concludes with examples of studies using frequency distributions As stated in Chapter 1, quantitative are normal ly divMost ided categories: descriptiveresearch statisticsmethods and inferential statistics. into two broad research projects carried out by social workers include some form of descriptive analysis and may also involve inferential analysis. Often, descriptive statistics are all that are required to answer important research questions, such as, what percentage of the population in our community is living below the poverty line? In Chapter 1, we described Campaign 2000, the national coalition dedicated to the reduction in child poverty in Canada. It publishes a government report card every year in which it makes extensive use of descriptive statistics to show the progress, or lack thereof, on reducing child poverty. The following hypothetical example illustrates how descriptive statistics can be used. June Edwards is a young social worker who has been hired to work in the Community Housing Department for her city. The mission of Community Housing is to provide safe affordable housingfor individuals and families whose income is insufficient to pay market level rent. June is aware that government cutbacks in social assistance rates and in subsidized housing have made it impossible for Community Housing to meet the increasing demands from low-income people. As a result, the waiting listfor housing is growing quickly. While she accepts the fact that she has limited control over government funding, she and her colleagues feel that, at the very least, they should document the extent and characteristics ofpeople on the waiting list in order to determine which ones are most in need. June is also hoping that this information will lead to public pressure to increase government funding. 25 26 Statistics for Social Justice How should June and her colleagues proceed? In what way should they describe people on the waiting list for subsidized housing? What format will provide the most useful information and have the greatest impact in terms of increasing public awareness and interest in the problem? In Chapter 1, we stated that descriptive statistics are used to describe the characteristics of a sample or population, while inferential statistics are used to show a relationship between two variables or to generalize findings to a larger population based on a sample of cases. the characteristics of the FREQUENCY DISTRIBUTIONS The simplest way to describe the sample (or the population) under investigation is by using frequency distributions. These are tables that display the number of times a specific value appears for a particular variable. A variable is any characteristic in a given population or sample that varies, and the values are the different measure¬ ments that a variable can take. In June s study, one of the variables that they would naturally be interested in is income, and the values would be the different income levels (e.g., <$10,000, $10,001-$20,000, $20,001-$30,000, etc.). To construct a frequency distribution, we start by creating an array, which dis¬ plays all the values that exist within the data in an orderly manner, from the lowest value to the highest value. In our hypothetical example, suppose June noticed that there are a growing number of single-female-headed families with children on the waiting list for subsidized housing. Let’s say that she is interested in finding out about the number of children in these families. This information would be because it impor¬ indication of the sizes of the housing units needed by month, Community Housing received requests from 25 such families in need of subsidized housing. Before creating an array, June lists the names of each of these women along with the number of children they have (Table 3.1). tant provides these families. In the an most recent Frequency Distributions and Graphs Table 3.1 Raw Data Name - 27 Clients’ Names and Number of Children Number of Children Cindy 1 Helen 2 Kathy 1 Chelsea 3 Taylor 2 Nancy 1 Rebecca 4 Ann 5 Paula 2 Zaina 1 Jenny 1 Mary 2 Maria 4 Anna 1 Julia 2 Elizabeth 1 Irene 3 Gisele 2 Sophie 1 Colleen 1 Sarah 2 Claudia 3 Helene 1 Pauline 1 Stephanie 2 Looking at the raw data, June can already see that most of the families have only one or two children. In order to display her observation more clearly, she creates an array (Table 3.2) to organize her data from the lowest to the highest number of children. This array shows her that 11 of the 25 families have 1 child and 8 families have 2. 28 Statistics for Social Justice Table 3.2 Array - Clients’ Names and Number of Children Name Number of Children Cindy 1 Kathy 1 Nancy 1 Zaina 1 Jenny 1 Anna 1 Elizabeth 1 Sophie 1 Colleen 1 Helene 1 Pauline 1 Helen 2 Taylor 2 Paula 2 Mary 2 Julia 2 Gisele 2 Sarah 2 Stephanie 2 Chelsea 3 Irene 3 Claudia 3 Rebecca 4 Maria 4 Ann 5 Absolute Frequency Distribution The array already provides June with helpful information with respect to the size of by these families. In order to make these data more useful, June creates an absolute frequency distribution, which displays the data on the variable the units needed she is interested in along with the different values. The variable is the number of children and the values Having already created each value the actual number of children in each of the families. an array, June is able to quickly count how many times places this number in the “Absolute Frequency” column (Table 3.3). occurs for each value are and Frequency Distributions and Graphs Table 3.3 Absolute 29 Frequency Distribution - Clients’ Names and Number of Children Number of Absolute Children Frequency Cindy, Kathy, Nancy, Zaina, Jenny, Anna, Elizabeth, Sophie, Colleen, Helene, Pauline 1 11 Helen, Taylor, Paula, Mary, Julia, Gisele, Sarah, Stephanie 2 8 Chelsea, Irene, Claudia 3 3 Rebecca, Maria 4 2 Ann 5 1 Names Cumulative Frequency Distribution June also creates a table to show the cumulative frequency distribution (Table 3.4), which displays the cumulative total of the values. For instance, June found that 11 of the 25 clients have 1 child and 8 of the clients have two children. So the cumulative total for families with 1 Table 3.4 Cumulative or 2 children is 19. Frequency Distribution - Clients’ Names and Total Number of Children Number of Cumulative Children Frequency Cindy, Kathy, Nancy, Zaina, Jenny, Anna, Elizabeth, Sophie, Colleen, Helene, Pauline 1 11 Helen, Taylor, Paula, Mary, Julia, Gisele, Sarah, Stephanie 2 19 Chelsea, Irene, Claudia 3 22 Rebecca, Maria 4 24 Ann 5 25 Names Percentage Distribution Because for our example includes only 25 cases, a relatively small number, it is understand the breakdown without the need for any easy additional larger? It may become necessary to use different tables to make the data more readable. In this case, it may be useful to create tables to show percentage distributions. June creates a percentage distribution table in order to display these data in the form of percentages. She is aware of the fact that as the waiting list for housing someone to tables. But what if the number of cases was much 30 Statistics for Social Justice grows, ful and she will have to to use percentage allow the reader to distributions to make her data understand the distribution of families more use¬ waiting for subsidized housing. With the current 25 cases, the 11 families who have 1 child represent 44% of the total number of families. The 8 families with 2 children rep¬ resent 32% of the total number of families (Table 3.5). Table 3.5 Percentage Distributions - Clients’ Names and Number of Children Number of Names Children Percentage Cindy, Kathy, Nancy, Zaina, Jenny, Anna, Elizabeth, Sophie, Colleen, Helene, Pauline 1 44% Helen, Taylor, Paula, Mary, Julia, Gisele, Sarah, Stephanie 2 32% Chelsea, Irene, Claudia 3 12% Rebecca, Maria 4 8% Ann 5 4% did for the absolute and cumulative frequency distributions, she a cumulative percentage distribution. June simply totals the per¬ centage for each value by adding it to the percentage ofthe lower value (Table 3.6). For example, the cumulative percentage for the value “2” is calculated by adding 32% to 44%. The advantage ofthis table is that it shows the cumulative percentage of families with certain numbers of children. June could show that 76% offamilies on the wait list have 1 or 2 children. Down the road, this information could help with program planning. Just as June could also create Table 3.6 Cumulative Percentage Distributions - Clients’ Names and Number of Children Number of Cumulative Children Percentage 1 44% Helen, Taylor, Paula, Mary, Julia, Gisele, Sarah, Stephanie 2 76% Chelsea, Irene, Claudia 3 88% Rebecca, Maria 4 96% Ann 5 100% Names Cindy, Kathy, Nancy, Zaina, Jenny, Anna, Elizabeth, Sophie, Colleen, Helene, Pauline Frequency Distributions and Graphs 31 It is often useful to show both the absolute distributions in one table. In Table 3.7, frequency and the percentage June shows the absolute frequency and percentage distributions. In Table 3.8, June displays the cumulative frequency and percentage distributions. Table 3.7 Absolute Frequency and Percentage Distributions - Clients’ Names and Number of Children Number of Absolute Absolute Children Frequency Percentage 1 11 44% Helen, Taylor, Paula, Mary, Julia, Gisele, Sarah, Stephanie 2 8 32% Chelsea, Irene, Claudia 3 3 12% Rebecca, Maria 4 2 8% Ann 5 1 4% Names of Clients Cindy, Kathy, Nancy, Zaina, Jenny, Sophie, Colleen, Anna, Elizabeth, Helene, Pauline Table 3.8 Cumulative Frequency and Percentage Distributions - Clients’ Names and Number of Children Number of Cumulative Cumulative Children Frequency Percentage 1 11 44% 2 19 76% Chelsea, Irene, Claudia 3 22 88% Rebecca, Maria 4 24 96% Ann 5 25 100% Names of Clients Cindy, Kathy, Nancy, Zaina, Jenny, Colleen, Anna, Elizabeth, Sophie, Helene, Pauline Helen, Taylor, Paula, Mary, Julia, Gisele, Sarah, Stephanie Grouped Frequency Distribution When working with large data sets with many possible values, it is sometimes comprehend the meaning of the data if they are grouped. For instance, June is interested in creating a table that displays the various income levels of all of the families on her waiting list. Rather than including every annual income amount, it makes more sense to group the incomes of the clients. June creates groups of income, such as $0 to $4,999, $5,000 to $5,999, $6,000 to $9,999 and so on (Table 3.9). easier to visualize and Statistics for Social Justice 32 Table 3.9 Grouped Frequency Distribution Income Levels and Number of Families Number of Families Income Level Absolute Frequency $0 to $4,999 5 $5,000 to $5,999 16 $6,000 to $9,999 28 $10,000 to $14,999 14 $15,000 to $19,999 6 $20,000 and - 2 over Determining which frequency distributions to use depends on what information most useful and have the greatest impact. As social workers, we are naturally concerned about the plight of individual families and their children, but providing the numbers in a manner that is easily understandable to agency administrators, will be government departments and the general public will have the best results in terms of reaching our social justice goals. (See Box 3.1 for an example.) Box 3.1 Shelter Use The by Homeless Families following summary describes a study using descriptive statistics. Homeless families diverse in structure, with some including two parents, and headed by a single parent (usually female). Family homelessness is largely underpinned by structural factors, including inadequate income, lack of affordable housing and family violence. Following the withdrawal of government housing programs and decreased supports, more families are turning to emergency shelters. are many A significant finding from the Segaert study was that the sharpest increase in shelter has been among families (in most cases headed by women) and therefore children. For instance, the number of children staying in shelters increased by over 50% between 2005 (6,205) and 2009 (9,459). Segaert identifies that the average length ofshelter stay for families was 50.2 days, an increase of 50% over five years, and more than triple the average stay for the total population of people who experienced use homelessness. This that while families accounted for just 4% of all shelter they used 14% of total bed nights. This puts incredible pressure on the family shelter system, which has not had the capacity to deal with this increase. It is worth noting, once again, that these figures do not include female-headed families using shelters providing accommodation for women fleeing violent partners. means stays, Source: Sagaert 2012 Frequency Distributions and Graphs 33 GRAPHS While frequency tables are useful for presenting data in a readable manner, with larger data sets, graphs often display these data in a more easily understandable way. The advantage of graphs is that they can quickly convey, in the form of a picture, large volumes of data. The drawback is that they often sacrifice detail, but they are nevertheless effective in conveying meaning to the audience. This may be especially important, for example, when making a presentation to a grassroots association whose members do not have the academic training to comprehend tables that convey large data sets. Graphs displaying one or two variables generally have two perpendicular lines. The horizontal line is called the x axis, the vertical line is they axis, and the point where they join is the point of origin. Bar A Graph of graph is the bar graph, also called a bar chart. In this graph, equal widths and they do not touch; this is to acknowledge that the data are qualitative in nature with values at the nominal or ordinal level of measurement. As stated in Chapter 1, nominal level measurement involves classifying observations into mutually exclusive categories, with no inherent order or rank. If we use the example of June’s study, looking at families with children, versus families with no children, versus single adults, we could create a frequency distribution table with the three categories of clients. Let’s assume that there are a total of 26 families with children on the waiting list, 15 families with no children and 38 single individuals. Table 3.10 shows the frequency distribution for these data. Category one includes families with children, category two shows families with no children and category three covers the single individuals. Figure 3.1 is a bar graph showing these same data. The bar graph clearly shows that the majority of clients are single individuals with no children and the next largest category is common the bars are type drawn in families with children. Table 3.10 Categories of Clients Cumulative Clients Categories Frequency Percent Families with Children 1 26 32.9% Families, 2 15 19% 51% 3 38 48.1% 100% 79 100% no Individuals Totals Children Percent 32.9% 34 Statistics for Social Justice Figure 3.1 Bar Graph - Three Categories of Clients Client Categories Pie Charts Pie charts generally used for nominal level data. When the values add up to a pie chart can be used. Each piece of the pie can reflect a segment of the whole. If we again turn to the example ofthe data from June s study shown in Table 3.10, which displays the three categories of clients, the pie chart for these data is illustrated in Figure 3.2. whole, are a Figure 3.2 Pie Chart - Three Categories of Clients clients â– roc â–¡ :oc â–¡ 3.00 Frequency Distributions and Graphs Pie charts effective if there are are few 35 categories. If there are too many categories the chart can become confusing and the data will not be understood. Pie charts frequently used to show the breakdown of a budget of an organization — hence the phrase “bigger slice of the pie.” are Histograms Another type of graph that is commonly used is the histogram. This is similar to that the bars do touch; this is to reflect the fact that the data is rank ordered and to reflect differences in quantity. The data must be at the interval or ratio level. As stated in Chapter 1, the interval/ratio level involves observations that are mutually exclusive, have an inherent order and have equal spacing between categories. The height of the bars varies to reflect the difference in frequency. The bars can be of equal width, reflecting equal category intervals; however, the width can vary if the category or “bin” sizes vary. In some cases, the bars can be drawn on both sides of they axis, reflecting positive values on the right, and negative values on the left, but this may be confusing for the reader. Let’s say that Community Housing’s intake form for clients requesting subsi¬ dized housing includes information on income level and years of education. Both of these variables are at the ratio level. Looking first at income levels of all clients from the previous month requesting subsidized housing, June creates a frequency table (Table 3.11). the bar graph except Table 3.11 Absolute Frequency and Percentage Distributions - Income Levels of Clients Income Level Category Frequency Percent Cumulative Percent $0 to $4,999 1 2 6.7 6.7 $5,000 to $5,999 2 5 16.7 23.3 $6,000 to $9,999 3 11 36.7 60.0 $10,000 $14,999 4 7 23.3 83.3 $15,000 to $19,999 5 3 10.0 93.3 $20,000 and 6 2 6.7 100.0 30 100.0 to over Total Next she used these data to develop a histogram (Figure 3.3). This histogram of the clients fit in categories 2, 3 and 4, with income levels ranging from $5,000 to $15,000. The distribution of income levels in this histogram resembles a bell curve, also called a normal distribution, which we discuss in Chapter 5. shows that the majority 36 Statistics for Social Justice Figure 3.3 Histogram - Income Levels of Clients June then does the same with the data on education levels. She creates a frequency table and a histogram (Table 3.12 and Figure 3.4). Table 3.12 Absolute Frequency and Percentage Distributions - Years of Education Years of Education Frequency Percent Cumulative Percent 6 2 6.7 6.7 7 7 23.3 30.0 8 9 30.0 60.0 9 We 6 20.0 80.0 10 3 10.0 90.0 11 2 6.7 96.7 100.0 12 1 3.3 Total 30 100.0 clearly see that the majority of clients had education levels of seven, eight and nine years and that the distribution of education levels is beginning to look like a normal curve. This information is useful because, as we explain in later chapters, certain statistical tests can only be done for interval/ratio level data that are normally distributed. can Frequency Distributions and Graphs 37 Figure 3.4 Histogram - Years of Education education Frequency Polygons Frequency polygons are similar to histograms, except that dots are used instead of bars. If we place a dot at the top middle of each bar, then join the dots, we have a frequency polygon. Polygons are normally used to display data at the ratio/interval level. Let’s consider, for example, a frequency polygon for grade levels reached. If the sample is large enough, the polygon will begin to resemble a normal curve. Scatterplots Scatterplots are useful for displaying the relationship between two variables which both at either an interval or ratio level. For example, we could use a scatterplot to show the relationship between the numbers of treatment sessions and the score on a related psychological test. The two variables would each be shown on either the x axis (the independent or predictor variable, discussed in Chapter 10), or the y axis (the dependent or outcome variables, also discussed in Chapter 10). Create a scatterplot by plotting the points on the graph for each subject, such that each subject is represented by one point of the graph. If there is a relationship between the two variables, the dots should begin to resemble a line going up diagonally, in the case of a positive relationship, or down in the case of a negative relationship. In June’s study, the data on income and education levels are both at the ratio level. June believes that there is likely a relationship between years ofeducation and income level, such that clients with higher education levels also have higher levels of income. She decides to plot these data on a scatterplot with each point on the chart showing where the education level and income level intersect (Figure 3.5). are 38 Statistics for Social Justice Figure 3.5 Scatterplot - Education and Income Levels The scatterplot shows data from fourteen subjects. Each point on the graph represents one subject in the study. One subject, with 6 years of education, has an income of $2,000. The graph shows a definite pattern, with the higher the education level, the higher the income level. June therefore concludes that there positive relationship between education and income. This positive type of relationship is discussed further in Chapter 10 when we look at the Pearson r is a correlation. COMMON MISTAKES The purpose of displaying data in the form of tables and graphs is to provide the to support an argument in a way that words alone cannot do. While this method of describing data is useful, it is also important to make sure that we are not misrepresenting the data. For instance, ifJune receives 5 new applications, each from families with children, she would be correct in stat¬ ing that there was a 20% increase in the families with children on her waiting list. Five families does not seem like a lot, but 20% could send a misleading message. Another common mistake made by researchers is to include too much informa¬ tion in the table or graph. If a table is too complex, it will be hard for the reader to interpret the information. It is easy with the aid of computers to create complex tables with large amounts of data. As a general rule, it is much better to develop, simple easy-to-understand tables and charts. reader with visual evidence Frequency Distributions and Graphs 39 EXAMPLES OF REAL STUDIES The following example, from Christine Marlow (2005: 240), provides a good frequency distributions can be used. In this data there are twenty-five university students with preschool-aged children. Each illustration of how the various set student represents one observation and the study is interested in their number of children, their ethnicity, their expressed need for daycare and the number of miles they live from campus. The different ethnic groups are coded as follows: 1 = white, non-Hispanic; 2= Hispanic and 3= African American. On the expressed need for daycare, a score of 1 is for the least need, while 4 is for the greatest need. The number of children and the miles from campus are represented by the actual numbers as reported by the students. In terms of level of measurement, the variable ethnicity is at the nominal level, the need of daycare is at the ordinal level, and the number of children and distance from campus are at the ratio level. Table 3.1 shows the frequency and percentage distributions for all the variables. Table 3.13 - Frequency and Percentage Distributions Four Variables Observation # by Each Observation Number of Children Ethnicity Need for Miles from Daycare Campus 1 2 1 3 2 2 1 1 4 1 3 1 3 4 10 4 1 1 4 23 5 1 2 3 4 6 1 1 3 2 7 2 2 4 1 8 2 1 3 1 9 1 2 2 6 10 3 1 4 40 11 2 1 3 2 12 1 2 1 1 And so on Source: Marlow 2005: 240 40 Statistics for Social Justice Tables 3.14 to 3.17 describe the Table 3.14 frequency distributions for the four variables. Frequency and Percentage Distribution Label - Ethnicity Value Frequency Percent Non-Hispanic white 1 17 68 Hispanic 2 6 24 African American 3 Total Table 3.15 2 8 25 100 Frequency and Percentage Distributions - Need for Daycare Label Value Frequency Percent No need 1 3 12 A little need 2 2 8 Some need 3 7 28 Great need 4 Total Table 3.16 13 52 25 100 Frequency and Percentage Distributions - Number of Children Value Frequency Percent 1 13 52 2 11 44 3 Total Table 3.17 1 4 25 100 Frequency and Percentage Distributions - Miles from Campus Value Frequency Percent Less than 5 14 56 5-9 miles 5 20 10 - 14 miles 15 miles and Total The over 2 8 4 16 25 100 following example of a study on homelessness also provides good illustra¬ use of frequency tables and graphs to inform governments, policy analysts and the general public about this important social justice issue. In 2009, research was conducted by Bri Trypuc and Jeffrey Robinson whose goal was “to help people across Canada have a better understanding of our homeless situation based on evidence rather than myths, and bring to the public s attention programs tions of the Frequency Distributions and Graphs 41 that work in helping the homeless. With better information, we hope Canadians (2009: 1). This research found staggering numbers. There are an estimated 157,000 home¬ less people in Canada. Of the 20% who remain homeless for more than three months (and are therefore considered chronically homeless), life on the streets can lead to addiction, abuse and suicide. An estimated 1,350 homeless people die each year; the average life expectancy of a homeless person is 39 years. Not only is there a large personal cost to homelessness, but there is also a societal cost; it is estimated that Canada spends 1.1 billion, or $35,000 per person, per year to keep homeless people in shelters, jails or hospital emergency wards. As stated by Trypuc and Robinson, homelessness occurs in every region of Canada. Figure 3.6 is a bar graph showing the distribution of shelter use in the provinces and territories. can make informed decisions that will create results for those in need” Figure 3.6 Distribution of Shelter Use by Province § 1200 Provinces Source: Trypuc and Robinson 2009: 4 The graph reveals that the prevalence of shelter use is highest in Alberta and shows the prevalence ofshelter use, the authors of the study also felt it necessary to put a human face to the statistics: lowest in Atlantic Canada. While the bar graph Andrew boy scout and a good student in school. began. His schizophrenia was difficult to control, he rebelled against medications which left him feeling numb. His family could no longer cope alone with Andrew’s erratic behaviour and he went to live in a group facility. When this did not address his needs, Andrew struck off on his own. Living alone was too great a challenge, was a When he was happy child, a 17 the voices 42 Statistics for Social Justice and without steady wages, Andrew was evicted from his apartment and became homeless. Carrie has long blond hair and beautiful blue eyes and loves to read Dostoevsky. At age 8 her step-father began raping her. Living with ongoing sexual abuse, Carrie escaped from ‘home’ at age 16. Carrie lives on the streets with Patches, her part-Rottweiler dog. Patches is her only source of unconditional love and companionship, offering protection, trust and body heat. Dogs are not allowed in the emergency shelters, so for four years Carrie has lived in a make-shift shanty camp. Rob drank his first beer with his dad when he he was an was 11. Within 2 years alcoholic, hiding his daily drinking from his parents. To pay for booze, Rob began stealing, starting first with petty theft escalating By 20, Rob was in federal jail. For the next 22 years, he was either in jail or drinking, moving from job to job. Rob would be dry for sometime, holding a job, but his alcoholism was always lurking, leading Rob to homelessness for years. Homelessness is the result we see people lining up at shelters, sleeping on park benches, and squatting in doorways. But the causes to bank robberies. - of homelessness abuse are are varied. Addiction, severe mental illness, and child primary causes of years living on the streets. In most every case homelessness is triggered by a single crisis beyond a persons control which cascades. Without effective early intervention and family or com¬ munity support, people fall through the gaps leading to a desperate life on the of God we or Homelessness streets. too can happen to anyone. But for the Grace could be homeless. Sadly we may view the homeless through the distorting lens of morality character, judging those living on the streets as lazy, undeserving or less worthy than ourselves. Worse is the attitude that people choose to be in their right mind would choose to be homeless with its violence, stress and degradation. Sometimes sleeping on the streets is safer than being in a crowded emergency shelter. Homelessness reflects a failure in us and organizations to provide appropriate and responsive care. It doesn’t have to be this way. There are effective programs and ser¬ homeless. No one vices that work to intervene with those who of these programs constrained In are homeless. The waitlists and their capacity to work with by the lack of funding. (2) more people, is only study published in 2005, Steve Pomeroy examined the question of the addressing homelessness through institutional settings such as hospitals, treatment centres, prisons, emergency shelters and hostel programs, a relative cost of Frequency Distributions and Graphs 43 compared to community-based and affordable housing. Figure 3.7 provides a comparison of the approximate per diem costs averaged over four major cities — as Halifax, Montreal, Toronto and Vancouver. Figure 3.7 Comparative Costs of Responses to Homelessness Prison/Jail Psychiatric Hospitals Emergency Shelters Emergency Shelters Families Emergency Shelters Family Violence Psych/Detox Treatment Centres Group transitional Supportive Group Longterm Supportive Board/Room House - Community Supports Independent Apart. Single Independent Apart. Family 0 â– per Source: 50 100 150 200 250 BOO 350 400 diem cost Pomeroy 2005: iv What the bar graph by Pomeroy clearly reveals is that it is much cheaper, and arguably more effective, to deal with the problem of homelessness by providing supportive and affordable housing than it is to keep people in shelters, hospitals and prisons. SUMMARY In this chapter, we introduced descriptive statistics. We showed how to display quantitative data in a variety of frequency distribution tables, including absolute, cumulative and percentage distributions. We described a few of the main type of tables, charts and graphs, such as bar graphs, histograms, pie charts and scatterplots. These illustrated how complex numerical data can be displayed in ways that are easily understood by academics, government officials and by the general public. We added a word of caution that including too much complex data may reduce the effectiveness of the table or chart. 44 Statistics for Social Justice REVIEW QUESTIONS 1. Describe a situation where you lute, percentage or would want to use a distribution table (abso¬ cumulative) and why you would choose to present your data in this way. Discuss each 3. type of graph (bar graphs, pie charts, histograms, frequency pol¬ and scatterplots), indicating their strengths, weaknesses and the level of data each is appropriate for. Provide an example of how data could be manipulated to present an inaccu¬ 4. Discuss how tables and 2. ygons rate picture. graphs work researcher compared to a cal framework. may be used differently by a structural social researcher working from a mainstream empiri¬ 4 Central Tendency and Variability In this chapter we look at measures of central tendency and examine ways in which we can describe a typical case. We show how to calculate mode, median and mean and examine the impact of outliers. We also discuss the degree of variability that exists in a population, using range, variance and standard deviation. Together, Chapters 3 and 4 provide enough information to allow researchers to use descriptive statistics to argue for social change. Social workresearchers want to descri e things lihow ke onetheirof typical the and practitioners of of may the clients theirbcaseload, their cases, range scores on clients compare to others and how their clients compare to the general population. following hypothetical example. Anne Matthews is a social worker for a municipal social assistance department. She has a large caseload of over a hundred people, the majority of whom are single mothers. She is familiar with the literature that states that the rates of depression among lowincome single mothers is proportionally much higher than itisin the general population. She decides to find out ifthis is true ofthe clients on her caseload. Recognizing that she is not be able to conduct the research herselfbecause of her conflict of interest in this situa¬ tion, she approaches the School of Social Work at her local university and asks a social work research professor ifsome of her students would be interested in conducting the study. After getting the necessary approvals and clearance from the university’s research ethics board, the student researchers ask the single mothers on Anne’s caseload if they would be willing to participate in their study. They decide to use the Depression Scale, a well-known standardized selfadministered (paper and pencil) measure of depression. The scale reveals scores that range from 1 (lowest possible depression score) to 25 (highest possible depression score), with scores of 1-5 indicating no depression, and scores greater than 20 indicating extreme depression. Of the 75 single mothers on Anne’s caseload, 55 agree to participate, so the sample size is 55. After collecting the data, the students find that the mean (average) score of the single mothers is 8, well above what is considered within the normal Let’s look at the 45 46 Statistics for Social Justice and many of Anne’s clients score over 10. This confirms for Anne what the indicates, that the single mothers on her caseload are far more likely to be depressed than people in the general population. What she is able to show as a result of the research is that a typical client on her caseload is a single mother in her early twenties with one preschool-age child, no permanent address, has been staying temporarily with friends and relatives and is clinically depressed. This description of a typical case provides the management of Anne’s department with a real sense of the problem of homelessness involving female-headed, single-parent families and the level of depression that these clients are facing. It also raises concerns about the lack of support available for families with very young children. range, literature MEASURES OF CENTRAL TENDENCY In descriptive statistics, it is often useful to provide an idea ofwhat a typical case in the population or sample being investigated looks like. Measures ofcentral tendency describe is typical cases in research. There are three accepted ways of describing what typical: mode, median and mean. Mode The mode is the value that at occurs most often with the data. It all four of the levels of measurement: can be used with data nominal, ordinal, interval and ratio. For example, the following data set displays test scores out of 10: 5,3,8,5,4,7,4,5,6,6. As a first step to determining the mode (or median), we display the data set in an array (discussed in Chapter 3). An array organizes the data from the lowest value to the highest or the other way around. The array for the data above looks like this: 3,4,4, 5, 5, 5, 6, 6, 7, 8. With the data arranged from smallest to largest, it is easy to see which values occur more than once and which occur most often (the mode). In this example, the value 5 appears most often, and therefore, the mode is 5. Data sets can be bimodal, with two values occurring most frequently. Consider the following array: 3,4,4,5,5,5,6,6,7,7,7,8. The values 5 and 7 each appear three times. Therefore, this data set is bimodal, and we would report the modes as both 5 and 7. If we were to draw a histogram (as described in Chapter 3), it would have two distinct peaks (Figure 4.1). Data can also be multimodal, that is, have more than two modes. Central Tendency and Variability 47 Figure 4.1 Histogram - Bimodal Distribution Histogram scores Of the three for its used use. measures While it often can other of central tendency, the mode has the fewest restrictions be used with all levels of measurement, the mode is not The problem is that, while it does identify the often, this value may not be the most accurate portrayal of a typical value. When dealing with data that is at the ordinal, interval or ratio level, a more accurate description of central tendency can be obtained by using as as measures. value that appears most either the median or the mean. Median Unlike the mode, the median has the restriction that it that are at the ordinal, interval level where or can only be used for data ratio level. This is because the data must be at a they can be placed in a meaningful rank order, from lowest to highest around. The median is the halfway point in the list of values. It divides a set of values into two equal halves. If the values were arranged in an array, we would simply count the number of values and divide by two. For example, consider the following data set: 5, 3, 8, 5, 4, 7, 4, 5, 6. Displayed as an array this data set is as follows: 3,4,4, 5, 5, 5, 6, 7, 8. The value at the halfway point is 5, and therefore 5 is the median. If there is an odd number of values, it is easy to pick the middle number. In our example there are four values below 5 and four values above 5. If there is an even number of values, the halfway point will fall between two values, as with the following data set: 3, 4, 4, 5, 6, 7, 7, 8, 8, 9. The halfway point is between the 6 and the 7. Therefore, the median is 6.5, or the average of 6 and 7 (6 plus 7 divided by 2). Unlike the mode, the median may not correspond to an actual value in the distribution; the halfway point may lie in between two values. or the other way Statistics for Social Justice 48 The main advantage ofthe median is that it is not influenced by extreme values. (discussed in Chapter 5) and thus is com¬ monly used to describe income levels because this type of data tends to be skewed. The main disadvantages are that it cannot be used for variables at the nominal level and it can be a bit more difficult to compute than the mode. It is often used for skewed distributions Mean The mean, commonly referred to as the average, is the most common method of describing central tendency. It is easy to compute: add all the values together and divide by the number ofvalues. It is viewed as the most accurate measure of central tendency because it uses all ofthe values in the data set in its computation; it is not simply the one in the centre of an array, as is the case ofthe median, or the one that appears most frequently, as is the case of the mode. Consider, for example, the fol¬ lowing data set: 5, 3, 8, 5,4,7,4, 5,6,6. If we add all the values, we arrive at a total of 53. Since there are 10 values, we divide 53 by 10, and we obtain a mean of 5.30. Outliers It may happen that one value is atypical. Consider the following data set: 3, 4, 4, 5,5, 5,6, 6,7, 8, 15. We is considered side of the an area see that the value 15 is unlike the other values. This value outlier. If we were to where the other values draw are a histogram, this value would lie out¬ found. Of all of the tendency, the median is the least affected by outliers, but the Figure 4.2 Histogram with Outlier Histogram = 618 Std. De v - 3 .25 Mean N-11 scores measures mean of central certainly is. Central Tendency and Variability 49 Figure 4.2 illustrates the main disadvantage of the mean, which is that it is by outliers. It is also the most restrictive of the three measures of central tendency in the sense that it can only be used with interval or ratio level data. affected DECIDING WHICH TO USE The decision of which measure what describes the data most of central tendency to use is usually based on accurately. With nominal data, the mode is the only appropriate measure of central tendency. With ordinal data, it may be helpful to use both the mode and median, but we cannot use the mean in this case. With interval and ratio data that have skewed distributions, such as income values, the median is used most often. With interval and ratio level data that normally distributed mean is used most commonly and provides the most accurate measure of central tendency. In a perfect normal distribution, the mode, median and mean are all the same value (discussed in Chapter 5). Consider the following hypothetical example. are (bell shaped), the Assume that you are a social work student doing a placement at a community agency working with new immigrants, conceptualized by the agency as those who have been in the country five years or less. This agency has recently developed a Head Start Program for the preschool children in the community. The agency director is concerned that both recent immigrants and families with the lowest income levels are not accessing the pro¬ gram. She asks you to do a review of case files offamilies in the community to determine who is and who is not accessing the program. You and the director have come up with the following two research questions: 1. Who is 2. What is the income level of the You accessing the Head Start Program? people using the Head Start Program? randomly select 25 active case files to answer you construct Table 4.1 to describe your data. your research questions, and 50 Statistics for Social Justice Table 4.1 Five Variables by Each Observation Number of Times Number of Families Preschool Ethnicity Years in Head Start Canada Program in Children 1 2 Using Income (x 1000) Rounded to the the Previous Nearest Month 1,000 3 4 5 6 1 2 1 5 7 35 2 4 2 1 1 8 3 3 3 3 4 10 4 1 1 5 6 23 5 2 2 3 3 24 6 2 1 2 3 15 7 4 2 1 1 8 8 2 1 2 3 16 9 1 2 4 4 26 10 3 1 5 7 40 11 2 1 2 3 20 12 3 2 2 1 7 13 1 1 1 2 23 14 2 3 2 2 7 15 3 3 2 4 8 16 1 1 3 4 19 17 1 1 4 6 15 18 2 1 3 4 25 19 1 2 2 1 9 20 2 4 2 1 8 21 2 1 3 4 22 22 1 2 4 7 36 23 2 1 4 5 31 24 1 1 3 4 28 25 2 1 2 3 18 Total 50, Mean 2.00 Note that columns 2, 5 Mode 1 Total 70 Total 90 Mean 2.80 Mean 3.6 and 6 contain the actual numbers Median 19 reported by the par¬ ticipants. For the data on the ethnicity of recent immigrants (column 3), 1 stands as Central Tendency and Variability for 51 European, 2 is for South Asian, 3 is for Arabic, 4 is for Chinese, and 5 is for ethnicity is at the nominal level of measurement, and the other other. The variable variables are at the ratio level. Because the variable for the number of preschool children is at the ratio level, any of the three measures of central tendency. However, the one that is used most often for ratio level data is the mean, which is calculated by adding up the total number of children (50) and dividing by the number of families (25). Therefore, the mean number of children is 2.00 (50 4- 25 = 2.00). we can use Table 4.2 Frequency and Percentage Distribution - Ethnicity Ethnicity Value Frequency Percentage European 1 14 56 Arabic 2 7 28 Chinese 3 3 12 Other 4 1 4 25 100 Total For the variable ethnicity, since we are dealing with nominal level data, the only of central tendency that is appropriate is the mode. Table 4.2 clearly shows that the value 1 occurs most frequently. Therefore, the mode for the ethnicity of measure immigrants is 1, which represents immigrants who identify as European. in Canada is at the ratio level of measurement; therefore, any of the three measures of central tendency can be used. We chose to use the mean, recent The variable years number of years in Canada, as it is the most often used and the one typical client, given that there are no outliers within this variable. The mean for this variable is 2.8 years in Canada (70 4 25 = 2.80). The variable for the number of times new immigrant families used the Head Start Program is also at the ratio level and any of the measures of central tendency or average which will best represent a can be used. We chose the Head Start to use the mean number of times families have attended Program which is 3.6 (90 4 25 The variable income is also = 3.6). the ratio level, at but instead of reporting the chose to report the median. Remember that the mean, we income is to use the median as the measure of central common practice with tendency as it is often more representative of the sample or population. This is because income distribution in the general population has a positively skewed distribution, making the median a accurate measure of a typical case. In our sample of 25 families, the median more income level is to also combine the values into four income be able to quickly examine those in the lowest income bracket (income $10,000). groups, to less than $19,000. We decide 52 Statistics for Social Justice Let’s original research questions: Who is accessing the Head Program? and What is the income level of the people using the Head Start Program? We can now describe the typical family in our sample of agency clients that use the Head Start Program. For the most part, they come from either Europe or Arabic speaking countries, have been in Canada between 1 and 3 years, and have return to our Start a median income of $ 19,000. With respect to the director s concern that those with the lowest income (con¬ ceptualized as those with an annual income of less than $ 10,000) are not using the Head Start Program as often, we can see that those families only used the Head Start Program once as compared to an overall mean of almost 4 times (see Table 4.3). Table 4.3 Frequency - Use of Head Start Program by Each Low-Income Family Income Frequency of Use 7,000 1 7,000 2 8,000 1 8,000 1 8,000 1 8,000 4 9,000 1 Mean 1.6 This confirms the held by the director that the lowest income families making much use of the program. It would now be important to find out why. Is it because of the transportation costs or are there other reasons? are concern not VARIABILITY Variability, also called dispersion, tells us how the values are distributed, whether all the values are close to the mean or spread apart. Why is it important to describe variability? If we provide only the measure of central tendency it may not give us a complete picture of the cases in our sample or population. For instance, in the previous example of single-female-headed families, we indicated that the mean score on the Depression Scale was 8, but how many clients scored close to this mean? If most clients scored between 5 and 10, somewhat this would indicate that most are clinically depressed. But what if many clients scored within the normal range (below 5) and many others scored at the more severe range of depression (above 20)? This would give us a different picture of the clients in our population and may require a different approach. Central Tendency and Variability There five 53 of variability: range, interquartile range, mean deviation, chapter, we focus on those that are used range, variance and standard deviation. are measures variance and standard deviation. In this most often: Range The range is the difference between the lowest and highest values. It is simply cal¬ culated by subtracting the minimum value from the maximum value and adding are included. Consider the following data set: 5, 3, 8,5,4,7,4, 5,6, 6. We start by displaying this data set as an array: 3,4,4, 5, 5,5, 6, 6, 7, 8. Next we calculate the range: 8 - 3 + 1 = 6. For this data set, the range is 6. The main disadvantage of the range is that it is influenced by outliers. Consider the array from the previous example, except with an outlier: 3, 4, 4, 5, 5, 5, 6, 6, 7, 12. Now the range would be 10 (12 - 3 + 1 = 10). This data set has only one different value (changing an 8 to a 12), and the range increases from 6 to 10, indicating that there is more variability within the data. 1; 1 is added so that all the values Variance be defined Variance can from the mean. the number of It is the scores deviation and is a as the average sum of the of the dispersion of the individual scores squared deviations from the mean divided by minus 1. It is the basis of the calculation for the standard fundamental way of describing the variability of a normally distributed variable. It is calculated using the following formula. s2 Zfx-y)2 n-i = where: s2 = Variance of the £ = Sum of x = Individual x = Mean of the n = Number of participants Consider the data set is sample raw score sample following array: 2,3,3,4,4,4,5,5,6. The total of the values in this are 9 values, we divide 36 by 9, which gives us a mean 36. Because there of 4. Next, we calculate the variance as subtract the from it, and square follows: we take each value in the data set, this value. This gives us what is called the squared deviation from the mean. (This makes sense as it is simply how much the values are deviating from the mean, squared.) The reason we have to square the difference is because if we added up the differences without squaring them, we would come up with 0 every time. Next, we total the squared deviations and mean 54 Statistics for Social Justice divide this value by the number of participants (in this case 9) minus 1. This value is what is called the variance. x - x x-x (x-x)2 2 - 4 -2 4 3 - 4 -1 1 3 - 4 -1 1 4 - 4 0 0 4 - 4 0 0 4 - 4 0 0 5 - 4 1 1 1 5 - 4 1 6 - 4 2 4 Total 12 12 h- Box 4.1 Note on (9-l) = 1.50 (variance) Calculating Formulas This is the first time in this text that we introduce a formula. In order to use this formula properly, you need to ensure that you are following the order of operations, which states the order in which different parts of the equation must be calculated. A common acronym used to remember the order is bedmas: brackets, exponents, division, multiplication, addition, subtraction. There are a number of YouTube videos that do a good job of explaining this, if you need a refresher. A second practice you may wish to review is the proper application of rounding principles. Keep in mind that we usually want to keep two decimal places. When we round, we typically use the rule that a 5 or more causes the number to the left to round up, whereas a 4 or less causes it to round down. For example, 1.546 would round up to 1.55, whereas 1.544 would round down to 1.54. THE STANDARD DEVIATION Standard deviation is by far the most commonly used measure of variability in by getting the square root of the variance. The formula statistics. It is calculated is as follows: EC* n - *)2 1 Notice that this formula is very ference is that you similar to the formula for variance; the only dif¬ take the square root of the calculated number as the last step Central Tendency and Variability (therefore, the square root variance score above of 1.50, we Table 4.4 Depression Scores 55 of the variance is the standard deviation). Using the obtain a standard deviation of 1.22, by simply taking the square root of the variance (Vl-50 = 1.22). How does the standard deviation help us to understand what a sample or a population looks like? Let’s use the example of the research on depression levels of single mothers, described at the beginning of this chapter. To make the calcula¬ tion of the standard deviation a bit easier, let’s take a sub-sample of 10 participants from the 55 participants who originally agreed to be part in the study and display their score in a table (Table 4.4). Depression Client Client Depression Score Score Mary 6 Jane 7 June 8 Shelley 7 Susan 10 Tania 8 Julie 8 Sonia 8 Martha 9 Tina 9 If we add up these sample), we obtain obtain a scores a mean and divide by 10 (the number of participants in our of 8. If we then calculate the standard deviation, we standard deviation of 1.15. x-x (x-x)x 6 8 -2 4 8 8=0 0 10 8=2 4 = 8 8=0 0 9 8=1 1 7 8=1 1 7 8=1 1 8 8=0 0 8 8=0 0 9 8=1 1 Total 12 -r (10 - l) = 12 1.33 (variance) V 1.33= 1.15 therefore the standard deviation is 1.15 Statistics for Social Justice 56 This suggests that the variability within the data is fairly narrow. This becomes display these data in a frequency table (Table 4.5) and a his¬ togram (Figure 4.3). The histogram is particularly effective at demonstrating that there is little variability within the data, as it is easy to see that the data are all close more to the evident if we mean Table 4.5 and the mode. Frequency and Percentage Distribution - Depression Scores Cumulative Frequency Percentage 6.00 1 10.0 10.0 7.00 2 20.0 30.0 8.00 4 40.0 70.0 Percentage 9.00 2 20.0 90.0 10.00 1 10.0 100.0 Total 10 100.0 Figure 4.3 Histogram - Depression Scores Histogram score Central Tendency and Variability 57 SUMMARY In this chapter, we covered two important ways of describing data: measures of tendency and variability. We explained that there are three measures of central tendency: mode, median and mean. We showed how to calculate each and examined the impact ofoutliers. We also looked at the measures of variability, including range, variance and standard deviation. The information covered in this chapter and Chapter 3 provide what is needed to make use of descriptive statistics in order to advocate for social change. central REVIEW QUESTIONS 1. Discuss the 2. tendency. Explain how different tures of a typical case. 3. Describe the effect that outliers have 4. 5. strengths and weaknesses of each of the three measures of central measures of central tendency could paint different pic¬ data they relate to describing a typical case. How might this work in the researcher’s favour at times and to their disadvantage in other cases? Describe a situation where a social work researcher might use a measure of variability to advocate for social change. Self-reflexive exercise: take a few minutes to think about your work to date in the social work profession. Have you encountered descriptive statistics? If so, when and where, and how were they being used ? How did you react to this in¬ formation? Were the numbers overwhelming, or did they make sense to you? How might your past experiences shape when and how you use descriptive statistics moving forward in your practice? on as 5 Probability and the Normal Distribution In this chapter, we move into the area of inferential statistics, which are used when we want to generalize our findings to a larger population based sample. We begin by introducing one of the fundamental aspects of inferential statistics, that is, the concept of probability, and The rules related on a determining the different levels of probability. We explain how the normal can be used to establish the likelihood or probability of an outcome occurring based on all possible outcomes. Finally, we show how to calculate z scores and percentiles and how they can be used to make comparisons. to distribution f the sample has been selected using a probability approach, meaning that each population of interest had an equal chance ofbeing selected, then we can infer the characteristics of the population based on the characteristics of our sample. Inferential statistics can also be used to tell us if there is a rela¬ tionship between two or more variables, for instance, if the independent variable (for example, alcohol consumption) is likely to be related to a change (increase or decrease) to the dependent variable (for example, injection drug use). individual within the PROBABILITY AND SOCIAL JUSTICE In a completely egalitarian society, everyone’s life chances would be the same; however, we know that this is far from the case. We also know that children grow¬ ing up in poverty, people of colour, First Nations people and other marginalized people may well not have the same life chances and opportunities as people who are raised in an affluent, White, mainstream family. The German sociologist Max Weber coined the term “life chances” their chances of achieving explain how someone’s class influences their goals in life. to It is the most elemental economic fact that the way in which the disposi¬ is distributed among a plurality of people meeting competitively in the market for the purpose ofexchange, in itself creates specific life chances. (1978: 927) tion over material property 58 Probability and the Normal Distribution Eric Krieg (2012) 59 the term “social capital” to refer to the number and range people who have political and financial influence. Those with greater social capital obviously have greater life chances. There are many examples of the use of probability figures help to describe important social justice related issues. For instance, in an article published by Forbes Magazine online, Daniel Fisher (2012) points out that while 79% of students in the United States from the top income quartile (incomes over $98,875) obtained a bachelor’s degree, only 11 % of students from the lowest quartile (incomes below $33,000) had the same level of education. In other words, the probability of stu¬ dents from high-income families earning a bachelor’s degree is roughly eight times higher than students from low-income families. Another example involves poverty rates among First Nations children. In a study conducted in 2013 for the Canadian Centre for Policy Alternatives, David uses of connections with Macdonald and Daniel Wilson state that 40% of First Nations children in Canada live in poverty. child The authors also point out that the probability of a First Nations living in poverty is two and half times that of the general population of children in Canada. BASIC LAWS OF PROBABILITY of the fundamental concepts of statistical analysis. It is based repeated observations of a phe¬ nomenon, a certain pattern will be evident. For instance, let’s say that the height of every grown Canadian male is measured and the average height is found to be 175 cm. While there may be large fluctuations in the height of individual males, with a large enough sample, the laws ofprobability indicate that the average height of men in the sample will be approximately 175 cm. Another example is the coin toss. Because there are only two possible outcomes, heads or tails, if you toss a coin often enough, in the long run, you should get roughly an equal number of heads and tails. Of course, if we only toss a coin 10 times, it is quite possible that we get 6 or 7 heads and only 4 or 3 tails. Another basic law is that probability is expressed in a range from 0 (never) to 1 (complete certainty). Since there are only two possible outcomes in a coin toss, the probability of getting heads is .5. On a multiple choice question with four pos¬ sible outcomes, the probability of getting a correct score is .25. Probability is used to forecast weather, anticipate economic growth and predict election results. It is also used by social workers interested in social policy because it can predict the impact of social policies affecting thousands of people. Calculating probability is therefore an effective and essential tool for social justice oriented social workers. Probability is on one certain basic laws. One such law is that within 60 Statistics for Social Justice Calculating Probability Probability is represented by the letter p and is expressed as a proportion of 1.0. Remember that since there are two possible outcomes, the probability of getting heads in a coin toss is .5 (p = .5). The formula for calculating probability is the following. Number of outcomes in the event v _ * Number of all possible outcomes that can occur example of tne coin toss, since there are two sides to the coin, the formula is expressed as follows: Turning to our Number of outcomes in the event Number of all possible outcomes that can occur P= _ ~ 1 2 = Another example often used to explain how to calculate probability is rolling calculate the probability of rolling a die and getting a 3. Since there are 6 possible outcomes with a die, the probability of getting a 3 is. 167. dice. Let’s say we want to Number of outcomes in the event Number of all possible outcomes that can occur P= 1 = 6 = -167 Addition Rule There few basic rules follow when calculating probabilities. The first is the probability of flipping a coin and getting heads is .5, and that the probability of getting tails is also .5. By adding these two prob¬ abilities we obtain a 1.0 (p = .5 + .5 = l). This addition rule applies because ofwhat is called disjoint events. Disjoint events are those that when one occurs no other event can occur the events are mutually exclusive. Flipping a coin and getting a head precludes the possibility of getting a tail. The addition rule can be applied to the example of rolling dice. If we want to know the probability of obtaining either a 3 or a 4, and we know that each of these two events has a probability of. 167, we can add these probabilities together to get our answer. Adding these two together, the probability of obtaining either a 3 or a 4 is .334 (p = .167 + .167 = .334). As a word of caution, Krieg (2012) points out that we need to be aware of rounding errors. If we were to add up the fractions of rolling either a 3 or a 4, as opposed to adding the individual probabilities, it would are a to addition rule. We know that the — be as follows: + 2 6 .333 Probability and the Normal Distribution Ap be of .334 as opposed to .333 may not seem 61 like a big difference but it could important. As Krieg suggests, the best way to avoid rounding errors is to add frequencies (e.g., 1/6) instead of the individual probabilities (e.g., 1.67). The Multiplication Rule The next rule have consider in calculating probabilities is the multiplication first look at what are called independent events. These are events where the occurrence of each event has no impact on the occurrence of any other event. If we were to calculate the probability of multiple independent events, we would simply multiply the probability of each event with the others. Looking at the example of a coin toss, the probability of getting one heads is .5. Getting one heads on the first toss has no impact on getting a second heads with the second flip of the coin. The probability of getting two heads then is .25, or .5 times .5. The same applies to rolling dice. If we want to know the prob¬ ability of getting two 3s by rolling two dice each just one time, the probability of the first event (rolling a 3) is. 167, times the probability of the second event (rolling a 3), which is also .167, equals .028. The last thing to consider within the multiplication rule is the probability with or without replacements. Replacement involves selecting a case for one event and then replacing this case for the second event. The example used by Krieg is the alphabet. If we put all 26 letters of the alphabet in a bag and select one, we would be left with 25 letters. The probability of obtaining any one specific letter, say an A, is 1 -h26. or .038. If we replace this same letter in the bag, the probability of obtaining a B is again 1h-26, or .038. However, if we did not replace the A, the probability of obtaining a B is W25 or .04. This is because there are now only 25 letters left. we to rule. To understand this rule we must What do these calculations have out, it is a do with social justice issues? As Krieg points small step from calculating probability to analyzing social data. Going to back to the example of First Nations children in the study conducted by MacDonald (2013), they found that 40% of First Nations children live in poverty. The probability of a First Nations child living in poverty is 4-MO, or .4. Assuming that we have a random sample of 25 female-headed families in need of public housing, as described in Table 3.3 on housing needs and numbers of and Wilson children, out of a total of 25 families, the number of families with The we probability of having families with found that there were one child is 1W25. or one child is 11. .44. In Table 3.3 8 families with 2 children. If wanted to know what the probability of having families with either one or two children, we would add the probability of each, 11/25 + 8/25 = 19/25 = .76. If our sample was large enough, we could generalize these results to future service users, suggesting that there is a .44 (44%) chance that any new service user will have one child and a .76 (76%) chance that they will have either one or two children. Statistics for Social Justice 62 THE NORMAL DISTRIBUTION The normal distribution be used establish the likelihood probability of occurring based on all possible outcomes. It is thus an essential tool in statistics. However, it is important to understand that the normal distribution exists only as a theoretical concept. Although many distributions approximate the shape of the normal distribution, it is an abstract ideal. an can to or outcome Figure S.l The Normal Distribution 99.74 95.44 ♦— 68.26 —* Standard Deviations Source: Lui n.d. The normal distribution has a number of properties that help in calculating probabilities. 1. The mean, median and mode are all at the 2. The total area under the and 50% above the 3. curve equals 1.0, or 100% — 50% below the mean mean. The normal distribution contains six standard mean 4. the highest point and in the center of curve. deviations, three above the and three below. The ends of the touch the horizontal axis reflecting the fact that values (outliers) beyond the three standard deviations. standard deviation above the mean is equal to the dis¬ curve never there may be extreme 5. The distance of one tance of one standard deviation below. As illustrated in and one Figure 5.1, approximately 34% of cases fall within the mean mean. Similarly, 34% of all cases fall within standard deviation below the Probability and the Normal Distribution the mean say that about 68% of all we move look at and one standard deviation above the cases further from the fall within mean, more one and mean. Adding these up, we can standard deviation of the more cases are three standard deviations from the mean, 63 mean. As included. If we were to above and below, we would be including approximately 100% of all cases. Kurtosis Kurtosis is the degree to which a distribution is peaked as opposed to Hat. are peaked, that is to say with narrow variability, are described as leptokurtic (also called positive kurtosis). Distributions that are flat, that is to say with wide variability, are described as platykurtic (also called negative kurtosis). Distributions that are neither peaked nor flat, and resemble most closely a normal Distributions that distribution, are described as mesokurtic. Figure 5.2 Kurtosis Source: Signal Trading Group 2014 Skewed Distributions Some distributions skewed. When they are skewed to the right, the slanting right side and more data than would be expected is on the left of the mean. Such a distribution is called positively skewed. When they are skewed to the left, the slanting tail is on the left side and more data than would be expected is on the right of the mean. Such a distribution is called negatively skewed. As stated in Chapter 4, income is a typical example of a positively skewed distribution, with most people being grouped in the lower end of the distribution. When a curve is free of skewness it is said to be symmetrical; a distribution that is symmetrical and bell shaped is called a normal distribution. tail is on the are Statistics for Social Justice 64 Figure 5.3 Skewed Distributions (b) A negatively skewed distribution (a) A positively skewed distribution Source: Dean and i Illowsky 2008 SCORES Another important concept used score. Z scores convert scores to direct comparisons in statistics is the z score, also called the standard standard score, which allows us to make a common on measurements different research studies taken from either two different populations, using different measurement scales. Z scores tell us where the score is in relation to the rest of the population described in a normal distribution. They are used for calculating the results ofsuch measurement across or the Scholastic Aptitude Test (sat) and the Graduate Record Exam (gre). To calculate the z score simply divide the raw score minus the mean by the instruments as standard deviation. raw score z = The - mean standard deviation following calculation shows the z score for a raw score of 81 when the mean is 75 and the standard deviation is 5: 81-75 z= The 1.2 3 following calculation shows the z score for a raw score is 75 and the standard deviation is 5: z= 70-75 zl 5 5 -1.0 of 65 when the mean Probability and the Normal Distribution 65 Percentiles Another way in which z scores are used is to calculate percentiles. For instance, 56% on a test, you may feel that you have done poorly. However, if you find out that the percentile of your score was 96, you would probably feel a whole lot better. Why? Because 95% of the students did more poorly than you did. Scores on measurement instruments such as sats and gres are normally expressed as percentiles. Convert z scores to percentile scores by using a z table, which describes the area under the normal curve. If the z score is positive you must add 50. If the z score is negative, you must subtract the number from 50. A little practice is essential. What is the percentile if the z score is 1.2? Using Table 5.1, which describes if you score the area under the normal curve, the number indicated for 1.2 is 38.49. Since the we add 50 (38.49 + 50 = 88.49). The percentile is 88.49. percentile if the z score is -1.0? Again using Table 5.1, the number indicated for 1.0 is 34.13. Since the z score is negative, we subtract 34.13 from 50 (50 - 34.13 = 15.87). The percentile is 15.87. z score is positive, What is the Box 5.1 Suggestions for Using a z Table Table 5.1 shows the First, find the area z score under the normal curve. It is also known as a z score table. that you have calculated with first decimal point within the first column (remember columns are vertical, or go up and down). Next, find the point in the first row on top (remember rows are horizontal, or go from side to side). Last, find where the column and row intersect and use that score to complete your calculation. second decimal For calculated a z score of 1.78. Lookup 1.7 in the first column row and that column intersect and you’ve found the value that you need to continue your calculations — 46.25. Since this z score is positive ( + 1.78, not -1.78), add the number found in the table (46.25) to 50 to arrive at the percentile score (46.25 + 50 = 96.25). The percentile score is example, say you’ve and .08 in the first therefore 96.25. row. Find where that 66 Statistics for Social Justice Table 5.1 Area under the Normal Curve (z score table) .04 .05 .06 .07 .08 .09 1.20 01.60 01.99 02.39 02.79 03.19 03.59 05.17 05.57 05.96 06.36 06.75 07.14 07.53 09.10 09.48 09.87 10.26 10.64 11.03 11.41 12.55 12.93 13.31 13.68 14.06 14.43 14.80 15.17 16.28 16.64 17.00 17.36 17.72 18.08 18.44 18.79 z .00 .01 .02 0.0 00.00 00.40 00.80 0.1 03.98 04.38 04.78 0.2 07.93 08.32 08.71 0.3 11.79 12.17 0.4 15.54 15.91 .03 0.5 19.15 19.50 19.85 2019 20.54 20.88 21.23 21.57 21.90 22.24 0.6 22.57 22.91 23.24 23.57 23.98 24.22 24.54 24.86 25.17 25.49 0.7 25.80 26.11 26.42 26.73 27.04 27.34 27.64 27.94 28.23 28.52 0.8 28.81 29.10 29.39 29.67 29.95 30.23 30.51 30.78 31.06 31.33 0.9 31.59 31.86 32.12 32.38 32.64 32.90 33.15 33.40 33.65 33.89 1.0 34.13 34.38 34.61 34.85 35.08 35.31 35.54 35.77 35.99 36.21 37.08 37.29 37.49 37.70 37.90 38.10 38.30 1.1 36.43 36.65 36.86 1.2 38.49 38.69 38.88 39.07 39.25 39.44 39.62 39.80 39.97 40.15 1.3 40.32 40.49 40.66 40.82 40.99 41.15 41.31 41.47 41.62 41.77 1.4 41.92 42.07 42.22 42.36 42.51 42.65 42.79 42.92 43.06 43.19 1.5 43.83 43.94 44.06 44.18 44.29 44.41 43.32 43.45 43.57 43.70 1.6 44.52 44.63 44.74 44.84 44.95 45.05 45.15 45.25 45.35 45.45 1.7 45.54 45.64 45.73 45.82 45.91 45.99 46.08 46.16 46.25 46.33 1.8 46.41 46.49 46.56 46.64 46.71 46.78 46.86 46.93 46.99 47.06 1.9 47.13 47.19 47.26 47.32 47.38 47.44 47.50 47.56 47.61 47.67 2.0 47.72 47.78 47.83 47.88 47.93 47.98 48.03 48.08 48.12 48.17 2.1 48.21 48.26 48.30 48.34 48.38 48.42 48.46 48.50 48.54 48.57 2.2 48.61 48.64 48.68 48.71 48.75 48.78 48.81 48.84 48.87 48.90 2.3 48.93 48.96 48.98 49.01 49.04 49.06 49.09 49.11 49.13 49.16 2.4 49.18 49.20 49.22 49.25 49.27 49.29 49.31 49.32 49.34 49.36 2.5 49.38 49.40 49.41 49.43 49.45 49.46 49.48 49.49 49.51 49.52 2.6 49.53 49.55 49.56 49.57 49.59 49.60 49.61 49.62 49.63 49.64 2.7 49.65 49.66 49.67 49.68 49.69 49.70 49.71 49.72 49.73 49.74 2.8 49.74 49.75 49.76 49.77 49.77 49.78 49.79 49.79 49.80 49.81 2.9 49.81 49.82 49.82 49.83 49.84 49.84 49.85 49.85 49.86 49.86 3.0 49.87 3.5 49.98 4.0 49.997 5.0 49.99997 Source: Fisher and Yates 1963 Probability and the Normal Distribution 67 EXAMPLES OF SOCIAL JUSTICE ISSUES Normal distributions and standard deviation be used identify impor¬ issues. For instance, in a study conducted byjustin Doubleday for the Chronicle oj Higher Education, percentiles are used to show that students from low-income families score much lower than students from high-income families tant on scores can to social justice standardized tests. Critics of standardized tests (such as sats) contend that the examinations backgrounds. Many of those students, they argue, don’t have the same access to advanced classes and test-preparation materials as their more-affluent peers do. The College Board’s report showed that test takers in the lowest income percentile, whose families make less than $20,000 per year, averaged a score of 1326, well below the mean. The average score for students from families who make more than $100,000 was 1619. (2013: 1) are unfair to students from low-income A study conducted by Jane Friesen and Brian Krauth from Simon Fraser University in British Columbia uses standard deviation z scores to show the gap between First Nations and non-First Nations students. First Nations students in grade 7 score more than 0.6 standard deviations average below non-First Nations students on... Foundations Skills Assessment (fsa) exams. The results by quartile are similar: the achieve¬ ment gap ranges from 0.51 to 0.77 standard deviations. Among students who wrote the fsa numeracy test in both grades, the gap between the on mean by the test scores of First Nations and non-First Nations students grew additional 0.05 standard deviations between grades 4 and 7, and reading test score gap grew by 0.09 standard deviations. (2010: 7) an SUMMARY In this chapter we covered a number of key aspects of analysis used for inferential was probability; establishing the probability of a relationship existing between variables is the underlying purpose of any inferential statistical analysis. We looked at the normal distribution, its properties and how it can be used to help identify the probability of any possible outcome occurring. We showed how to calculate z scores and percentiles and how they can be used to make comparisons. statistics. The first 68 Statistics for Social Justice REVIEW QUESTIONS 1. Describe the addition rule and the 2. Imagine importance of mutual exclusivity. casino at a roulette table with a friend. You can see that “red” has come up the past 8 spins. Knowing that you are taking a statistics course, your friend asks you for advice placing their bet. They ask, “I should you are at a probably bet black is due a lot black, right? Since red has come up the past 8 times, up!” What advice would you give this friend and explain on to come reasoning. Explain the importance of specifying if there was or was not replacement your 3. done for subsequent events when calculating probability. Create an example justice oriented research. Discuss the meaning of the number 99.74 as it relates to normal distributions. Come up with an example of a variable which you think is skewed. Is it posi¬ tively or negatively skewed? Why do you think this is? Describe the difference between obtaining a 90% as a final mark in a statistics course, versus being in the 90th percentile in a statistics course. that relates to social 4. 5. 6. 6 Hypothesis Testing In this chapter, we cover the basic concepts involved in hypothesis testing, including testing for relationships, design flaws, statistical significance, festing the null hypothesis, the two different types of hypotheses and finally, the two types of research errors. Hypothesis testing involvrelationship es followingbetween a specific set of statisvariables. tical procedures There to determine if there is two or more a are kinds of relationships that could be of interest to social workers; some involve cause and effect, but within the social sciences, many do not. For instance, a researcher may determine through a study that there is a relationship between poverty and crime. However, she cannot say that this is a cause and effect relation¬ ship; she cannot say that poverty causes someone to become involved in crime. In terms of relationships that do involve cause and effect, one of the most com¬ mon forms of research carried out in social work are program evaluations. Social workers are naturally interested to find out if there is a relationship between an intervention program and the outcomes for service users, in particular whether the program causes the desired outcomes. Social workers are also interested in the effects of social problems on citizens. For instance, the causes of poverty remain in dispute, with some people blaming the poor for their poverty and others blam¬ ing the lack of secure well-paying jobs with benefits. Regardless of the causes, the effects of poverty on children are clear. For one thing, children who grow up in poor families experience more severe and persistent health problems (see Box 6.1). many CLASSIC EXPERIMENTAL DESIGN As stated above, a common type ofresearch carried out in social work is a program evaluation. If a foundation must a agree to summative measurable community organization applies for funding from a government or to create a new conduct identified social need, it typically evaluation. Many funding bodies will insist on program to meet an a program evaluation, also called an outcome evaluation, which relies on quantitative evidence to show that the program produces the desired 69 70 Statistics for Social Justice Box 6.1 The Impact of Poverty on Health Status of Children Child poverty in Canada is a significant public health concern. Because child development during the early years lays the foundation for later health and development, children must be given the best possible start in life. Family income is a key determinant of healthy child development. Children in families with greater material resources enjoy more secure living conditions and greater access to a range of opportunities that are often unavailable to children from low-income families. On average, children living in low-income families or neighbourhoods have poorer health outcomes. Furthermore, poverty affects children’s health not only when they are young, but also later in their lives as adults. The health sector should provide services to mitigate the health effects of poverty, and articulate the health-related significance of child poverty, in collaboration with other sectors to advance healthy public policy. Source: Paul-Scn Gupta de Wit, and McKeown 2007 Ideally, this type of program evaluation makes use of what is referred experimental design. This type of research is considered to be the gold standard against which other research designs are compared. This is the only research design that will allow the people responsible for running the program to state conclusively that the program is effective and has produced the desired outcomes. If the experimental group scores significantly higher on the measure¬ ment instrument than the control group, this is generally accepted as proof that the program is effective. The classic experimental design must include the following three conditions: outcomes. to as the classic 1. The selection of participants 2. There must be two groups for the study must make use of probability sam¬ pling (random sampling), meaning that every member within the population of interest has an equal chance of being selected. of participants: the experimental group, made of participants who are included in the program, and the control group, made up of participants who are not included in the program. The measurement instrument used must have been tested for reliability and validity. up 3. Box 6.2 highlights the use of experimental program evaluation design. To con¬ duct research using classic experimental design, the researcher must be familiar with the concept ofhypothesis testing. To illustrate the concepts related to hypothesis testing, consider the following hypothetical example. Suppose you are running a Head Start Program in your community to help preschool children from chances low-income families improve their school readiness and ultimately their of succeeding in school. Suppose also that you are interested in demonstrating Hypothesis Testing Box 6.2 71 Evaluating the Effectiveness of a Social Skills Program for Preadolescents Peer relations play Social workers who are a major role in the social development of children and youth. well aware of the potential negative consequences for children socially excluded and they frequently work with these children concerning truancy, academic performance, delinquency and substance abuse. This article discussed the outcome of a social skills program designed to improve the social interaction ot fifth grade children. Results indicated that the treatment group made significant gains on sociometrics, observational and self-perception measures. The importance of using school social workers and staff and integrating social skills programs into elementary school curricula is also discussed. are issues of Source: Hepler 1994 the effectiveness of your Head Start Program. Because you want to determine if there and effect relationship between your Head Start Program and better school readiness, you decide to use the classic experimental design. The program would be the cause and the school readiness outcome would be the effect. To carry out this rigorous form of program evaluation, three basic conditions must be met in order to explain causality: is a cause 1. The two variables be empirically linked to one another. In other words, Program must be linked to the outcome of school readiness. must precede the effect in time. This means that the program must must the Head Start 2. The cause occur 3. The before the outcome of school readiness is measured. relationship between the factors cannot be explained by other factors. are no other explanations for the improve¬ You need to demonstrate that there ment in school readiness. TESTING YOUR RESEARCH HYPOTHESIS You begin by stating your research hypothesis. In this example, your research hypothesis is that your program is effective in helping preschool children improve their readiness for school. Next you identify and define your variables. As stated in Chapter 1, the term conceptualization refers to the process we use in choosing the variables and clearly defining the variables that are included in our study. In this case, the program is the independent variable, and school readiness is the dependent variable. The independent variable is the variable which is believed to affect the dependent variable, and it is the one manipulated in some way. You need to clearly describe your program and how you will define “school readiness,” such as preschool literary or math skills. Next you state how you will measure your vari¬ ables. Again turning to the definitions in Chapter 1, the term operationalization 72 Statistics for Social Justice refers the method used the variable. Your independent variable is possible levels, either children are in the program (they are the experimental group) or they are not in the pro¬ gram (they are the control group). For the dependent variable, school readiness, you find an instrument to measure the outcomes of your program. Let’s call it the Head Start Outcome Measure, and it measures preschool literary and math skills. These measurements give you data at the interval/ratio level of measurement. You want to establish that there is a relationship between the independent variable and the dependent variable. Let’s say that you were able to use a probability sampling method and ran¬ domly select two groups of preschool children from your community. One group is participating in your Head Start Program. This is your experimental group. The second group is not involved in the program. This is your control group. By using a control group, you are able to control for other possible variables which are not part of your study but which may have an impact on the outcome, such as differences in parental support or differences in natural academic ability. You test both groups at the start ofthe program with your Head Start Outcome Measure to show that both groups are similar. This is your pre-test. You then test both groups again at the end of the program, for your post-test. If the experimental group scores higher than the control group on the post-test, are you safe in concluding that there is a relationship between the independent variable and the dependent variable and that the your program is effective? Maybe, but you need to consider other possible problems. At this point you must be open to the fact that there may still be other expla¬ nations which may exist and which may also explain the relationship. Robert Weinbach and Richard Grinnell (2010) identify two other possible explanations: rival hypotheses and research design flaws. at to to measure the nominal level of measurement and has Rival You two Hypothesis testing your hypothesis that your Head Start Program is effective with preschool children and the results confirm that it is. However, there maybe other explanations for the improvement in school readiness. The rival or alternative are hypothesis suggests that some non-random cause, or some other variable, may causing the relationship you have found. For example, if the children attend¬ ing your program receive more family support in the form of learning activities at home, the relationship between the independent and the dependent variable may be explained by the family support as opposed to your Head Start Program. Another possibility is that variables co-vary, that is to say they interact to create a relationship. In our example, perhaps the family support together with the Head Start Program created the outcome. be Hypothesis Testing Research 73 Design Flaws There may also be design flaws in your study. Two possible design flaws are measure¬ sampling bias. Measurement error may be the result of consistent distortion of the measurement of variables, which can distort the quality of the data and the subsequent analysis of the results. This could create a measurement bias that goes undetected. There may also be a systematic error, which occurs when the researcher is using an invalid measure. This may happen if your Head Start Outcome Measure is not a valid measure of preschool literary or math skills. There may also be random error, which could occur because of mood or health changes on the part of the children. The second type of design flaw is referred to as sampling bias, which is the systematic distortion of a research sample. What if the sample you selected is not typical of the population you are investigating? In our example, what if the children attending the Head Start Program are not typical of the children from low-income families in your community? For instance, perhaps because of the lack of a school bus service, many low-income parents may not be able to bring their child to your program, and only families who have access to a car are able to take advantage of ment error and the program. These possible design flaws can be dealt with before you begin your study by using proper research procedures. Because you are using a control group, you are able to dismiss the possibility of rival hypotheses. If your research instrument, two the Head Start Outcome Measure, is a reliable and valid measure of school readi¬ also dismiss the possibility of measurement error. Finally, if you used random sampling to select your sample and were able to show that your sample is representative of the low-income families in your community, you can disregard the problem of sample bias. ness, you can Sampling Error Even if you carefully followed the most rigorous design procedures, there is one possible problem that could still affect the quality of your data and your ability to show that there is a true relationship between your independent and depend¬ ent variables. This problem has to do with sampling error, which is the concept that there is a natural tendency for any sample to differ, if only slightly, from the population from which it was drawn. As you can imagine, it would be a very rare occurrence for the sample statistics and the population parameters to be identical. However, there are two ways ofrefuting sampling error: replication and inferential statistical analysis. Replication refers to doing a study over and over again. The more we repeat it, the more we can be sure that the results are true. If we repeat the study 100 times and we obtain the predicted outcome 95 times, we are safe in concluding that our hypothesis is confirmed. A cheaper and more practical method 74 Statistics for Social Justice inferential statistical analysis. Ifwe rigorously follow scientific procedures, significant results after running inferential statistical analyses, we can arrive at the same conclusion, that our hypothesis is confirmed. This brings us to the issue of statistical significance. is to use and we find STATISTICAL SIGNIFICANCE Significance is a common term used in everyday conversation. We might, for example, say that someone has made a significant contribution to a social work organization. We may also say that the results of our work are significant. However, in discussing statistics, significance has a specific meaning. Weinbach and Grinnell (2010: 99) state: “Statistical significance is the dem¬ onstration, through the use of mathematics and the laws of probability, that the relationship between variables in a sample is unlikely to have been produced by sampling error.” Regardless of the direction, if there is a low probability that the results we obtain are due to chance, we say that the results are statistically signifi¬ cant. It cannot be emphasized enough that there is always a chance, albeit a slim one, that the results could be the result of sampling error. If we are trying to prove that a relationship exists and we accept that because of sampling error there is always some chance that no relationship exists, at what point are we safe in accepting that there is a relationship? Within the social sciences, the accepted level is 95%. Looking back at the concept ofprobability, we said that probability is expressed as a range of 0, meaning that there is no relationship, to 1.0, meaning that there is a 100% chance that a relationship exists. Therefore if the accepted level is 95% or higher, we can then say that there is less than a 5% chance (or a .05 probability) that no relationship exists. The .05 is our level of statistical significance. THE NULL HYPOTHESIS AND THE REJECTION LEVEL We mentioned that hypothesis is a tentative answer to a research question derived through a review of the literature. It is a testable statement. Looking at our example ofthe evaluation of our Head Start Program, our test hypothesis is that our program is effective in helping children from low-income families improve their readiness for school. However, in the interest of being scientifically rigorous, we must now introduce a new concept, the null hypothesis. The null hypothesis is a statement that no relationship exists between the variables in our study, and, if one does seem to exist, it is simply occurring by chance. In scientific studies, we are, therefore, not directly interested in finding evidence to accept our test hypothesis. Instead, we are looking for reasons to reject the null hypothesis, which would allow us to say that the relationship that we have found a Hypothesis Testing between the variables has 75 occurred by chance. This brings us to the concept of rejection level. The rejection level is the level at which we are safe in rejecting the null hypothesis and concluding that a real relationship exists; this is the level of sampling error that we are willing to accept within our research. If there is less than a 5% chance that the relationship that we found is due to sampling error, then we are safe in rejecting the null hypothesis, and we can then say that the relation¬ ship is statistically significant. In social work, as is the case in most social sciences, a probability value of .05 is considered acceptable for our purposes. The p = .05 is our rejection level, also called the significance level. This level was chosen through convention. not the TYPES OF HYPOTHESIS There of research hypotheses, and each has a function in rejecting hypothesis. These two types are called the one-tailed and two-tailed hypotheses. The term “tailed” refers to the tails at either end of a normal distribu¬ tion formed by the distribution of all possible outcomes are two types the null One-Tailed Hypothesis A one-tailed hypothesis is also known as a directional hypothesis. This is where we predict the direction of the relationship between the two variables. For instance, if we say that children in the Head Start Program will score higher on the Head Start Outcome Measure than children in our control group, we are implying a direction (we are predicting they will score higher, not lower, and are therefore specifying the direction we think the relationship will show). With a one-tailed hypothesis, to obtain a p of .05 or less, the results of our sta¬ tistical analysis would have to be located at or above the 1.645 standard deviation above the mean in a distribution of all possible outcomes (Figure 6.1). Figure 6.1 One-Tailed Hypothesis, Area under the Curve Source: Institute Critical Value = -1.64 for Digital Research and Education n.d. 76 Statistics for Social Justice TWO-TAILED HYPOTHESIS If we believe, on the other in what direction, hand, that there is a relationship but we do not know use a two-tailed hypothesis, or a non-directional would we hypothesis. For example, if we say that there is a relationship between participating in a Head Start Program and outcome on the Head Start Outcome Measure, we would not be implying a direction. With a two-tailed hypothesis, to obtain ap of .05 or less, we need to split the .05 level into two, or .025, above the mean and .025 below. The results of our statistical analysis would have to be located at or above the 1.96 standard deviation either above or below the mean in a distribution of all possible outcomes (Figure 6.2). Figure 6.2 Two-Tailed Hypothesis, Area under the Curve Critical Values = -1.96 and +1.96 Source: Institute for Digital Research and Education n.d. ERRORS Once have ready to state if we are able to accept errors that we could make in drawing our conclusions: type I and type II errors. These are two possible errors in interpreting the research data. or we completed our study and are reject the null hypothesis, there are two possible Type I Error This error is where we reject the null hypothesis and conclude that there is a rela¬ tionship when in fact there is none. Although we can never totally eliminate the I error, we can l) use larger samples because larger samples reduce the possibility of sampling error, and 2) replicate our study to chance of committing a type confirm our results. Hypothesis Testing 11 Type II Error The second type the null hypothesis and say that there is no relationship when in fact there is a relationship. Although we would obviously want to reduce our chances of making either type of error, type II errors can result in us missing potentially useful relationships. We can reduce the chances of making a type II error by using a test with a higher level ofstatistical power. Statistical power refers to the ability of the test to correctly reject the null hypothesis, to detect a true relationship between variables. Factors that affect statistical power include the strength of the actual relationship, the amount of variability within the variables, the rejection level being used, whether a one-tailed or two-tailed hypothesis is used and the size of the sample. is where we accept SUMMARY In this chapter we explored the concept ofhypothesis testing and explained that it identifying whether a true relationship exists among variables. We looked at different kinds of relationships, and how to use the classic experimental research design to determine if a cause and effect relationship exists. We explained that, even if the results of our research show that there is a relationship, there may still be other explanations, including errors, as to why we obtained these results. We introduced the concepts of statistical significance, the null hypothesis, and one- and two-tailed hypotheses. is a fundamental part ofthe process in REVIEW QUESTIONS 1. Describe what is 2. Discuss the 3. Explain how the concepts of sampling nificance 4. 6. by cause and effect. importance of the criteria for the classic experimental design. are error, rejection level and statistical sig¬ related. Describe the importance of formulating a hypothesis and a null hypothesis conducting research. Provide an example of a situation in which a rival hypothesis might explain the relationship found in a research study. Discuss why committing a type I error is so dangerous. Explain why commit¬ ting a type II error is also a bad thing. when 5. meant 1 Sampling Distributions In this chapter, we examine the important concept of the sampling distribution and how we use this concept to make inferences about the population parameters based on the sampling statistics. A sample never provides a completely accurate representation of the population. Therefore, we need to be familiar with concepts such as sampling error, sampling distributions, central limit theorem and confidence infervals and learn how to use the information from these concepts to answer the question: at what point are we safe in concluding that the sample statistics accurately reflect the population parameters? We attempt to provide that familiarity this in this chapter. Conservative politicalvalues favour low taxatiin peoples on rates, lives. a reliaWith nce onthetheConservative free market and minimal involvement of government Party of Canada winning three successive elections, forming minority governments in 2006 and 2008 and then majority in 2011, one could be forgiven for thinking public attitudes in Canada during the last decade have shifted towards such conservative values. And yet, a survey entitled Focus Canada 2011 by the polling company Environics found that the opposite is true. Michael Adams, founder of a that Environics, had this to say: Our Focus Canada survey showed three-quarters of Canadians believe generally a positive thing, as opposed to one in five (19 per¬ cent) who think taxes are mostly a bad thing.... A strong majority (68 percent) agree that “governments are essential to finding solutions to the important problems facing the country.”... A large majority (82 percent) agree either strongly (50 percent) or somewhat (32 percent) that “gov¬ ernments in Canada should actively find ways to reduce the gap between wealthy people and those less fortunate.”... In short, Canadians tend to think government is reasonably effective in how it operates (although many see room for improvement); large majorities think government has an important role to play in addressing society’s problems, including inequality and the excesses of the private sector; and three-quarters are taxes are 78 Sampling Distributions 79 quite happy to fork over some of their own money to make a functioning government possible. These are not attitudes one would expect from a population that is utterly disgusted with public services or interested in burning government institutions to the ground. (Adams 2013) For those of us who believe in the role that government plays in supporting the most vulnerable in our society, “hard data” such as those above describing the attitudes of Canadians about the role of government in people’s lives provide important evidence to dispel the myth promoted by right-wing conserva¬ tives, who claim that they speak for the majority of Canadians. Adams continued: programs to help When Foreign Affairs Minister John Baird revealed his behind-theto oppose anti-gay policies signed into law in Russia in June [of 2013], the socially conservative lobby group real Women of Canada condemned him as a “left-wing elitist” who was out of step with “grassroots Canada.” Unless “grassroots Canada” excludes the majority scenes efforts of Canadians, real Women is mistaken: Minister Baird’s work on this file fits quite nicely with public attitudes. Social values research well polling indicates that Canadians are becoming more socially liberal; more at ease with diverse family models, diverse sexual orientations and gender identities; and generally more comfortable with sexuality, in real life and in popular culture. as as we know if the conclusions drawn by the Focus Canada 2011 are accu¬ they really describe the attitudes of the average Canadian? Is it possible that the statement made by the real Women of Canada that Minister Baird is “out of step” with grassroots Canada is true? How do rate? Do SAMPLING ERROR We take it self-evident that there is natural tendency for any sample to differ population it is taken from. Put in another way, the sample statistics, for example, the mean and standard deviation of the sample, are very rarely a perfect representation ofthe true mean and standard deviation of the population. Sampling error is the difference between the sample statistics and the population parameters. Obviously, the larger the sample, the more likely it is that the sample statistics will be similar to the population parameters. To illustrate this point, let’s assume that we have a class of 100 undergraduate social work research students. Say that the class average (the mean) on an exam is 75 (out of 100), with a standard devia¬ tion of 5; these are the population parameters. If we randomly select the exams of three students, it is highly likely that the average score on the exams of these from the as a 80 Statistics for Social Justice three students will differ from the overall class average. It is possible that the three of 85. If we randomly select the exams of another three students, the average score of these three maybe 65. But what if we select a much larger sample, say 25 students? We could expect that the average score of the 25 students would be much closer to the average of the entire class. The average of the 25 students could be 74 or 76 and the standard deviation of this sample would also be close to the standard deviation of the total class; it could be something like 5.2. To put it simply: the larger the sample, the smaller the sample error, whereas the smaller the sample, the greater the sample error. students could have an average score SAMPLING DISTRIBUTION OF THE MEAN A sampling distribution of the mean is an abstract concept which may be best introduced through a relatively simple example. Imagine that Susan is an economics student who is interested in public attitudes about raising minimum wage. Suppose that Susan asks random sample of 25 Canadians if they support increasing the by $ 1 per hour, and let’s say that 15 of the 25 say yes. To verify her results, Susan randomly selects a different sample of 25 Canadians and asks them the same question, and this time only 10 say yes. This range from 10 to 15 out of 25 is a large spread. So Susan decides to do the same thing many more times, and she obtains results ranging from 9 to 16, with most results from all of her samples clustering at 13. If she were to draw a histogram with all the different results from the different samples, it would form a normal distribution with a mean score of 13, which would be very close if not the same as the population mean (if we could possibly know it). This distribution created by the means of all these samples is the sampling distribution. The sampling distribution of the mean is the mean (average) of the means. An important characteristic of the sampling distribution is that it will form a a minimum wage normal distribution even if the actual data of a variable are skewed. For instance, we know that income is positively skewed in our population, with most people having income that is in the low end of the income distribution. But if we randomly samples, the distribution created'of all of the average incomes of all of our samples (the mean of all of the means) would form a normal distribution. an select many Standard Error of the Mean Another important characteristic of the sampling distribution is that because of sampling error, the standard deviation of this distribution will not be the same as the standard deviation of the population. Instead, it is called the standard error of the mean, and it is calculated by dividing the standard deviation of the popula¬ tion by the square root of the sample size. Once again, the larger the sample, the smaller the sampling error will be. Sampling Distributions 81 Central Limit Theorem Of course, it is not realistic or necessary to do what Susan did. We do not need to repeat our study over and over again to create a sampling distribution in order to establish the population parameters. Instead, we rely on a statistical theory called the central limit theorem. This theory states that for any variable, if the sample is large enough, that is to say at least 30 participants, the sampling distribution of the mean will form a normal distribution and will approximate the population param¬ eters. This is true even if the variable itself is skewed within the population. The concepts of sampling distribution and central limit theorem form a fundamental basis for statistical analysis. This is because once we have a normal distribution, we are able to carry out a variety of statistical procedures, which allows us to make inferences about the population based on the sample statistics. CONFIDENCE INTERVALS But what if we don’t know the our parameters of the population? How can we tell if sample is representative of the population? Public opinion polls such as Focus Canada 2011 make inferences about the attitudes of Canadians without ever knowing the actual population parameters. To answer these questions, we look at confidence intervals. Simply put, a confidence interval is a range of values in which the true mean of the population is expected to fall and is calculated based on sample statistics. To calculate the confidence interval, we must first choose the confidence level. Public opinion polls typically choose a 95% confidence level. They state that the results of their poll are accurate 19 times out of 20, or allow for a 5% margin of error, which represents the confidence level of 95%. So how do we use Here we must go Figure 7.1. the confidence level to calculate the confidence interval? back to the concept of normal distribution and z scores. Consider 82 Statistics for Social Justice Figure 7.1 Area under the Normal Curve Lying within 1.96 Standard Deviations of the Mean Critical Values Source: = -1.96 and +1.96 Psychstatistics n.d. Table 5.1 in Chapter 5 demonstrates the area under the normal curve, and that will give 95% of the all the values within a normal distribution. We find that the z score is 1.96, assuming we are using a two-tailed test. This means that, in any normal distribution, 1.96 standard deviations above and below the mean will always include 95% of all the possible values. In order to estimate the parameters of the population based on the sample statistics, we can use this figure of 1.96. If we want to be 95% confident (confidence level) that the population mean falls between two levels, we take the sample mean less the z score (1.96) times the standard error. This is the lower level of the confidence interval. For the upper level we add the z score to the sample mean and multiply by the standard error. To illustrate how this works in practice, let’s assume you are planning to run an assertiveness training program for women staying in a shelter and you want to know their current level of assertiveness. You will be using a standardized test measuring assertiveness with a known mean of 100 and a standard deviation of 10. (In this case we know the population parameters because the measurement instrument has been standardized for the population.) Let’s say that you want to calculate the 95% confidence interval of the population mean for the residents in the shelter. You randomly select a sample of 25 residents from the shelter and you we look for the obtain a mean z scores of 80 for assertiveness. Remember that the formula for determin¬ ing the standard error is the standard deviation divided by the square root of the sample size and that the z score for a 95% confidence interval is always 1.96 when using a two-tailed test. Sampling Distributions 83 Example 7.1 Calculating the Confidence Interval Mean of your 80 sample is The standard deviation is The z score for a 10 95% confidence interval The (two-tailed) is standard error is (10 4- 1.96 V2S) 2 Lower limit: 80.00 1.96x2 = - 3.92 76.08 Upper limit: 80.00 1.96x2 = +3-92. 83.92 Therefore, be 95% confident that the true assertiveness score of the population of residents in the women’s shelter lies between 76.08 and 83.92. In our you can example, we were able to calculate the standard error by dividing the standard deviation ofthe population on a standardized test measuring assertiveness by the square root of the sample size. The standard deviation on the assertiveness scale is 10, and we have a sample of 25 women. The square root of 25 is 5, and so 10 divided by 5 gives us a standard error of 2. STANDARD ERROR OF THE PROPORTION In the of public opinion polls, researchers do not know the standard deviation population. In this case, instead of using the standard error of the mean for a population, we use the standard error of the proportion. Let’s go back to our example of a public opinion poll at the beginning of this chapter. The poll found that 82% of Canadians agree that governments should actively find ways to reduce the gap between wealthy people and those less fortunate. How accurate is this percentage? These researchers surveyed a sample of 1,500 people. The formula for calculating the standard error of the proportion is as follows. case of the SE P(l-P) = 84 Statistics for Social Justice We SE can now .82(1 plug in the actual figures. - .82) = 1500 The standard of the proportion then is .0099, or 1.0%. We can use this figure in the same way as we use the standard error of the mean, as in Example 7.1. Assuming that we are interested in a 95% confidence interval, the calculation is as follows, shown in Example 7.2. error Example 7.2 Calculating the Confidence Interval The standard of the proportion error 1.0 multiplied by the z score is 1.96= 1.96 x Lower limit: 82.00 1.96 - 80.04 Upper limit: 82.00 + 1.96 83.96 Therefore, be 95% confident that the true proportion of the population of between the rich and those less fortunate lies between 80.04% you can of Canadians who believe that the Government of Canada should find ways reducing the gap and 83.96%. What this is that the results of the survey published by Focus Canada This example demonstrates the important role that statistics can play to help ensure that the truth is known Canadian attitudes are becoming more liberal, despite what some Conservatives would have us think. So how are these polls conducted? In Box 7.1 the polling company Gallup offers an explanation. Box 7.2 provides an example of polls showing public opinion on issues related to poverty. means 2011 do reflect the real attitudes of Canadians about the role of governments. - Sampling Distributions Box 7.1 How Does 85 Gallup Polling Work? Gallup polls aim to represent the opinions of a sample of people representing the same opinions that would be obtained if it were possible to interview everyone in a given country. The majority of Gallup surveys in the U.S. are based on interviews conducted by landline and cellular telephones. Generally Gallup refers to the target audience as “national adults,” representing all adults, aged 18 and older, living in United States. The findings from Gallup’s U.S. surveys are based on the organizations standard telephone samples, consisting of directory-assisted random-digit-dial (rdd) telephone samples using a proportionate, stratified sampling design. A computer randomly generates the phone numbers Gallup calls from all working phone exchanges (the first three numbers of your local phone number) and notlisted phone numbers; thus, Gallup is as likely to call unlisted phone numbers as listed phone numbers. national Within each contacted household reached via landline, an interview is sought with of age or older living in the household who has had the most recent birthday. (This is a method pollsters commonly use to make a random selection within households without having to ask the respondent to provide a complete roster of adults living in the household.) Gallup does not use the same respondent selection procedure when making calls to cell phones because they are typically associated with one individual rather than shared among several members an adult 18 years of a household. When respondents to be interviewed are selected at random, every adult has an equal probability ot falling into the sample. The typical sample size for a Gallup poll, either a traditional stand-alone poll or one night’s interviewing from Gallup’s Daily tracking, is 1,000 national adults with a margin of error of ±4 percentage points. Gallup’s Daily Tracking process now allows Gallup analysts to aggregate larger groups of interviews for more detailed subgroup analysis. But the accuracy of the estimates derived only marginally improves with larger sample sizes. After Gallup collects and processes survey data, each respondent is assigned a weight so that the demographic characteristics of the total weighted sample of respondents match the latest estimates of the demographic characteristics of the adult population available from the U.S. Census Bureau. Gallup weights data to census estimates for gender, race, age, educational attainment, and region. Source: Gallup 2010 86 Statistics for Social Justice Box 7.2 Is It Really Possible to Have a Poverty-Free Canada? For many experts, the answer is a clear yes, and the best way to reach that goal is through a guaranteed annual income. It’s a radical idea that to date has been largely dismissed by government leaders as too costly, too difficult to implement and lacking in public support. Because of the perception that voters don’t generally like the idea, few politicians bother even to think about such a program, let alone come out in support of it. However a major new poll conducted this fall may provide the evidence that some risk-averse politicians need before giving their support to a guaranteed annual income. The survey showed more Canadians like the idea than oppose it. The findings are important because it’s the first time a national poll has ever asked Canadians what they think of the idea of providing everyone with a guaranteed income. Such a program “is often dismissed as giving free money to people who won’t work,” said Keith Neuman, Executive Director of the Environics Institute for Survey Research, which conducted the poll earlier this fall for the Montreal-based Trudeau Foundation. Neuman said the results suggest there’s a “potential foundation for building public support for it (guaranteed income) by some bold government,” especially if it was accompanied by the elimination of other programs. A guaranteed annual income is a single, cash payment that would replace all current social programs, such as welfare and employment insurance. It would create a minimum income below which income line” at about no Canadian would fall. Statistics Canada $22,200 for now sets a “low- single person and $47,000 for a family with three children. Proponents, such as Conservative Senator Hugh Segal, argue such a plan wouldn’t cost Ottawa more money because it would get the needed dollars from other programs that would be killed. Also, they contend the idea would actually encourage people to work because it would eliminate provisions in the current welfare system that penalize the poor who take very low-paying or part-time jobs. The poll found 46 per cent of Canadians strongly (19 per cent) or somewhat (27 per cent) favour such a policy. Another 42 per cent said they strongly (25 per cent) or somewhat (17 per cent) oppose the idea. About 10 per cent said it would depend on how such a program was actually implemented or had no opinion. Support was highest in Quebec at 55 per cent and lowest in Alberta at 38 per cent. A majority of Canadians with household incomes under $100,000 and those with no post-secondary education also backed the idea while support was lowest (38 per cent) amongst Canadians earning more than $100,000. To date, no major national political party has embraced the idea of a guaranteed income, although all talk vaguely about the need to study it more closely just like any other policy option. Clearly we are in an era when our politicians are more wary than brave, afraid to champion new programs for fear of upsetting voters whose minds are focused only on cutting taxes. As Keith Neuman says, it would require a bold government to make the guaranteed income a reality. Given the poll results though, the idea could be a winner for the political party with the courage to make it a serious part of the debate on tackling poverty in our country. Source: Hepburn 2013 a Sampling Distributions 87 SUMMARY In this a chapter, we introduced the concept of sampling distributions. We covered topics, including sampling error, central limit theorem, confi¬ number of related dence intervals and how we can use the information from these concepts to answer the question: “At what point are we safe in concluding that the sample statistics accurately reflect the population parameters?” We explained how sampling distri¬ butions be used bring attention to important social justice issues such as The examples of public opinion polls included in this chapter show how confidence intervals based on the standard error of the propor¬ tion can be powerful tools in convincing governments at various levels about the political advantages of pursuing a social justice agenda. can to the elimination of poverty. REVIEW QUESTIONS 1. Explain what is meant by “sampling error.” Why is it important to understand this concept? What level of sampling error is generally accepted within the social sciences? 2. Describe the relationship between sampling statistics and population param¬ eters. 3. The 4. What is the sampling distribution of the mean is the mean of the means. Elaborate. relationship between the central limit theorem and the concept of normal distribution? 5. Provide to 6. an calculate example of a scenario where a social work researcher would want a confidence interval. Using a structural and aop perspective, critique the research methodology of Gallop polls. 8 Chi-Square In this chapter, we introduce the chi-square test of association, a nonparametric test that social workers may use if their data is at the nominal or ordinal level. We illustrate how to create a cross-tabulation table and how to use observed frequencies to calculate a chi-square. We explain that this does not prove us about the strength of the relationship among and effect but will tell variables. cause Social workers are often faced with having to conduct studies where the only If, for data available are at the nominal or ordinal level of measurement. exam¬ ple, they want to show that their intervention program is related to a positive outcome, they would use a well-known non-parametric test called the chi-square test of association, or simply chi-square. Consider, for instance, the following hypothetical example. Julie Chow is working in a shelterfor women leaving abusive relationships. She runs a counselling program for the residents designed to help these women plan for a future without their former abusive relationships. She recognizes that there are many factors related to why women return to abusive relationships, certainly not the least of which is income security. In order to applyfor funding to keep the program going, she needs to demonstrate to a local private foundation dedicated to addressing violence against women that her program is effective. She has designed a study to test the efficacy of the counseling program. In Julie s study, the independent variable is simply whether the women attended the program or not. The dependent variable is whether the women who attended the program return to their former abusive relationship or not. Both of these variables are at the nominal level of measurement. The research hypothesis in this case could be stated in the following manner: Women who attend a counselling program designed to help women refrain from returning to an abusive will be less likely to return than women who do not attend the program. This is a one-tailed directional hypothesis because June is predicting a positive outcome. The null hypothesis would state: there is no relationship between attending the program 88 Chi-Square 89 and returning to an abuse relationship. To comply with the rigorous research design, testing the research hypothesis, Julie needs to test the null hypothesis that there is no relationship. If she obtains a significant result, she will be able to reject the null hypothesis. Julie decides to conduct a pilot project with a small group of women in the hopes of being able to reject the null hypothesis. She randomly selects a sample of 15 women from the shelter and, after obtaining the necessary informed consents, invites them to participate in an intensive five-session group counselling program. She plans to compare the outcome for this group with the outcome of a similar sized group of women who did not attend the program. Upon completing the program and at the point when the women in her counselling program are ready to leave the shelter, there are four possible scenarios for the women in both groups: rather than attended the program 2. attended the program 1. 3. 4. and did not return to an abuse relationship. and did return to an abusive relationship. did not attend the program and did not return to an abusive relationship. did not attend the program and did return to an abuse relationship. Let’s say that were as the number of women who fit into each of the four categories above follows: 1. attended the program 2. attended the program 3. 4. and did not return to an abuse relationship. (10) and did return to an abusive relationship. (5) did not attend the program and did not return to an abusive relationship. (5) did not attend the program and did return to an abuse relationship. (10) CROSS-TABULATION TABLE For the chi-square test of association, the above results would be displayed in what a cross-tabulation table (see Table 8.1). It would be set up for the above example with four cells, a, b, c and d, which display the observed frequencies. These are the actual results of the study. This cross-tabulation table is an example of a two-by-two table because for each of the two variables, there are only two pos¬ is called sible values If there — in this case, were more than two returned/not returned and attended/did not attend. values for each variable, we would have more cells. 90 Statistics for Social Justice Table 8.1 Cross-Tabulation Table Returned Did Not Return Totals Attended (a) observed frequencies (b) observed frequencies marginal total Did Not Attend (c) observed frequencies (d) observed frequencies marginal total marginal total marginal total overall total Totals Table 8.2 Cross-Tabulation Table - Observed Frequencies for Women Surviving Abuse Returned Did Not Return Totals Attended a 5 b 10 15 Did Not Attend c 10 d 5 15 15 30 Totals 15 The far right column of the table displays the totals for each row. For instance, if the observed frequencies recorded in the row that included cells a and b, we have a total of 15. Similarly, if we add up the observed frequencies for cells we c add up and d, we also have a total of 15. If we do the same for each of the two columns, end up with totals shown in the bottom row of 15 and 15. These are called the marginal totals. The overall total of 30 is recorded in the bottom right hand box. Another important point is that the minimum number of cases for each cell should not be less than 5. As you can see, cells a and d each have 5 cases. we CALCULATING THE CHI-SQUARE To the answer the question ofwhether a true relationship exists, we must first calculate expected frequencies. These are the frequencies that are most likely to occur hypothesis is correct. The expected frequencies are calculated by using if the null the following formula: E = (RXC) (N) Where: E = Expected frequency of a particular cell R = Marginal total for the row in which the cell appears C = Marginal total for the column in which the cell appears N = Total number of cases Chi-Square These 91 expected frequencies are then compared with the observed or actual frequen¬ cies using the chi-square formula. Note that, normally, one would need to calculate the expected frequencies separately for each of the four cells (a, b, c, d), but because the marginal totals for each row and column are the same, the expected frequencies for each of the four cells are the same as well. The expected frequency for cell a is 15 times 15, divided by 30. The expected frequency for cell b is 15 times 15, divided by 30. The expected frequency for cell c is 15 times 15, divided by 30. The expected frequency for cell d is 15 times 15, divided by 30. DEGREES OF FREEDOM The larger the table and the more cells, the more there is a chance for a large dif¬ expected frequencies. The number of cells is expressed in terms of degrees of freedom. Once we have obtained the chi-square results, we must calculate the degrees of freedom using the following formula: ference between the observed and df= (R -1)(C - l) The R refers to the number of rows, and C stands for the number of columns. Chi-Square Formula , 4f= (2-l)(2-l) =1 (O-E)2 ^ Cell fS Cell b a 7.501 2 - + .83 = x2 = 3.32, df=l,p = + 7.5 7.5 x2 (10-7.512 Cell + < .83 (TO 7.51 2 - Cell d c + 7.5 + .83 (5-7.5)2 7.5 + .83 .05 Where: O E = = observed frequency expected frequency The critical value for x2 with 1 degree of freedom (see Table 8.3) at the .05 level the chi-square scorejulie obtained was 3.32 and is greater than the x2 critical value, she is safe in rejecting the null hypothesis and accepting the research hypothesis that there is a significant relationship between attending the counselling program and not returning to an abusive relationship. for a one-tailed test is 2.71. Since Statistics for Social Justice 92 Table 8.3 Critical Values of Chi-square Level of Significance .10 .05 for a One-Tailed Test .025 Level of Significance .01 .005 .0005 for a Two-Tailed Test Df .20 .10 .05 .02 .01 .001 1 1.64 2.71 3.84 5.41 6.64 8.83 2 3.22 4.60 5.99 7.82 9.21 13.82 3 4.64 6.25 7.82 9.84 11.34 16.27 4 5.99 7.78 9.49 11.67 13.28 18.46 5 7.29 9.24 11.07 13.39 15.09 20.52 6 8.56 10.64 12.59 15.03 16.81 22.46 7 9.80 12.02 14.07 16.62 18.48 24.32 8 11.03 13.36 15.51 18.17 20.09 26.12 9 12.24 14.68 16.92 19.68 21.67 27.88 10 13.44 15.99 18.31 21.16 23.21 29.59 11 14.63 17.28 19.68 22.62 24.72 31.26 12 15.81 18.55 21.03 24.05 26.22 32.91 13 16.98 19.81 22.36 25.47 27.69 34.53 14 18.15 21.06 23.68 26.87 29.14 36.12 IS 19.31 22.31 25.00 28.26 30.58 37.70 16 20.46 23.54 26.30 29.63 32.00 39.29 17 21.62 24.77 27.59 31.00 33.41 40.75 18 22.76 25.99 28.87 32.35 34.80 42.31 19 23.90 27.20 30.14 33.69 36.19 43.82 20 25.04 28.41 31.41 35.02 37.57 45.32 21 26.17 29.62 32.67 36.34 38.93 46.80 22 27.30 30.81 33.92 37.66 40.29 48.27 23 28.43 32.01 35.17 38.97 41.64 49.73 24 29.55 33.20 36.42 40.27 42.98 51.18 25 30.68 34.38 37.65 41.57 44.31 52.62 26 31.80 35.56 38.88 42.86 45.64 54.05 27 32.91 36.74 40.11 44.14 46.94 55.48 28 34.03 37.92 41.34 45.42 48.28 56.89 29 35.14 39.09 42.69 46.69 49.59 58.30 30 36.25 40.26 43.77 47.96 50.89 59.70 32 38.47 42.59 46.19 50.49 53.49 62.49 34 40.68 44.90 48.60 53.00 56.06 65.25 36 42.88 47.21 51.00 55.49 58.62 67.99 38 45.08 49.51 53.38 57.97 61.16 70.70 40 47.27 51.81 55.76 60.44 63.69 73.40 Source: Fisher and Yates 1963 Chi-Square She thus showed that women who attended the program were abuse relationship. 93 less likely to return to an It is important to note that the chi-square test will not demonstrate that there is and effect relationship between variables even if there is a strong relation¬ ship. In other words, in the case ofJulie’s study, she cannot state if the counselling session caused the women to not return to their former abusive relationship. What the chi-square test can show is that the relationship between the independent variable (the counselling program) and dependent variable (women refraining from returning to an abusive relationship) is so strong that sampling error alone is unlikely to explain the relationship. The above example was based on a study with a relatively small sample of indi¬ viduals. It is also possible to use the chi-square test for much larger macro-level studies. Let’s look at another hypothetical example. Karen Weinstein is a social worker with Municipal Child Care Services who is interested in demonstrating the social and economic value of increasing the number of subsidized child care spaces in her city. She is aware that child poverty rates in Canada remain stubbornly high and that one of the most effective ways of reducing the rate of child poverty is through offering families quality regulated child care at an affordable price. Currently her department has 600 subsidized spaces where parents pay $10 per day, which is just under what someone would make per hour at minimum wage. She has 400families on the waiting list, and most of these parents are single, female, unemployed and surviving on social assistance. She is aware that the child poverty rate for singlefemale-headedfamilies is proportionally much higher than forfamilies headed by single males or couples. Karen is therefore particularly interested in looking at the impact of having access to subsidized child care on single-female-headedfamilies. She would like to demonstrate that there is a strong relationship between having access to subsidized child care and being employed full-time. Karen is hoping that a snapshot look at the impact ofsubsidized child care on employment status willprovide the evidence she needs to argue for more funding for child care. In this example, there are again two variables, both at the nominal level; the independent variable is access to subsidized care with two scores: yes or no (on the waiting list), and the dependent variable is employment status with two scores: employed full-time or unemployed/ employed part-time. The one-tailed research hypothesis for this study is: Women who have access to subsidized child care are more likely to have full-time employment. The null hypothesis for this study is: There is no relationship between having access to subsidized child care and being employed full-time. The observed frequencies of a random sample chosen from families accessing child care and from the waiting list are shown in Table 8.4. a cause 94 Statistics for Social Justice Table 8.4 Cross-Tabulation Table - Observed Frequencies for Single-Female-Headed Families Access to Employed Subsidized Care Full-Time Unemployed or Employed Part-Time Totals Yes a 80 b 20 100 No c 30 d 60 90 80 190 Totals 110 The first step is to calculate the expected frequencies for each of the four cells. Using the following formula, the results c E are as follows: (r)xca = N The expected frequency for cell a is 100 times 110=11,000, divided by 190 = 57.9 expected frequency for cell b is 100 times 80 = 8,000, divided by 190 = 42.1 The expected frequency for cell c is 90 times 110 = 9,900, divided by 190 = 52.1 The expected frequency for cell d is 90 times 80 = 7,200, divided by 190 = 37.9 Next Karen must calculate the chi-square score for the above data. The 2 (O-E)2 v Cell df= (R-1) (C -1) = 1 Cell b a (20-42.1) 2 (80-57.9) 2 = x2 = 42.30, Here Cell Cell d c (30-52.1) 2 (60-37.9)2 57.9 + 42.1 + 52.1 + 37.9 8.44 + 11.6 + 9.37 + 12.89 df= l,p = < .05 again, the critical value for x2 at the .05 level for a one-tailed test is 2.71. score that Karen obtained was 42.30, much higher than 2.71, she is Because the definitely safe in rejecting the null hypothesis and accepting the research hypoth¬ esis that women who have access to subsidized child care are much more likely to obtain full-time employment. These results provide strong evidence in support of the argument that investing in subsidized child care is related to important social and economic benefits. Chi-Square 95 CROSS-TABULATION TABLES FOR VARIABLES WITH MORE THAN TWO LEVELS Both of our examples are relatively simple in that they each involve a two-by-two cross-tabulation table. It is also possible to use the chi-square test for larger tables. Consider this third hypothetical example. counsellor working for a government employment centre who has been providing employment counselling services to people who have been laid offfrom their jobs in the high tech sector as a result ofdownsizing and who need help locating and securing a position in emerging high tech companies. He offers help in resume writing, preparingfor interviews and connecting people with possible employers. Up to now, he has been seeing people on an individual basis but because of the recent increase in the number of layoffs and the growing number ofpeople on the waiting list, he has decided to offer workshops to people in groups. He wants to compare the outcomes of the group workshops with the outcomes of the individual training. Given the number of people on the waiting list at the start of the study, he wants to also compare the results of people who took his training with the results of a group ofpeople on the waiting list, who may find work on their own before entering either of the services he provides. In Jim’s study, there are two variables; the independent variable is the type of employment service provided (individual, group, waiting list), and the dependent variable is success in obtaining employment (yes, no). Since there are three values for the delivery of service and two values for success in obtaining employment, a two-by-three table is required. The cross-tabulation table for this study is shown Jim Morris is a in Table 8.5. Table 8.5 Cross-Tabulation Table - Observed Frequencies for Employment Training Program Type of Training Employed Unemployed Totals Individual a 20 b 10 30 Group c 10 d 5 15 Wait List e 10 f 30 40 45 95 Totals c E 40 = (R) (C) N The expected frequency for cell a is 30 times 40 = 1,200, divided by 95 = 12.63 expected frequency for cell b is 30 times 45 = 1,350, divided by 95 = 14.21 The expected frequency for cell c is 15 times 40 = 600, divided by 95 = 6.32 The 96 Statistics for Social Justice The The The expected frequency for cell d is IS times 45 = 675, divided by 95 = 7.11 expected frequency for cell e is 40 times 40 = 1,600, divided by 95 = 16.84 expected frequency for cell/is 40 times 45 = 1,800, divided by 95 = 18.95 Next Jim must calculate the chi-square score for the above data. (O-El2 Cell df = Cell Cell b a (20- 12.62)2 f 10 - Cell d c (10 6.3212 14.213 2 + 4.3 x2 = + 17.54, 1.25 df= 2,p = < + 2.14 Cell + = 2 Cell f e flO- 16.841 2 (5-7.nl2 - 6.32 14.21 12.63 (R- l) (C- 1) (30- 18.951 2 7.11 + 16.84 + 18.95 .63 + 2.78 + 6.44 .05 Looking at the Table 8.3, showing the critical values for chi-square for a one.05 level with 2 degrees of freedom, the value we are looking for is 4.60. Because Jim obtained a result of 17.55, he is confident in concluding that he can reject the null hypothesis that there is no relationship and accept the tailed hypothesis at the Box 8.1 Link Between Sexual Abuse in Childhood and in Adolescence The aim of this study was to examine the link between childhood experiences of subsequent revictimization in adolescence. A sample of 281 female adolescents between 17-20 years of age, who participated in a prevalence survey of unwanted sexual contacts, completed the Sexual Experiences Survey as a measure of unwanted sexual contacts in adolescence and indicated whether or not they had experienced childhood sexual abuse. Childhood experiences of sexual abuse were reported by 8.9% of the respondents, a further 8.5% indicated they were not sure if they had been sexually sexual abuse and abused as children. Both abused victimization status contacts as were women significantly adolescents than women and more who did women uncertain about their likely to report unwanted sexual not state abuse. The link between childhood abuse and subsequent victimization was mediated by a higher level of activity among the abuse victims. The results support existing evidence on the impact of childhood sexual abuse on sexual relationships in subsequent developmental stages and underline sexual the need to consider childhood sexual abuse as a risk factor of adolescent sexual victimization. Source: Krahe, Scheinberger-Olwig, IVaizenhofer and Koplin (1999) Chi-Square 97 research hypothesis that there is a strong relationship between his services to laidand helping them obtain employment. Remember, he cannot say if there is a cause and effect relationship, but can say that there is a relationship. Furthermore, proportionately as many people obtained employment who received individual counselling as the ones in the group workshops, which suggests that the individual counselling and the group workshops may be equally effective. However, significantly more people who received Jim’s services were able to obtain employ¬ ment than those who were on the waiting list, so being in either of his programs is better than nothing at all. off employees SUMMARY In this chapter, we introduced the chi-square test of association. We explained that conducting research may only have access to nominal or ordinal level data. We showed how to create a cross-tabulation table, where we input the actual observed frequencies and compare these to the expected frequencies, which are the frequencies that would occur if there was no relationship between independ¬ ent and dependent variables. We explained how to calculate a chi-square test of association and what can be inferred from the results. Finally, we explained that we cannot state if there is a cause and effect relationship, but we can say that the relationship is so strong that sampling error is not likely to explain the relationship. social workers REVIEW QUESTIONS 1. Explain how the number of rows and columns in a cross-tabulation table is determined. 2. Discuss the relationship between observed and expected frequencies as well relationship with the null hypothesis. Describe the steps from start to finish of calculating a chi-square test of asso¬ as 3. their ciation. 4. Create a hypothetical example of a research study for which a chi-square test most appropriate inferential statistical test to use. of association would be the 9 / Tests and ANOVA This chapter examines three types of inferential statistical tests called t tests: one-sample /tests, paired /tests and independent /tests. We also briefly look at the test used to compare three or more groups called the anova. which stands for analysis of variance. We will begin by describing / tests in general, and will continue to explain the theory behind the one-sample /test. Next, we will explain paired and independent / tests and demonstrate how to calculate the latter. We will then explain how to report findings of / tests. Lastly, we will conclude our description of the various / tests by introducing the anova. In Chapter 1, we noted that social workers oftenfunding calledbodies, upon tosuchdemonstrate explainedarethat the effectiveness of their work. We ments and private foundations, as govern¬ continue to require empirical evidence as proof that the programs they are funding are effective. In Chapter 6, we stated that the gold of program evaluations is the classic experimental design, where the experimental group, the group receiving the program, is compared to a control group. To test whether the experimental group scores higher on a given variable than the control group, we generally use an inferential statistical test called the t test. standard in terms WHEN TO USE/TESTS of inferential statistical test that use means as their primary way comparisons. These t tests can be used when you wish to determine if there is a significant relationship between l) a sample and its population (onesample t test) or 2) two variables (independent and paired t tests). To run any t test, the independent variable must be at the nominal level of measurement, and the dependent variable at the interval or ratio level of measurement. In addition, the dependent variable must be normally distributed, as this is a parametric test. An additional criterion for the independent t test is that participants/cases must be randomly selected. We use the following hypothetical example throughout the chapter to help us better understand the different types of t tests and their uses. Kiran Tran is a social worker at a non-profit organization, Spectrum Group, whose t tests are a type of making 98 t Tests and ANOVA 99 mandate is to provide services to children on the autism spectrum, using the principles oj applied behaviour analysis (aba), an evidenced-based approach to working with children on the spectrum. Spectrum Group uses the Verbal Behaviour Milestones Assessment (vb-mapp), an assessment tool for children with autism, before placing children into a program. The assessment is re-administered at six-month intervals to check on the child’s progress. Recently the non-prof t organization was given a small amount offundingfrom the provincial government to evaluate the effectiveness of three of their programs offered to children prior to their entering the public school system: one-on-one aba therapy in the centre, one-on-one aba therapy in the child's home and a school-readiness preparation program. Kiran is tasked with carrying out the evaluation. ONE-SAMPLE (TEST The one-sample t test is most often used to determine it a research sample is may first want to ensure that the children in the sample from Spectrum Group do not somehow differ significantly from other children in Ontario on the autism spectrum. She may want to ensure that the severity of their autism and their functional status are representative, in order to be able to confidently generalize the results. When using a one-sample t test to determine if a sample is representative, we hope we are not able to reject the null hypothesis, that there is no difference between the sample and the population. This is the opposite of what we are usually doing, which is collecting evidence which we hope will allow us to reject the null hypothesis and conclude that there is a significant difference between our sample and the control group. In the case of a one-sample t test, we want to prove that there is no difference between the sample and the population, and that our sample is therefore representative of the population. Remember that in failing to reject the null hypothesis, we are stating that any difference between the mean of the sample and the mean of the population is likely not due to sampling error, but rather, is a true difference. Using their scores from the Verbal Behaviour Milestones Assessment (vbmapp), Kiran can compare the children at Spectrum Group to other children in Ontario (assuming she has the data of other children’s scores on this assessment). Let’s say she knows the mean score for children in Ontario on this assessment and the mean score for all of the children in the three programs at Spectrum Group. She could use the one-sample t test to compare these means. Ifthe resulting p value is greater than .05 (p > .05), she would fail to reject the null hypothesis, and she would conclude that her sample is not significantly different from the population in regard to their scores on the vb-mapp. She could then confidently proceed to run other statistical tests with the data and generalize the results to the population of children on the autism spectrum in Ontario. If the results of her one-sample t representative of its population. Kiran 100 Statistics for Social Justice resulted in value of less than .05 (p < .05), however, she would reject the hypothesis and conclude that her sample is significantly different from the population in regard to their scores on the vb-mapp. She would, therefore, not be able to generalize her findings, making them much less valuable. Kiran would likely present this information to Spectrum Group and its funder prior to continuing with any further statistical analyses. test a p null NON-PARAMETRIC ALTERNATIVE: CHI-SQUARE GOODNESS OF FIT TEST When designing a research study, it is usually best to design it in such a way that robust statistical tests, parametric tests, can be used to evaluate the data. This, however, may not always be possible, for instance, if we are given data that have already been collected that do not meet the criteria for a parametric test. When this is the case, non-parametric alternatives must be used. For every parametric test, there is a non-parametric alternative. For the one-sample t test, the alternative test is called the chi-square goodness of fit test. It fulfills the same purposes as the one-sample t test, but is used when the dependent variable is at the nominal level of measurement (remember that for the one-sample t test it must be at the interval/ratio level of measurement). It is similar to the chi-square test of associa¬ tion, described in Chapter 8. A less robust test, the chi-square goodness of fit test does not compare means, but instead compares the percentage of cases in each category of the nominal variable. This less precise test produces a chi-square value (instead of a t value), represented by a2. the most PAIRED-SAMPLES f TEST There are many names for a paired-samples t test: dependent or dependent-groups paired or paired-groups t test, or matched-groups t test. This type of t test does not compare a sample mean with a population mean, as did the one-sample t test. Instead, it compares two sample means that are somehow related. The sample may be the same sample measured twice, often a one-group pre-test/post-test research design, or the test may involve two related samples, each measured once. t test, Let s look a bit closer at these two scenarios. Same Sample, Measured Twice Using Kiran’s task of evaluating the effectiveness of the program at Spectrum Group, let’s say she wants to look at the effectiveness of one program in particular: the one-on-one aba therapy in-home. As mentioned, the Spectrum Group uses the vb-mapp assessment on all children prior to starting the program and then again after six months. Here we have the same sample of children being measured twice, often called a one-group pre-test/post-test research design. We could run a t Tests and ANOVA 101 paired-samples t test to determine ifthe childrens scores had changed significantly from one time point (pre-intervention) to another (mid-intervention). The research hypothesis could be: Children who participate in the one-on-one aba therapy inhome program will have improved scores on the vb-mapp assessment. Two Related Samples, Each Measured Once Here the two samples may naturally be similar in some way, such as two of the being siblings, one of whom is put in a therapy group, while the other receives no treatment. Or, the samples may be matched based on some other characteristic, for example, based on their vb-mapp scores (low and high), with half of the low scores and half of the high scores each assigned to the one-on-one aba therapy in-home program and the one-on-one aba therapy in-centre program. This would ensure that both groups (in-home and in-centre) have an equal number of low and high scoring children on the vb-mapp. children NON-PARAMETRIC ALTERNATIVE: WILC0X0N SIGN TEST The non-parametric alternative to the paired-samples t test is the Wilcoxon Sign This test is used when the dependent variable is not at the interval/ratio level of (through their natural match (e.g., sibling) or calculated match [e.g., based on assessment scores]), and then assigns each case a positive or negative, based on whether their score on the dependent variable is higher or lower than its matched counterpart. If there are many more positives or negatives in one group over the other, this suggests that a significant difference may be found. test. measurement. The test matches the individuals INDEPENDENT SAMPLES t TEST Like the paired-samples t test, the independent samples t test compares the means samples. It is commonly used in experimental or quasi-experimental designs with two intervention conditions (intervention A/intervention B, or intervention/ control group). With an independent samples t test, in addition to the criteria mentioned before (nominal level independent variable, interval/ratio dependent variable and a normally distributed dependent variable), the samples must also be randomly selected. That is, all cases have an equal chance of being selected for participation in the study. Importantly, the two samples do not have to be the same size. The degrees of freedom for an independent t test are calculated using the following formula, dj= ni+n2-2. The independent samples t test compares the means of the two samples, and a t value and p value are produced, which tell us if the two means are significantly different from each other. We can then either reject or fail to reject the null hypothesis. of two 102 Statistics for Social Justice In Kiran’s task of looking at the effectiveness of Spectrum Group’s three pro¬ she may choose to compare the effectiveness of one program over another. She could choose to determine which one-on-one aba therapy location is most effective: in-home or in-centre. Assuming the children were randomly selected to participate in the in-home or in-centre program, and that those who that partici¬ pated in one program did not also participate in the other, the two groups would be independent of each other. Therefore, an independent t test would be used to test the following research hypothesis: Among children receiving one-on-one aba therapy, those who receive their therapy in-centre will have higher scores on the vb-mapp assessment than those who receive their therapy in-home. grams, First, we need to check to • • ensure that the criteria for the test have been met: Independent variable at the nominal level: therapy location (at home, in-cen¬ tre). Dependent variable at the interval/ratio level: score on the vb-mapp assess¬ ment • • • Dependent variable normally distributed: assumed Groups are independent of each other Random samples Next, we calculate the independent samples t test. Formula for the t Independent t Test *1 “ *2 = if ss.+ssrvf yj\N1 + /V2 — 2/ V Where Xi = The mean of group 1 x2 = The mean of group 2 5S,= The sum of squares of group 1 of squares of group 2 ss2 = The N, = The number of participants in group 1 N2 = The number of participants in group 2 sum Below are the results from the vb-mapp assessments after six months of therapy. Each group (in-home and in-centre) had 10 children. t Tests and ANOVA 103 Table 9.1 Raw Data Group 1: In-Home Group 2: In-Centre 40 40 30 60 40 60 20 80 50 30 70 90 80 60 40 80 50 70 60 70 OO ^1 .5? II SO Step 1: Calculate 3ci and Jc2. Step 2: Calculate ssi and 55: by doing the following for each case: subtract the mean, square the given value, sum the squared values for all cases. Table 9.2 Sums of Squares Calculations SSi ss2 40 - 48 = -8 = 64 40 - 64 = -24 30 - 48 = -18 = 324 60 - 64 = = 64 60 - 64 - 64 = 576 -4 = 16 = -4 = 16 = 16 = 256 1156 40 - 48 = -8 20 - 48 = -28 = 784 80 50 - 48 = -2 = 4 30 - 64 = -34 = 70 - 48 = 22 = 484 90 - 64 = 26 = 676 80 - 48 = 32 = 1024 60 - 64 = -4 = 16 40 - 48 = -8 = 64 80 - 64 = 16 = 256 50 - 48 = 2 = 4 70 - 64 = 6 = 36 - 48 = 12 = 144 70 - 64 = 6 = 36 60 SSi = 2960 ssi = 3040 104 Statistics for Social Justice Step 3: Plug the calculated values in the formula and calculate. X1 t X2 ~ ss, + ss2 N,+N2-2 48 t - 64 = 2960 + 3040 010 10 + 10 — 2 J ) (± ++ ±) 10/ vlO -16 V(333.33)(0.2) 1 _ ~ -16 816 t = -1.96 Degrees of freedom df= Afi + Ni -2 f = -1.96, df= 18, p > .05 = 18 Step 4: Use the t distribution table (Table 9.3) to look up the t score for the degrees of freedom and the p value you are using (.05). In this case, we are using a two-tailed test as we do not have sufficient evidence to show that either in-home or in-centre therapy is superior, so the t critical value is 2.101. calculated (-1.96) falls within the rejection region (±2.101), Kiran fails reject the null hypothesis that there is no difference between therapy location, and she concludes that therapy location is not significantly related to scores on As the (calculated to the vb-mapp. t Tests and ANOVA 105 Table 9.3 Critical Values of T Level of Significance .10 .05 for a One-Tailed Test .025 Level of Significance .01 .005 .0005 for a Two-Tailed Test df .20 .10 .05 .02 .01 .001 1 3.078 6.314 12.710 31.821 63.657 636.619 2 1.886 2.920 4.303 6.965 9.925 31.598 3 1.638 2.353 3.182 4.541 5.841 12.941 4 1.533 2.132 2.776 3.747 4.604 8.610 5 1.476 1.015 2.571 3.365 4.032 6.859 6 1.440 1.943 2.447 3.143 3.707 5.959 7 1.415 1.895 2.365 2.998 3.499 5.405 8 1.397 1.860 2.306 2.896 3.355 5.041 9 1.383 1.833 2.262 2.821 3.250 4.781 10 1.372 1.812 2.228 2.746 3.169 4.587 11 1.363 1.796 2.201 2.718 3.106 4.437 12 1.356 1.782 2.179 2.681 3.055 4.318 13 1.350 1.771 2.160 2.650 3.012 4.221 14 1.345 1.761 2.145 2.624 2.977 4.140 15 1.341 1.753 2.131 2.602 2.947 4.073 16 1.337 1.746 2.120 2.583 2.921 3.015 17 1.333 1.740 2.110 2.567 2.898 3.965 18 1.330 1.734 2.101 2.552 2.878 3.922 19 1.328 1.729 2.093 2.539 2.861 3.883 20 1.325 1.725 2.086 2.528 2.845 3.850 21 1.323 1.721 2.080 2.518 2.831 3.819 22 1.321 1.717 2.074 2.508 2.819 3.792 23 1.319 1.714 2.069 2.500 2.807 3.767 24 1.318 1.711 2.064 2.492 2.797 3.745 25 1.316 1.708 2.060 2.485 2.787 3.725 26 1.315 1.706 2.056 2.479 2.779 3.707 27 1.314 1.703 2.052 2.473 2.771 3.690 28 1.313 1.701 2.048 2.467 2.763 3.647 29 1.311 1.699 2.045 2.462 2.756 3.659 30 1.310 1.697 2.042 2.457 2.750 3.646 40 1.303 1.684 2.021 2.423 2.704 3.551 60 1.296 1.671 2.000 2.390 2.660 3.460 120 1.289 1.658 1.980 2.358 2.617 3.373 Source: Fisher and Yates 1963 106 Statistics for Social Justice NON-PARAMETRIC ALTERNATIVE: MANN-WHITNEY U TEST One of the most common non-parametric alternatives to the independent t test is the Mann-Whitney U test. This alternative test can be used when the dependent 1) not at the interval/ratio level or 2) not normally distributed. Like its parametric test, the samples do not have to be of equal size. The test compares the ranked cases in one group which fall above/below the ranked cases in the second group. For the purposes ofthis textbook, we will not demonstrate how to calculate this test, but rather only explain the theory behind it. variable is PRESENTING FINDINGS OF f TESTS The proper way to report t test For a For a significant test: t = _. non-significant test: t = findings is the following: , df= , , df- p /> , . p = . Note that the t, df and p are always italicized, and there is a space both before equal sign. Then summarize the numerical findings using a few concluding sentences for the reader. For example: An independent samples t test was conducted to evaluate the hypothesis that: The test was/was not significant (insert test results here: t(df) = _. ,p=/< ). This supports/does not support the research hypothesis. Participants in the group (M = , SD = ) on average scored higher/ lower than participants in the group (M = _. , SD = ). and after the . _. _ _ . HOW TO USE A (TEST IN A PROGRAM EVALUATION As stated previously, the gold standard for program evaluations is the classic experimental design, which compares an experimental group with a control group. Let’s look at how the t test can be used in this type of program evaluation design. A social work agency has been providing counselling to women survivors of abuse on an individual basis for several years with the goal of raising their self-esteem. Their service user list is over a hundred women and growing. Recently, to become more cost effective and treat more women at any one time, they have decided to begin offering the same service in groups. They are concerned that the group program may not be as effec¬ tive as the individual one-to-one sessions. Justine and two colleagues at this agency have decided to run a pilot project to evaluate the results. As a first step, Justine reviews the literature on the internet to find journal articles that describe research projects carried out with similar populations. She also needs to find out about the kinds of instruments that are used to measure the effectiveness of programs designed to raise the self-esteem of women survivors of abuse. She t Tests and finds standardized to fit the is one tests ANOVA commonly used in this type of research. One that 107 seems requirements, the Rosenberg Self-Esteem Scale (ses) (Rosenburg 1989), of the widely used measures of self-esteem. While this test was not specifically designed for women survivors of abuse, after a careful review of the items in the scale, the social work researchers agreed that it would measure what they were interested in. As a next step, Justine designs the study to include two groups: the experimental group is made up of 10 women randomly selected from the population of service users to participate in the group treatment program, and the control group includes randomly selected women to receive the usual one-to-one counselling. Both groups are asked to complete the ses at the start of the counselling program. This is to ensure that the two groups are similar in terms of their level on the ses. The experimental group participates in the group treatment program, with one group session per week for ten weeks, and the control group continues receiving one-toone counselling. Once the programs finishes, both groups again complete the SES. The research hypothesis is that the scores of the experimental group will be different than the scores of the control group. The null hypothesis is that there will be no difference between the two groups. Justine carries out the analysis using a t test for independent samples. Below are the results of her analysis. most Table 9.41 Test Results Output Mean Scores n t df Probability Experimental 23.6 10 2.287 18 .034 Control 20.6 10 Group Presentation of results t = 2.29, df= 18, p < .05 Therefore, Justine rejected her null hypothesis and concluded that that there is difference between the of experimental group and the control group, experimental group scoring significantly higher, lending support to the proposal to offer more group counselling. Not only will it be a cost saving for the organization, but it will also provide a more effective treatment for the women. a scores with the ANOVA anova stands for used compare means. at to the nominal interval we are or analysis of variance. It is similar to t tests in the sense that it is It is also similar in that there is an independent variable or ordinal level of measurement and a dependent variable at the ratio level. One difference is that the anova is used in situations where comparing three or more groups instead of two. That is, the independent 108 Statistics for Social Justice Box 9.1 The Indigenous Parenting Practices following summary describes a study carried out on Indigenous parenting practices using the t test for independent samples. Indigenous parenting practices are generally viewed as different than mainstream (Euro-Canadian) parenting. For instance, raising children is seen by Indigenous parents as a community responsibility whereas the mainstream views parenting primarily the responsibility of individual families (Northwest Indian Child 1986). Furthermore, it has been recognized that the Canadian governments policy on assimilation has had a devastating effect on Indigenous communities (Shkilnyk 1985). So what has been the impact of assimilation on Indigenous parenting practices? This article presents the results of a study comparing Indigenous and mainstream parenting practices. The study attempted to answer the question: Are there significant differences between Indigenous and as Welfare Institute mainstream To parenting? the difference between Indigenous and mainstream practices, the principal author developed a fiffy-five-item paper-and-pencil questionnaire based on an Indigenous parenting program called Cherish the Children (Minnesota Indian Women’s Resources Center 1988). A sub-group of items in this questionnaire called Family Life Skills looked at the role played by the extended family and elders in the care and supervision of children. A convenience sample of 102 Indigenous parents for this study was drawn from the First Nations communities situated in a semi-rural region of Northern Ontario west of Sudbury. For comparison purposes, a convenience sample of mainstream parents consisted of 60 parents who live in the same geographic region. Both groups were asked to complete the questionnaire. A t test for independent samples was used to compare the difference between the two samples’ results on the Family Life Skills questions. The table below shows the results of the analysis. measure Subscale Family Life Skills Group N Mean t score df Prob. Indigenous parents 102 40.42 2.13 160 .036 60 37.88 Non-Indigenous parents = 2.13, df= 160, p < .05 Presentation of the results: t These results indicate that, even after years of assimilation, Indigenous parenting significantly different than those of mainstream parents and that Indigenous parents continue to see parenting as a community responsibility. practices are Note: The term used in the original article was Native parenting. An acceptable today is Indigenous. Because the samples were not randomly selected the authors are limited in their ability to generalize the restults to a wider population term of Indigenous parents. Source: van de Sande and Menzies 2003 t Tests and ANOVA variable has more than two 109 levels. Another difference is that instead of the t value, the anova produces an F ratio. If there is only one dependent variable, we refer to simple or one-way anova. Where there is more than one dependent variable, we use the term manova, which stands for a multivariate analysis of variance. With advanced statistical computer programs like SPSS, even the more complex anovas are relatively simple to calculate. The simple anova compares the means of each group with each other as well as with the overall mean of all the groups combined. This is referred to as the between group variance. It also looks at the amount of variability within each group. This this test as a is referred to as the within group variance. To see how the simple anova is used, let us consider the example of the program evaluation carried out byjustine. She compared the experimental group of service users who participated in group counselling with a control group of users who received one-to-one counselling. What ifjustine wants to add a third group? Let’s say that she adds a group of service users who are on the waiting list for counselling. Instead of doing multiple t tests comparing various combinations of two groups, she could instead carry out one test using the simple anova. The results of the test would tell Justine if the differences between the three groups are the result of random error, in which case she would have to accept the null hypothesis, or if the differences are great enough, she can reject the null hypothesis and conclude that there is a real, statistically significant difference. SUMMARY In this chapter, we presented a well-known parametric test called the t test. We explained that, since the t test is a parametric test, certain conditions must be met. While the independent variable should be at the nominal level, the dependent variable must be at the interval or ratio level of measurement and the results must be normally distributed. We stated that there are three variations of the t test: the one-sample t test; the paired t test; and the independent samples t test, and we learned how to calculate the independent samples t test. We also introduced the anova, which is used when we have three or more samples. REVIEW QUESTIONS 1. 2. Explain why the classic experimental design is considered the gold standard in research methodology. How might this be related back to what was covered in Chapter 2 on the history of empiricism? Describe when you might use a non-parametric alternative to any of the three types of t tests discussed in this chapter. Statistics for Social Justice 3. After comparing a control group with an experimental group, the researcher .07. What does this mean in terms of statistical significance? What is finds p = the usefulness of these 4. findings to the research? hypothetical example of when a social worker research would use the anova. Explain why they would use the anova in this case instead of a t test. Create a Correlation Analysis In this chapter, we examine studies where the data for both the independent and the dependent variables are at the interval or ratio level of measurement. The parametric test commonly used for this type of study is called the correlation coefficient, and it tests whether there is a linear relationship between the independent and the dependent variables. We explain that correlation analysis determine whether the relationship is significant and illustrate how to use it to further social justice. can There is a vast amount of literature comparing data on twojustice or moreissues, variasuch bles at the ratio level that identifies a correlation between social as poverty, and a host of social issues, such as health, including mental health, low levels of educational attainment and crime. For instance, an article by Gerald Ogbuja (2012), Correlation between Poverty and Mental Health: Towards a Psychiatric Evaluation, reviews a number of epidemiological studies which identify poverty and socio-economic problems as some of the most important factors related to mental health issues. In a study by Silvernail, Sloan, Paul, etal. (2014: i) entitled The Relationship between School, Poverty and Student Achievement in Maine, the authors state that, as the level of poverty increases, the performance of students decline. While they admit that other factors affect school performance, the authors insist that the level of poverty is the “single best predictor of average student performance.” There are also cross-national studies which look at income inequality and health. In a report published by Inequality.org, an online journal that focuses on social justice issues, a comparison is made between countries with various levels of income inequality and infant mortality, both of which are ratio level variables. Figure 10.1, published by a U.K. organization called Equality Trust and cited in Inequality.org, plots the 22 richest countries and shows the ratio between the top 20% of income and the lowest 20%. It reveals that countries such as Japan, Finland and Norway have the lowest ratio (thus least income inequality) while countries such as Singapore, the United States and Portugal have the highest ratio (thus most income inequity). Statistics for Social Justice 112 Figure 10.1 Ratio of Top Income to Bottom Income (Average for years 2003-06) 9.7 Source: lnequality.org 2011 In Figure 10.2, these same countries are compared based on the rates of infant mortality, defined as the death of children during the first year oflife. It shows that some of the countries, like the U.S. and Portugal, also have the highest level ofinfant mortality. An exception is Singapore, which has the highest income inequality but the lowest infant mortality. Figure 10.3 displays the same data on a scatterplot that shows the levels of income inequality with infant mortality. Figure 10.4 is a scatterplot showing the same developed countries’ life expectancy compared to income inequality. Here again, we can see that the countries with the highest income inequality have the lowest life expectancy. Figure 10.2 Infant Mortality, 2005 (Deaths in first year of life per 1000 live births) 6.9 Source: Inequality.org 2011 Correlation Analysis 113 Figure 10.3 Scatterplot on Infant Mortality • 1 USA 6.5 £ 6.0 • New Zealand Ireland • â– 2 5.5 Denmark • -1 ° 5.0 t •'Netherlands —-'-"Germany • • France Italy • Spain • il 4.0 c w •& Swit2erland. AuStfia 45 Portugal • Canada Belgium • co • * UK Israel • Finland •* NorwaV 3.5 • Sweden • 2 3.0 Q Japan Singapore • 2.5 3.00 4.00 5.00 Income Source: 7.00 6.00 Inequality (Top 20% : 8.00 9.00 10.00 Bottom 20% Ratio) lnequality.org 2011 Figure 10.4 Scatterplot on Life Expectancy 82 • Japan 81 2 CO >- 80 c >. o 2 o „ Israel „ . Spain M Canada 79 • Norway t * •TBeJgiura^ France Oprmanu V 78 m • lta|y Austria < d» ^ Switzerland Greece < Netherlands Singapore • New Zealand .2 _i 77 Denmark • • 76 3.00 4.00 5.00 Income Source: 6.00 7.00 Portugal 8.00 9.00 10.00 Inequality (Top 20% : Bottom 20% Ratio) Inequality.org 2011 What Figures 10.1 to 10.4 reveal is that quantitative data at the ratio level can be displayed graphically to illustrate the impact of income inequality. Reports such as the one published by Inequality.org can be used by those who wish to argue for the development of progressive social policy changes. An article published in the Toronto Star newspaper by Conservative Senator Hugh Segal (2011) states: of measurement While all those Canadians who live beneath the poverty means prisons line are by no almost all those in Canada’s from beneath the poverty line. Less than 10 per cent of associated with criminal activity, come 114 Statistics for Social Justice Canadians live beneath the poverty line but almost 100 per cent of our There is no political ideol¬ ogy, on the right or left that would make the case that people living in poverty belong in jail. prison inmates come from that 10 per cent. OTHER USES OF THE CORRELATION COEFFICIENT In addition to their use in social justice related studies, correlations are also used authors developing standardized tests, for a wide variety of purposes. For instance, like the Rosenberg Self-Esteem Scale (ses), mentioned in Chapter 9, can use it to mentioned in Chapter 1, reliability is defined as the degree to which the measurement instrument provides consistent results over time, and validity refers to the degree that an instrument will truly measure what it is supposed to measure, and not something else. The test-retest reliability scores range from r = .82, to r = .85 whereas the criterion validity score is r = .55. This shows that the reliability scores for the ses is quite good while the validity score is acceptable (Rosenberg 1965). establish the reliability and validity oftheir instrument. As SCATTERPLOTS Earlier we stated that correlations are used to test whether there is a linear relation¬ ship between the independent and dependent variables. To refresh your memory on what is meant by a linear relationship, we suggest going back to Chapter 3, where we provide a description of scatterplots. Examples such as Figures 10.3 and 10.4 portray case values for two variables simultaneously. Each dot on a scatterplot represents the intersection between the two variables. In this way we can see if a relationship exists. Ifthe dots form something close to a line, a relationship exists. If there is no pattern, then no relationship exists. Ferderich Huebler (2005) provides the scatterplot in Figure 10.5 of the correla¬ tions showing the relationship between poverty and education. Using data from the 2004, American Community Survey, Huebler states: In the United States, the states with the highest poverty rates are also those with the lowest share of high school graduates. The graph below [Figure 10.5] plots the percent of the population living below the poverty level against the percent of the population above 25 years of age without complete high school education. In other words, the higher the percentage of people who live below the poverty line, the higher the percentage of people who have not completed high school. Correlation Analysis 115 Figure 10.5 Poverty Line and Educational Achievement 25.0% 20.0% 15.0% 10.0% 5.0% 5% 7% 9% 11% Percent of 13% 15% 17% 19% 21% 23% population below poverty level Source: Huebler 2005 CORRELATION COEFFICIENT The correlation coefficient is expressed as continuum from -1 to +1, where -1 is a perfect negative relationship, +1 is a perfect positive relationship, and a score of 0, or close to 0, means that there is no relationship. The correlation coefficient describes two things: the strength and the direction of the linear relationship between two variables. The closer the score is to 1.0 (or-1.0), the stronger the relationship. The positive/negative sign indicates the direction of the relationship. A line going up, from the lower left side to the upper right side of the scatterplot, indicates a posi¬ tive direction. A line going down from the upper left side to the lower right side indicates a negative direction (Figure 10.6). If there is no pattern in the dots, it indicates a very weak relationship or none at all (Figure 10.7). Correlation coef¬ ficients equal to or greater than .80 are considered strong, those between .50 and .79 are considered moderate, and less than .49 are considered weak. Remember: a significant relationship does not predict causation. It will not tell us if the independ¬ ent variable caused a change in the dependent viable. The statistic for correlation is Pearson’s Product Moment Correlation, or simply Pearson’s r. 116 Statistics for Social Justice Figure 10.6 Positive and Negative Correlation Examples Figure 10.7 Weak or No Correlation Example PREDICTOR AND OUTCOME VARIABLES So far in this text, we have been using the terms independent and dependent variables, where the independent variable is the that causes or contributes causing the change in the outcome, which is called the dependent variable. We manipulate the independent variable to find out what effect this will have on the dependent variable. For instance, we may want to compare a new program or intervention approach to a control group or an existing approach to see if it will have the desired positive outcome on service users. The program or approach is the independent variable and the desired outcome is the dependent variable. With correlations, the independent variable is referred to as the predictor vari¬ able and the dependent variable is called the outcome variable. These terms more accurately describe the role that these variables play. In the case of correlations, we are not looking at cause and effect relationships; instead, we are interested in finding out if the variables co-vary, that is, if the two variables are related in some way. If there is a strong relationship, either positive or negative, we may be able to look at the predictor variable and actually predict the effect it will have on the to outcome variable. one Correlation Analysis 117 MULTIPLE r's used to test the relationship between the outcome variable and predictor variables. For instance, we may want to know about the relationship between results on a statistics exam and l) the number of statistics course taken, 2) the age of the student, 3) the grade point average and 4) scores Multiple r’s three are or more the math section of the on sat. Formula for Pearson’s rTest: nl*y- Q»(Ey) sJWZx* -Q»2][nly2-G]y)2] Where: r = Correlation coefficient n = Number of cases = Sum of xy = Sum of x column = Sum = Sum of x2 column = Sum I xy lx zy Lx2 Ly2 column ofy column ofy2 column Admittedly, this formula can appear to be a bit intimidating, but approached in systematic, step-by-step manner, it is straightforward. Let’s look at the following hypothetical example. Sheila is a social worker offering assertiveness trainingfor women. She was interested in finding out if there a relationship between the number of sessions service users par¬ ticipate in on assertiveness training programs and their scores on a standardized scale measuring assertiveness. She believed that the more sessions service users participate in, the higher they will score on the assertiveness scale. Sheila’s test hypothesis is: There is a positive relationship between the number of sessions in assertiveness training that service users attend and their score on a a standardized test measuring assertiveness. The corresponding null hypothesis is: There is no relationship between the training that service users attend and their score standardized test measuring assertiveness. She carries out the research and number of session in assertive on a her raw data is shown in Table 10.1. Table 10.2 contains the work necessary to obtain the values needed these numbers in. to be plugged into the formula, followed by plugging 118 Statistics for Social Justice Table 10.1 Raw Data Service Users Sessions (x) Score 1 June 2 30 Lynn 3 40 Shelly 3 50 Liz 4 60 Carol 5 60 Debbie 5 60 Anne 7 90 Sue 8 80 Helen 8 90 Table 10.2 Plugging in the Numbers 5* lx2 X/ 1 30 1 900 30 2 30 4 900 60 3 40 9 1600 120 3 50 9 2500 150 4 60 16 3600 240 5 60 25 3600 300 5 60 25 3600 300 7 90 49 8100 630 8 80 64 6400 640 8 90 64 8100 720 46 590 266 39300 3190 X.r Totals (y) 30 Mary I xy Correlation Analysis 119 Solving the Formula for r Score 7 n^xy - (Xx)Q]y) ~~ V["S>2-(I»2][nEy2-Q:y)2] 10(3190) -v/[10(266) - (46)(590) (46)2][10(39300) - 31900 V[2660 - - - (590)2] 27140 2116][393000 - 348100] 4760 V[540][44900] 4760 ~ V24425600 4760 r ~ 4942.23 r = .96 Sheilas calculations result in an r score of .96, which is very close to a perfect positive score of 1.0. Therefore, Sheila is confident in concluding that there is a strong positive a service The user next relationship between the number of sessions in assertive training attends and a standardized question is whether this measuring assertiveness. is significant. Looking at Table 10.2, score score showing the critical values of r, with an N (number of participants) of 10, and at the .05 level of significance for a one-tailed test, Sheila needs an r score of .5494 to reject the null hypothesis. Since an r score of .96 is well above the required .5494, she is certainly safe in accepting her research hypothesis that there is a strong posi¬ tive relationship between the number of sessions in assertive training that service users attend and their score on a standardized test measuring assertiveness. 120 Statistics for Social Justice Table 10.2 Critical Values of r Level of Significance N .05 .025 Level of Significance for a One-Tailed Test .01 .005 .0005 for a Two-Tailed Test .10 .05 .02 .01 .001 .8054 .8783 .9343 .9587 .9912 6 .7293 .8114 .8822 .9172 .9741 7 .6694 .7545 .8329 .8745 .9507 8 .6215 .7067 .7887 .8343 .9249 9 .5822 .6664 .7498 .7977 .8982 10 .5494 .6319 .7155 .7646 .8721 11 .5214 .6021 .6851 .7348 .8471 12 .4973 .5760 .6581 .7079 .8233 13 .4762 .5529 .6339 .6835 .8010 14 .4575 .5324 .6120 .6614 .7800 15 .4409 .5139 .5923 .6411 .7603 .7420 5 16 .4259 .4973 .5742 .6226 17 .4124 .4821 .5577 .6055 .7246 18 .4000 .4683 .5425 .5897 .7084 19 .3887 .4555 .5285 .5751 .6932 20 .3783 .4438 .5155 .5614 .6787 21 .3687 .4329 .5034 .5487 .6652 22 .3598 .4227 .4921 .5368 .6524 27 .3233 .3809 .4451 .4869 .5974 32 .2960 .3494 .4093 .4487 .5541 37 .2746 .3246 .3810 .4182 .5189 42 .2573 .3044 .3578 .3932 .4896 47 .2428 .2875 .3384 .3721 .4648 52 .2306 .2732 .3218 .3541 .4433 62 .2108 .2500 .2948 .3248 .4078 72 .1954 .2319 .2737 .3017 .3799 82 .1829 .2172 .2565 .2830 .3568 92 .1726 .2050 .2422 .2673 .3375 102 .1638 .1946 .2301 .2540 .3211 Source: Fisher and Yates 1963 Correlation Analysis Box 10.1 Use of Multiple r s between 121 Study on the Relationship Poverty and Childhood Depression in a (ses) and depression has been wellpopulations. A number of studies suggest that family ses may be associated with depression among children and adolescents as well, although the evidence is mixed. We assessed the relation between family income and depressive symptoms among 457 children aged 11-13 years old and examined pathways that may explain this relation. In-person interviews of children and their caregivers were conducted, including assessment of family income and administration of the Computer-based Diagnostic Interview Schedule for Children (c-disc). Family income was significantly associated with depressive symptoms, with children in the lowest income group (<$35,000) reporting a mean of 8.12 symptoms compared to 6.27 symptoms in the middle income group ($35,000-$74,999) and 5.13 symptoms in the highest income group (>$75,000; p<0.00l). Controlling for the number of stressful life events experienced in the past six months attenuated the effect of low family income on depressive symptoms by 28%. Indicators of the family environment explained 45% and neighbourhood median household income and aggravated assault rate explained 12% of the relation. The family environment, including parental divorce or separation and perceived parental support, appears to explain most of the relation between low family income and childhood depressive symptoms. Further exploration of the pathways between family ses and depression may suggest potential interventions to reduce the occurrence and persistence of depressive symptoms in children. The relation between low socio-economic status documented in adult Source: Tracy, Zimmerman, Stoep, et al. 2008 SUMMARY In this chapter, we described another well-known parametric test, the correlation explained that correlations are used to compare two variables at the interval/ratio level of measurement. We showed that a scatterplot can be used to visually portray whether these two variables co-vary, in the sense that they form a linear relationship. We explained how to use a correlation analysis to test if this linear relationship is significant and that correlations are often used in large population studies regarding important social justice issues. coefficient. We REVIEW QUESTIONS 1. Explain the benefits of using a scatterplot to represent data in which both the predictor and outcome variables are at the interval/ratio level of measure¬ ment 2. Describe how scatterplots can be used to advocate for social change. 122 Statistics for Social Justice Create 3. hypothetical example of a research study for which a correlation analysis would be appropriate. How might this study be tweaked in order to accommodate another predictor variable? 4. A researcher calculates a a Pearsons r correlation and finds an rvalue of +.91. What does this tell us? 5. What if a researcher calculates is -.92? an rvalue of-.80 but the What does this tell us? Can this still be useful? r critical value needed Simple Regression Analysis In this chapter we show that, if there is a strong and significant relationship between two variables, we can use a simple regression to predict the score of the outcome variable based on the score of the predictor variable. introduce the concepts of the regression line and least-squares criterion show how to calculate the least-squares regression equation. We and Prediction ng) what a casebased value, or score, likely refers be forto knowing (without variable. measuri Predicting is to an outcome an outcome on one or variables is used extensively in several fields of science. Two familiar examples are predicting the weather and predicting the rate of climate change. This method is also used for predicting economic growth and predicting consumer demand for commodities such as gasoline or new housing. In social work, we are interested in demographic changes, so we might want to predict the proportion of elderly people in the population at some future date or the increase in the proportion of Aboriginal people. We are regularly called upon to predict the success of intervention programs or the likelihood of child abuse and neglect based on certain risk factors (see Box 1 l.l). Child psychologists have been able to make predictions of the occurrence of depression among adolescents based on their self-image: more We investigated the ability of a measure of self-image, two measures of depression, and demographic characteristics to predict the outcome of depressive symptoms. Subjects were 47 adolescents who were referred to outpatient treatment for depression. Subjects were assessed for depressive symptoms at three time periods. (Fine, Haley, Gilbert and Forth 1993) Being able to predict the impact of social policy decisions also has important impli¬ cations for policy analysts and government officials. Let’s consider the following hypothetical example of a study of the relationship between the number of child protection cases and the level of income support from social assistance programs. Estelle Young is a social worker working with an ngo called Poverty Ends Now. She 123 124 Statistics for Social Justice is interested in the relationship between poverty and child abuse and neglect. She looks of studies in the literature thatfind a positive relationship. She examined the number of child protection cases in her province, along with the changes in social assistance over a twenty-year period. While there was not a perfect correlation, given the number offactors related to child protection cases, she nevertheless finds that there is a negative correlation between social assistance levels and child protection cases: as social assistance amounts available to families drop, the number of child protection cases increases. She carries out the statistical analysis and is able to identify a simple regression model that will predict the number of child protection cases based on a specific level of social assistance. Acknowledging that governments have been cutting costs by reducing social assistance rates forfamilies, she believes that it is important for governments, and the population in general, to understand that there are consequences to some of these cost-cutting measures. The savings realized by cutting social assistance will be reduced by the amount of money needed by child protection authorities to deal at a number with the increased workload. HOW DOES SIMPLE LINEAR REGRESSION WORK As explained in Chapters 3 and 10, scatterplots are graphs that visually display the relationship between two variables at the interval or ratio level of measurement. The points along the x axis represent the scores on the predictor variable and the points along the y axis represent the scores on the outcome variable, also called criterion variable. Each point on the graph represents the location where these variables converge for each case (or participant). If the points form something close to a line, we can say that there is a relationship. If there is a perfect relation¬ ship, a straight line can be drawn that would touch every point. If there is less than a perfect relationship, we could still draw a straight line through the points but it would not touch every point. The line that comes closest to all the points is called the regression line. Figure 11.1 shows a scatterplot comparing experience and income, with expe¬ rience being the predictor variable, or variable x, and income being the outcome variable, or variabley. The dots on the graph form a rough line, indicating that there is a strong relationship between these two variables. The line in the graph is the regression line. If we want to predict a level of income and we only have the score on years of experience, we would choose a score on income that sits along this line. The term “regression” is also referred to as “regression toward the mean,” and it suggests that for each score on experience, the income score will likely fall along this regression line. If we were to measure the distance from each dot to the line, we would be calculating the deviations. If we squared these deviations and totalled them, this total would be different depending on where we placed the line amongst Simple Regression Analysis 125 the dots. The line with the lowest total squared deviations, called the least-squares line, or the line of best fit. Fortunately, we do not need to calculate all of the deviations for each possible line. Instead by using the least-squares equation, we can easily identify the best regression line. criterion, would be the best regression Figure 11.1 Scatterplot Showing the Relationship between Experience and Income 0 10 20 Years of 30 40 Experience Stating the Research Question The research question for simple regressions is not phrased in the form of a hypoth¬ esis but rather is question worded as follows: “How does knowing a value of the predictor variable improve the prediction on the outcome variable?” Limits ot There a Simple Regression are some terms of what we can say using simple regressions. make predictions using values for the predictor that are limitations in For instance, we cannot larger than the largest value or smaller than the smallest value used in the compu¬ tation of the equation. Let’s refer back to the lished that there is a example from Chapter 10, on correlations. Having estab¬ strong correlation between weeks in an assertiveness training and scores on the assertiveness scale, Sheila is interested in being able to predict scores on assertiveness based on number of weeks in the program. Using her data set (Table 1 l.i), she is able to create a model that allows her to make this prediction. The equation that Sheila uses to create her model is called the least- program squares regression equation. 126 Statistics for Social Justice Table 11.1 Raw Data and Client Preliminary Calculations Weeks (x) Mary 1 30 June 2 30 Lynn 3 40 Shelly 3 50 Liz 4 60 Carol 5 60 Debbie 5 60 Anne 7 90 Sue 8 80 Helen 8 90 Ex 46 590 (y) Score Ex2 ly2 Exy 266 39300 3190 Least-Squares Regression Equation y‘ = a + b(x) y‘ = Predicted y value from a particular x value a = The point where the regression line b The slope of the line, where the amount of change iny is directly related to of change in x. = would intersect they axis, amount x = A selected value of the predictor variable used to predict the value of the outcome variable Box 11.1 Abstract of a Study on Child Abuse Risk Factors Secondary analyses of the 1998 Canadian Incidence Study of Reported Child were carried out to investigate the effect of caregiver vulnerabilities on the substantiation of child abuse and neglect. Analyses were done of (l) demographic factors, socio-economic disadvantage, and caregivers history of abuse; (2) caregiver vulnerability factors; (3) involvement in partner violence; and (4) the interaction between caregiver vulnerability and partner violence. Results showed that the total number of caregiver vulnerabilities was the best predictor of the substantiation of child abuse and neglect. Caregiver substance abuse was the single most important caregiver vulnerability in predicting maltreatment substantiation. High caregiver vulnerability and high partner violence increased the Maltreatment likelihood that maltreatment would be substantiated. Source: Wekerle, Wall, Leung and Troane 2007 Simple Regression Analysis 127 THE REGRESSION COEFFICIENT The regression coefficient is shown as (b) in the least-squares regression equation and represents the slope of the regression. As explained by Weinbach and Grinnell (2010), the slope could be conceived in the same ways as the slope of a hill and change in the outcome (y) variable for the cor¬ responding horizontal change in the predictor (x) variable. represents the amount of vertical Regression Coefficient or Slope Formula b NT.xv-(7..v)(T.v) NEx2-(£x)2 Where b = N = I-vy Zx2 I* b Slope Number of cases = Sum of xy = Sum ofx2 column = Sum column ofy2 column = Sum of x column = Sum = ofy column 10(3190}-1461(5901 10(266)-(46)^ 31900 - 27140 2660-2116 4760 544 b _ 8.75 128 Statistics for Social Justice THE y INTERCEPT FORMULA After calculating the slope (b), the next step for Sheila is the calculation of the)/ intercept, which is the (a) in the regression formula. They intercept is the point at which the regression line crosses they axis. A word of caution: Sheila will not be able to predict the outcome for a client whose score falls below the lowest score in her original sample, which was 1 week. Obviously, the score for someone who attended for 0 weeks would be 0. However, if the lowest score in Sheila’s sample had been 5 weeks, Sheila will not be able to make predictions for clients who attended for only 4 weeks or less. Intercept Formula a=y-b x Where: a y - intercept y Mean bx Slope times the mean ofy column ofy column a = 59-8.75(4.6) a = 18.75 Now that Sheila has calculated the slope and the y intercept, the least-squares regression equation for this example isy1 = 18.75 + 8.75(x). This formula is our prediction model. We can use this model to predict any score on the criterion or outcome variable based on the score on the predictor variable. For instance, if have we a score of 6 on the predictor variable, we could carry out the following calculation: y' = 18.75 + 8.75(6) The criterion or outcome score tiveness scale that we would would be 71.25. This is the score on the asser¬ predict an individual to achieve after 6 weeks in the program. STANDARD ERROR Obviously, if there was a perfect correlation, that is, an r-value ofeither + 1.0 or -1.0, predictor variable would be a perfect predictor of the outcome variable. In this case, there would be no error. Since there is rarely a perfect correlation between the predictor and outcome variables, it would be helpful to have an indication of how well the regression equation will predict the outcome variable based on the predictor variable. If the r-value is 0.0, which suggests that there is no relationship, then the regression equation would be of no use. The standard error provides this the Simple Regression Analysis indication. The closer that the r value is ard error to either + 1.0 or 129 -1.0, the closer the stand¬ (p value) is to 0.0, and the more confident we can be that the regression equation will accurately predict the value of the outcome variable. However, the closer the standard error is to 1.0, the less confident we can be in our prediction. EXCHANGING THE In x AND y VARIABLES example of the correlation between experience and income, we were interested in predicting income (outcome variable) based on level of experience (predictor variable). However, if instead we were interested in predicting the years of experience based on the level on income, we could conceivably exchange the variables with income becoming the predictor variable and experience becoming the outcome variable. If there is a strong r-value, indicating a strong relationship, it doesn’t really matter which variable is identified as the predictor variable and which is identified as the outcome variable. The only difference is that we would have a slightly different regression equation and so we must be careful to enter the data to reflect this change. our Table 11.2 Raw Data and Preliminary Calculations Years of Service Name of Clients (x) Job Satisfaction Scores (y) Sue 1 8 Jane 6 2 Jim 8 2 Helen 3 4 Sarah 4 5 Ed 2 9 Simone 5 3 Mary 7 2 Pauline 9 2 John 4 4 lx 49 41 I*2 1/ Ixy 301 111 149 Statistics for Social Justice 130 Slope b N Where: b = Ex2 (Ex)2 - Slope Number of cases N E xy Sum of xy E*2 Sum of x2 column 1/ Sum lx Sum of x column ly Sum column ofy2 column of_y column 10(149)-(49)(41) b 10(301)-(49)2 1490 - 2009 3010-2401 -519 609 b -.852 bx (-.852)(4.9) a = 4.1 -(-4.17) a = y- a = 4.1 a = 8.27 - Now that we have calculated the sion equation is;y‘ = slope and the y intercept, the least squares regres¬ 8.27 + (-.852)(x). Simple Regression Analysis 131 SUMMARY As a follow-up to Chapter 10, on correlations, this chapter explained that, if there significant relationship between two variables at the interval/ratio level of measurement, we can use a simple regression to predict what the score of the is a on the score of predictor variable. We introduced of regression line, regression coefficient and least-squares criterion. We showed that being able to predict the impact of social policy decisions can be very useful to social work researchers and advocates and also has important implications for policy analysts and government officials. outcome variable will be based the concepts REVIEW QUESTIONS 1. Provide a hypothetical example of how a structural researcher might use re¬ gression analysis in order to advocate for social change. 2. Discuss the dangers of using regression analyses to make predictions. Why must we consider before making any predictions? Explain the relationship between correlation analysis, discussed in Chapter 10, and simple regression analysis. What can the latter do that the former can¬ must 3. we be cautious? What not? 4. Explain why it is important to first ensure that there is indeed a correlation two variables before using a simple regression analysis to make pre¬ between dictions. 5. What does it mean if the “b” within the regression equation is negative? Do a quick sketch of a scatterplot to demonstrate what this would look like graphi¬ cally. 12 Writing a Research Report In this last chapter, we focus on the preparation of the final research report explain fhe importance of following the more mainstream writing style using the third person passive. We provide an outline of what should be included in a traditional academic research report and describe in some detail what each section of the research report should include. We n the previous chapters, we covered the basic principles of descriptive and infer¬ ential statistics as well as the more common statistical tests to determine if a true relationship exists among variables. Whether it is to identify and describe important social justice issues, such as poverty and homelessness, or to evaluate a program using an experimental or quasi-experimental design, this book provides enough basic knowledge to carry out a variety of quantitative research. As budgets for social programs continue to shrink and as smaller non-governmental organizations strug¬ gle to provide essential services to the most vulnerable people in our society, social workers are increasingly called upon to use their research skills to demonstrate the effectiveness of these programs and services. While qualitative methods are gaining more widespread acceptance, they will not replace quantitative methods as the approach required by many government departments and funding bodies. Once the data collection and analysis have been completed, the final step involves writing the research report. Traditional scientific writing continues to fol¬ low specific guidelines; this was the case in social work writing as well, though it is less so today. Those who took social work research courses before the 1980s were told that research reports had to be written in the third person passive. In other words, they were required to use phrases such as “It was found that...” rather than “We found that...” This was to maintain the appearance of objectivity. As structural social work researchers, our main concern is that the report be accessible to those people we serve. While we want to ensure that our research is useful to service users, we are often obliged to write the final report following a tra¬ ditional academic format. Regardless of the style, we need to keep in mind that the ultimate goal of our research and research report is structural change. Increasingly, 132 Writing a Research Report 133 social work researchers are finding that they have to write two reports, a more formal audience, and another more accessible one for service users. Our graduate students conducting research for community organizations are often asked to prepare a PowerPoint presentation in plain nonacademic language for community members in addition to the formal report for the agency requesting the research. one for an academic and/or government THE WRITING PROCESS (2005) offers the following useful suggestions to make the writing process as straightforward as possible. These steps may seem like common sense, but following these steps can save much time later on. Christine Marlow 1. 2. Keep a log for ideas and decisions taken throughout the research process. Prepare an outline; the more detail it has the easier the final report will be to write. 3. Write a first draft. This is often the hardest part. Revise several times if neces¬ sary. 4. Ask 5. Have a colleague to proofread the draft and don’t be afraid of criticism. proofread the final copy. There are usually mistakes that the someone writer misses. SECTIONS IN THE ACADEMIC FORMAT The traditional scientific/academic format normally includes the following sections: • • Title page Abstract • Introduction • Literature Review • • • Methodology Findings Discussion • Conclusion • List of References • (also called the bibliography) Appendices The following summary of what each section should (2011). de Sande and Schwartz cover is provided by van 134 Statistics for Social Justice Title Page The title page should provide the reader with enough information without being overly wordy (Rubin and Babbie 2008). In addition to the usual details such as the date and the list of authors with their degrees, try to think of a title that will catch the reader s interest. However, as Rubin and Babbie point out, with a formal academic report, it is important to maintain credibility and not come across as unscholarly. Achieving the right balance can, at times, be a challenge and it is often helpful to seek feedback from colleagues. Abstract The abstract is brief summary of the study. Most abstracts are between 150 and although some journals ask for shorter ones. The abstract should begin with very basic information on the purpose of the study, the research question and the central thesis. It should have a sentence or two on the research design or methodology, and conclude with a couple of sentences on the major findings. If the plan is to present the research at an academic or professional conference, the a 200 words abstract is what is submitted to the conference organizers. Introduction The introduction is the first section of the main body of the report. It should background information concerning the study. It should also describe the issue being investigated, the goals of the research and the specific research question or questions. Next, it should provide a description of the theory that has informed the research. If the study follows a traditional empirical design, include the hypothesis and the conceptualization and operationaliza¬ tion of the variables. The conceptualization of the variables provides the reader with clear definitions of each of the variables, including the independent and the dependent variables. The operationalization tells the reader how these variables provide the reader with some will be measured. Literature Review The literature review provides the reader with a summary ofwhat has been written topic under investigation. It is a systematic examination and assessment of the publications available on the topic ofthe study. It looks at what is known about the topic and what gaps still exist. If there are studies mentioned in the literature which give contradictory views, provide a brief overview of the various sides of the question. Ifyou disagree with what the literature says, provide an argument for your position. The literature review should conclude with a statement of the gaps in knowledge and how your research attempts to fill these gaps. on the Writing a Research Report 135 Methodology The methodology section of a traditional academic report should begin with a description ofthe selection of participants, also called the sample selection. Explain how the participants were contacted and whether a probability sampling method or a non-probability sampling method was used. The next section in the methodology section deals with administration of the measurement instruments used in the study and the method of analysis. If a standardized measurement instrument was used, include the reliability and validity scores of the instrument. Finally, identify which statistical test was used and whether it was parametric test or non-parametric test. Findings The findings section in a traditional research report provides the reader with a clear description of the results of the study. With a quantitative design, this sec¬ tion would normally include tables and graphs that visually describe the results. State if the results obtained were statistically significant and whether or not they supported the research hypothesis. concise Discussion and Conclusion In the discussion and conclusion section, the results of the study are related back to the literature review. Were the results supported or contradicted by the literature? implications of the research for social work practice and/or social policy? In what way have the results added to our knowledge base? The discussion should also include a subsection on limitations. Describe any problems with the study. For example, perhaps the response rate from participants was low, result¬ ing in some sample bias, or there were difficulties with the administration of the instrument, resulting in some design error. The discussion and conclusion section should end with suggestions for further research. What are the List of References The in-text citations and the list of references for social science reports fol¬ (American Psychological Association) method of referencing. The list of references must provide complete and accurate information on all sources low the most apa cited in the report. Appendices The the appendices section of the report should include, as a minimum, a copy of used and copies of ethical consent letters. Many measurement instrument researchers include additional tables findings section but may on the results that were not included in the still be useful to the readers. Each appendix should be clearly labeled using letters (A, B, C, etc.). 136 Statistics for Social Justice Example of a Study Written Using a Traditional Format Appendix B is an example of a research report, which is based on actual research carried out byJulie Shaw, one of our graduate students. SUMMARY In this last chapter, we discussed the need to write the research report along tradi¬ lines, using the third person passive, which is still the expectation of most academic disciplines. We offered some suggestions on how to make the writing tional straightforward as possible. The rest of the chapter focused on the traditional format of an academic research report. While many government and process as funding bodies still require an academic format, our first concern is meeting the needs of the people we serve and using our research to promote social justice. This may involve writing more than one report, a more traditional one for an audience made up ofacademic, government or other funding bodies, and a non-traditional, user-friendly report for a lay audience of service users. REVIEW QUESTIONS 1. What are some of the challenges in writing an academic research report for a lay audience? 2. How important is it to maintain academic rigour in a report prepared for a mainstream audience? 3. How do you versus feel about writing active? the first person a research report in the third person passive