CODING AND CONTENT ANALYSIS © LOUIS COHEN, LAWRENCE MANION & KEITH MORRISON STRUCTURE OF THE CHAPTER • • • • • Coding What is content analysis? How does content analysis work? A worked example of content analysis Reliability in content analysis CONTENT ANALYSIS • Data reduction is a major issue in qualitative data analysis. • Content analysis reduces text to fewer categories. • Categories may be pre-ordinate (decided in advance) or responsive (emerging from the data themselves). CODING • • • • A code is a name or label that the researcher gives to a piece of text that contains an idea or a piece of information. Coding is the translation of question responses and respondent information to specific categories for the purpose of analysis. Coding is the ascription of a category label to a piece of data, the process of breaking down segments of text data into smaller units (based on whatever criteria are relevant), and then examining, comparing, conceptualizing and categorizing the data. The same piece of text may have more than one code ascribed to it, depending on the richness and contents of that piece of text. CODING • • • • Coding enables the researcher to identify similar information. Coding enables the researcher to search and retrieve the data in terms of those items that bear the same code. Codes can be at different levels of specificity and generality when defining content and concepts. Some codes which subsume others, thereby creating a hierarchy of subordination and superordination, creating a tree diagram of codes. CODING • • • • • • A code is a word or abbreviation sufficiently close to that which it is describing for the researcher to see at a glance what it means. Codes are frequently abbreviations. Codes should be kept as discrete as possible. Coding should start earlier rather than later. Coding involves iteration and reiteration to ensure comprehensiveness and consistency of coding. The researcher goes through the data systematically, typically line by line, and writes a descriptive code by the side of each piece of relevant datum. TYPES OF CODE • Open code – A new label that the researcher attaches to a piece of text to describe and categorize that piece of text, line-by-line, phrase-by-phrase, sentence-by-sentence, paragraph-byparagraph, or unit-of text-by-unit-of-text. • Analytic code – Interpretive and explanatory TYPES OF CODE • Axial code – A category label ascribed to a group of open codes whose referents (the phenomena being described) are similar in meaning. – Connects related codes and subcategories into a larger category of common meaning TYPES OF CODE • Selective code – Similar to an axial code, but at a greater level of abstraction than an axial code. – Identifies the core category/ies of text data, integrating them to form a theory. – A core category is that central category or phenomenon around which all the other categories identified and created are integrated, and to which other categories are systematically related and by which it is validated. WORKING WITH CODES • Once codes have been assigned, ordered, and grouped, they can be structured into hierarchies of subsumption. • Lower order codes (e.g. descriptive codes) are subsumed under analytic and axial codes, which, in turn are subsumed under a selective code. • Keep hierarchies ‘shallow’ (not too many levels). • Take care with coding, as there is a risk of losing temporality, context and sequence in the coding and retrieval of text (the researcher may prefer to write a narrative account). WHAT IS CONTENT ANALYSIS? • The process of summarizing and reporting written data – the main contents of data and their messages. • Content analysis defines a rule-governed, strict and systematic set of procedures for the rigorous analysis, examination and verification of the contents of written data. • Content analysis reduces and interrogates text into summary form through the use of both pre-existing categories and emergent themes in order to generate or test a theory. • Content analysis can yield frequencies (quantitizing text). NARRATIVE AND BIOGRAPHICAL APPROACHES TO DATA ANALYSIS • Narratives and biographies are selective, based on: – Key decision points in the story or narrative – Key, critical (or meaningful to the participants) events – Themes – Behaviours – Actions – People – Key experiences – Key places SYSTEMATIC APPROACHES TO DATA ANALYSIS • Comparing different groups simultaneously and over time • Matching the responses given in interviews to observed behaviour • Analyzing deviant and negative cases • Calculating frequencies of occurrences and responses • Assembling and providing sufficient data that keeps separate raw data from analysis HOW DOES CONTENT ANALYSIS WORK? • Numerical content analysis: – Define the units of analysis (e.g. words, sentences) and the categories to be used for analysis. – Code the texts and place them into categories. – Count and log the occurrences of words, codes and categories. – Apply statistical analysis and quantitative methods and interpret the results. HOW DOES CONTENT ANALYSIS WORK? • Non-numerical content analysis – Code and categorize data – Compare categories and make links between them – Conclude – draw theoretical conclusions from the text. STEPS IN CONTENT ANALYSIS Step One: Define the research questions to be addressed by the content analysis. Step Two: Define the population from which units of text are to be sampled. Step Three: Define the sample to be included. Step Four: Define the context of the generation of the document. Step Five: Define the units of analysis. Step Six: Decide the codes to be used in the analysis. STEPS IN CONTENT ANALYSIS Step Seven: Construct the categories for analysis. Step Eight: Conduct the coding and categorizing of the data. Step Nine: Conduct the data analysis. Step Ten: Summarize. Step Eleven: Make speculative inferences. RELIABILITY IN CONTENT ANALYSIS • • • • Witting evidence (intended to be imparted) and unwitting evidence (what is inferred and unintended); The text may have been written for a very different purpose from that of the research; the researcher will need to know or be able to infer the intentions of the text; The documents may be limited, selective, partial, biased, non-neutral and incomplete because they were intended for a different purpose other than that of research; It may be difficult to infer the direction of causality in the texts; RELIABILITY IN CONTENT ANALYSIS • • • • • • Texts may not be corroborated or able to be corroborated; Classification of text may be inconsistent; Words are inherently ambiguous and polyvalent; Coding and categorizing may lose the nuanced richness of specific words and their connotations; Category definitions and themes may be ambiguous, as they are inferential; Some words may be included in the same category but may have more/less significance in that category; RELIABILITY IN CONTENT ANALYSIS • • • • Words in a category may have different connotations and their usage may be more nuanced than the categories recognize; Categories may reflect the researcher’s agenda and imposition of meaning more than the text may sustain or the producers of the text may have intended; Aggregation may compromise reliability; A document may deliberately exclude something for mention, overstate an issue or understate an issue.