PCAD 2000 Psychiatric Content Analysis and Diagnosis GB Software Louis A. Gottschalk M.D. PhD Robert J. Bechtel Ph.D. What is PCAD? Software program designed to content analyze written and transcribed text based on the Scales measuring various emotional and psychological states developed by Louis A. Gottschalk and Goldine Gleser PCAD “In operation, the program assigns scores on the userselected Scales to each clause in the input sample, then, at the user’s option, reports score summaries for each scored Scale with comparisons to established norms for the subject’s demographic group, provides an analysis of the score profile, and suggests possible diagnoses drawn from the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV).” The DSM-IV • Recommended diagnoses are derived from the Diagnostic Statistical Manual of Mental Disorders, Fourth Edition (DSM – IV) • The DSM–IV is published by the American Psychiatric Association and covers all mental health disorders for children and adults • It also contains known causes, prognosis, and some research on optimal treatment approaches Gottschalk-Gleser Scales • Anxiety- divided into six subscales – death, mutilation, separation, guilt, shame, and diffuse or non-specific anxiety. • Human Relations • Achievement Strivings • Hostility Inward • Dependency • Hostility Outward • Health and Sickness • Ambivalent Hostility • Quality of Life • Social Alienation-Personal Disorganization • Cognitive and Intellectual Impairment • Hope Scale Validation • A number of construct validation studies were performed to assess the validity of each of the scales • The studies included four kinds of criterion measures: psychological, physiological, pharmacological, and biochemical • For example, in order to validate the use of scored clauses, six fiveminute speech samples were scored for hostility outward by both the computerized method and expert human content analysis technicians, and resulted in a Spearman’s correlation of 0.80 • For more information on validation and reliability studies, refer to PCAD 2000 manual or Gottschalk and Gleser's The Measurement of Psychological States Through the Content Analysis of Verbal Behaviors Tips for Getting Started • Select a text of at least 85 – 90 words • Remove all formatting from text to be analyzed • In addition, remove all unnatural line breaks and dashes • Save as a text file (.txt) in the PCAD folder - in the C Drive under Program Files • Rename as a .sam file • Select a file name that is fewer than six characters and contains no capitals • Now you’re ready to analyze… Start -> All Programs -> CATA -> PCAD-> Scoring Step 1: Copyright Information Step 2: Select “Guide Me” In lieu of the “Guide Me” option, File -> Open will also work and allows you to choose the scales you wish to score Step 3: Reminding you to select a sample. Click “OK” Step 4: Select the file you want to analyze > click open Step 5: Specify Your Scale Unless otherwise specified, text will be scored on all scales Scales can be specified by clicking the “scales” tab at the top, then selecting or deselecting chosen scales Step 6: Select output options Select from printer, file, or spreadsheet – then click OK to begin analysis Analysis of Sylvia Plath • We first analyzed the transcript of an interview Sylvia Plath gave in October of 1962, shortly before she committed suicide in February of 1963. • Next, we analyzed two older journal fragments written by Plath in 1956. Analysis Results of the scored clauses Analysis on each scale Summary Analysis Diagnosis Excel Output If selected, output will automatically be saved in an Excel file with the same name as the sample. Analysis for Plath Interview Analysis Summary: Mildly elevated diffuse anxiety (no further diagnosis given) Diagnosis for Plath journal fragments Diagnosis from journal fragments continued If the clinician is in the process of making a neuropsychiatric diagnosis, the DSM-IV diagnostic classifications to consider are: For Adults: Axis I: Adjustment disorder with depressed mood (309.00) Dysthymia (Depressive neurosis) (300.40) Depressive disorder not otherwise specified (311.00) Major depression, mild, single episode (296.21), recurrent (296.31) Bipolar disorder, mild, depressed or mixed (296.51, 296.61) Axis II: Personality disorder not otherwise specified (301.90) Diagnosis continued – DSM-IV • Recommended diagnoses are derived from the Diagnostic Statistical Manual of Mental Disorders, Fourth Edition (DSM – IV) • The DSM–IV is published by the American Psychiatric Association and covers all mental health disorders for children and adults • It also contains known causes, prognosis, and some research on optimal treatment approaches Caveat • Scales are based on assessment of natural language. Therefore, caution should be used when analyzing materials collected through means other than “standard procedure,” especially when making comparison to the norms. • “Standard procedure” is describes as an approach that “elicits speech samples by using purposely ambiguous standardized instructions simulating a projective test situation” Standard Procedure Prompt • “This is a study of speaking and conversation habits. I would like you to talk for five minutes about any personal interesting or dramatic life experiences you have ever had. If you finish telling about one life event, you can continue on telling about another one until the five minutes is over. While you are talking I would prefer not to reply to any questions you have until the five minutes is over. If you have any questions now, I will be happy to response to them now.” Comparison of Plath Anxiety Measures LIWC 2001 Linguistic Inquiry and Word Count James W. Pennebaker, Martha E. Francis & Roger Booth Published in: 2003 LIWC, 2001 • Developed by Pennebaker and Francis in 1993 • Developed in order to provide an efficient and effective method for studying the various emotional, cognitive, structural and process components present in individual’s verbal and written speech. • Designed to quantitatively code words based on an internal dictionary and produce word counts and percentages of total word count on more than 80 text variables. LIWC Dictionary Design • The LIWC dictionary was created by several “rating phases” where human judges rated words as to whether or not they should be included in a given dictionary category. • This rating took place from 1992-1994, with a final revision in 1997. Between 93% - 100% agreement was recorded for words. • LIWC was then used to analyze text files totaling over 8 million words at the same time as WordSmith, a previously validated program. • Categories that received less than .3 percent of words in the analysis, or that suffered from poor reliability/validity were eliminated. • Words used less than .005 percent of the time or words that were not listed in Francis and Kucera’s (1982) Frequency Analysis of English Usage were excluded. LIWC Validity • LIWC has been validated a number of times in studies undertaken by Pennebaker and other scholars. Many times, however, these depend on “judges ratings” rather than other scales. • In 1996 72 college students were asked to write about the college experience. LIWCS coding was correlated with coding done by judges, and significant and important correlations were observed. LIWC Validity • In addition to judge’s ratings, some studies have looked for a relationship between LIWC measured variable percentages and external measures, such as self-report or other measures of items such as personality type and motivation. • See Pennebaker, J.W. & King, L.A. (1999). Linguistic Styles: Language Use as an Individual Difference. Journal of Personality and Social Psychology. 77;6(1296-1312). LIWC Dictionary • Though Proprietary, the LIWC dictionary is available for viewing on Yoshikoder, provided you have a running copy of Yoshikoder on your computer. See next slide for image of dictionary. LIWC Dictionary YKD LIWC Notes • It is important for researchers to note that all variables measured by LIWC are reported as a percentage of the total text, with the exception of: • Raw Word Count (WC) • Words per sentence (WPS) • And percentage of sentences ending in question marks (Qmarks) Preparing to Use LIWC • Decide how you want your written data analyzed (multiple cases). • File format must be a .txt Text file or a ASCII. LIWC cannot read Word or WordPerfect formats. • Name and organize files based on the way you want analysis data to record (Participant, Day Condition) • Clean the text file, removing all misspellings and inappropriate word use (its vs. it’s) • Consider how this affects “non-fluency” measure. Preparing to Use LIWC • Some more notes on cleaning the text files to make them readable by LIWC: • Correct spelling errors, use standard US. • Spell out meaningful abbreviations (Jan=January). Leave acronyms. • Handle abbreviations. • Make sure abbreviations are turned “on.” • Be cautious removing periods (U.S. v US) Getting Started • LIWC can be found under program files or applications, depending on the operating system and program location. • For users at Cleveland State University, the path is as follows: START > CATA > LIWC 2001 • In order to locate files and outputs when saved in the LIWC folder, CSU students should look under: My Comp. > C: > Program Files > CATA > LIWC 2001 Opening/Processing Text Files in LIWC •Choosing “Process text” will run the text analysis. •Choosing “Open” will literally open the text for you to review it within LIWC, and will not process the text automatically. Set Dictionary • LIWC uses the Pennebaker dictionary by default. If you would like to load a custom dictionary, click on the dictionary tab and follow the instructions there, or in the manual. Set Categories The “Categories” Menu allows you to select or deselect the variables you would like to measure. A best practice is to measure all variables and eliminate unneeded variables in spss or excel, in a new file. This is especially recommended if you do not have easy access to the software. Category Selections The Category box selections include all of the variables measured by LIWC. If a person was 100% sure of what variables they were studying, and had access to the software regularly, they might choose to isolate variables for ease of use and to make data files less wieldy for analysis. Extras & Punctuation The extras selection under the “Categories” menu allows the researcher to exert more specific control over numerals, abbreviations and emoticons depending on the type of text they are analyzing and the variables which they will be using. If word count is an analysis variable for a researcher studying narrative texts, they would want to count numerals, whereas someone analyzing a text for different reasons might want to ignore them as non-words. The punctuation selection under the “Categories” menu allows the research to exert specific control over which punctuation marks will be counted. Analyze The “Analyze” menu gives you the option to analyze your text in segments, by number, by words/segment or by delimeter. Segmenting the text is useful because it allows the researcher to have one output file with many segment analyses. Segmenting the File This text file, for example is prepare to be segmented by “delimeter.” I will set the delimiter to “3 or more returns,” because each segment here has three or more hard returns in between. These segments are Sylvia Plath poems. Segmenting Selection Processing File Once you have set all the parameters… you can now process your file: Select “Process text” from file menu; navigate to the text and select, click “Select” Processing File (Cont.) • This dialogue box, which opens after you choose your file, allows you to save your results as a .xls file in any location you choose. LIWC Results Location Results for Segmented Poems Poem Results in Excel Results for Consolidated Poems (used in analysis) Plath Analysis • We analyzed seven Plath poems, chosen randomly from poems written in 1963, the last year of her life. The poems were entered in as seven segments. We also analyzed one segment of a journal entry. • We are utilizing research design from Stirman & Pennebaker (2001) to predict which variables should be elevated in suicidal poems: Analysis Variables • Via Disengagement Theory: I(me,my) • Via Hopelessness Theory: Death, Negative Emotion • Via Other Findings in Pennebaker & Stirman (2001), Sexual Words Graph of Results Results • There is a higher percentage of sexual words in both Plath’s journal and poetry compared to a mean. • Death words are higher than the norm for the poetry, but there were 0 death words in the journal entry. • The norm for negative emotion is higher than both the journal and the poems. • I (me,my) words are lower than the norm for both the poem and the journal. Discussion • Why the strange findings? • Our sample size was relatively small compared to the sample size utilized in the Pennebaker article. • Our sample size was not randomly chosen from the poet’s entire canon. • NE (negative emotions) was also non-significant in the Pennebaker piece. • The journal entry is an insignificant sample of her journaling. Results Explored Viewing individual results for each of the analyzed poems (which were taken as a whole in our comparison), we can see how poem selection matters. Examine Poem 5. Look at its I words, Death words and Sexual words. Compare that to Poem 8. The small random selection of poems likely accounts for the unexpected results of our simple analysis. LIWC Applications • LIWC has been used in many ways, from examining suicidal poets’ poetry to analyzing countless other texts for cognitive and topical variables. • Julie’s proposed study – a description of a potential application with multiple measures. Questions? Or, suggestions for other research uses?