Kate & Julie present LIWC & PCAD - Academic Csuohio

advertisement
PCAD 2000
Psychiatric Content Analysis and Diagnosis
GB Software
Louis A. Gottschalk M.D. PhD
Robert J. Bechtel Ph.D.
What is PCAD?
Software program designed to content analyze
written and transcribed text based on the Scales
measuring various emotional and psychological
states developed by Louis A. Gottschalk and
Goldine Gleser
PCAD
“In operation, the program assigns scores on the userselected Scales to each clause in the input sample,
then, at the user’s option, reports score summaries for
each scored Scale with comparisons to established
norms for the subject’s demographic group, provides
an analysis of the score profile, and suggests possible
diagnoses drawn from the Diagnostic and Statistical
Manual of Mental Disorders, Fourth Edition (DSM-IV).”
The DSM-IV
• Recommended diagnoses are derived from the
Diagnostic Statistical Manual of Mental Disorders,
Fourth Edition (DSM – IV)
• The DSM–IV is published by the American Psychiatric
Association and covers all mental health disorders for
children and adults
• It also contains known causes, prognosis, and some
research on optimal treatment approaches
Gottschalk-Gleser Scales
• Anxiety- divided into six subscales
– death, mutilation, separation, guilt,
shame, and diffuse or non-specific
anxiety.
• Human Relations
• Achievement Strivings
• Hostility Inward
• Dependency
• Hostility Outward
• Health and Sickness
• Ambivalent Hostility
• Quality of Life
• Social Alienation-Personal
Disorganization
• Cognitive and Intellectual
Impairment
• Hope Scale
Validation
• A number of construct validation studies were performed to assess the
validity of each of the scales
• The studies included four kinds of criterion measures: psychological,
physiological, pharmacological, and biochemical
• For example, in order to validate the use of scored clauses, six fiveminute speech samples were scored for hostility outward by both the
computerized method and expert human content analysis technicians,
and resulted in a Spearman’s correlation of 0.80
• For more information on validation and reliability studies, refer to PCAD
2000 manual or Gottschalk and Gleser's The Measurement of
Psychological States Through the Content Analysis of Verbal Behaviors
Tips for Getting Started
• Select a text of at least 85 – 90 words
• Remove all formatting from text to be analyzed
• In addition, remove all unnatural line breaks and dashes
• Save as a text file (.txt) in the PCAD folder - in the C Drive
under Program Files
• Rename as a .sam file
• Select a file name that is fewer than six characters and
contains no capitals
• Now you’re ready to analyze…
Start -> All Programs -> CATA ->
PCAD-> Scoring
Step 1: Copyright
Information
Step 2: Select “Guide Me”
In lieu of the “Guide Me” option,
File -> Open will also work and
allows you to choose the scales
you wish to score
Step 3:
Reminding you to select a sample.
Click “OK”
Step 4:
Select the file you want to analyze > click open
Step 5: Specify Your Scale
Unless otherwise specified, text will be scored
on all scales
Scales can be specified by clicking the “scales” tab
at the top, then selecting or deselecting chosen scales
Step 6: Select output options
Select from printer, file, or spreadsheet – then click OK to
begin analysis
Analysis of Sylvia Plath
• We first analyzed the transcript of an interview
Sylvia Plath gave in October of 1962, shortly
before she committed suicide in February of
1963.
• Next, we analyzed two older journal fragments
written by Plath in 1956.
Analysis Results
of the scored clauses
Analysis on each scale
Summary Analysis
Diagnosis
Excel Output
If selected, output
will automatically be
saved in an Excel file
with the same name
as the sample.
Analysis for Plath Interview
Analysis Summary:
Mildly elevated
diffuse anxiety
(no further
diagnosis given)
Diagnosis for Plath journal fragments
Diagnosis from journal fragments
continued
If the clinician is in the process of making a neuropsychiatric diagnosis, the
DSM-IV diagnostic classifications to consider are:
For Adults:
Axis I:
Adjustment disorder with depressed mood (309.00)
Dysthymia (Depressive neurosis) (300.40)
Depressive disorder not otherwise specified (311.00)
Major depression, mild, single episode (296.21), recurrent (296.31)
Bipolar disorder, mild, depressed or mixed (296.51, 296.61)
Axis II:
Personality disorder not otherwise specified (301.90)
Diagnosis continued –
DSM-IV
• Recommended diagnoses are derived from the
Diagnostic Statistical Manual of Mental Disorders,
Fourth Edition (DSM – IV)
• The DSM–IV is published by the American Psychiatric
Association and covers all mental health disorders for
children and adults
• It also contains known causes, prognosis, and some
research on optimal treatment approaches
Caveat
• Scales are based on assessment of natural
language. Therefore, caution should be used when
analyzing materials collected through means
other than “standard procedure,” especially when
making comparison to the norms.
• “Standard procedure” is describes as an approach
that “elicits speech samples by using purposely
ambiguous standardized instructions simulating a
projective test situation”
Standard Procedure Prompt
• “This is a study of speaking and conversation
habits. I would like you to talk for five minutes
about any personal interesting or dramatic life
experiences you have ever had. If you finish telling
about one life event, you can continue on telling
about another one until the five minutes is over.
While you are talking I would prefer not to reply to
any questions you have until the five minutes is
over. If you have any questions now, I will be
happy to response to them now.”
Comparison of Plath Anxiety Measures
LIWC 2001
Linguistic Inquiry and Word Count
James W. Pennebaker,
Martha E. Francis & Roger Booth
Published in: 2003
LIWC, 2001
• Developed by Pennebaker and Francis in 1993
• Developed in order to provide an efficient and
effective method for studying the various
emotional, cognitive, structural and process
components present in individual’s verbal and
written speech.
• Designed to quantitatively code words based on
an internal dictionary and produce word counts
and percentages of total word count on more
than 80 text variables.
LIWC Dictionary Design
• The LIWC dictionary was created by several “rating phases”
where human judges rated words as to whether or not they
should be included in a given dictionary category.
• This rating took place from 1992-1994, with a final revision in 1997.
Between 93% - 100% agreement was recorded for words.
• LIWC was then used to analyze text files totaling over 8 million
words at the same time as WordSmith, a previously validated
program.
• Categories that received less than .3 percent of words in the analysis,
or that suffered from poor reliability/validity were eliminated.
• Words used less than .005 percent of the time or words that were
not listed in Francis and Kucera’s (1982) Frequency Analysis of English
Usage were excluded.
LIWC Validity
• LIWC has been validated a number of times in
studies undertaken by Pennebaker and other
scholars. Many times, however, these depend
on “judges ratings” rather than other scales.
• In 1996 72 college students were asked to
write about the college experience. LIWCS
coding was correlated with coding done by
judges, and significant and important
correlations were observed.
LIWC Validity
• In addition to judge’s ratings, some studies
have looked for a relationship between LIWC
measured variable percentages and external
measures, such as self-report or other
measures of items such as personality type
and motivation.
• See Pennebaker, J.W. & King, L.A. (1999).
Linguistic Styles: Language Use as an Individual
Difference. Journal of Personality and Social
Psychology. 77;6(1296-1312).
LIWC Dictionary
• Though Proprietary, the LIWC dictionary is
available for viewing on Yoshikoder, provided
you have a running copy of Yoshikoder on your
computer.
See next slide for image of dictionary.
LIWC Dictionary YKD
LIWC Notes
• It is important for researchers to note that all
variables measured by LIWC are reported as a
percentage of the total text, with the
exception of:
• Raw Word Count (WC)
• Words per sentence (WPS)
• And percentage of sentences ending in
question marks (Qmarks)
Preparing to Use LIWC
• Decide how you want your written data analyzed
(multiple cases).
• File format must be a .txt Text file or a ASCII. LIWC
cannot read Word or WordPerfect formats.
• Name and organize files based on the way you
want analysis data to record (Participant, Day
Condition)
• Clean the text file, removing all misspellings and
inappropriate word use (its vs. it’s)
• Consider how this affects “non-fluency” measure.
Preparing to Use LIWC
• Some more notes on cleaning the text files to
make them readable by LIWC:
• Correct spelling errors, use standard US.
• Spell out meaningful abbreviations
(Jan=January). Leave acronyms.
• Handle abbreviations.
• Make sure abbreviations are turned “on.”
• Be cautious removing periods (U.S. v US)
Getting Started
• LIWC can be found under program files or
applications, depending on the operating system
and program location.
• For users at Cleveland State University, the path is
as follows: START > CATA > LIWC 2001
• In order to locate files and outputs when saved in
the LIWC folder, CSU students should look under:
My Comp. > C: > Program Files > CATA > LIWC 2001
Opening/Processing Text
Files in LIWC
•Choosing “Process
text” will run the
text analysis.
•Choosing “Open”
will literally open
the text for you to
review it within
LIWC, and will not
process the text
automatically.
Set Dictionary
• LIWC uses the Pennebaker dictionary by
default. If you would like to load a custom
dictionary, click on the dictionary tab and
follow the instructions there, or in the manual.
Set Categories
The “Categories” Menu allows
you to select or deselect the
variables you would like to
measure. A best practice is to
measure all variables and
eliminate unneeded variables
in spss or excel, in a new file.
This is especially recommended
if you do not have easy access
to the software.
Category Selections
The Category box selections include all of
the variables measured by LIWC. If a
person was 100% sure of what variables
they were studying, and had access to
the software regularly, they might
choose to isolate variables for ease of
use and to make data files less wieldy for
analysis.
Extras & Punctuation
The extras selection under the “Categories” menu allows
the researcher to exert more specific control over
numerals, abbreviations and emoticons depending on
the type of text they are analyzing and the variables
which they will be using. If word count is an analysis
variable for a researcher studying narrative texts, they
would want to count numerals, whereas someone
analyzing a text for different reasons might want to
ignore them as non-words.
The punctuation selection under the “Categories”
menu allows the research to exert specific control
over which punctuation marks will be counted.
Analyze
The “Analyze” menu
gives you the option
to analyze your text
in segments, by
number, by
words/segment or by
delimeter.
Segmenting the text
is useful because it
allows the researcher
to have one output
file with many
segment analyses.
Segmenting the File
This text file, for example
is prepare to be
segmented by
“delimeter.” I will set the
delimiter to “3 or more
returns,” because each
segment here has three
or more hard returns in
between.
These segments are
Sylvia Plath poems.
Segmenting Selection
Processing File
Once you have set all the parameters… you can now process your file:
Select “Process text” from file menu; navigate to the text and select, click “Select”
Processing File (Cont.)
• This dialogue
box, which
opens after you
choose your
file, allows you
to save your
results as a .xls
file in any
location you
choose.
LIWC Results Location
Results for Segmented
Poems
Poem Results in Excel
Results for Consolidated
Poems (used in analysis)
Plath Analysis
• We analyzed seven Plath poems, chosen
randomly from poems written in 1963, the last
year of her life. The poems were entered in as
seven segments. We also analyzed one
segment of a journal entry.
• We are utilizing research design from Stirman
& Pennebaker (2001) to predict which
variables should be elevated in suicidal poems:
Analysis Variables
• Via Disengagement Theory: I(me,my)
• Via Hopelessness Theory: Death, Negative
Emotion
• Via Other Findings in Pennebaker & Stirman
(2001), Sexual Words
Graph of Results
Results
• There is a higher percentage of sexual words in
both Plath’s journal and poetry compared to a
mean.
• Death words are higher than the norm for the
poetry, but there were 0 death words in the
journal entry.
• The norm for negative emotion is higher than
both the journal and the poems.
• I (me,my) words are lower than the norm for both
the poem and the journal.
Discussion
• Why the strange findings?
• Our sample size was relatively small compared
to the sample size utilized in the Pennebaker
article.
• Our sample size was not randomly chosen from
the poet’s entire canon.
• NE (negative emotions) was also non-significant
in the Pennebaker piece.
• The journal entry is an insignificant sample of
her journaling.
Results Explored
Viewing individual results for each of the analyzed poems (which were taken as a
whole in our comparison), we can see how poem selection matters. Examine
Poem 5. Look at its I words, Death words and Sexual words. Compare that to
Poem 8. The small random selection of poems likely accounts for the unexpected
results of our simple analysis.
LIWC Applications
• LIWC has been used in many ways, from
examining suicidal poets’ poetry to analyzing
countless other texts for cognitive and topical
variables.
• Julie’s proposed study – a description of a
potential application with multiple measures.
Questions?
Or, suggestions for other research uses?
Download