LIWC - Academic Csuohio

advertisement
LIWC
Linguistic
Inquiry &
Word
Count
Jeff Spicer & Matthew Egizii
The Pennebaker Dictionary


LIWC uses Dictionaries of Categories to define its search terms.

The Pennebaker Dictionary is built in, but others can be
imported.
The Pennebaker Dictionary (2001)

LIWC's default set of psychologically meaningful categories

74 subdictionaries (categories)
 {80 in LIWC 2007}

Each subdictionary is comprised of words chosen and
assessed by a set of judges who then agreed upon a set
of subdictionary scales (93%-100% of the time).

Many of these words are in multiple categories.
The Pennebaker Dictionary

If you are able, use the Pennebaker 2007 rather than 2001:

It removes several categories that had, “Consistently
low base rates and were rarely used: Optimism, Positive
Feelings, Communication Verbs, Other References,
Metaphysical, Sleeping, Grooming, School, Sports,
Television, Up, and Down. The category of unique
Words (also known as Type/Token ratio) has also been
removed.”

It adds the categories of Conjunctions, Adverbs,
Quantifiers, Auxiliary Verbs, Commonly-used Verbs,
Impersonal Pronouns, Total Function Words, and Total
Relativity Words.

Also, the categories themselves are much more fleshed
out:

Religion is not strictly, “Catholicism,” as it was
before (seemed a tad biased).
The Pennebaker Dictionary

The LIWC website has a page with
comparisons between the scores of
each dictionary based on its library.


Means, SDs, Correlations
Comparing LIWC2007 with LIWC2001 Dictionaries
Preparing Text

LIWC uses .txt or ASCII files for analysis.

Files should be checked for:



Correct U.S. Spelling
Spelled-out meaningful
abbreviations
Removal of “Non-Fluency” words
Reading the Results


Results are given as a % of the total text.

Except for:

Word Count

Words Per Sentence

Sentences Ending with a Question Mark
(?)
Results are placed in a .xls file (Spreadsheet)

The file is “Tab-Delimited” meaning that
importing it into an SPSS data file is quite
simple.
Opening & Processing Files

Opening: Allows you to read/edit the text within LIWC

Processing: Runs the text analysis
Setting Dictionaries & Categories

Each of the categories can be turned on/off with a checkbox.
Analyze Function
Segmenting the File
Segmenting the Selection allows you to divide the text into
multiple parts for analysis.


Analysis of Epic Texts

We decided to use the power of CATA on several huge
literary blocks of text:

The Odyssey

The Aeneid

Beowulf
Analysis of Epic Texts

Textualization of Oral Epic Tradition



Attempt to capture the Ekphrasis of the original
medium.
Some elements are lost in translation.
Question: Which elements are both difficult to
describe and also necessary to pass on to a
culture?
Analysis of Epic Texts


Primarily we were interested in references to Gods, Religious
Tradition and Worship

We chose the (now defunct) category, “Metaphysical,”
rather than, “Religion.”

Its word choices are more in line with spirituality
rather than modern, formalized religion.
We also used the Standard Information category

Word Count, Words/Sentence, Sentences ending with ?,
LIWC dictionary words, Unique words, Words longer than 6
characters
Source of Error?

We had some trouble with the
Psychological Processes group.



Several categories wouldn’t shut off, even after
de-selecting them.
???
So we decided to
run them too!
Processed Files

After hitting Process & choosing where to save the
.xls file, it will open in plain text within LIWC.
Results & Graphs

Total Word Counts:



Odyssey: 117643
Aeneid: 101370
Beowulf: 23726
Results & Graphs

Words/Sentence



Odyssey: 37.43
Aeneid: 32.74
Beowulf: 25.19

Unique Words (%)



Odyssey: 5.76
Aeneid: 8.54
Beowulf: 15.76
Results & Graphs

Metaphysical

Question Marks

Exclamation Marks
(% of Words)
(% of Sentences)
(% of Sentences)

Odyssey: 0.97

Odyssey: 0.23

Odyssey: 0.01

Aeneid: 1.37

Aeneid: 0.41

Aeneid: 0.1

Beowulf: 1.42

Beowulf: 0.03

Beowulf: 0.72
Findings

Odyssey is:





MASSIVE
Has the longest
sentences
Has the least % of
unique words
Has the least % of
exclamations
Is the least interested
in the Metaphysical.
Findings

Aeneid is:




Also pretty big
Has a larger amount of
Metaphysical text
Also isn’t interested in
exclamations
And asks the most
questions.
Findings

Beowulf is:






Pretty short
(comparatively)
Much shorter
sentences
Filled with many
unique words
Asks few questions
Is the most
interested in the
Metaphysical
And is very
excitable!!!!!
Download