Diction 5.0 By Ben Gifford and Terri Johnson About the creators Diction was created by the Dean of the College of Communication at The University of Texas at Austin, Roderick P. Hart--he is very focused on political communication. He's "passionate about many things, but especially about his family and basketball." and Craig Carroll, associate professor and department chair of Communication and Journalism at Lipscomb University Features • Examines text for 5 main semantic features: o o o o o Activity Optimism Certainty Realism Commonality Features It also has built-in dictionaries for numerical terms, ambivalence, self-referencing, tenacity, leveling terms, collectives, praise, satisfaction, inspiration, blame, hardship, aggression, accomplishment, communication, cognition, passivity, spatial terms familiarity, temporal terms, present concerns, human interest, concreteness, past concern, centrality, rapport, cooperation, diversity, exclusion, liberation, denial, and motion Features • Reports an average frequency and whether the variable falls within a standard range. • Diction can analyze o o o o the first 500 words of a given passage up to 500,000 words in 500 word units averaged together any passage up to 5,000 words in length (500-word units) Units smaller than 500 words Comparing Bill Clinton and Barack Obama • Make sure you have some text (.txt file) Create a new project file File > New or Ctrl +N Add the text files Edit > Add File(s) or Insert Then navigate to your .txt files and open them Check the properties Go to File->Properties. This is especially important if you want to use SPSS In the processing tab, check under Large File Options. “Averaged” is probably your best bet. Properties continued Give the output file a unique name. Under “Numeric Filename” highlight the text immediately before “.num” and replace it with a chosen name. *Note: this is different from saving the entire project. When you save a project, this file is created and saved separately and can be used in SPSS Choose Norms Profile • Diction notes when text falls outside of a normal range based on previous content analyses o The default for this a "single normative profile" o Can tailor to more specific needs Public speeches Poetry Newspaper Editorials Music lyrics etc. • It's simple to change the profile Choose Norms Profile Cont. • Go to View ->Normative Values • To choose a more specific set of norms, make a general selection under Class, and then a more specific selection under Type. Choose Norms Profile Cont. • Some normative values are "better" than others. • Searching for "Normative Values" under help-> help topics will bring up a list of all the different profiles. e.g. the creators sampled 2,357 campaign speeches, but only 78 poetry and verse samples Process your files Processing-> All Files (Ctrl+Shift+G) or Selected Files (Ctrl+G) You may have to add new words to the insistence score More on that soon, but just go ahead and hit “yes” (there’s really no reason not to) Viewing output Output for one file Abridged output For all files. Clinton Results It’s possible to look at some raw results. This presentation will touch on some of the variables. A full list is available in the manual. Clinton Results Diction brings up a count of all words that appear three or more times (in a 500-word passage) called “Insistence Score” Looks for nouns, noun-derived objects, or words that can be used as both a verb and a noun/noun-derived object Interesting, but probably not statistically significant or practical. Clinton Results Calculated Variables Insistence - repetition of key terms Embellishment - Ratio of adj. to verbs Variety - Different words/total words Complexity - Avg. # of chars. per word Master Variables These scores use built-in dictionaries (See next slide) Clinton Results Activity: Language featuring movement, change, the implementation of ideas and the avoidance of inertia. ex. formula: [Aggression + Accomplishment + Communication + Motion] - [Cognitive Terms + Passivity + Embellishment] Optimism: Language endorsing some person, group, concept or event or highlighting their positive entailments. Certainty: Language indicating resoluteness, inflexibility, and completeness and a tendency to speak ex cathedra (authority from office/position) Realism: Language describing tangible, immediate, recognizable matters that affect people’s everyday lives Commonality: Language highlighting the agreed -upon values of a group and rejecting idiosyncratic modes of engagement. Clinton Results Diction flags any of these variables that it deems "out of range" When you’re ready to use SPSS… Save first (File -> Save, Ctrl + s). This creates the .num file Find where your .num file went. Copy it (Ctrl+C) Move some stuff around Create folders in C:\ called “RAWDATA” and “spssdata” if they aren’t there already. Go to C:\RAWDATA. Paste your .num file (Ctrl+V). Right-click it and click “rename.” Rename it “mystudy.dat” Click “Yes” Open SPSS Get to the default blank view. Go to File->Open->Syntax… Navigate to C:\Program Files\Diction\Stats\ Open ‘SPSS-DIC.SPS’ This file is a pain… no really. Open it, you’ll see. You need to make this… Look like something like this Consult the Diction manual Go to page 54 (of the Diction 4.0 manual) and look at Figure 28. You need to make SPSS-DIC.SPS look somewhat like that file. The most important part is that each word in all CAPS is on its own line. Figure out where there is a line break at hit “enter” at each one. After that long and tedious process Run the syntax. Cross fingers. Pray. This must match up with the .dat file you created If it worked You should have an SPPS sheet filled with data From here, the sky is the limit For example A simple means comparison shows Clinton was much more ambivalent in the speeches sampled Though the results are not significant WordStat By Ben Gifford and Terri Johnson Craig Stovall JetBlue Airlines To obtain a free trial of WordStat About WordStat • Content analysis module of SimStat • Analyzes textual information o open-ended responses o interview transcripts o journal articles o websites • Can be used for automatic categorization of text • Can be used for manual coding • Facilitates the development of new dictionaries Features • • • • • • Integrated text-mining analysis Visualization tools Hierarchical categorization dictionary User-generated dictionary Keyword-in-context (KWIC) retrieval tools Statistical analysis capabilities o factor analysis o word frequencies To Create Dictionary Go to My Computer ...C drive ...Program Files ...Provalis Research ...Dictionary ...Copy Existing CAT file ...Rename to ______.cat ...Right click new cat file ...Open with notepad ("choose program" if notepad is nonexistent) <--Category <--Words Create dictionary using correct formatting: Category flush left, word tabbed in with " (1)" after (space is important). Everything single-space. Dictionary needs fixing Dictionary results To Open WordStat ...Go to CATA ...Provalis Research ...Simstat ...Simstat for Windows Go to File>Data>New You'll get a screen that looks like this: Create Variables ...1) Person (integer) tab to add additional variables ...2) Speech (memo) (text files will always be memo variables) A Content Analysis on Speeches by Clinton and Obama Step #1-Enter data • In this case, enter "1" for a speech by Obama, "2" for a Clinton speech. • Copy/paste the text into the window below when the appropriate "memo" column is highlighted. • To add another line, hit "tab" while in the right-most column. A Content Analysis on Speeches by Clinton and Obama Step #2 -Select the variables • Execute the STATISTICS...CHOOSE X-Y command • Move the PERSON variable to the INDEPENDENT • Move the SPEECH variable to the DEPENDENT • Press the OK button A Content Analysis on Speeches by Clinton and Obama Step # 3- Run the content analysis module • Execute STATISTICS...CONTENT ANALYSIS Step # 4- Choose the proper dictionaries (for Inclusion) Speeches by Clinton and Obama Step # 5- View the results • Click different tabs (word count and crosstabs) • Click that button Clinton and Obama Speeches More Results THE END... (or should we say this is just the beginning)