View - Leximancer

advertisement
From Words to Meaning to Insight
Julia Cretchley & Mike Neal
Outline
 Getting started
 Creating projects and loading data
 Run the project
 Initial results interpretations
• The Concept Map
Getting Started
Help Button-->
About -->
ShowsManual
version -->
of
Leximancer
Access PDF
Contact-->
Manual
Starts email
Projects
Manage Projects-->
Create folders under
Leximancer Projects
to organize your own
projects.
Interviews-->
Double
ClickProject-->
to Open
Create
Project
Panel
Create
projects
in
current folder.
Planning Projects
 Fast, first-cut analysis for pure discovery
(grounded theory)
1. Load Data
2. Run steps with no editing or configuration
3. Examine results and explore data
 Deliberate, planned analysis
1. Load Data
2. Set up custom configuration (tags, sentiment analysis)
3. Examine results; explore data; modify settings
4. Repeat 2 and 3
Project Control Panel
Four main interaction areas
Current status
1
2
3
4
Configure
and option Buttons
editors
Reporting
and Exploration
Steps to Analysis
 Fast, first-cut analysis for pure discovery
(grounded theory)
1. Load Data
2. Run steps with no editing or configuration
3. Examine results and explore data
 Deliberate, planned analysis
1. Load Data
2. Set up custom configuration (tags, sentiment analysis)
3. Examine results; explore data; modify settings
4. Repeat 2 and 3
Stages: Load Data
Load Data
 Data formats
• xls, cvs, tsv for spreadsheet loading
• pdf, doc, docx, rtf, txt, html, xml, xhtml
 Two options
1. Spreadsheet
2. Files and file folders of documents
 Tags (briefly...)
• Organize data into folders or spreadsheet columns
(automatic) by date or topic for Dashboard later
Stages Run Project
What Did Leximancer Just Do?
 Split the text into sentences, paragraphs, and documents
 Divided the text into blocks of 2 sentences (by default)
 Identified Proper Nouns and multi-word (compound) names
 Removed non-lexical and weak semantic information (i.e., stop word list)
 Determined seed words via most frequent words and relationships
 Used seed words to build coding dictionary (i.e., thesaurus)
 Use thesaurus to code text and tagged the blocks the concepts they contain
 Measured co-occurrence between concepts
 Produced concepts, themes, final thesaurus
•
Statistics (frequencies, measurements)
•
Outputs (Dashboard only if configured)
View Results
 Concept Map and Concept Cloud are key interfaces
 Activities analyst typically performs now
• Understand the initial run and data
• Explore thesaurus; links to actual data
• Look for concepts to merge, remove, or make
compound
• Create Dashboard Report, export data; save map
 Run analysis again; repeat as necessary
Concept Map
Controls to toggle concept map, network display, center, zoom, save, export
Concept
Summary
Theme Summary
• Ranked list
• Name-like
Examples
• word-like
more...
Colored spheres are Themes
Dots are concepts (size matters)
Connections shown
Control % of concepts
% of themes
Rotate for better display
Concept Map
 Leximancer uses concept frequency and cooccurrence data to compile a matrix of concept cooccurrences
• You can export this matrix to Excel for your own
visualizations
 A statistical algorithm is then used to create a twodimensional concept map based on the matrix
 Initially, concepts are dispersed randomly in the
map space. Then the relationships between
concepts act like attractive forces to guide
concepts to their resting places.
Concept Cloud
Concept Relationships highlighted
Colors are heat mapped (Themes)
Rotate for better view
Save Map/Export Image in case of new run
Concept Tab
 Top Name-like concepts at
top (Proper names by
capital first letter)
 Click name and get ranked
list of related concepts
 Count is number of times
word (concept) appears
in entire corpus (2sentence blocks)
 Relevance is most
frequent concept
(Japan:7010) as 100%.
Divide counts by 7010 for
percentages.
• Shows proportionality
(representative)
relative to each other
Concept Extraction A Test!
"We use the laser 500 printer here at the office. We are pretty
happy with it. Once there was a leak and all the toner spilled
out of the machine
machine, but a technician came out and fixed the
problem for us. We still have to top the toner up often. The
printer goes through ink quickly and the cartridges are
expensive, but we put up with this because it delivers good
results reliably. We are pleased with the quality of rinting we
get. The laser 500 can batch process, and collate the pages to
500
save us time. Sometimes paper gets jammed in the laser 500.
Then we have to open it up to remove the crumpled paper.
We have tried other machines in the past, but have not found
an alternative that works better for us.”
For printer concept:
____
2 occurrences by ordinary keyword text search
____
5 occurrences from Leximancer
Select a Concept
Lines drawn to related concepts
Count is number of times concept is
mentioned with Redcross.
Example donation: 196.
So, of all comments about donation, 68%
mention Redcross.
Redcross clicked
Thesaurus
Concepts here listed in abc order
Click concept to see thesaurus:
evidence words describing concept.
Score is z-score.
Higher score is more relevant.
Higher relevance value means:
- Occur often in sentences
containing the concept
- Rarely occur in sentences not
containing the concept
Questions?
Download