From Words to Meaning to Insight Julia Cretchley & Mike Neal Outline Getting started Creating projects and loading data Run the project Initial results interpretations • The Concept Map Getting Started Help Button--> About --> ShowsManual version --> of Leximancer Access PDF Contact--> Manual Starts email Projects Manage Projects--> Create folders under Leximancer Projects to organize your own projects. Interviews--> Double ClickProject--> to Open Create Project Panel Create projects in current folder. Planning Projects Fast, first-cut analysis for pure discovery (grounded theory) 1. Load Data 2. Run steps with no editing or configuration 3. Examine results and explore data Deliberate, planned analysis 1. Load Data 2. Set up custom configuration (tags, sentiment analysis) 3. Examine results; explore data; modify settings 4. Repeat 2 and 3 Project Control Panel Four main interaction areas Current status 1 2 3 4 Configure and option Buttons editors Reporting and Exploration Steps to Analysis Fast, first-cut analysis for pure discovery (grounded theory) 1. Load Data 2. Run steps with no editing or configuration 3. Examine results and explore data Deliberate, planned analysis 1. Load Data 2. Set up custom configuration (tags, sentiment analysis) 3. Examine results; explore data; modify settings 4. Repeat 2 and 3 Stages: Load Data Load Data Data formats • xls, cvs, tsv for spreadsheet loading • pdf, doc, docx, rtf, txt, html, xml, xhtml Two options 1. Spreadsheet 2. Files and file folders of documents Tags (briefly...) • Organize data into folders or spreadsheet columns (automatic) by date or topic for Dashboard later Stages Run Project What Did Leximancer Just Do? Split the text into sentences, paragraphs, and documents Divided the text into blocks of 2 sentences (by default) Identified Proper Nouns and multi-word (compound) names Removed non-lexical and weak semantic information (i.e., stop word list) Determined seed words via most frequent words and relationships Used seed words to build coding dictionary (i.e., thesaurus) Use thesaurus to code text and tagged the blocks the concepts they contain Measured co-occurrence between concepts Produced concepts, themes, final thesaurus • Statistics (frequencies, measurements) • Outputs (Dashboard only if configured) View Results Concept Map and Concept Cloud are key interfaces Activities analyst typically performs now • Understand the initial run and data • Explore thesaurus; links to actual data • Look for concepts to merge, remove, or make compound • Create Dashboard Report, export data; save map Run analysis again; repeat as necessary Concept Map Controls to toggle concept map, network display, center, zoom, save, export Concept Summary Theme Summary • Ranked list • Name-like Examples • word-like more... Colored spheres are Themes Dots are concepts (size matters) Connections shown Control % of concepts % of themes Rotate for better display Concept Map Leximancer uses concept frequency and cooccurrence data to compile a matrix of concept cooccurrences • You can export this matrix to Excel for your own visualizations A statistical algorithm is then used to create a twodimensional concept map based on the matrix Initially, concepts are dispersed randomly in the map space. Then the relationships between concepts act like attractive forces to guide concepts to their resting places. Concept Cloud Concept Relationships highlighted Colors are heat mapped (Themes) Rotate for better view Save Map/Export Image in case of new run Concept Tab Top Name-like concepts at top (Proper names by capital first letter) Click name and get ranked list of related concepts Count is number of times word (concept) appears in entire corpus (2sentence blocks) Relevance is most frequent concept (Japan:7010) as 100%. Divide counts by 7010 for percentages. • Shows proportionality (representative) relative to each other Concept Extraction A Test! "We use the laser 500 printer here at the office. We are pretty happy with it. Once there was a leak and all the toner spilled out of the machine machine, but a technician came out and fixed the problem for us. We still have to top the toner up often. The printer goes through ink quickly and the cartridges are expensive, but we put up with this because it delivers good results reliably. We are pleased with the quality of rinting we get. The laser 500 can batch process, and collate the pages to 500 save us time. Sometimes paper gets jammed in the laser 500. Then we have to open it up to remove the crumpled paper. We have tried other machines in the past, but have not found an alternative that works better for us.” For printer concept: ____ 2 occurrences by ordinary keyword text search ____ 5 occurrences from Leximancer Select a Concept Lines drawn to related concepts Count is number of times concept is mentioned with Redcross. Example donation: 196. So, of all comments about donation, 68% mention Redcross. Redcross clicked Thesaurus Concepts here listed in abc order Click concept to see thesaurus: evidence words describing concept. Score is z-score. Higher score is more relevant. Higher relevance value means: - Occur often in sentences containing the concept - Rarely occur in sentences not containing the concept Questions?