L.A.S.I. Linguistic Analysis for Subject Identification Presented by Red Team: Scott Minter, Dustin Patrick, Aluan Haddad, Richard Owens, Brittany Johnson and Erik Rogers September 26th, 2012 Index Introduction Problem Specifics (cont.) NCSOSE Dr. Patrick Hester Problem Identification 1 2 4 Why it matters Criteria Abstraction Synergy Public Sector Work The Private Sector 5 6 7 8 AID Significance Beyond Strategic Assessment Applicability 9 Diagram Mission Statement Project Vision Goals Organizational Hierarchy 10 11 12 13 14 Affected Areas Time 15 16 Current Process Flow Problem Specifics Documentation Gathering Data Analysis Scientific Process 17 18 19 Characteristics How would the process change? Possible Issues 20 21 24 Overview Syntactic v. Semantic Carrot2 Word Stat ReadMe 25 26 27 28 29 What the Competition Lacks 30 The Potential Solution The Competition A Unique Solution Conclusion Final Thoughts Works Cited 31 32 1 September 26th, 2012 Introduction: NCSOSE • National Center for Systems of Systems Engineering • Provides key services in strategic analysis to corporations and government agencies • Discover the Underlying Causes of Issues 1 NCSOSE website 2 September 26th, 2012 Introduction: Dr. Patrick Hester • Ph. D. from Vanderbilt University, 2007 ▫ Major: Risk and Reliability Engineering and Management • B.S. from Webb Institute, 2001 ▫ Major: Naval Architecture and Marine Engineering “My research interests include multi-objective decision making under uncertainty, probabilistic and non-probabilistic uncertainty analysis, critical infrastructure protection, and decision making using modeling and simulation. “ 2 - Dr. Hester 2Patrick Hester Website 3 September 26th, 2012 Introduction: Dr. Patrick Hester cont. • While working for the Department of Homeland Security he was involved in research for resource allocation. • This research led to the development of the Enterprise AID methodology. • Showing the importance of problem identification and assessment 4 September 26th, 2012 Introduction: Problem Identification “Failing to key in on the right problem means even the best solution is destined to fail.” – Dr. Hester 4Parsing Tool for Linguistic Analysis 5 September 26th, 2012 AID Significance: Why it matters The AID Methodology is a flexible, transparent, universal systems assessment approach which: • Analyzes reports and documents critically objectively • De-shrouds issues from an author’s frame of reference • Identifies relevance of explicitly stated problems • Determines the feasibility of proposed solutions 6 September 26th, 2012 AID Significance: Criteria Abstraction Synergy “none directly measureable but every one justifiable as an attribute that merits the title of MOE and hence might promote the assessment, improvement, or design of innumerable enterprises.” 3 -Hester & Meyers • The AID process provides a basis for assessing qualitative well as quantitative goals such as • • • Team Chemistry Positive Attitude Customer Satisfaction1 • Qualitative concerns are highly significant • • • Qualitative concerns are what people care about They provide important context Integrate an agency’s Mission and Vision, directly into the specific problem area 3Enterprise AID 7 September 26th, 2012 AID Significance: Public Sector Work NCSOSE’s public sector work improves Gov. Agency efficiency and helps it maintain focus • Currently engaged in a 3 year, $1.6 million effort to improve efficiency for The Department of Homeland Security4 • Cost and necessity assessments for The Federal Emergency Management Agency3 • Public sector projects matter to all of us 4Parsing Tool for Linguistic Analysis 3Enterprise AID 8 September 26th, 2012 AID Significance: The Private Sector • The AID methodology is generally applicable • Malleable yet not generic • Robust yet adaptable • Beneficial to a wide array of specific issues, focus areas, and problem sets • Corporate planning and management • Academic organization and strategy 9 September 26th, 2012 Beyond Strategic Assessment: Applicability of LASI An intelligent, heuristic algorithm to analyze written work to aid in reaching an unbiased, intelligent conclusion would be invaluable. • Broad Applicability • • • • • Strategic Analysis Education6 Translation Engines Web Search Algorithms7 General Management Decisions8 5 “A Framework for the Computerized Assessment of University Student Essays.” 6 “Using an Intelligent Agent to Enhance Search Engine Performance." 7“A fuzzy multi-criteria decision-making method for facility site selection” 10 September 26th, 2012 Current Workflow Process: Process Diagram Manual Search for: Strategy/Strategic Plans Customer Contact Problem Statement Mission/Vision/Goals Client Input/SA Meeting Situational Awareness Meeting Will NCSOSE Be Useful? yes Documentation Gathering Process Does the customer provide documentation? no no Client Goes Elsewhere Independent Research Required yes Strategic Plans Vision/Mission/Goals Org. Hierarchy 11 September 26th, 2012 Current Workflow Process: Mission Statement • Qualitative overview of the company solving the problem • One or two sentences at most • Must be considered to find the scope of the problem 12 September 26th, 2012 Current Workflow Process: Project Vision • Qualitative outline of the problem to be solved • Several sentences that outline the problem in question • Very important for Situational Awareness Meeting 13 September 26th, 2012 Current Workflow Process: Goals • Quantifiable Statements outlining a specific task • Several are included in each document • The most useful thing to look for in a strategic document 14 September 26th, 2012 Current Workflow Process: Organizational Hierarchy Dept. of Homeland Security S&T Directorate Individual Disaster Relief FEMA TSA Public Disaster Relief Hurricane Relief 8 “DHS Component Websites” Other 15 September 26th, 2012 Problem Specifics: Affected Areas • Time • Documentation Gathering • Data Analysis • Scientific Process 16 September 26th, 2012 Problem Specifics: Time • Only 2 Ph. D.’s, Dr. Hester and his counter-part, currently work on this process. • Process is very time consuming. • Can take up to 12 hours to go through problem statement process. 17 September 26th, 2012 Problem Specifics: Documentation Gathering • Documents will have “jargon” specific to a client’s organization, which must be defined and understood ahead of time. • The client may not provide enough documentation, which requires additional research. • If a lack of documentation requires extensive research, only one of the researchers will perform the entire problem statement process. 18 September 26th, 2012 Problem Specifics: Data Analysis • Large quantities of documents must be mentally digested, assessed, and interrelated. • Must determine aspects achievable given specific monetary and temporal restrictions. • There is difficulty keeping the problem space and solution space separate and without bias. 19 September 26th, 2012 Problem Specifics: Scientific Process • Identify and weight individual problem aspects in terms of stated scope, goals, and resources. • At this time, there is no defined “critique criteria” used to determine the problem statement. • There is only some scientific defense to convince the client the problem statement is accurate, and not have to start over. 20 September 26th, 2012 The Potential Solution: Characteristics • • • • • Fully Automated Run Locally Web Querying Capabilities Unbiased Provides a Weighted Breakdown 21 September 26th, 2012 The Potential Solution: How would the process change? • Decreased time for researching documents If the customer does not provide documentation, the program will be able to search for it. Does the customer provide documentation? no Independent Research Required 22 September 26th, 2012 The Potential Solution: How would the process change? • Decreased time for reading and assessment Reading and analyzing documents currently takes about 6-8 hours and the total time can take up to 12 hours. 9 Manual Search for: Strategy/Strategic Plans Problem Statement Mission/Vision/Goals Client Input/SA Meeting 9 Hester, Patrick 23 September 26th, 2012 The Potential Solution: How would the process change? • Provide justification for the analysts’ reasoning Statistical and visual proof is needed when presenting the final problem statement. Situational Awareness Meeting Present findings to customer Problem Statement 24 September 26th, 2012 The Potential Solution: Possible Issues • • • • Incomplete Documents Computation Time Memory Usage Acronyms 25 September 26th, 2012 The Competition: Overview • Document parsing and linguistic analysis are both popular subjects that have much use. Used in: • • • • In the field of Linguistics In the field of Computer Science By the Government As an aid to productive business for companies 26 September 26th, 2012 The Competition: Syntactic v. Semantic Syntactic • Logical grammar • Statistical Data • Alphabetical Frequencies • Word Counts • Parts of Speech • Word Dependencies Semantic • Relating syntactic structures to languageindependent meanings • Extracting meaning and conceptional arguments • Summarization 27 September 26th, 2012 The Competition: Carrot2 • Type: Semantic Analysis Tool • Input: • Documents • Search Engine Results • Output: • List of associations of words between different sources • Two different visual representations of the data • Note: This is what is currently used. 10Carrot2 28 September 26th, 2012 The Competition: WordStat • Type: Syntactic Analysis Tool • Input: Transcripts Websites Documents Books Taxonomy lists User interface with plots and graphs Database Website Document • Output: 11WordStat 29 September 26th, 2012 The Competition: ReadMe • Type: Syntactic Analysis Tool • Input: • Populations of Social Science Documents • User-input categories • Output: • Unbiased estimations for topics in the input categories • Opinions of thousands of people about a given topic 12ReadMe:Software for Automated Content Analysis 30 September 26th, 2012 A Unique Solution: What the Competition Lacks • Single theme finding capabilities over a span of multiple documents • Finding themes based on user provided key terms • Web querying capabilities to find organizationally related documents 31 September 26th, 2012 Conclusion: Final Thoughts • The ability to determine a problem is necessary. • The current process is inefficient and resource intensive. • L.A.S.I will “sniff” out the problem searching for a single theme across a plethora of documents. • Competitors can be outshined by additional help from data mining. • Automating the document analysis process will ensure the problem statement will be correct. 32 September 26th, 2012 Works Cited 1 "National Centers for System of Systems Engineering." National Centers for System of Systems Engineering. Web. 23 Sept. 2012. <http://www.ncsose.org/>. 2 “Patrick Hester" Old Dominion University. N.p., n.d. Web. 24 Sept. 2012 <http://www.odu.edu/directory/people/p/pthester>. 3 Hester, P.T., Meyers, T. (2012). Enterprise AID 4 Patrick Hester, Parsing Tool for Linguistic Analysis 5 Duwairi, Rehab M. "A Framework for the Computerized Assessment of University Student Essays." Computers in Human Behavior 22.3 (2006): 381-88. Web. 6 Jansen, James. "Using an Intelligent Agent to Enhance Search Engine Performance." First Monday 2.3 (1997): Web. 7 “A fuzzy multi-criteria decision-making method for facility site selection” GIN-SHUH LIANG, MAO-JIUN J. WANG International Journal of Production Research Vol. 29, Is. 11, 1991 8 “DHS Component Websites.” Homeland Security. N.p., n.d. Web. 26 Sept. 2012. <http://www.dhs.gov/>. 9 Hester,Patrick. Personal Interview. 12 Sept. 2012. 10 Stanislaw Osinski, Dawid Weiss. 13 August, 2012 . Carrot 2. 9/25/2012 <http://project.carrot2.org>. 11”WordStat” Provalis Research. Web. 24 Sept. 2012. <http://provalisresearch.com/products/content-analysis-software/>. 12 “ReadMe: Software for Automated Content Analysis” Web. 24 Sept. 2012. <http://gking.harvard.edu/node/4520/rbuild_documentation/readme.pdf>