NLyze: Interactive Programming by Natural Language for SpreadSheet Data Analysis and Manipulation Sumit Gulwani and Mark Marron SIGMOD 2014 Motivation • 100’s of millions of spreadsheet users • Many data analysis tasks currently require “programming” • Most non-technical users do not know/want to learn to program • Even technical users struggle • Don’t remember API and how to use it • NL is faster and less distracting than formal program NLyze Approach • Allow users to express their intent in Natural Language • Many spreadsheet tasks easily expressible • Easy for users to understand and provide • Support common tasks identified on help forms and usage data • • • • • sum, average, count, min, max, … selection & vlookup +, *, -, / =, <, <=, empty, startswith, … and, or, not Challenges • Many task compositions and variations • Even bounded task model has 10’s of thousands of possible tasks • Many tasks in long tail: (Min(A1:A12) + Max(A1:A12)) / 2 • Many ways to express a given task in natural language • =SUMIFS(hours, location, ="capitol hill", title, =“chef") • “sum the hours for the capitol hill location chefs” • “total hours capitol hill chefs” • “get the hours where title = chef that work at capitol hill and sum them up” • On average 37.7 semantic clusters for each intent Key Ideas • Domain specific language (DSL) for translation task • Combined translation approach • Uses semantic parsing & keyword programming simultaneously • Co-operation through DSL API & type signature (not grammar structure) • Generate & rank multiple possible interpretations • Leverage interactive environment • Ambiguity resolution • Live environment for task decomposition & programming-by-example DSL Design • Inspired by SQL & LINQ • Core map, filter, reduce algebra with limited joins (vlookup) • Extensions for spreadsheet tasks • Selection, highlighting, and formatting • Tuned for translation & tasks • Type system to support synthesis • Operations designed to map closely with “high-value” tasks • Conditional reduce (sum, average, count with predicates) • VLookup • Selection Translation Overview • Syntax directed rules • Cover majority of common expression styles with small number of rules • High precision (but low recall) • Type-based expression synthesis • Construct expressions from fragments in uncommon intent expressions • High recall (but low precision) • Use both techniques during translation • Apply both translations to each part of the sentence • Allow them to use results from other method too Example Sheet Common Idioms (via Syntactic Rules) Sum the %CH %Pred → SUM(□CH, □Pred) Sum the totalpay for the chefs %V → □V.CH=□V SUM(totalpay, title=“chef”) Combined with Synthesis Sum their %CH → SUM(□CH, __) Get the chefs then sum their totalpay %V → □V.CH=□V title=“chef” SUM(totalpay, ) Combination • Interleaving approach • Allow arbitrary interleaving of synthesis & rule applications • Cannot allow algorithms to recursively call each other at any level • Bottom-up all interpretations using dynamic programming • Typed “holes” instead of grammar non-terminals • Types + restrictions are synthesis specifications • Allows combination of different interpretation theories • We can support mixing of Excel formulas and Natural language • “average hours + STDEVA(D2:D12)” Result Ranking • Translation produces multiple results • Rank based on a number of features • • • • Scores of various rules Use of synthesis Coverage of users input Ordering relative to users input Programming Model • Ambiguity resolution • “Live Search” with multiple results • Paraphrase into disambiguated English • Highlight ignored, misspelled, and special words • Task decomposition in live environment • Complete tasks in multiple steps • Integrate with operations in menu • Interoperability with programming-by-example (Flash-Fill) Evaluation • Evaluation Data Set • 3570 NL commands (correctly labeled via Mechanical Turk) • 40 tasks on 4 sheets • Expression Results • Top Ranked: 94% • Top-3 Ranked: 97% • Overall Recall: 98% • Computational Cost • Average Response Time: 0.011s User Experience Conclusions • Combination of NLP & Synthesis is powerful translation approach • Combines high-precision & high-recall methods • Bottom-up + type-based signature allows efficient & general combination • NLyze is a proof of concept implementation for spreadsheets • Works with high precision and recall in evaluation (and user experiments) • Concept applicable to other domains as well (semi-structured data) • Opportunities for further improvement • Currently not using advanced NLP (parsers, tagging, etc.) • Currently not using semantic info (value identification, ontologies, etc.) Questions User Experience Conclusions • Natural language is a good specification for many tasks/applications • Spreadsheets are an important example of such an application • Similar high-impact opportunities in other “end-user” focused applications • Combination of NLP & Synthesis is powerful translation approach • Combines high-precision & high-recall methods • Bottom-up + type-based signature allows efficient combination • Opportunities for further improvement • Leverage work in NLP and DB communities • Extend live programming with PBE