Text Summarization: News and Beyond Kathleen McKeown Department of Computer Science Columbia University 1 What is Summarization? Data as input (database, software trace, expert system), text summary as output Text as input (one or more articles), paragraph summary as output Multimedia in input or output Summaries must convey maximal information in minimal space 2 Summarization is not the same as Language Generation Karl Malone scored 39 points Friday night as the Utah Jazz defeated the Boston Celtics 118-94. Karl Malone tied a season high with 39 points Friday night…. … the Utah Jazz handed the Boston Celtics their sixth straight home defeat 119-94. Streak, Jacques Robin, 1993 3 Summarization Tasks Linguistic summarization: How to pack in as much information as possible in as short an amount of space as possible? Streak: Jacques Robin MAGIC: James Shaw PLanDoc: Karen Kukich, James Shaw, Rebecca Passonneau, Hongyan Jing, Vasilis Hatzivassiloglou Conceptual summarization: What information should be included in the summary? 4 Input Data -- STREAK score (Jazz, 118) score (Celtics, 94) points (Malone, 39) location(game, Boston) #homedefeats(Celtics, 6) The Utah Jazz beat the Celtics 118 - 94. Karl Malone scored 39 points It was a home game for the Celtics It was the 6th straight home defeat 5 Revision rule: nominalization beat Jazz hand Celtics Jazz defeat Celtics 6 SU M M O N S Q U ER Y O U TPU T S u m m a ry : W e d n e s d a y , A p r il 1 9 , 1 9 9 5 , C N N r e p o rte d th a t a n e x p lo s io n s h o o k a g o v e r n m e n t b u ild in g in O k la h o m a C ity . R e u te r s a n n o u n c e d th a t a t le a s t 1 8 p e o p le w e re k ille d . A t 1 P M , R e u te rs a n n o u n c e d th a t th re e m a le s o f M id d le E a s te rn o rig in w e re p o s s ib ly r e s p o n s ib le fo r th e b la s t. T w o d a y s la te r , T im o th y M c V e ig h , 2 7 , w a s a rre s te d a s a s u s p e c t, U .S . a tto rn e y g e n e ra l J a n e t R e n o s a id . A s o f M a y 2 9 , 1 9 9 5 , th e n u m b e r o f v ic tim s w a s 1 6 6 . Im a g e(s): 1 (o k fe d 1 .g if) ( W e b S e e k ) A r t ic le ( s ) : ( 1 ) B la s t h it s O k la h o m a C it y b u ild in g ( 2 ) S u s p e c t s ' t r u c k s a id r e n t e d f r o m D a l la s ( 3 ) A t le a s t 1 8 k il le d in b o m b b la s t - C N N ( 4 ) D E T R O I T ( R e u t e r ) - A fe d e r a l ju d g e M o nd a y o rd ered Ja m es (5 ) W A S H IN G T O N (R eu ter) - A s u s p e c t in t h e O k la h o m a C it y b o m b in g Summons, Dragomir Radev, 1995 7 Briefings Transitional Automatically summarize series of articles Input = templates from information extraction Merge information of interest to the user from multiple sources Show how perception changes over time Highlight agreement and contradictions Conceptual summarization: planning operators Refinement (number of victims) Addition (Later template contains perpetrator) 8 How is summarization done? 4 input articles parsed by information extraction system 4 sets of templates produced as output Content planner uses planning operators to identify similarities and trends Refinement (Later template reports new # victims) New template constructed and passed to sentence generator 9 Sample Template Message ID TST-COL-0001 Secsource: source Secsource: date Reuters 26 Feb 93 Early afternoon 26 Feb 93 World Trade Center Bombing At least 5 Incident: date Incident: location Incident:Type Hum Tgt: number 10 Text Summarization Input: one or more text documents Output: paragraph length summary Sentence extraction is the standard method Using features such as key words, sentence position in document, cue phrases Identify sentences within documents that are salient Extract and string sentences together Luhn – 1950s Hovy and Lin 1990s Schiffman 2000 Machine learning for extraction Corpus of document/summary pairs Learn the features that best determine important sentences Kupiec 1995: Summarization of scientific articles 11 Text Summarization at Columbia Shallow analysis instead of information extraction Extraction of phrases rather than sentences Generation from surface representations in place of semantics 12 Problems with Sentence Extraction Extraneous phrases “The five were apprehended along Interstate 95, heading south in vehicles containing an array of gear including … ... authorities said.” Dangling noun phrases and pronouns “The five” Misleading Why would the media use this specific word (fundamentalists), so often with relation to Muslims? *Most of them are radical Baptists, Lutheran and Presbyterian groups. 13 Cut and Paste in Professional Summarization Humans also reuse the input text to produce summaries But they “cut and paste” the input rather than simply extract our automatic corpus analysis 300 summaries, 1,642 sentences 81% sentences were constructed by cutting and pasting linguistic studies 14 Major Cut and Paste Operations (1) Sentence reduction ~~~~~~~~~~~~ 15 Major Cut and Paste Operations (1) Sentence reduction ~~~~~~~~~~~~ 16 Major Cut and Paste Operations (1) Sentence reduction ~~~~~~~~~~~~ (2) Sentence Combination ~~~~~~~ ~~~~~~~ ~~~~~~ 17 Major Cut and Paste Operations (3) Syntactic Transformation ~~~~~ ~~~~~ (4) Lexical paraphrasing ~~~~~~~~~~~ ~~~ 18 Summarization at Columbia News Email Meetings Journal articles Open-ended question-answering What is a Loya Jurga? Who is Mohammed Naeem Noor Khan? What do people think of welfare reform? 19 Summarization at Columbia News Single Document Multi-document Email Meetings Journal articles Open-ended question-answering What is a Loya Jurga? Who is Al Sadr? What do people think of welfare reform? 20 Cut and Paste Based Single Document Summarization -- System Architecture Input: single document Extraction Extracted sentences Generation Parser Sentence reduction Co-reference Sentence combination Corpus Decomposition Lexicon Output: summary 21 (1) Decomposition of Humanwritten Summary Sentences Input: a human-written summary sentence the original document Decomposition analyzes how the summary sentence was constructed The need for decomposition provide training and testing data for studying cut and paste operations 22 Sample Decomposition Output Document sentences: Summary sentence: S1: A proposed new law that would require web publishers to obtain parental consent Arthur B. Sackler, vice before collecting personal information from president for law and children could destroy the spontaneous nature public policy of Time that makes the internet unique, a member of the Warner Cable Inc. and a Direct Marketing Association told a Senate member of the direct panel Thursday. marketing association told S2: Arthur B. Sackler, vice president for law and public policy of Time Warner Cable Inc., the Communications said the association supported efforts to protect Subcommittee of the children on-line, but he… Senate Commerce S3: “For example, a child’s e-mail address is Committee that legislation necessary …,” Sackler said in testimony to the Communications subcommittee of the Senate to protect children’s Commerce Committee. privacy on-line could destroy the spondtaneous S5: The subcommittee is considering the Children’s Online Privacy Act, which was nature that makes the drafted… 23 A Sample Decomposition Output Document sentences: Summary sentence: S1: A proposed new law that would require web publishers to obtain parental consent Arthur B. Sackler, vice before collecting personal information from president for law and children could destroy the spontaneous nature public policy of Time that makes the internet unique, a member of the Warner Cable Inc. and a Direct Marketing Association told a Senate member of the direct panel Thursday. marketing association told S2: Arthur B. Sackler, vice president for law and public policy of Time Warner Cable Inc., the Communications said the association supported efforts to protect Subcommittee of the children on-line, but he… Senate Commerce S3: “For example, a child’s e-mail address is Committee that legislation necessary …,” Sackler said in testimony to the Communications subcommittee of the Senate to protect children’s Commerce Committee. privacy on-line could destroy the spondtaneous S5: The subcommittee is considering the Children’s Online Privacy Act, which was nature that makes the drafted… 24 Decomposition of human-written summaries A Sample Decomposition Output Document sentences: Summary sentence: S1: A proposed new law that would require web publishers to obtain parental consent Arthur B. Sackler, vice before collecting personal information from president for law and children could destroy the spontaneous nature public policy of Time that makes the internet unique, a member of the Warner Cable Inc. and a Direct Marketing Association told a Senate member of the direct panel Thursday. marketing association told S2: Arthur B. Sackler, vice president for law and public policy of Time Warner Cable Inc., the Communications said the association supported efforts to protect Subcommittee of the children on-line, but he… Senate Commerce S3: “For example, a child’s e-mail address is Committee that legislation necessary …,” Sackler said in testimony to the Communications subcommittee of the Senate to protect children’s Commerce Committee. privacy on-line could destroy the spondtaneous S5: The subcommittee is considering the Children’s Online Privacy Act, which was nature that makes the drafted… 25 Decomposition of human-written summaries A Sample Decomposition Output Document sentences: Summary sentence: S1: A proposed new law that would require web publishers to obtain parental consent Arthur B. Sackler, vice before collecting personal information from president for law and children could destroy the spontaneous nature public policy of Time that makes the internet unique, a member of the Warner Cable Inc. and a Direct Marketing Association told a Senate member of the direct panel Thursday. marketing association told S2: Arthur B. Sackler, vice president for law and public policy of Time Warner Cable Inc., the Communications said the association supported efforts to protect Subcommittee of the children on-line, but he… Senate Commerce S3: “For example, a child’s e-mail address is Committee that legislation necessary …,” Sackler said in testimony to the Communications subcommittee of the Senate to protect children’s Commerce Committee. privacy on-line could destroy the spondtaneous S5: The subcommittee is considering the Children’s Online Privacy Act, which was nature that makes the drafted… 26 Decomposition of human-written summaries The Algorithm for Decomposition A Hidden Markov Model based solution Evaluations: Human judgements 50 summaries, 305 sentences 93.8% of the sentences were decomposed correctly Summary sentence alignment Tested in a legal domain Details in (Jing&McKeown-SIGIR99) 27 (2) Sentence Reduction An example: Original Sentence: When it arrives sometime next year in new TV sets, the V-chip will give parents a new and potentially revolutionary device to block out programs they don’t want their children to see. Reduction Program: The V-chip will give parents a new and potentially revolutionary device to block out programs they don’t want their children to see. Professional: The V-chip will give parents a device to block out programs they don’t want their children to see. 28 The Algorithm for Sentence Reduction Preprocess: syntactic parsing Step 1: Use linguistic knowledge to decide what phrases MUST NOT be removed Step 2: Determine what phrases are most important in the local context Step 3: Compute the probabilities of humans removing a certain type of phrase Step 4: Make the final decision 29 Step 1: Use linguistic knowledge to decide what MUST NOT be removed Syntactic knowledge from a large-scale, reusable lexicon we have constructed convince: meaning 1: NP-PP :PVAL (“of”) (E.g., “He convinced me of his innocence”) NP-TO-INF-OC (E.g., “He convinced me to go to the party”) meaning 2: ... Required syntactic arguments are not removed 30 Step 2: Determining context importance based on lexical links Saudi Arabia on Tuesday decided to sign… The official Saudi Press Agency reported that King Fahd made the decision during a cabinet meeting in Riyadh, the Saudi capital. The meeting was called in response to … the Saudi foreign minister, that the Kingdom… An account of the Cabinet discussions and decisions at the meeting… The agency... 31 Step 2: Determining context importance based on lexical links Saudi Arabia on Tuesday decided to sign… The official Saudi Press Agency reported that King Fahd made the decision during a cabinet meeting in Riyadh, the Saudi capital. The meeting was called in response to … the Saudi foreign minister, that the Kingdom… An account of the Cabinet discussions and decisions at the meeting… The agency... 32 Step 2: Determining context importance based on lexical links Saudi Arabia on Tuesday decided to sign… The official Saudi Press Agency reported that King Fahd made the decision during a cabinet meeting in Riyadh, the Saudi capital. The meeting was called in response to … the Saudi foreign minister, that the Kingdom… An account of the Cabinet discussions and decisions at the meeting… The agency... 33 Step 3: Compute probabilities of humans removing a phrase verb (will give) vsubc (when) subj (V-chip) iobj (parents) ndet (a) obj (device) adjp (and) Prob(“when_clause is removed”| “v=give”) lconj (new) rconj (revolutionary) Prob (“to_infinitive modifier is removed” | “n=device”) 34 Step 4: Make the final decision verb L Cn Pr (will give) vsubc L Cn Pr subj L Cn Pr iobj L Cn obj Pr L Cn Pr (device) (when) (V-chip) (parents) L Cn Pr ndet (a) L -- linguistic Cn -- context Pr -- probabilities adjp L Cn Pr (and) rconj lconj (new) (revolutionary) L Cn Pr L Cn Pr 35 Evaluation of Reduction Success rate: 81.3% 500 sentences reduced by humans Baseline: 43.2% (remove all the clauses, prepositional phrases, to-infinitives,…) Reduction rate: 32.7% Professionals: 41.8% Details in (Jing-ANLP00) 36 Multi-Document Summarization Research Focus Monitor variety of online information sources News, multilingual Email Gather information on events across source and time Same day, multiple sources Across time Summarize Highlighting similarities, new information, different perspectives, user specified interests in real-time 37 Our Approach Use a hybrid of statistical and linguistic knowledge Statistical analysis of multiple documents Identify important new, contradictory information Information fusion and rule-driven content selection Generation of summary sentences By re-using phrases Automatic editing/rewriting summary 38 Newsblaster Integrated in online environment for daily news updates http://newsblaster.cs.columbia.edu/ Ani Nenkova David Elson 39 Newsblaster http://newsblaster.cs.columbia.edu/ Clustering articles into events Categorization by broad topic Multi-document summarization Generation of summary sentences Fusion Editing of references 40 Newsblaster Architecture Crawl News Sites Form Clusters Categorize Title Clusters Summary Router Event Summary Biography Summary Select Images MultiEvent Convert Output to HTML 41 42 Fusion 43 Sentence Fusion Computation Common information identification Alignment of constituents in parsed theme sentences: only some subtrees match Bottom-up local multi-sequence alignment Similarity depends on Word/paraphrase similarity Tree structure similarity Fusion lattice computation Choose a basis sentence Add subtrees from fusion not present in basis Add alternative verbalizations Remove subtrees from basis not present in fusion Lattice linearization Generate all possible sentences from the fusion lattice Score sentences using statistical language model 44 45 46 Tracking Across Days Users want to follow a story across time and watch it unfold Network model for connecting clusters across days Separately cluster events from today’s news Connect new clusters with yesterday’s news Allows for forking and merging of stories Interface for viewing connections Summaries that update a user on what’s new Statistical metrics to identify differences between article pairs Uses learned model of features Identifies differences at clause and paragraph levels 47 48 49 50 51 Different Perspectives Hierarchical clustering Each event cluster is divided into clusters by country Different perspectives can be viewed side by side Experimenting with update summarizer to identify key differences between sets of stories 52 53 54 55 Multilingual Summarization Given a set of documents on the same event Some documents are in English Some documents are translated from other languages 56 Issues for Multilingual Summarization Problem: Translated text is errorful Exploit information available during summarization Similar documents in cluster Replace translated sentences with similar English Edit translated text Replace named entities with extractions from similar English 57 Multilingual Redundancy BAGDAD. - A total of 21 prisoners has been died and a hundred more hurt by firings from mortar in the jail of Abu Gharib (to 20 kilometers to the west of Bagdad), according to has informed general into the U.S.A. Marco Kimmitt. Spanish Bagdad in the Iraqi capital Aufstaendi attacked Bagdad on Tuesday a prison with mortars and killed after USA gifts 22 prisoners. Further 92 passengers of the Abu Ghraib prison were hurt, communicated a spokeswoman of the American armed forces. German The Iraqi being stationed US military shot on the 20th, the same day to the allied forces detention facility which is in アブグレイブ of the Baghdad west approximately 20 kilometers, mortar 12 shot and you were packed, 22 Iraqi human prisoners died, it announced that nearly 100 people were injured. Japanese BAGHDAD, Iraq – Insurgents fired 12 mortars into Baghdad's Abu Ghraib prison Tuesday, killing 22 detainees and injuring 92, U.S. military officials said. English 58 Multilingual Redundancy BAGDAD. - A total of 21 prisoners has been died and a hundred more hurt by firings from mortar in the jail of Abu Gharib (to 20 kilometers to the west of Bagdad), according to has informed general into the U.S.A. Marco Kimmitt. Spanish Bagdad in the Iraqi capital Aufstaendi attacked Bagdad on Tuesday a prison with mortars and killed after USA gifts 22 prisoners. Further 92 passengers of the Abu Ghraib prison were hurt, communicated a spokeswoman of the American armed forces. German The Iraqi being stationed US military shot on the 20th, the same day to the allied forces detention facility which is in アブグレイブ of the Baghdad west approximately 20 kilometers, mortar 12 shot and you were packed, 22 Iraqi human prisoners died, it announced that nearly 100 people were injured. Japanese BAGHDAD, Iraq – Insurgents fired 12 mortars into Baghdad's Abu Ghraib prison Tuesday, killing 22 detainees and injuring 92, U.S. military officials said. English 59 Multilingual Similarity-based Summarization ستجواب日本語 القالفテキ スト Machine Translation Arabic Japane se Text Text Germa n Text Select summary sentences Deutsche r Text Documents on the same Event English Text English Text English Simplify English Sentences Detect Similar Sentences Replace / rewrite MT sentences Text Related English Documents English Summar y 60 Similarity Computation Simfinder computes similarity between sentences based on multiple features: Proper Nouns Verb, noun, adjective WordNet (synonyms) word stem overlap New: Noun phrase and noun phrase variant feature (FASTR) 61 Sentence 1 Iraqi President Saddam Hussein that the government of Iraq over 24 years in a "black" near the port of the northern Iraq after nearly eight months of pursuit was considered the largest in history . Similarity 0.27: Ousted Iraqi President Saddam Hussein is in custody following his dramatic capture by US forces in Iraq. Similarity 0.07: Saddam Hussein, the former president of Iraq, has been captured and is being held by US forces in the country. Similarity 0.04: Coalition authorities have said that the former Iraqi president could be tried at a war crimes tribunal, with Iraqi judges presiding and international legal experts acting as advisers. 62 Sentence Simplification Machine translated sentences long and ungrammatical Use sentence simplification on English sentences to reduce to approximately “one fact” per sentence Use Arabic sentences to find most similar simple sentences Present multiple high similarity sentences 63 Simplification Examples 'Operation Red Dawn', which led to the capture of Saddam Hussein, followed crucial information from a member of a family close to the former Iraqi leader. ' Operation Red Dawn' followed crucial information from a member of a family close to the former Iraqi leader. Operation Red Dawn led to the capture of Saddam Hussein. Saddam Hussein had been the object of intensive searches by US-led forces in Iraq but previous attempts to locate him had proved unsuccessful. Saddam Hussein had been the object of intensive searches by US-led forces in Iraq. But previous attempts to locate him had proved unsuccessful. 64 F-measure vs. Threshold 0.75 0.7 0.65 0.6 0.55 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 F-measure Results on alquds.co.uk.195 Full Threshold .10 Full Threshold .65 Simplified Threshold . 10 Simplified Threshold . 65 FASTR Simplified .10 FASTR Simplified .65 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Threshold 65 Multilingual Summarization: References to Named Entities Use related English text to find similar references Align translated text with English text Automated Evaluation of References By comparison with references in model text Metrics: Precision, Recall and F-Measure Word Order Determiner Choice 66 Example Comparison: American contact Unity (generated) The American Connecting Module Unity (Model) P= 2/3 = 0.67 R= 2/4 = 0.50 F= 0.57 Word Order = 2/3 At most 3 words can be aligned; In this case only 2 can be Determiner Choice = 0 67 Ongoing Work Aligning Named entities across Multiple Translations Learning of language models for word order based on related English text at runtime 3-Part summarization Information common to English and Arabic Information appearing in Arabic only Information appearing in English only 68 Evaluation DUC (Document Understanding Conference): run by NIST Held annually Manual creation of topics (sets of documents) 2-7 human written summaries per topic How well does a system generated summary cover the information in a human summary? 69 User Study: Objectives Does multi-document summarization help? Do summaries help the user find information needed to perform a report writing task? Do users use information from summaries in gathering their facts? Do summaries increase user satisfaction with the online news system? Do users create better quality reports with summaries? How do full multi-document summaries compare with minimal 1-sentence summaries such as Google News? 70 User Study: Design Four parallel news systems Source documents only; no summaries Minimal single sentence summaries (Google News) Newsblaster summaries Human summaries All groups write reports given four scenarios A task similar to analysts Can only use Newsblaster for research Time-restricted 71 User Study: Execution 4 scenarios 4 event clusters each 2 directly relevant, 2 peripherally relevant Average 10 documents/cluster 45 participants Balance between liberal arts, engineering 138 reports Exit survey Multiple-choice and open-ended questions Usage tracking Each click logged, on or off-site 72 “Geneva” Prompt The conflict between Israel and the Palestinians has been difficult for government negotiators to settle. Most recently, implementation of the “road map for peace”, a diplomatic effort sponsored by …… Who participated in the negotiations that produced the Geneva Accord? Apart from direct participants, who supported the Geneva Accord preparations and how? What has the response been to the Geneva Accord by the Palestinians? 73 Measuring Effectiveness Score report content and compare across summary conditions Compare user satisfaction per summary condition Comparing where subjects took report content from 74 75 User Satisfaction More effective than a web search with Newsblaster Not true with documents only or single-sentence summaries Easier to complete the task with summaries than with documents only Enough time with summaries than documents only Summaries helped most 5% single sentence summaries 24% Newsblaster summaries 43% human summaries 76 User Study: Conclusions Summaries measurably improve a news browswer’s effectiveness for research Users are more satisfied with Newsblaster summaries are better than single-sentence summaries like those of Google News Users want search Not included in evaluation 77 Email Summarization Cross between speech and text Elements of dialog Informal language More context explicitly repeated than speech Wide variety of types of email Conversation to decision-making Different reasons for summarization Browsing large quantities of email – a mailbox Catch-up: join a discussion late and participate – a thread 78 Email Summarization: Approach Collected and annotated multiple corpora of email Hand-written summary, categorization threads&messages Identified 3 categories of email to address: Event planning, Scheduling, Information gathering Developed tools: Automatic categorization of email Preliminary summarizers Statistical extraction using email specific features Components of category specific summarization 79 Email Summarization by Sentence Extraction Use features to identify key sentences Non-email specific: e.g., similarity to centroid Email specific: e.g., following quoted material Rule-based supervised machine learning Training on human-generated summaries Add “wrappers” around sentences to show who said what 80 Data for Sentence Extraction Columbia ACM chapter executive board mailing list Approximately 10 regular participants ~300 Threads, ~1000 Messages Threads include: scheduling and planning of meetings and events, question and answer, general discussion and chat. Annotated by human annotators: Hand-written summary Categorization of threads and messages Highlighting important information (such as questionanswer pairs) 81 Email Summarization by Sentence Extraction Creation of Training Data Start with human-generated summaries Use SimFinder (a trained sentence similarity measure – Hatzivassiloglou et al 2001) to label sentences in threads as important Learning of Sentence Extraction Rules Use Ripper (a rule learning algorithm – Cohen 1996) to learn rules for sentence classification Use basic and email-specific features in machine learning Creating summaries Run learned rules on unseen data Add “wrappers” around sentences to show who said what Results Basic: .55 precision; .40 F-measure Email-specific: .61 precision; .50 F-measure 82 Sample Automatically Generated Summary (ACM0100) Regarding "meeting tonight...", on Oct 30, 2000, David Michael Kalin wrote: Can I reschedule my C session for Wednesday night, 11/8, at 8:00? Responding to this on Oct 30, 2000, James J Peach wrote: Are you sure you want to do it then? Responding to this on Oct 30, 2000, Christy Lauridsen wrote: David , a reminder that your scheduled to do an MSOffice session on Nov. 14, at 7pm in 252Mudd. 83 Information Gathering Email: The Problem Summary from our rule-based sentence extractor: Regarding "acm home/bjarney", on Apr 9, 2001, Mabel Dannon wrote: Two things: Can someone be responsible for the press releases for Stroustrup? Responding to this on Apr 10, 2001, Tina Ferrari wrote: I think Peter, who is probably a better writer than most of us, is writing up something for dang and Dave to send out to various ACM chapters. Peter, we can just use that as our "press release", right? In another subthread, on Apr 12, 2001, Keith Durban wrote: Are you sending out upcoming events for this week? 84 Detection of Questions Questions in interrogative form: inverted subject-verb order Supervised rule induction approach, training Switchboard, test ACM corpus Results: Recall 0.56 Precision 0.96 F-measure 0.70 Recall low because: Questions in ACM corpus start with a declarative clause So, if you're available, do you want to come? if you don't mind, could you post this to the class bboard? Results without declarative-initial questions: Recall 0.72 Precision 0.96 F-measure 0.82 85 Detection of Answers Supervised Machine Learning Approach Use human annotated data to generate gold standard training data Annotators were asked to highlight and associate question-answer pairs in the ACM corpus. Learn a classifier that predicts if a subsequent segment to a question segment answers it Represent each question and candidate answer segment by a feature vector Labeller 1 Labeller 2 Union Precision 0.690 0.680 0.728 Recall 0.652 0.612 0.732 F1-Score 0.671 0.644 0.730 86 Integrating QA detection with summarization Use QA labels as features in sentence extraction (F=.545) Add automatically detected answers to questions in extractive summaries (F=.566) Start with QA pair sentences and augmented with extracted sentences (F=.573) 87 Integrated in Microsoft Outlook 88 Meeting Summarization (joint with Berkeley, SRI, Washington) Goal: automatic summarization of meetings by generating “minutes” highlighting the debate that affected each decision. Work to date: Identification of agreement/disagreement Machine learning approach: lexical, structure, acoustic features Use of context: who agreed with who so far? Adressee identification Bayesian modeling of context 89 Conclusions Non-extractive summarization is practical today User studies show summarization improves access to needed information Advances and ongoing research in tracking events, multilingual summarization, perspective identification Moves to new media (email, meetings) raise new challenges with dialog, informal language 90 91