Opinion Observer: Analyzing and Comparing Opinions on the Web

Opinion Observer: Analyzing and Comparing Opinions on the Web Bing Liu, Minqing Hu, Junsheng Cheng Paper Presentation:Vinay Goel Introduction  Web: excellent source of consumer opinions  Online customer reviews of products  Useful information to customers and product manufacturers  Novel framework for analyzing and comparing customer opinions  Technique based on language pattern matching to extract product features Opinion Observer Technical Tasks  Identify product features that customers have expressed their opinions on  For each feature, identify whether the opinion is positive or negative  Review Format (2) - Pros, Cons and detailed review  The paper proposes a technique to identify product features from pros and cons in this format Problem Statement Let P={P1,P2 … Pn} be a set of products that the user is interested in Each product Pi has a set of reviews Ri ={r1,r2 … rk} Each review rj is a sequence of sentences rj= {sj1,sj2 … sjm} Product Feature  A product feature f in rj is an attribute/component of the product that has been commented on in rj  If f appears in rj, explicit feature “The battery life of this camera is too short”  If f does not appear in rj but is implied, implicit feature “This camera is too large” (size) Opinions and features  Opinion segment of a feature Set of consecutive sentences that expresses a positive or negative opinion on f “The picture quality is good, but the battery life is short”  Positive opinion set of a feature (Pset) Set of opinion segments of f that expresses positive opinions about f from all the reviews of the product Nset can be defined similarly Visualizing Opinion Comparison Automated opinion analysis Explicit and implicit features Synonyms Granularity of features Extracting Product Features Labeling  Perform POS tagging and remove digits “<V>included<N>MB<V>is<Adj>stingy”  Replace actual feature words with [feature] “<V>included<N>[feature]<V>is<Adj>stingy”  Use n-gram to produce shorter segments “<V>included<N>[feature]<V>is” “<N>[feature]<V>is<Adj>stingy”  Distinguish duplicate tags “<N1>[feature]<N2>usage”  Perform word stemming Rule Generation  Association Rule Mining  Only need rules that have [feature] on the righthand-side (<N1>,<N2> --> [feature])  Consider the sequence of items in the conditional part (left-hand-side) of each rule  Generate language patterns (<N1>[feature]<N2>) Feature Refinement strategies  There may be a more likely feature in the sentence segment but not extracted by any pattern “slight hum from subwoofer when not in use”  Frequent-Noun Only a noun replaces another noun  Frequent-Term Any type replacement Semi-Automated Tagging of Reviews Extracting Reviews from Web Pages Non trivial task MDR-2 System finds patterns from page containing reviews System uses these patterns to extract reviews from other pages of the site System Architecture Experimental Results Experimental Results Amount of time saved by Semi-automatic tagging is around 45% Group synonyms using WordNet (52% recall and 100% precision) Does not handle context dependent synonyms Conclusion  Novel visual analysis system  Supervised pattern discovery method  Interactive correction of errors of the automatic system  Improve techniques, study strength of opinions

Opinion Observer: Analyzing and Comparing Opinions on the Web

Related documents

Products

Support

Opinion Observer: Analyzing and Comparing Opinions on the Web

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib