Extracting and Ranking Product Features in Opinion Documents Lei Zhang #, Bing Liu #, Suk Hwan Lim *, Eamonn O’Brien-Strain * # University of Illinois at Chicago * HP Labs 1. Introduction 3.2 Feature Ranking An important task of opinion mining is to extract people’s opinions on features/attributes of an entity. The sentence, “I love the GPS function of Motorola Droid”, expresses a positive opinion on the “GPS function” of the Motorola phone. “GPS function” is the feature. The basic idea is to rank the extracted feature candidates by feature importance. If a feature candidate is correct and important, it should be ranked high. For unimportant feature or noise, it should be ranked low. 2. Existing Techniques We identify two major factors affecting the feature importance. Feature relevance: it describes how possible a feature candidate is a correct feature. A recently proposed unsupervised technique for extracting features from reviews 2.1 Double Propagation (Qiu et al 2010) Observation : Opinion words are often used to modify features ; opinion words and features themselves have relations in opinionated expressions too, Method: Double propagation assumes that features are nouns/noun phrases and opinion words are adjectives Opinion words can be recognized by identified features, and features can be identified by known opinion words. The extracted opinion words and features are utilized to identify new opinion words and new features which are used again to extract more opinion words and features. This propagation or bootstrapping process ends when no more opinion words or features can be found. The opinion word/feature relations can be identified via a dependency parser based on the dependency grammar. Feature frequency: a feature is important, if appears frequently in opinion documents. We find that there is a mutual enforcement relation between opinion words, part-whole relation and “no” patterns and features. If an adjective modifies many correct features, it is highly possible to be a good opinion word. Similarly, if a feature candidate can be extracted by many opinion words, part-whole patterns, or “no” pattern, it is also highly likely to be a correct feature. The Web page ranking algorithm HITS is applicable. Bipartite graph and HITS algorithm 2.2. Dependency Grammar It describes the dependency relations between words in a sentence Direct relations: it represents that one word depends on the other word directly or they both depend on a third word directly (e.g. “ The camera has a good lens” ) Indirect relations: it represents that one word depends on the other word through other words or they both depend on a third word indirectly Dependency relations 3.3. The Whole Algorithm Step 1 : Extract products features using double propagation, part-whole patterns and “no” patterns Step 2 : Compute feature score using HITS without considering frequency Step 3 : The final score function considering the feature frequency is given as follows S = S(f) * log( freq(f) ) Freq (f) is the frequency count of feature f, and S(f) is the authority score of the feature f. Feature extraction and ranking Opinion Lexicon Parsing indirect relations is error-prone for Web corpora. Thus we only use direct relation to extract opinion words and feature candidates in our application. Rank features Algorithms Feature extraction double propagation, part-whole relation, “no” pattern” Class Concept Word Feature ranking HITS algorithm Video Picture quality GPS function . . . 2.3. Weakness For large corpora, double propagation may introduce a lotof noise ( error propagation). For small corpora, it may miss some important features ( features are not modified by any opinion word). Corpus1 3. Proposed Techniques To deal with the problem of double propagation, we propose a novel method to mine features, which consists of two steps: feature extraction and feature ranking. 3.1 Feature Extraction We still adopt double propagation idea to populate feature candidates. But two improvements based on part-whole relation patterns and a “no” pattern are made to find features which double propagation cannot find. Corpus2 … Corpusn 4. Experiments We used 4 diverse corpora to evaluate the techniques. They were obtained from a commercial company. The data were crawled and extracted from multiple online message boards and blogs discussing different products and services. Table 1. Descriptions of the 4 corpora So there are three kinds of feature indicators: Double propagation Part-whole relation pattern A part-whole pattern indicates one object is part of another object. It is a good indicator for features if the class concept word (the “whole” part) is known. Table 2. Results of 1000 sentences (1) Phrase pattern NP + Prep + CP CP + with + NP NP CP or CP NP (e.g. battery of the camera) (e.g. mattress with a cover) (e.g. mattress pad ) Table 3. Precision at top 50 feature candidates (2) Sentence pattern CP verb NP (e.g. the phone has a big screen) “no” pattern 4. Conclusions a specific pattern for product review and forum posts. People often express their comments or opinions on features by this short pattern (e.g. no noise) A new method to deal with problems of double propagation. Part-whole and “no” patterns are used to increase recall and then ranks the extracted feature candidates by feature importance, which is determined by feature relevance and frequency. HITS was applied to compute feature relevance.