Identifying Features in Opinion Mining via Intrinsic and Extrinsic

advertisement
Identifying Features in Opinion Mining via
Intrinsic and Extrinsic Domain Relevance
Abstract:
In this work, Sentiment Analysis or opinion mining aims to use automated tools
to detect subjective information such as opinions, attitudes and feelings expressed
in text. Besides, unlike supervised approaches to sentiment classification which
often fail to produce satisfactory performance when shifting to other domains, the
weakly supervised nature of JST makes it highly portable to other domains.
Emotion mining can provide a new aspect for document categorization, and
therefore help online users to select related documents based on their emotional
preferences and discovers the connections between social emotions and affective
terms and based on which predict the social emotion from text content
automatically. The Vast majority of existing approaches to opinion feature
extraction rely on mining patterns only from a single review corpus, ignoring the
nontrivial disparities in word distributional characteristics of opinion features
across different corpora. In this paper, we propose a novel method to identify
opinion features from online reviews by exploiting the difference in opinion
feature statistics across two corpora, one domain-specific corpus (i.e., the given
review corpus) and one domain-independent corpus (i.e., the contrasting corpus).
We capture this disparity via a measure called domain relevance (DR), which
characterizes the relevance of a term to a text collection.
Existing system:
Sentiment classification model trained in one domain cannot work well in
another domain. Favor supervised learning, requiring labeled corpora for training,
and potentially limiting the applicability to other domains of interest. Topic/feature
detection and sentiment classification are often performed separately, which
ignores their mutual dependence. Without considering the mixture of topics in the
text, limits the effectiveness of the mining results to users.
– e.g. ‘unpredictable steering’:
• Negative in automobile review
• Positive in movie review
Proposed system:
We first extract a list of candidate opinion features from the domain review
corpus by defining a set of syntactic dependence rules. For each extracted
candidate feature, we then estimate its intrinsic-domain relevance (IDR) and
extrinsic-domain relevance (EDR) scores on the domain-dependent and domainindependent corpora, respectively. Supervised learning model may be tuned to
work well in a given domain, but the model must be retrained if it is applied to
different domains. Topic modeling approaches can mine coarse-grained and
generic topics or aspects, which are actually semantic feature clusters or aspects of
the specific features commented on explicitly in reviews.
System Requirements:
Software Requirements:
Operating System
: Windows XP.
Platform
: JDK1.6.
Server side
: Glassfish Server 2.1, JSP, Xampp 1.7.1.
Frontend
: JSP.
Backend
: MySQL 5.1.
Hardware Requirements:
Processor
: Pentium 4
RAM
: 512 MB and above
Hard Disk
: 40 GB and above
Download