Web-Based Traffic Sentiment Analysis

advertisement
Web-Based Traffic Sentiment Analysis- Methods and
Applications
Abstract
In the recent of social media, sentiment analysis has developed rapidly in recent years.
However, only a few studies focused on the field of transportation, which failed to meet the
stringent requirements of safety, efficiency, and information exchange of intelligent
transportation systems (ITSs). We propose the traffic sentiment analysis (TSA) as a new tool to
tackle this problem, which provides a new prospective for modern ITSs.
Our methods and models in TSA are proposed in this paper, and the advantages and
disadvantages of rule- and learning-based approaches are analyzed based on web data.
Practically, we applied the rule-based approach to deal with real problems, presented an
architectural design, constructed related bases, demonstrated the process, and discussed the
online data collection.
Existing system:
Existing approaches to sentiment analysis can be categorized into rule- and learning based
approaches. Rule-based approaches often require an expert-defined dictionary of subjective
words; this approach predicts the polarity of a sentence or document by analyzing the occurring
patterns of such words in text. For example, Wiebe et al. provided a lexicon source of
subjectivity clues, such as verbs, adjectives, and nouns, with their polarity (i.e., positive,
negative, or neutral) and strength (i.e., strong or weak) annotated. However, this lexicon is able
to define the original polarity of a word only, and the actual polarity of a word may be modified
by its context in a sentence. Several approaches that consider the context of words have been
proposed to determine the sentiment orientation of words.
Previous studies, the data set contains several subjective texts that could not be easily analyzed
by the rules. The most typical phenomenon is the ironic sentiment sentences. For instance, in
posts regarding fuel prices, the thread title used was “the fuel price will rise,” to which one user
replied, “go to sell the car.” Such a reply apparently carries an ironic tone; thus, all annotators
manually labeled the reply as “negative.” However, given that the computer cannot detect from
the given text any word expressing a negative sentiment, the methods cannot recognize the
sentiment polarity. Therefore, numerous problems remain unsolved.
DISADVANTAGES:
 Rule-based approach, the disadvantage is that the sentiment polarity results cannot be as
precise as expected if the context of the texts is not considered. Nevertheless, for
handling web data, this type of approach has the following advantages.
 The precision of the rule-based approach is independent of the sizes of the clauses.
Second, the syntax rule of a certain language is basic and static despite the differences in
the stylistic features of various users. The thought process and word choice basically
remain unchanged.
 Existing the rules of the rule-based approach is relatively static in the rule-based
approach can be easily extended by simply updating the sentiment lexicon, although
new sentimental words rapidly emerge and the sentiment of several words may be
changed with words.
PROPOSED SYSTEM:
We propose traffic sentiment analysis (TSA) for processing traffic information from websites.
As taking consideration of human affection, TSA will enrich the performance of the current ITS
space. TSA is a subfield of sentiment analysis, which concerns about the issues of traffic in
articular. Due to the field sensitivity of sentiment analysis, it is necessary to discuss the TSA
problems and construct TSA systems specifically.
The TSA treats the traffic problems in a new angle, and it supplements the capabilities of current
ITS systems in the modules of ITS and exhibits that the TSA plays the role of sensing,
computing, and supporting the decision making in ITSs. The functions of the TSA system can be
illustrated as follows. 1) Investigation: It is more economical and efficient than the public poll to
collect the public opinion through the TSA system. 2) Evaluation: The computational production
of the TSA system can be used to evaluate the performance of traffic services and policies. 3)
Prediction: The TSA system can be further developed to predict the trends of some social events.
For example, to predict whether a cancelled flight would bring chaos, we can analyze the
emotion of passengers on their words published on Twitter or Weibo through TSA systems. In
addition, specific parts of the TSA system can be viewed as another form of “social sensors”
compared with traditional sensor systems; it can detect the situation from a new humanized
perspective.
ADVANTAGES:
 We approach is adopted here to address the distinct challenges posed by the web data set
illustrated the architecture of TSA; the architecture is based on the tackling process; and
its main components, including 1) web data collection, 2) preprocessing, 3) extraction of
subjects and objects, 4) extraction of sentiment properties, 5) sentiment calculation and
classification, 6) evaluation or applications, and 7) feed-back, improve the construction of
the sentiment, rule, and TSA object bases.
 Data collection: We gathered data from several websites, such ensuring that the
conclusions are definitely based on public opinion or, at least, represent part of the public
opinion.
 Preprocessing: As previously mentioned, web documents must be processed additionally
because that segment words by spaces in sentences. In the preprocessing, the following
steps are included: 1) the segmentation of text, 2) the labeling of words, and 3) the
replacement of synonymous expressions.
System Configuration:Hardware Configuration: Processor
 Speed
-
Pentium –IV
1.1 Ghz
 RAM
-
256 MB(min)
 Hard Disk
-
20 GB
 Key Board
-
Standard Windows Keyboard
 Mouse
-
Two or Three Button Mouse
 Monitor
-
SVGA
Software Configuration:-
 Operating System
: Windows XP
 Programming Language
: JAVA
 Java Version
: JDK 1.6 & above.
Download