Web-Based Traffic Sentiment Analysis- Methods and Applications Abstract In the recent of social media, sentiment analysis has developed rapidly in recent years. However, only a few studies focused on the field of transportation, which failed to meet the stringent requirements of safety, efficiency, and information exchange of intelligent transportation systems (ITSs). We propose the traffic sentiment analysis (TSA) as a new tool to tackle this problem, which provides a new prospective for modern ITSs. Our methods and models in TSA are proposed in this paper, and the advantages and disadvantages of rule- and learning-based approaches are analyzed based on web data. Practically, we applied the rule-based approach to deal with real problems, presented an architectural design, constructed related bases, demonstrated the process, and discussed the online data collection. Existing system: Existing approaches to sentiment analysis can be categorized into rule- and learning based approaches. Rule-based approaches often require an expert-defined dictionary of subjective words; this approach predicts the polarity of a sentence or document by analyzing the occurring patterns of such words in text. For example, Wiebe et al. provided a lexicon source of subjectivity clues, such as verbs, adjectives, and nouns, with their polarity (i.e., positive, negative, or neutral) and strength (i.e., strong or weak) annotated. However, this lexicon is able to define the original polarity of a word only, and the actual polarity of a word may be modified by its context in a sentence. Several approaches that consider the context of words have been proposed to determine the sentiment orientation of words. Previous studies, the data set contains several subjective texts that could not be easily analyzed by the rules. The most typical phenomenon is the ironic sentiment sentences. For instance, in posts regarding fuel prices, the thread title used was “the fuel price will rise,” to which one user replied, “go to sell the car.” Such a reply apparently carries an ironic tone; thus, all annotators manually labeled the reply as “negative.” However, given that the computer cannot detect from the given text any word expressing a negative sentiment, the methods cannot recognize the sentiment polarity. Therefore, numerous problems remain unsolved. DISADVANTAGES: Rule-based approach, the disadvantage is that the sentiment polarity results cannot be as precise as expected if the context of the texts is not considered. Nevertheless, for handling web data, this type of approach has the following advantages. The precision of the rule-based approach is independent of the sizes of the clauses. Second, the syntax rule of a certain language is basic and static despite the differences in the stylistic features of various users. The thought process and word choice basically remain unchanged. Existing the rules of the rule-based approach is relatively static in the rule-based approach can be easily extended by simply updating the sentiment lexicon, although new sentimental words rapidly emerge and the sentiment of several words may be changed with words. PROPOSED SYSTEM: We propose traffic sentiment analysis (TSA) for processing traffic information from websites. As taking consideration of human affection, TSA will enrich the performance of the current ITS space. TSA is a subfield of sentiment analysis, which concerns about the issues of traffic in articular. Due to the field sensitivity of sentiment analysis, it is necessary to discuss the TSA problems and construct TSA systems specifically. The TSA treats the traffic problems in a new angle, and it supplements the capabilities of current ITS systems in the modules of ITS and exhibits that the TSA plays the role of sensing, computing, and supporting the decision making in ITSs. The functions of the TSA system can be illustrated as follows. 1) Investigation: It is more economical and efficient than the public poll to collect the public opinion through the TSA system. 2) Evaluation: The computational production of the TSA system can be used to evaluate the performance of traffic services and policies. 3) Prediction: The TSA system can be further developed to predict the trends of some social events. For example, to predict whether a cancelled flight would bring chaos, we can analyze the emotion of passengers on their words published on Twitter or Weibo through TSA systems. In addition, specific parts of the TSA system can be viewed as another form of “social sensors” compared with traditional sensor systems; it can detect the situation from a new humanized perspective. ADVANTAGES: We approach is adopted here to address the distinct challenges posed by the web data set illustrated the architecture of TSA; the architecture is based on the tackling process; and its main components, including 1) web data collection, 2) preprocessing, 3) extraction of subjects and objects, 4) extraction of sentiment properties, 5) sentiment calculation and classification, 6) evaluation or applications, and 7) feed-back, improve the construction of the sentiment, rule, and TSA object bases. Data collection: We gathered data from several websites, such ensuring that the conclusions are definitely based on public opinion or, at least, represent part of the public opinion. Preprocessing: As previously mentioned, web documents must be processed additionally because that segment words by spaces in sentences. In the preprocessing, the following steps are included: 1) the segmentation of text, 2) the labeling of words, and 3) the replacement of synonymous expressions. System Configuration:Hardware Configuration: Processor Speed - Pentium –IV 1.1 Ghz RAM - 256 MB(min) Hard Disk - 20 GB Key Board - Standard Windows Keyboard Mouse - Two or Three Button Mouse Monitor - SVGA Software Configuration:- Operating System : Windows XP Programming Language : JAVA Java Version : JDK 1.6 & above.