Uploaded by super nova

Twitter Research

advertisement
Abstract
Social media have received more attention nowadays.
Public and private opinion about a wide variety of
subjects are expressed and spread continually via
numerous tweets. In this proposed work, extraction and
classification of tweets are made to get trending topics
across globe. Tweets are classified into various topics
where positive, negative and neutral sentiment are
present. The percentage of various topics are calculated
and trending topic is analyzed effectively.
Objective of the work
• The main objective of this project is to extract
tweets in a particular duration of time and
analyze the tweets to classify into various
topics.
• Report on the most trending fields are
generated.
Existing Work


Algorithms:

Sequential Minimal Optimization

Hyperpipes
Advantage


The accuracy obtained is 72-80%
Demerits of the base paper:

It has a problem for ironic and sarcastic tweets.
Proposed Work



Algorithm used:

Naïve Bayes.

Random forest.
Tag prediction is done where the tweets are
tagged with relevant keywords.
Merits:

It deals with ironic and sarcastic tweets.

It gives higher accuracy than existing system.
Related works
S.No
1)
2)
Paper Title
(Author Name,
Title of the paper,
Publication
Name, Year of
publication)
Technique
Sentiment
Analysis of
Twitter Data Case
NLTK
Study on Digital
GATE
India Prerna
Mishra,InCITe
2016
Opinion mining
and sentiment
analysis Bo Pang
and Lillian
Lee,2008
Data Collection
Data
Preprocessing
Feature
Extraction
Sentiment
Analysis &
Merits
Achieve high
accuracy for
classifying
sentiment by
using Machine
Learning
Algorithms.
The unigram
feature extractor
is the simplest
way to retrieve
features from a
Demerits
Neutral sentiment
tend to be much
harder to identify
as it requires the
determination of
the context of the
tweet message.
It has a problem
for ironic and
sarcastic tweets.
Related works
S.No
3)
4)
Paper Title
(Author Name,
Title of the paper,
Publication
Name, Year of
publication)
Sentiment
Knowledge
Discovery in
Twitter Streaming
Data,Albert
Bifet,SVBH 2010
Semantic
Sentiment
Analysis of
Twitter
Hasssan
Saif,ISWC 2012.
Technique
Red Opal
Opinion Finder
The semantic
feature model
outperforms the
Unigram and
POS baseline for
identifying both
negative and
positive
sentiment.
Merits
Automatically
extract sentiment
(positive or
negative) from a
tweet.
Machine learning
techniques
perform well for
classifying
sentiment in
tweets.
Demerits
Evaluating data
streams in real
time is a
challenging task
Raw tweets data
can be very noisy
and hence some
pre-processing
was necessary,
such as replacing
all hyperlinks like
URL and remove
the repeated
letters.
Problem Statement
An effective analysis of twitter tweets and get
top trending across various domains and fields
like sports,education etc.
MODULES:
• Pre-processing
• Hash Tag Classification
• Polarity Classifier
• Emoticon Analysis
DATA RETREIVAL
• The data from Twitter can be retrieved in many ways like –
Using Twitter Search APIs
• NodeXL
• Kimonofy Tool
• using which we can generate APIs and import all the
required data.
• This data is preprocessed and classified according to
• the polarity.
• The Emoticon dataset can retrieve from
twittersentiment.appspot.com.
Preprocessing
All caps identification
Lower casing
URL Removal
Emoticon Analysis
Removal of Punctuations and White spaces
Letter Redundancy / Compression of Words
Hashtag Classification
• Hashtag classification is very important for topic
modeling.
• While posting any message,the user uses a hash tag
• eg. #IndvsAus.
• So, from this we can know that the post is about the
India versus
• Australia match.
• This can help in classifying the preprocessed data in
various topics.
POLARITY CLASSIFIER
• Polarity classifier is the heart of this paper.
• Naïve Bayes Classifier,
• Unigram and
• Bigram models for classification of polar
data.
Polarity Shifter
• If a noun, verb or adjective is having a positive polarity and the word before
that is a negation like
• ‘not’ then the accuracy might decrease.
• To overcome this, we have proposed an algorithm in which it searches for the
negation words.
• When the parser finds the negative word it looks for three words beyond
negation.
• If the three word window is having a noun, verb or an adjective which has
positive polarity then the
• polarity of that data is reversed.
• Using this polarity shifter, we can achieve results with maximum accuracy.
• For example Let us take a data ‘The movie was not good’.
• Now, the text is having a positive sentiment word
• i.e. ‘good’, so there is a possibility that the machine classifies this data as
positive sentiment.
• But, with polarity shifter, that would not be possible because we even look at
negation.
• The polarity is reversed and
• that data is classified into negative sentiment
Emoticon Analysis
Emoticon List
Emoticon Sentiment
:)
:(
:D
:|
:’(
;)
:/
:O
Positive
Negative
Positive
Negative
Negative
Positive
Negative
Negative
Download