Main Goal of the Project

advertisement
Mixed-Initiative Social Media Analytics at the World Bank:
Observations of Citizen Sentiment in Twitter Data to Explore
“Trust” of Political Actors and State Institutions and its
Relationship to Social Protest
2 0 O c t o b e r , 2 0 1 5
I E E E B i g D a t a a n d
H u m a n i t i e s
N a d y a C a l d e r o n , B r i a n F i s h e r , J e f f H e m s l e y ,
B i l l C e s c a v i c h , G r e g J a n s e n , R i c h a r d M a r c i a n o
a n d V i c t o r i a L . L e m i e u x
Motivation
2
WE FEEL FINE PROJECT
Main Goal of the Project:
– Use Big Data Analytics to
contribute to research on trust
in government, specifically the
relationship among trust in
Government, trust in state
institutions and citizens’
collective behavior
Research Questions:
– how Brazilian citizens’ felt about
their state institutions
– how these feelings connected to
their sentiments about Brazilian
Federal and State government
services and politicians
– how such sentiments translated
into collective behaviors
3
I N FO R M AT I O N L A N D S C A P E I N B R A Z I L
Number of internet users in Brazil
107.7m
Internet user penetration in Brazil
53.1%
Average duration of monthly internet usage in Brazil
29.4h
Google is most popular search engine in Brazil based in
market share
Mobile phone internet users in Brazil
96.7%
72.1m
Active social media penetration in Brazil
45%
Number of social network users in Brazil
78.1m
Number of Twitter Users
41.2m
4
CONTEXT OF THE STUDY
2014 FIFA WORLD CUP
5
MIXED INITIATIVE SOCIAL MEDIA ANALYTICS (MISMA)
Interleaved contributions by the user and
the system, to- gether converging on a
solution to a single problem.
Asymmetric division of labour such that
the contributions made by the computer
and the user are distinct.
Kirkpatrick, A. E., Dilkina, B., & Havens, W. S. (2005, June). A framework for designing and evaluating
mixed-initiative optimization systems. In ICAPS Workshop on Mixed-Initiative Planning and Scheduling,
held in conjunction with the Fifteenth International Conference on Automated Planning and
Scheduling, Monterey, California, June.
6
OVERVIEW OF THE MISMA METHODOLOGY
1. Using the sentiment expressed in the content of twitter data to measure “Trust”
2. Instrument choice = sentiment analysis (SentiStrength)
3. Initial “big picture” harvest of Twitter data
4. Visual analysis and Text analysis of “big picture”
5. Use of search terms to harvest historical twitter data
6. Sentiment classification of harvested tweets
7. Development and use of VA tool to explore historical Twitter collections
8. Pair analysis
9. Analysis of competing hypotheses (ACH)
7
S E N T I M E N T A N A LY S I S
• Rule-Based Classifier
• Representative Features : Dictionary of sentiment words,
booster words, negating words, question words.
• We Integrated ANEW-BR Valence words
• Evaluation with Portuguese Speakers
8
Sentiment Classification
with SentiStrength
+4
Valence: + or Magnitude: 0 - 4
-4
9
H I STO R I C A L T W E E T CO L L EC T I O N D E TA I L S
10
WORD CO-OCCURRENCE
serviço, serviços, saúde, hospital ,
hospitais,
policia, polícia, policiais,
educação, faculdade,
dilma, governo, lula, presidente,
presidenta, federal, prefeito,
prefeitura,
ministério, ministro, municipal,
vigilância,
politica, política, oposição, justiça,
justo, petista, pt,
corrupção, corrupto, corruptos,
brasileiro, brasileiros, brasileira,
brasileiras,
crise, água, emergência, falta,
petrobras, petrobrás
COLLECTION THEMES: SEARCH PHRASES
Political Opinion
dilma lula pt,
dilma lula política,
Public Services **Water
falta água,
crise água,
organização criminosa pt, crise hídrica,
dilma governo,
pt governo,
política dilma,
política governo,
brasileiros dilma,
brasileiro governo,
oposição governo,
acabar corrupção,
impeachment dilma,
dilma precisa,
reforma politica,
dilma vítima,
dilma pobres,
água dilma,
água governo,
água saúde,
falta d água,
água pt,
água acabando,
água corrupção,
água brasileiro,
seca dilma,
água educação,
educação saúde,
educação dilma,
educação serviços,
saúde serviços,
água educação,
governo saúde,
dilma saúde
educação governo,
educação federal,
federal polícia
Petrobras
dilma petrobrás,
petrobrás pt,
petrobrás corrupto,
corrupção petrobrás,
petrobrás crise,
petrobrás presidente,
petrobrás brasileiro,
petrobrás dinheiro,
graça foster petrobrás,
petrobrás mesma coisa,
dilma graça
povo brasileiro,
brasileira política
12
V I S UA L A N A LYS I S
The sense-making loop for visual analysis based on a
simple model of visualization. Van Wijk’s (2005)
13
14
15
PROJECT FINDINGS
Brazilian citizens were expressing negative sentiment about the national
government’s low level of investment in education, health and water, and to a
lesser extent security and electricity, relative to spending on the World Cup. At
the state level, water was the key issue.
The study also found support for theories of relative deprivation as a
cause of protest and for theories of digitally-mediated modes of political
contestation.
The use of big data analytics made it possible to observe the protests from a
distance, both in terms of space (i.e., the research team did not travel to Brazil)
and time (i.e., the study used historical Tweets).
16
LIMITATIONS OF THE STUDY
Representativeness of the
sample
Performance of Sentiment
Classification for political
opinion analysis: take
interactive machine learning
approach to refine during
exploration
Historical metadata:
limitations with geographical
analysis
Topical Themes introduce
biases that need to be
explicit
A-Historicity
Network analysis
Tool design
Domain Expertise
Privacy
17
FINAL THOUGHT
E
valuation persuades rather than convinces, argues
rather than convinces, is credible rather than certain,
is variously accepted rather than compelling.
- Ernesto R. House, Evaluation with Validity, Beverly Hills, 1980
18
Download