Social Media Intelligence: Measuring Brand

advertisement
Social Media Intelligence: Measuring Brand
Sentiment from Online Conversations
David A. Schweidel
Goizueta Business School
Emory University
October 2012
What’s Trending on Social Media?
Agenda
• Social Dynamics in Social Media Behavior
– Why do people post a product opinion? What
influences their posting behavior? (Moe and
Schweidel 2012)
– What is the impact of these social dynamics on
product sales? (Moe and Trusov 2011)
• Social Media Intelligence (work in progress
with W. Moe)
– What factors influence social media metrics?
– How can we adjust our metrics for different
sources of data?
Why do people post?
•
Opinion formation
– Pre/post purchase (Kuksov and Xie
Pre-Purchase Evaluation
E[uij]
Purchase Decision and
Product Experience
2008)
– Customer satisfaction and wordof-mouth (Anderson and Sullivan 1993,
Post-Purchase Evaluation
Vij=f(uij, E[uij])
Anderson 1998)
•
Opinion expression
– Opinion dynamics (Godes and Silva
2009, Li and Hitt 2008, Schlosser 2005, McAllister
and Studlar 1991)
– Opinion polls and voter turnout
(see for example McAllister and Studlar 1991)
LATENT EXPERIENCE MODEL
Opinion formation versus opinion
expression (Berinsky 2005)
Incidence
Decision
Evaluation
Decision
SELECTION
EFFECT
ADJUSTMENT
EFFECT
Posted Product Ratings
INCIDENCE & EVALUATION MODELS
•
Selection Effects:
What influences participation?
• Extremely dissatisfied customers are more
likely to engage in offline word-of-mouth (Anderson
1998)
• Online word-of-mouth is predominantly
positive (e.g. Chevalier and Mayzlin 2006, Dellarocas and Narayan 2006)
• Subject to the opinions of others
– Bandwagon effects (McAllister and Studlar 1991, Marsh 1984)
– Underdog effects (Gartner 1976, Straffin 1977)
– Effect of consensus (Epstein and Strom 1981, Dubois 1983, Jackson 1983, Delli
Carpini 1984, Sudman 1986)
Adjustment effects:
What influences posted ratings?
• Empirical evidence of opinion dynamics
– Online opinions decline as product matures (Li and Hitt 2008)
– Online opinions decline with ordinality of rating (Godes and Silva 2009)
• Other behavioral explanations of opinion
dynamics
– “Experts” differentiate from the crowd by being more negative
(Schlosser 2005)
– “Multiple audience” effects when opinion variance is high (Fleming et al
1990)
Modeling Overview
• Online product ratings for bath, fragrance and home retailer over 6
months in 2007 (sample of 200 products with 3681 ratings)
• Two component model structure (Ying, Feinberg and Wedel 2006):
– Incidence (Probit)
– Evaluation (Ordered Probit)
• Product utility links incidence and evaluation models (non-linear)
• Bandwagon versus differentiation effects
– Covariates of ratings environment
– Separate but correlated effects in each model component
• Product and individual heterogeneity
Role of post-purchase evaluation (Vij)
in rating incidence
0
1.5
2
2.5
3
3.5
Utility Effect on Posting Incidence
-0.1
-0.2
-0.3
-0.4
-0.5
-0.6
-0.7
Individual-Product Utility Value
4
4.5
5
Classifying Opinion Contributors
• Groups based on frequency of posts (β0)
Empirical Trends
5
3.5
4.8
3
•
Posted ratings
– Average rating decreases over
time
– Variance increases over time
•
Poster composition
– Community-builders are overrepresented in the posting
population.
– As forum evolves, participation
from community-builders
increases while that of LI and
BW decreases.
2.5
4.4
4.2
2
4
1.5
3.8
3.6
Variance
Average Rating
4.6
1
3.4
0.5
3.2
3
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
Rating
Average
Variance
0.7
Proportion of Posters
0.6
0.5
0.4
0.3
0.2
0.1
0
1
3
5
7
9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
Rating
% Activists
% Low-Involvement
Effect of Opinion Variance
MEDIAN CUSTOMER BASE
5
POLARIZED CUSTOMER BASE
5
3.5
4.8
4.8
3
4.6
3
4.6
2
4
1.5
3.8
3.6
1
3.4
2.5
4.4
4.2
2
4
1.5
3.8
3.6
1
3.4
0.5
3.2
3
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
0.5
3.2
3
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
Rating
Average
Rating
Variance
Average
Variance
•
Customer bases have same mean but different variance in opinions
•
With a polarized customer base, ratings exhibit:
– Lower average with a significant decreasing trend
– Greater variance
•
Negative ratings do not necessarily signal a lower average opinion among
customers
Variance
4.2
Average Rating
2.5
4.4
Variance
Average Rating
3.5
Conclusions: Individual-level Analysis
•
Empirical findings
– Heterogeneity in posting incidence and evaluation
– Incidence and evaluation behavior are related and can result in a
systematic evolution of posting population
•
General trends in the evolution of product forums:
– Dominated by “activists””
– Participation by activists tends to increase as forum evolves while
participation by low-involvement individuals tend to decrease
•
Implications:
– Ratings environment does not necessarily reflect the opinions of the
entire customer base or even the socially unbiased opinions of the
posters
– Posting behavior is subject to venue effects…
Posting Decisions
Opinion / Brand Evaluation
Do I post?
What do I
post?
Where do I
post?
Sentiment
Venue Format
Product
Domain
Attribute
SOCIAL MEDIA METRICS
The value of social media as a research
tool
• Does it matter where (i.e., blogs, microblogs,
forums, ratings/reviews, etc.) we listen?
– Product/topic differences across venues?
– Systematic sentiment differences across venues?
– Venue specific trends and dynamics?
• How do social media metrics compare to other
available measures?
Social media monitoring
IN PRACTICE
• Early warning system: Kraft
removed transfats from Oreos in
2003 after monitoring blogs
IN RESEARCH
• Twitter to predict sock prices
(Bollen, Mao and Zeng 2011)
• Twitter to predict movie sales (Rui,
• Customer feedback: Land of Nod
monitors reviews to help with
product modifications and
redesigns
• Measuring sentiment: Social
media listen platforms collect
comments across venues
Whinston and Winkler 2009)
• Discussion forums to predict TV
ratings (Godes and Mayzlin 2004)
• Ratings and Reviews to predict
sales (Chevalier and Mayzlin 2006)
Does source of data matter?
• Online venues (e.g., blogs, forums, social networks, micro-blogs) differ
in:
– Extent of social interaction
– Amount of information
– Audience attracted
– Focal product/attribute
• Venue is a choice
– Consumers seek out brand communities (Muniz and O’Guinn 2001)
– Venue depends on posting motivation (Chen and Kirmani 2012)
• Social dynamics affects posting (Moe and Schweidel 2012, Moe and Trusov 2011)
Research Objective
• Assess online social media as a listening tool
• Disentangle the following factors that can systematically
influence posted sentiment
– Venue differences
– Product and attribute differences
– Within venue trends and dynamics
• Examine differences across different venue types
– Sentiment
– Product and attribute differences
– Implications for social media monitoring and metrics
Social Media Data
• Provided by Converseon (leading online social
media listening platform and agency)
• Sample of approximately 500 postings per
month pertaining to target brand
• Comments manually coded for:
– Sentiment (positive, neutral, negative)
– Venue and venue format
– Focal product/attribute
• Categories: (1) enterprise software, (2)
telecommunications, (3) credit card services,
and (4) automobile manufacturing
Data for Enterprise Software Brand
• 140 products within the brand portfolio
• 59 brand attributes (e.g., compatibility, price, service, etc.)
• Social Media data spanned a 15 month period
– June 2009 – August 2010
– 7565 posted comments
– Across 800+ domains
Venue Format
Discussion Forum
Micro-blog
Blog
Social Network
Mainstream News
Social News
Wiki
Video
Review Sites
Illustrative Website
forums.adobe.com
twitter.com
wordpress.com
linkedin.com
cnbc.com
digg.com
adobe.wikia.com
vimeo.com
epinions.com
Frequency
2728
2333
2274
155
36
19
10
6
4
Direct Experience
93%
37%
23%
40%
3%
47%
50%
0%
25%
Modeling Social Media Sentiment
• Comments coded as “negative”, “neutral”,
or “positive”
• Ordered probit regression
r 1
Pr( y i  r )  Pr(  v ( i )  U i   i   v ( i ) )
r
U i  VS i  
*
*
venue-specific
brand sentiment
p ( i ), 1
  a ( i ), 1
product effect
attribute effect
What affects venue-specific brand
sentiment?
• General brand impression (GBI)
• Domain and venue effects (including
dynamics)
VS i  GBI
t (i )
  d ( i )   v ( i )   v ( i ), t ( i )
domain effect
venue-specific dynamics
venue-format effect
Venue Attractiveness
• Model venue format as a choice made by the
poster
• Multinomial logit model
Vij = g j + l j wi + k j × GBIt(i)
effect of content on venue choice
wi = p p(i),2 + aa(i),2
product effect
Attribute effect
Model Comparisons
• Baseline model
– Independent sentiment and venue decisions
– Controlling for product, topic and domain effects
Baseline
+ Sentiment link
+ Venue specific sentiment
+ Venue specific dynamics
DIC
32588.0
30040.7
29677.2
29529.1
Sentiment
hit rate
0.454
0.451
0.465
0.473
Venue choice
hit rate
0.316
0.424
0.424
0.424
Sentiment metrics vary depending on
what you are measuring.
CORRELATIONS
0.5
0.4
0.3
0.2
GBI
0.1
Observed Average
Sentiment
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
-0.1
Obs. Avg.
Sentiment
GBI
GBI
Obs. Avg.
Sentiment
Blog
-0.2
Forum
-0.3
Microblogs
Blog
Forum
Microblogs
1
-0.0346
1
0.678
0.496
1
0.00409
0.263
-0.0870
1
0.751
0.503
0.8196
-0.139
1
Sentiment Differences
across Venues (v(i))
3.5
3
2.5
2
1.5
1
0.5
0
Divergent Trends
across Venues (v(i),t(i))
0.4
0.3
0.2
0.1
Blog
0
-0.1
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15
Forum
Micro-blog
-0.2
-0.3
-0.4
-0.5
* Blogs, forums and microblogs are the 3 most common venues
• Sentiment varies across venues
• Venue-specific sentiment is subject to
venue-specific dynamics
Attribute Effects on Sentiment (a(i),1)
0.4
0.3
0.2
-0.5
Size of Company
Security
Quality of Product
Price
Mobile Compatibility
-0.4
Hardware Compatibility
-0.3
Ease of Installation
-0.2
Documentation/Suppor
t
-0.1
Brand Reputation
0
Application
Compatibility
0.1
-0.6
• Provides attribute-specific sentiment metrics
• Empirical measures are problematic due to data sparsity
• Correlation between model-based effects and observed attribute-sentiment
metrics = -.276
Venue Attractiveness Results
Venue
Intercept (g)
Product/Attrib.
Effects (l)
Blog
5.012
1.000
-1.610
Forum
5.655
-2.713
5.002
Mainstream
Media
0.246
2.474
-2.632
Microblogs
5.378
1.005
2.978
Photoshare
-4.509
-0.469
0.321
Review site
-1.229
-2.427
0.767
Social network
2.349
1.771
1.086
Social news
aggregator
0.607
-0.058
1.848
Video share
-1.024
0.784
-1.105
--
--
--
Wiki
Effect of GBI (k)
Posters with positive
sentiments toward the
brand are attracted to
forums and microblogs.
Forums attract comments
that focus on different
products and attributes
than microblogs.
-0.2
-0.4
-0.6
* For products with >5 mentions only
0
Protocol Compatibility
Application Compat.
Hardware Compatibility
Graphics support
Resource optimization
Ease of…
Customization
I/O Performance
Technical Issue
TOP 10
Server Virtualization
Infrastructure Cost
Private/Internal Cloud
Future of Virtualization
Market Share
Microsoft Partnership
Cloud Computing
Marketing
Marketing/Campaign
Corporate Partnership
0.8
Industry News
Attribute effects on venue choice
(a(i),2)
BOTTOM 10
0.6
0.4
0.2
Predictive Value of GBI
• Offline brand tracking survey
– Satisfaction surveys conducted from Nov 2009 to Aug 2010 in waves
(overlapping with last 10 months of social media data)
– Approximately 100 surveys conducted per month
– 7 questions re: overall sentiment toward brand
• Company stock price
– Weekly and monthly closing prices for firm
– Weekly and monthly closing S&P
– June 2009 to September 2010 (extra month for lag)
9.1
0.3
9.05
0.2
9
0.1
8.95
0
8.9
-0.1
8.85
-0.2
8.8
-0.3
8.75
GBI Sentiment
GBI at t
Survey
0.4
9.1
0.3
9.05
0.2
9
0.1
8.95
0
8.9
-0.1
8.85
-0.2
8.8
-0.3
8.75
GBI at t-1
Survey Sentiment
0.4
Survey
Survey Sentiment
GBI Sentiment
GBI vs. Offline Survey
• Potential for GBI as a lead
indicator
• Correlation with survey
–
–
–
–
–
–
GBI(t) = .375 [.277,.469]*
GBI(t-1) = .875 [.824,.919]*
Avg sentiment = -.0346
Blogs = .678
Forums = .00410
Microblogs = .751
GBI and Stock Price
(DV=monthly close)
Iteration level
Posterior Means
Constant
S&P*
GBI(t)
GBI(t-1)
Adj R-sq
Coeff
StdErr
p-val
-69.045
34.044
0.070
0.104
0.031
0.008
-16.695
10.324
0.137
30.693
10.375
0.014
.475
* Closing price in month
Median
Coeff
% p-val
<.05
Constant
-67.096
0.138
S&P*
0.102
1
GBI(t)
-15.872
0.012
GBI(t-1)
29.921
0.9892
Adj R-sq
.4635
Observed Social Media Metrics
(DV=monthly close)
Average
Coeff
Constant
p-val
-44.550 30.776
Blogs
Forums
Microblogs
Coeff
p-val
Coeff
p-val
Coeff
p-val
0.178
-93.215
31.675
0.015
222.605
52.477
S&P**
0.073
0.026
0.018
0.071
0.021
0.007
-0.014
0.021
SM(t)
34.995
17.181
0.069
6.059
8.055
0.469
-42.767
10.838
SM(t-1)
5.078
18.008
0.784
18.267
8.289
0.052
-39.252
10.840
Adj R-sq
0.404
* Closing price in month
0.547
0.753
0.390
Conclusions
• Social media behavior varies across venue formats
 Need to account for the source of SM data
• Potential to use social media as market research
Adjusted measure (GBI) can serve as lead indicator
• Implications for academic research that use social media measures and for
practitioners who monitor social media sentiment
• Next steps
– Additional data sets
– Simulations of different GBI scenarios and resulting metrics
Download