Open access and data processing of Social Media (Twitter) data – a new and valuable consumer research instrument Thierry Worch, Anne Hasted & Hal MacFie Overview • • • • Using Twitter for Research – Macro vs Micro The R based macro TwitteR A food product application Possible use in Sensory and Consumer Science © Qi Statistics Ltd Slide 2 What is Twitter? • Online social network and microblog. • Open text-based messages of up to 140 characters also known as “Tweets”. • Tweets are open: – personal information (what people are doing/feeling); – discussions; – sharing information... • Tweets are grouped together according to their content (use of “#word”). • People can “follow” friends, celebrities or brands to stay updated. • Over 500 million registered users in 2012, generating over 340 millions tweets/day, and handling over 1.6 billion search queries/day. © Qi Statistics Ltd Slide 3 MACRO APPLICATION 1 Diurnal and Seasonal Mood Vary with Work, Sleep, and Day length Across Diverse Cultures • • • • Study from Golder et al. Science 30 September 2011: 1878-1881. Previous studies small samples of American students. Students are exposed to varying academic schedules that constrain when and how much they sleep. • Retrospective self-reports, vulnerable to memory error and experimenter demand effects. • Researchers have acknowledged the limitations of this methodology but have had no practical means for in situ realtime hourly observation of individual behavior in large and culturally diverse populations over many weeks. © Qi Statistics Ltd Slide 4 Methodology Twitter data access 2.4 million individuals worldwide 509 million messages February 2008 and January 2010 Linguistic Inquiry and Word Count (LIWC) Analysis Negative Term Frequencies Positive Term Frequencies Time of day Time of day © Qi Statistics Ltd Slide 6 Results • Individuals awaken in a good mood that deteriorates as the day progresses—which is consistent with the effects of sleep and circadian rhythm. • Seasonal change in baseline positive affect varies with change in day length. • People are happier on weekends, but the morning peak in positive affect is delayed by 2 hours, which suggests that people awaken later on weekends. © Qi Statistics Ltd Slide 7 MACRO APPLICATION 2 Effects of the Recession on Public Mood in the UK • Landsdall-Welfare, Lampos, & Cristianini (University of Bristol, UK). • 484 million tweets 9.8 million UK users July 09 to Jan 12 © Qi Statistics Ltd Slide 7 Results – 4 emotion categories © Qi Statistics Ltd Slide 8 Micro Application 1: Airline companies • “R by example: mining Twitter for consumer attitudes towards airlines”, by Jeffrey Breen (June 2011) © Qi Statistics Ltd Slide 9 Airline satisfaction scores • Retrieved from www.theacsi.org • Airlines do not score very high compared to other sectors. © Qi Statistics Ltd Slide 10 Example of Tweets How can we access and summarize this data? © Qi Statistics Ltd Slide 11 Searching tweets with twitteR © Qi Statistics Ltd Slide 12 Game Plan for the Sentiment Analysis © Qi Statistics Ltd Slide 13 Sentiment distributions Positive Negative Southwest United Airlines Southwest has much less negative tweets than United Airlines © Qi Statistics Ltd Slide 14 Micro Application 2: Chocolate Study • 5 chocolate products/brands: – – – – – • • • • • Cadbury Twix Snickers Hershey KitKat Once a week for 8 weeks. 7000 tweets per brand. Circle around Manchester with a radius of 500 Miles. English only Duplicated tweets (and re-tweets) removed. © Qi Statistics Ltd Slide 15 Sentiment Analysis Positive Negative Cadburys Kitkat © Qi Statistics Ltd Slide 16 Classification of the terms tweeted after clean up using the R text mining routine TM 9 sensory descriptors in the top 25 of each product Cadbury chocolate 2077 eat 381 cream 279 bars 308 ice 60 milk 371 taste 98 cake 56 food 115 Snickers chocolate 330 eat 1021 cream 485 bars 324 ice 494 milk 66 taste 60 cake 89 food 46 Twix chocolate eat cream bars ice milk taste cake food 286 537 132 207 118 52 41 51 39 KitKat chocolate eat cream bars ice milk taste cake food 363 544 110 74 54 37 52 94 57 Hershey chocolate eat cream bars ice milk taste cake food 555 135 198 75 39 51 110 26 43 795 114 59 57 51 Hershey sweet dark sweets sauce cupcake 71 26 18 17 14 5 sensory descriptors specific to 2 or less products Cadbury dairy jelly eggs strawberries dairymilk © Qi Statistics Ltd 300 184 176 155 117 Snickers flake icecream brownies bites flavour 79 68 36 35 35 Twix crisps coffee bites dairy sweet 63 33 30 28 27 Slide 17 KitKat chunky mint crisps chunkies coffee Results (chocolate occasion) Category Terms – 9 descriptors in the top 15 of each product Cadbury today 191 always 32 tomorrow 94 home 58 tonight 28 craving 30 birthday 32 breakfast 27 diet 29 Snickers today always tomorrow home tonight craving birthday breakfast diet Twix 149 68 31 45 33 55 25 63 40 today always tomorrow home tonight craving birthday breakfast diet 123 70 34 56 47 26 26 64 37 KitKat today 186 always 63 tomorrow 49 home 42 tonight 57 craving 24 birthday 69 breakfast 56 diet 38 Hershey today 104 always 57 tomorrow 49 home 34 tonight 24 craving 20 birthday 20 breakfast 13 diet 6 Unique Terms – 2 descriptors specific to 2 or less products Cadbury picnic 78 college 48 © Qi Statistics Ltd Snickers hungry 407 earlier 16 Twix japan hungry Slide 18 75 24 KitKat hungry 26 japan 24 Hershey july 94 concert 44 Results (chocolate) • Cadbury have been running a competition and this is reflected in high frequency responses. • Can see descriptors that appear to define the category • Can observe product specific descriptors for sensory and occasion © Qi Statistics Ltd Slide 19 • Usage Comments – TwitteR package " easy " to use ( once you know how) – Large number of texts required – even for micro studies – Linguistic/Text processing software essential • Micro Applications - Sensory research – Vocabulary development to define a category – Brand specific attributes – Change in sentiment over time and place • Research – Macro – find a strong hypothesis and the numbers will do the rest © Qi Statistics Ltd Slide 20 Conclusion • Useful open access research source • Methodological research needed • Specialised sensory algorithms needed © Qi Statistics Ltd Slide 21