WHEN IS SOCIAL MEDIA MINING GOOD ENOUGH? OR HELP! I THINK I MIGHT BE A SCIENTIST. Nick Buckley Social Media Director GfK NOP 1 1. What are we talking about? 2 What exactly are we talking about? Definition* of social media monitoring: “Social Media Monitoring (SMM) means the identification, observation, and analysis of user-generated social media content for the purpose of market research.” News sites Forums Review sites Professional & Consumer Client sites Blogs/ Microblogs Video sites Public Communities What they say * http://www.social-media-monitoring.org 3 What was that 2.0 thing again? The “era of shout marketing” is over*: Before the rise of the internet Web 2.0 Eh? * Marshall, 2012 4 Web Mining, Social Media Monitoring or Social Media Mining? I like “Mining”. User generated content in social media lays down a rich seam of activity, opinion, thought and information… mess, echoes and ‘whimsy’. For some time marketing and PR professionals have been monitoring Social Media to capture headline ‘buzz’ in real time, and to detect sudden changes requiring a response. But collecting and counting this content is only the beginning of a process which can add value via many techniques… including integration with other sources such as market research data. 5 Rapid supply-side evolution. What has driven it? For the original PR and Marketing Users… • Boring outputs – flat lining “buzz share” • Commoditisation [seeming] of the core process by technology newcomers • Differentiation by interface… the “Dashboard” – to emphasise use-cases • Making user self-service easier – for all kinds of reasons • Increasingly sophisticated users… looking for outputs suggestive of insights • The ‘social CRM’ branch http://blog.glennz.com/evolution/ 6 2. What happens when Market Researchers get hold of it? 7 Sony brand damage was driven by PlayStation breach (2011) sony buzz this year sony sentiment this year sony buzz in april sony sentiment in april playstation buzz playstation sentiment 8 Market Researchers believe that SMM can also give clients a window on other dimensions of online conversations SMM provides insights into: • Category Dynamics Consumer needs Problems and issues consumer discuss Product usage discussions New product entries • Corporate Corporate mentions related to reputation Crises Social issues • Brand Brand/sub-brand mentions, brand “buzz” Number of positive vs. negative sentiments for each brand Brand content analysis, what’s being said about brand Advertising noticed most and related discussion Source of mentions (specific sites.) and the most influential sites • Competition All the above for competition 9 Inevitably they think about comparison with surveys… Strengths • Very immediate • Unconditioned by participant awareness of a research process Often more emotive than considered survey responses • Spontaneously generated content unconstrained by research frame. • Offers insight into active social media users • Potentially global • You can ‘ask a new question’ without having to issue a new questionnaire* • Low cost – under certain circumstances © 2012 GfK NOP Weaknesses 10 • Not necessarily representative of the general population • Difficult to weight back to general population, as demographic data is sparse • Automated sentiment analysis only as good as the algorithms [and these vary greatly] • Automated harvesting can capture a lot of ‘noise’ for certain words or brands • No guarantee of sufficient data • Costs rise when we use supplementary analysis to overcome some of these issues *within certain technical limitations Different client needs indicate different SMM approaches For example - Precision Extraction vs ‘Trawl & Filter’ Quantitative Brand tracking and integration with traditional research More post processing, applied to data by MR agency - to reduce noise and refine sentiment attribution Indicative Qual Exploratory Qual – more e.g. using trends and volumes to guide focus of analysis complex collection. Manually manageable volumes and ‘tuning’ Lower data volumes Higher data volumes from targeted & compound search terms © 2012 GfK NOP Crude mention & mood tracking from simple search terms 11 Accept raw data output from application 3. Too Abstract? 12 The raw material - Results from search terms SMM applications extract results from wholesale supplies of data, conducting searches defined by “search terms” • These can be anything from a simple and distinctive brand or product name, to a complex expression configured to capture discussions about a category or concept. • A search term combines words or phrases via logical instructions such as AND, OR, NOT. They may also employ functions such as WITHIN to detect words in a certain proximity to each other. Finally – just as in mathematical equations – brackets can dictate the sequence in which the instructions are applied, e.g. • “word1” AND ( “word2” OR “word3” ) 13 Typical SMM application offers a dashboard view of data returned by these search terms – and the facility to export the underlying data 14 Analyses Whatever the Search Terms define – here is what can be measured about the results returned… in combination or in isolation Volume – “how much is it talked about, and how is this changing over time” Location – “where in the Channels – “where on the web is it being talked about… twitter, blogs, forums, comments?” world is it being talked about?” Verbatims - drill-down to individual posts, in their own words – “what do people actually say?” Themes – “what other words and phrases are most regularly associated with it?” People – “who is talking about it?” That may be by influence – according to various proprietary indices – or by demographics [to be used with caution] Sentiment: Across all of these variables is superimposed automatically generated “Sentiment” analysis – positive, negative or neutral language associated with the subject of the posts… 15 Examples of outcomes from SMM studies FINDING: Focus on the right social media channels at the right time. A manufacturer used a video from a high profile pop star to drive a major campaign. Predictably, when aired, the video generated a ‘spike’ of twitter activity. BUT – looking back down the timeline showed there had also been a burst of activity on forums, and some blogs, from fans of the artist when the video was being shot. FINDING: Differentiate ‘trade press’ buzz from real engagement. A manufacturer used a novel approach, through Facebook, to support advice and collaboration between users of its product. This appeared to have some success in stimulating social media conversations about the product. However – deeper scrutiny revealed that this traffic was almost exclusively blogging by sector and marketing industry press, attracted by the novel approach, with further blog, forum and link-tweeting activity amongst sector insiders and social media enthusiasts. 16 Examples of outcomes from SMM studies (2) FINDING: Consumers don’t always talk about the product features that you highlight. Analysis of conversations about a newly launched electronics product revealed that the functional features most discussed [particularly those with largely positive sentiment attached] were not those which the manufacturer had chosen to highlight. Subsequent marketing was able to adjust to take account of these ‘more loved’ features. FINDING: ‘The world’ can sometimes throw up more interesting stories about you than you could hope to generate for yourself… but not always with the connotations you would like. An automotive manufacturer which had enjoyed modest online buzz as a result of its own sponsorship activities experienced a ‘spike’ in online mentions which was 10 times the size – as a result of a much repeated witty comment. A high profile celebrity had appeared on TV news being interviewed from the drivers’ seat of one of their vehicles. The comment – linking the celebrity to a negative ‘folk image’ of the vehicle – spread rapidly across a range of social media channels. The moral is that spontaneous, and genuinely social, media can currently still outperform marketers. 17 BUT! 18 There are many forces* which erode this nice model… Accuracy? Reach?................................................... Relevance? Reach image from titletrack.com 19 Accuracy Is the searched-for phrase even in the returned “snippet”? Is it ‘content’ – or is it • Navigation? • Ticker or title content? • Ad Content? • Various species of spam [overlaps with ‘Relevance’]? Is meta-data about the poster • Present? • Reliable? Understanding this, apart from making your own manual checks, is about understanding your third party suppliers’ processes and content and, often, that of their ‘wholesale data suppliers’ – each of which may differ from the others. 20 Reach [T]here are known knowns; there are things we know that we know. There are known unknowns; that is to say there are things that, we now know we don't know. But there are also unknown unknowns – there are things we do not know, we don't know. Donald Rumsfeld • • • • • • Are these results from scrutiny of the entire [English speaking] social web Are they results from a very large, sometimes stated, number of social sources? Could this range be skewed relative to the subject under scrutiny? Where it’s Twitter data – is it from the whole of Twitter Is historical data always the same basis as current data, or data gathered since the search was defined? Do we always have a good idea of what the ‘Reach’ is? No Yes Yes Maybe Not always No 21 Relevance Even when the application has collected exactly what we asked for, and it is legitimate content, with some nice useful data about the poster… it might not be relevant “Cats are great company.” “#EMT Bolt one cool cat!” “Also, the Cat is a great resort” “I love my aunt Cat!” “I think Cat Stark is worse than any Lanister.” “I think this hurricane was a scam cooked up by the fat cats in Big Grocer.” 22 Challenges include However , commencing too early public smoking facts will just overstress your pet ; quite a fresh pet will not learn everything from services. Just after he has ended up perched for some a few moments, supply him with the particular take care of, plus for instance in advance of, make sure you compliment the pup. When dog house teaching your dog, continue to keep the dog house in the vicinity of the spot where you as well as the canine are usually conversing. 23 And I haven’t mentioned automated Sentiment Analysis yet! Irony – really? Slang/Dialect/Register Multiple meanings – “50 strong” Adjacent subjects – “My beautiful FIAT next to a BMW” 24 4. And what is Good, and what is not Good? 25 To Recap • SMM tools make it very easy to “Super Google” certain Brands, people, objects and even categories or concepts – quickly generating convincing-looking tables and charts. • But underneath there’s a complex story about accuracy, reach and relevance… which becomes apparent on scrutiny of drilled-down text samples – and can only fully be understood by getting inside the provider’s systems and sources. • It doesn’t mean they are misleading users – it just means that they started out somewhere else. • The conclusion is that you have to carefully consider use cases, or build your own better mouse trap, or wait for proprietary solutions to get better at certain things • Sentiment analysis is part of this story – but doesn’t define it. 26 Natural Language Processing [NLP] to the rescue? Definition “Specifically, it is the process of a computer extracting meaningful information from natural language input and/or producing natural language output”* Most SMM applications claim some level of NLP. Whilst this may be legitimately contrasted with simple vocabulary, combination and probabilistic methods, it can end up meaning little. It may only mean that some rules of language have been ‘attended to’ in what is still essentially a pattern-matching exercise *Warschauer, M., & Healey, D. (1998). Computers and language learning: An overview 27 But clearly sophisticated NLP would make a big difference • Improved Accuracy – including filtering out of unstructured spam • More tools available to achieve/check Relevance • Much-improved Sentiment Analysis Some commercial tools have become available in the last 12 months which offer an assessment of their confidence in their own NLP analysis – dividing snippets into those with Low, Medium and High confidence. Significantly, ‘High’ is a minority of the output. 28 Barking up the wrong Tree? The recap assumes that the Market Researcher’s instinct is correct… to make the fuzzy working of the social web itself… the collection mechanisms and enterprises, and the analytical engines… into a familiar data collection process, somehow isomorphic with surveys. But “what is good” is, as many of the ancient philosophers would tell us, about function and purpose. I think we’ve now learned enough, • and experienced enough un-straightforwardness • and contemplated enough need for manual evaluation or augmentation - dispelling the notion that this is a self-evident labour saving device along the way… to stop and ask, “what was it we were trying to do?” 29 To Recap • SMM tools make it very easy to “Super Google” certain Brands, people, objects and even categories or concepts – quickly generating convincing-looking tables and charts. • But underneath there’s a complex story about accuracy, reach and relevance… which becomes apparent on scrutiny of drilled-down text samples – and can only fully be understood by getting inside the provider’s systems and sources. • It doesn’t mean they are misleading users – it just means that they started out somewhere else. • The conclusion is that you have to carefully consider use cases, or build your own better mouse trap, or wait for proprietary solutions to get better at certain things • Sentiment analysis is part of this story – but doesn’t define it. 30 What are we trying to do? • Use the social web as a proxy for the population? • Understand how the social web is responding – for the benefit of those solely interested in this sub-set of the population as a channel or marketplace? • Access particularly niches which are more concentrated online than off? • Detect significant events? • Measure shifts and changes? • Make rough comparisons? • Discover new insights, themes and connections? 31 How useful is extracted Social Media content? Mechanically extracted content is inevitably imperfect as regards: • relevance • comprehensiveness relative to ‘total web’ • accuracy of classification, sentiment etc • representativeness of general population It’s important to know when this matters, and how much. It is vital to work honestly with the constraints and exploit the strengths… In general web mining is therefore useful for: • • • • • relative measures measuring and detecting change or discontinuity iterative discovery of related concepts and drivers comparing channels matching to events and schedules …and, of course, integration with other sources of data. 32 Different client needs indicate different SMM approaches For example - Precision Extraction vs ‘Trawl & Filter’ Not radical enough! Sensible Exploratory Qual – more complex collection. Manually manageable volumes and ‘tuning’ Lower data volumes from targeted & compound search terms © 2012 GfK NOP Quantitative Brand tracking and integration with traditional research More post processing, applied to data by MR agency - to reduce noise and refine sentiment attribution Indicative Qual e.g. using trends and volumes to guide focus of analysis Crude mention & mood tracking Too much like hard work Higher data volumes from simple search terms 33 Accept raw data output from application Rather than wait for NLP utopia… Settle for: 1. SMM as a powerful and novel Qual exploration tool 2. Do big number crunching on brands but take a “hyena” approach. Accept all* occurrences of a brand or product name in posts as an indication of significance… even the spam and the adverts and the competitions Similarly look for pure correlations between words/phrases and other word/phrases Or between trends in these numbers and classes of offline events – such as sales, complaints and other behaviours… with a view to predicting, explaining or causing such events in the future. *Except for the most obvious duplication errors such as over-indexing 34 5. Some Conclusions 35 I am not a scientist OK – I’m a scientist amongst researchers, and possibly amongst programmers But amongst scientists – and text analysis specialists – I’m a mere researcher. Because I couldn’t use these tools “as is” with confidence I had to start delving… … and delving is time consuming in a commercial environment. Our technology suppliers have become more like partners… increasingly transparent as they’ve understood, but not challenged, what we tried to do. The software and services will now adapt to us – whether they should or not. PR monitors, real time trackers and ‘social CRM’ folks will carry on using the tools the same way they always have… and may even benefit from changes my industry has now initiated. 36 But How will commercial SMM applications and services with the best accuracy, reach and relevance capabilities be recognised, validated and promoted? Is the ‘bit in the middle’ just a holy grail until such time as the NLP part of the reckoning makes a step change – driven by all its other exploitations, such as ordinary language driven IT interfaces. If you’re a researcher and you want to use this stuff tomorrow… what must be done? Fortunately – there’s enough to learn by “super-googleing”, browsing and crude trend tracking to keep us going… and learning… for some time to come. 37 38 Dr Nick Buckley Social Media Director GfK NOP M: 07958 516967 T: @grimbold E: nick.buckley@gfk.com [from August 2012. E: nick@soshall.net] 39