https://searchbusinessanalytics.techtarget.com/definition/opinion-mining-sentiment-mining Metadata is data that describes other data. Meta is a prefix that in most information technology usages means "an underlying definition or description." Metadata summarizes basic information about data, which can make finding and working with particular instances of data easier. For example, author, date created and date modified and file size are examples of very basic document metadata. Having the abilty to filter through that metadata makes it much easier for someone to locate a specific document. In addition to document files, metadata is used for images, videos, spreadsheets and web pages. The use of metadata on web pages can be very important. Metadata for web pages contain descriptions of the page’s contents, as well as keywords linked to the content. These are usually expressed in the form of metatags. The metadata containing the web page’s description and summary is often displayed in search results by search engines, making its accuracy and details very important since it can determine whether a user decides to visit the site or not. Metatags are often evaluated by search engines to help decide a web page’s relevance, and were used as the key factor in determining position in a search until the late 1990s. The increase in search engine optimization (SEO) towards the end of the 1990s led to many websites “keyword stuffing” their metadata to trick search engines, making their websites seem more relevant than others. Since then search engines have reduced their reliance on metatags, though they are still factored in when indexing pages. Many search engines also try to halt web pages’ ability to thwart their system by regularly changing their criteria for rankings, with Google being notorious for frequently changing their highly-undisclosed ranking algorithms. Metadata can be created manually, or by automated information processing. Manual creation tends to be more accurate, allowing the user to input any information they feel is relevant or needed to help describe the file. Automated metadata creation can be much more elementary, usually only displaying information such as file size, file extension, when the file was created and who created the file. -----------------------XxxxxxxxxxxxxX================== Unstructured text is written content that lacks metadata and cannot readily be indexed or mapped onto standard database fields. It is often user-generated information such as email or instant messages, documents or social media postings. Unstructured text is an important source of information for businesses, research institutes and surveillance agencies. Enterprises often mine unstructured text for data to enhance their business intelligence strategy and gain a competitive advantage in the marketplace. The unstructured text collected from social media activities plays a key role in predictive analytics for the enterprise because it is a prime source for sentiment analysis to determine the general attitude of consumers toward a brand or idea. Mining of unstructured text delivers new insights by uncovering previously unknown information, detecting patterns and trends, and identifying connections between seemingly unrelated pieces of data. Natural language processing software and other automated tools are typically used to prepare unstructured text for indexing. Because language is often vague, disambiguation of the text through an examination of context is often an important initial step in the mining process. The content is also reviewed for word frequency and other patterns. Tagging is performed to label various pieces of text-derived data so it can be categorized and grouped in ways that are most likely to deliver useful information. Once the text has been turned into data, it can be analyzed and evaluated for relevance and importance. -----------------------XxxxxxxxxxxxxxxxxX----------------------------------------------------------------------------------- Sentiment analysis, also referred to as opinion mining, is an approach to natural language processing (NLP) that identifies the emotional tone behind a body of text. This is a popular way for organizations to determine and categorize opinions about a product, service or idea. It involves the use of data mining, machine learning (ML) and artificial intelligence (AI) to mine text for sentiment and subjective information. Sentiment analysis systems help organizations gather insights from unorganized and unstructured text that comes from online sources such as emails, blog posts, support tickets, web chats, social media channels, forums and comments. Algorithms replace manual data processing by implementing rule-based, automatic or hybrid methods. Rule-based systems perform sentiment analysis based on predefined, lexicon-based rules while automatic systems learn from data with machine learning techniques. A hybrid sentiment analysis combines both approaches. In addition to identifying sentiment, opinion mining can extract the polarity (or the amount of positivity and negativity), subject and opinion holder within the text. Furthermore, sentiment analysis can be applied to varying scopes such as document, paragraph, sentence and sub-sentence levels. Vendors that offer sentiment analysis platforms or SaaS products include Brandwatch, Hootsuite, Lexalytics, NetBase, Sprout Social, Sysomos and Zoho. Businesses that use these tools can review customer feedback more regularly and proactively respond to changes of opinion within the market. Types of sentiment analysis 1. Fine-grained sentiment analysis provides a more precise level of polarity by breaking it down into further categories, usually very positive to very negative. This can be considered the opinion equivalent of ratings on a 5-star scale. 2. Emotion detection identifies specific emotions rather than positivity and negativity. Examples could include happiness, frustration, shock, anger and sadness. 3. Intent-based analysis recognizes actions behind a text in addition to opinion. For example, an online comment expressing frustration about changing a battery could prompt customer service to reach out to resolve that specific issue. 4. Aspect-based analysis gathers the specific component being positively or negatively mentioned. For example, a customer might leave a review on a product saying the battery life was too short. Then, the system will return that the negative sentiment is not about the product as a whole, but about the battery life. Applications of sentiment analysis Sentiment analysis tools can be used by organizations for a variety of applications, including: Identifying brand awareness, reputation and popularity at a specific moment or over time. Tracking consumer reception of new products or features. Evaluating the success of a marketing campaign. Pinpointing the target audience or demographics. Collecting customer feedback from social media, websites or online forms. Conducting market research. Categorizing customer service requests. Challenges with sentiment analysis Challenges associated with sentiment analysis typically revolve around inaccuracies in training models. Objectivity, or comments with a neutral sentiment, tend to pose a problem for systems and are often misidentified. For example, if a customer received the wrong color item and submitted a comment "The product was blue," this would be identified as neutral when in fact it should be negative. Margaret Rouse asks: What is the most valuable insight that sentiment analysis helps your organization gather? Join the Discussion Sentiment can also be challenging to identify when systems cannot understand the context or tone. Answers to polls or survey questions like "nothing" or "everything" are hard to categorize when the context is not given, as they could be labeled as positive or negative depending on the question. Similarly, irony and sarcasm often cannot be explicitly trained and lead to falsely labeled sentiments. Computer programs also have trouble when encountering emojis and irrelevant information. Special attention needs to be given to training models with emojis and neutral data so as to not improperly flag texts. Finally, people can be contradictory in their statements. Most reviews will have both positive and negative comments, which is somewhat manageable by analyzing sentences one at a time. However, the more informal the medium (Twitter or blog posts, for example), the more likely people are to combine different opinions in the same sentence and the more difficult it will be for a computer to parse.