CS6029 SOCIAL NETWORK ANALYSIS SET WISE OPERATIONS WHY SET WISE OPERATONS ? • Query limitations • Operator types: standalone and conjunctionrequired • Boolean operators and grouping • Order of operations • Punctuation, diacritics, and case sensitivity • Specificity and efficiency BUILDING QUERIES FOR SEARCH TWEETS QUERY • A database query is a request for data from a database. • The request should come in a database table or a combination of tables using a code known as the query language. QUERY LIMITATIONS IN TWITTER • Your queries will be limited depending on which access level you are using. • If you have Essential or Elevated access, your query can be 512 characters long. • If you have Academic Research access, your query can be 1024 characters long. Standalone operators used alone or together with any other operators #samantha OPERATOR TYPES "twitter data" has:mentions (has:media OR has:links) can only be used when at least one standalone operator is included in the query. Conjunction-required operators BOOLEAN OPERATORS AND GROUPING AND LOGIC • Successive operators with a space between them will result in boolean "AND" logic, meaning that Tweets will match only if both conditions are met. • Ex: rainyday #ilayaraja OR LOGIC • Successive operators with OR between them will result in OR logic, meaning that Tweets will match if either condition is met. • Ex: small boi OR #epuuraaa OR #meme BOOLEAN OPERATORS AND GROUPING NOT LOGIC, NEGATION • Prepend a dash (-) to a keyword (or any operator) to negate it (NOT). For example, cat #meme -grumpy will match Tweets containing the hashtag #meme and the term cat, but only if they do not contain the term grumpy. • One common query clause is -is:retweet, which will not match on Retweets, thus matching only on original Tweets, Quote Tweets, and replies. • All operators can be negated, but negated operators cannot be used alone. BOOLEAN OPERATORS AND GROUPING GROUPING • Use parentheses to group operators together. For example, (grumpy cat) OR (#meme has:images) will return either Tweets containing the terms grumpy and cat, or Tweets with images containing the hashtag #meme. • Note that ANDs are applied first, then ORs are applied. ORDER OF OPERATIONS • When combining AND and OR functionality, the following order of operations will dictate how your query is evaluated. • Operators connected by AND logic are combined first • Then, operators connected with OR logic are applied For example: • poori OR pongal chutney would be evaluated as poori OR (pongal chutney) • poori pongal OR chutney would be evaluated as (poori pongal) OR chutney PUNCTUATION, DIACRITICS, AND CASE SENSITIVITY • If you specify a keyword or hashtag query with character accents or diacritics, it will match Tweet text that contains both the term with the accents and diacritics, as well as those terms with normal characters. • For example, queries with a keyword Diacrítica or hashtag #cumpleaños will match Diacrítica or #cumpleaños, as well as with Diacritica or #cumpleanos without the tilde í or eñe. • Characters with accents or diacritics are treated the same as normal characters and are not treated as word boundaries. For example, a query with the keyword cumpleaños would only match activities containing the word cumpleaños and would not match activities containing cumplea, cumplean, or os. • All operators are evaluated in a case-insensitive manner. For example, the query osma will match Tweets with all of the following: osma, OSMA, Osma. SPECIFICITY • Using broad, standalone operators for your query such as a single keyword or #hashtag is generally not recommended. • For example, if your query was just the keyword happy you will likely get anywhere from 200,000 - 300,000 Tweets per day. • Adding more conditional operators narrows your search results Adding more conditional operators narrows your search results, for example (happy OR happiness) place_country:GB -birthday -is:retweet EFFICIENCY • Writing efficient queries is also beneficial for staying within the characters query length restriction. The character count includes the entire query string including spaces and operators. • For example, the following query is 67 characters long: (happy OR happiness) place_country:Kailasa -nithyananda -is:retweet ANALYSING AN USER'S FRIENDS AND FOLLOWERS Problem You’d like to conduct a basic analysis that compares a user’s friends and followers. Solution Use set wise operations such as intersection and difference to analyze the user’s friends and followers. ANALYZING A USER’S FRIENDS AND FOLLOWERS DISCUSSION Given two sets, the intersection of the sets returns the items that they have in common, whereas the difference between the sets “subtracts” the items in one set from the other, leaving behind the difference. Recall that intersection is a commutative operation, while difference is not commutative. In the context of analyzing friends and followers, the intersection of two sets can be interpreted as “mutual friends” or people you are following who are also following you back, while the difference of two sets can be interpreted as followers who you aren’t following back or people you are following who aren’t following you back, depending on the order of the operands. Given a complete list of friend and follower IDs, computing these setwise operations is a natural starting point and can be the springboard for subsequent analysis. For example, it probably isn’t necessary to use the GET users/lookup API to fetch profiles for millions of followers for a user as an immediate point of analysis. THANK YOU MITHESH A (2019503533) YOGEESWAR S (2019503573)