Data Stream Find similar items Jaccard Similarity Shingling -> MinHash -> LSH MinHash o Random permutations o One-pass MinHash Sampling data stream Fixed proportion Fixed size o Reservoir Sampling Find frequent items [Deterministic] Misra-Gries o (m-m')/(k+1) decrement steps at most [Randomized] CountMin Sketch Filtering data stream Bloom Filter o 1 hash function o k hash functions Locality Sensitive Search (LSH) (r, c, p1, p2)-sensitive Jaccard similarity/distance o MinHash Cosine similarity/distance o SimHash Graph Link Analysis TF.IDF Term Spam PageRank o Dead ends Recursively remove Taxation Biased PageRank Link Spam (Spam Farm) o o Trust Rank Spam Mass Social Network Analysis Small world property o Power law degree distribution Core-Periphery structure Strength of ties o Triadic closure Clustering Coefficient o Triangle enumeration Community Detection o K-core decomposition Influence Maximization o Greedy-based o Sketch-based Join Analysis Natural join Semijoin Multiway join Acyclic join o Yannakakis Algorithm