PACT BPOE2010 2014 Big Data Workloads An Architect’s Perspective Lizy K. John University of Texas at Austin BPOE 2014 PACT 2010 The Buzz with Big Data Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 BIG DATA - Seeing things we could not see before Analyze massive amounts of data Derive Insights Business Medicine World Economy Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 An Architect would like to know What kind of cores, memory organizations and clustering support needed to support big data Performance metrics to guide workload partitioning strategies other than use available/affordable nodes Partitioning considering performance, power, energy Scaling of computation and communication depending on partitions Becomes important to understand big data workloads Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 What is “Big Data”? Common Definition 1 Terabyte? – Yesterdays “Big Data” “Data that is too large and Petabytes? Exabytes? complex to classify using traditional relational database – Today’s “Big Data” methods” Zettabytes? -Wikipedia – Tomorrow’s “Big Data” What does complex mean?? Need a more complete definition Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Some examples Combined Space of all hard drives in 2006 – 160 exabytes All hard drives sold by Seagate in 2011 Exa = 2^60 – 300 exabytes Zetta = 2^70 The world wide web in 2013 – 4 zettabytes Yotta = 2^80 NSA Utah Data Center in Snowden leaks – 5 zettabytes (some claimed it to be 1 YB) Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Characteristics of Big Data * Not always included in taxonomy Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Big Data Analytics = I got this in the mail the very same week my son turned 16 Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 What’s the Problem? Deriving insights from data NOT a new problem – Traditional relational databases that contain carefully pruned and organized data But storage is relatively cheap these days – Possible to store more data in unstructured form Need intelligent ways to distill large amounts of data in different formats to actionable KNOWLEDGE Many different levels to approach this problem….. Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Big Data Stack Algorithms – PageRank, Genetic Algorithms, SVM, etc. Frameworks and Implementations – Map/Reduce (Hadoop), MySQL, NoSQL (Cassandra), etc Hardware – SMT, Accelerator Nodes (Intel Phi, GPU), etc How does workload analysis fit in? – EVERYONE BENEFITS FROM A DEEP UNDERSTANDING OF A WORKLOAD AND ITS CHARACTERISTICS! Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Are New Benchmarks Needed? Already have industry standard benchmarks! Critical Question – Do Big Data workloads have different characteristics than these “traditional” Benchmarks? – Yes they do! • • • • • TLB Behavior [Wang et al] I-Cache Behavior [Ferdman et al, Zhen et al, Wang et al] SMT [Ferdman et al] Operation Intensity [Wang et al] Data Volume [Wang et al] Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Why New Benchmarks? I-Cache behavior from Cloudsuite [Ferdman et al] – Much higher miss rate than traditional benchmarks – Significant OS contribution to cache behavior Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Why New Benchmarks? OS Activity [Zhen et al] – Shows percentage of instructions – Significant variation in kernel/application dynamic instructions Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Why New Benchmarks? I-TLB Behavior from BigDataBench [Wang et al] – Once again, more misses than traditional benchmarks Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Big Data Characterization Challenges INPUT GENERATION Input data is critical! Couple of approaches – Synthetic data generation • Questionable Veracity – Grab data from industry • Not always possible • CAIDA-like How much data? – Feasibility vs accuracy Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Common Big Data Domains Databases – Structured • Typically relational data • SQL databases – Unstructured • Example: document oriented • Generally no fixed table schema – Semi-structured Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Common Big Data Domains Common NoSQL Databases – Cassandra • Industry leading, ultra scalable – HBase • Database built on top of Hadoop and HDFS – MongoDB • JSON- database with dynamic schema Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Common Big Data Domains Map/Reduce - Hadoop Key/ Value computation – Map and Reduce phase Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Common Big Data Domains Graph Algorithms – Important for Data Mining and Machine Learning – Graphlab – essentially Hadoop over large graphs – GraphChi – web scale graph computation – Streaming graph changes – asynchronous changes to the graph (i.e changes written to edges are immediately visible to subsequent computation) – Partitioning Challenges Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Hierarchical Decomposition of Workloads By dividing into functional blocks - e.g. front end, back end, and database. By subdividing into tasks, task groups, processes, threads, etc. By dividing considering hardware modules at microarchitectural level – memory subsystem, CPU, disk, etc. eg: consider AMD APUs Group together tasks in an application that use data from the same rack. Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Entropy Guided Optimizations Partitioning Graph Workloads – How do we assign work to nodes? Important Factors – Data Locality – Minimize Communication – Maximize Resource Utilization Bisection bandwidth Entropy Guided Optimization Lizy K. John Entropy = (memory-in, memory out, #computations, …other attributes) 3/1/2014 BPOE 2014 PACT 2010 In-Memory Map/Reduce IBM Main Memory Map Reduce (M3R) – Eliminates intermediate disk writes for Hadoop Map/Reduce Jobs – Pros • Significantly speeds up some workloads – 45x on sparse matrix mult – Cons • Data must fit in cluster memory • No failure resilience Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Big Data Benchmarking Challenges WORKLOAD VARIETY Ton of software stacks required – Configuration of software platform sometimes more important than workload (see next slide) A comprehensive benchmark should feature – Offline (Batch Style Analytics) – Online (Real Time Analytics) Seeing positive momentum here! TPC-* -> Cloudsuite, BigDataBench, etc Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Hadoop Case Study – Optimal Settings What are the optimal framework settings? – Workload Dependent? – Hardware Dependent? – Just set everything to the maximum value?? – Does it matter? How do engineers setup clusters for new platforms? – Some “rules of thumb” available, but imprecise Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Hadoop Case Study Standard Hadoop configuration algorithm ):hadoop_options = Google(“Best Hadoop Configuration”) launch_cluster() if (!cluster_boots || !clients_happy) { hadoop_options = Permute(hadoop_options) launch_cluster() if(!cluster_boots || !clients_happy) { options = Lookup_Options(Buddy_at_Other_Company) launch_cluster() if(!cluster_boots || !clients_happy) { options = default_options launch_cluster() } } Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Hadoop Case Study (Mapper-Reducer Slots) 8m8r 16m4r 32m4r 64m4r 2m2r CPU Occupancy of TeraSort for different mapper-reducer slots – Simple app, but different very different execution profile depending on configuration Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Hadoop Case Study (Mapper-Reducer Slots) 64m4r 2m2r 8m8r 16m4r 32m4r Memory Utilization of TeraSort for different mapper-reducer slots – Simple app, but different very different execution profile depending on configuration Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Hadoop Case Study (Block Size) 32MB 64MB 128MB 256MB 512MB TeraSort – Higher block size reduces total number of maps – Simple app, but different very different execution profile depending on configuration Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Big Data Benchmarking Frameworks Management frameworks and harnesses essential Example: AMD SWAT – Software platform for automating the….. • creation, deployment, provisioning, execution, and data gathering of synthetic workloads on scalable clusters Several benchmarks available – Cloudsuite – Hadoop – Graphlab – Anything you want to plugin! Lizy K. John 3/1/2014 PACT 2010 Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Big Data Benchmarking Challenges Big Cluster Lots of cores, lots of memory and disk space – Hard for non-industry researchers Prohibitively long runtimes Can we simulate Big Data? • Requires full system simulation • Cloudsuite on Flexus (EPFL) Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Adaptable Scalable Futuristic Benchmark Proxies Workload Characteristics Application Behavior Space n tio c tr u Ins Mix w Flo m l a r y tro ior og Pr calit Con ehav B Lo on el a ti s v c i ng e un ristic hari d L ism m a re el Com acte ata S erns Th arall D Patt ar P ch ‘Knobs’ for Changing Program Characteristcs Workload Synthesis Algorithm Benchmark Synthesizer Generate Clones by setting knobs to appropriate values Adaptable Scalable Synthetic Benchmark Compile and Execute Lizy K. John Futuristic Hardware Pre-silicon Model 3/1/2014 BPOE 2014 PACT 2010 No. 1 2 3 4 5 6 7 8 Abstract 9 10 Workload Model 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Lizy K. John Metric Basic block size Branch taken rate for each branch Category Control flow predictability Branch transition rate Proportion of INT ALU, INT MUL, INT DIV, FP ADD, FP MUL, FP DIV, FP MOV, FP SQRT, LOAD & STORE Dependency distance distribution Private stride value per static load/store Data Footprint of the workload Mean and standard deviation of the MLP MLP frequency Number of threads Thread class and processor assignment Percentage loads to private data Percentage loads to read-only data Percentage migratory loads Percentage consumer loads Instruction mix Instruction level parallelism Data locality Memory Level Parallelism (MLP) Thread level parallelism Percentage irregular loads Percentage stores to private data Percentage producer stores Percentage irregular stores Shared stride value per static load/store Data pool distribution based on sharing patterns Number of lock/unlock pairs and Number of mutex objects Number of Instructions between lock and unlock 3/1/2014 Shared data access pattern and communication characteristics Synchronization Characteristics BPOE 2014 PACT 2010 Big Data Synthetics? A Possibility? Workload Characteristics Application Behavior Space on c ti tru ix s In M w Flo ol ior ramy r g t o Pr calit Con ehav B Lo n l tio ve ca ics i ng e n u rist ari d L ism h m a S e t m ns re el C o a r a c D a ta a tte r Th arall P P ch But what are the knob settings for “Big Data” ‘Knobs’ for Changing Program Characteristcs Workload Synthesis Algorithm Given challenges in Big Data workloads, this would be useful Benchmark Synthesizer – Need detailed characterization Synthetic Benchmark Compile and Execute Hardware Pre-silicon Model Figure 5: Proxy Benchmark Generation Framework Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Big Data Workload Clones Workload Characteristics Application Behavior Space on c ti tru ix s In M w Flo ol ior ramy r g t o Pr calit Con ehav B Lo n l tio ve ca ics i ng e n u rist ari d L ism h m a S e t m ns re el C o a r a c D a ta a tte r Th arall P P ch ‘Knobs’ for Changing Program Characteristcs Workload Synthesis Algorithm Benchmark Synthesizer Clones for DSS Clones for OLAP Hardware Pre-silicon Model Figure 5: Proxy Benchmark Generation Framework Lizy K. John Clones for Hadoop Clones for Graph Processing Synthetic Benchmark Compile and Execute CLONES WILL AVOID COMPLEX SOFTWARE STACKS: Clones for DSS with materialized views Need detailed characterization 3/1/2014 BPOE 2014 PACT 2010 Tricks from the Old Treasure Chest Search and Sort – – age old computer science problems – new issues raised by scale but Old OLTP, OLAP and DSS Combination of HPC and Database Ideas Old Scatter-Gather Piece-wise modeling Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Conclusion Big Data is here to stay Increasingly important Cloud and Big Data will take the world in unprecedented ways Appropriate hardware and software need to be developed Workload metrics to guide partitioning Need to act now to develop intelligent benchmarks and workload analysis methodology Lizy K. John 3/1/2014 BPOE 2014 PACT 2010 Thank You! Questions? Laboratory for Computer Architecture (LCA) The University of Texas at Austin lca.ece.utexas.edu Lizy K. John 3/1/2014 PACT 2010 References [1] M. Ferdman, et. al.. 2012. Clearing the clouds: a study of emerging scale-out workloads on modern hardware.SIGARCH Comput. Archit. News 40, 1 (March 2012), 37-48. [2] Zhen Jia, Lei Wang, Jianfeng Zhan Lixin Zhang, Chunjie Luo. Characterizing Data Analysis Workloads in Data Centers. In Workload Characterization (IISWC), 2013 IEEE International Symposium on. IEEE. [3] Lei Wang, Jianfeng Zhan, Chunjie Luo, Yuqing Zhu, Qiang Yang, Yongqiang He, Wanling Gao, Zhen Jia, Yingjie Shi, Shujie Zhang, Cheng Zhen, Gang Lu, Kent Zhan, Xiaona Li, and Bizhu Qiu. The 20th IEEE International Symposium On High Performance Computer Architecture (HPCA-2014), February 1519, 2014, Orlando, Florida, USA. [4] Huang, Shengsheng, et al. "The HiBench benchmark suite: Characterization of the MapReduce-based data analysis." Data Engineering Workshops (ICDEW), 2010 IEEE 26th International Conference on. IEEE, 2010. [5] Cooper, Brian F., et al. "Benchmarking cloud serving systems with YCSB."Proceedings of the 1st ACM symposium on Cloud computing. ACM, 2010. [6] GridMix [Online]. Available: https://hadoop.apache.org/docs/r1.2.1/gridmix.html. (21.10.2013). [7] PigMix [Online]. Available: https://cwiki.apache.org/confluence/display/PIG/PigMix.(21.10.2013). [8] PAVLO, A., PAULSON, E., RASIN, A., ABADI, D.J., DEWITT, D.J., MADDEN, S., and STONEBRAKER, M., 2009. A comparison of approaches to largescale data analysis. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data ACM, 165-178. [9] Transaction Processing Performance Council (Online) http://www.tpc.org/default.asp (02-13-2013) [10] GHAZAL, A., RABL, T., HU, M., RAAB, F., POESS, M., CROLOTTE, A., and JACOBSEN, H.-A., 2013. BigBench: Towards an Industry Standard Benchmark for Big Data Analytics. In SIGMOD ACM, New York, New York, 2013, 197-1208. [11] SUMBALY, R., KREPS, J., and SHAH, S., 2013. Linkbench: a database benchmark based on the Facebook social graph In Proceedings of the SIGMOD (New York, New Youk, USA2013), ACM, 1185-1196. [12] Cloudsuite on Flexus[Online]. http://parsa.epfl.ch/cloudsuite/isca12-tutorial.html (02-13-2013). ISCA 2012 Tutorial [13] Graphlab [Online]. Available: http://graphlab.com/). [14] Shinnar, A., Cunningham, D., Saraswat, V., & Herta, B. (2012). M3R: increased performance for in-memory Hadoop jobs. Proceedings of the VLDB Endowment,5(12), 1736-1747. [15] Nambiar, Raghunath Othayoth, and Meikel Poess. "The making of TPC-DS."Proceedings of the 32nd international conference on Very large data bases. VLDB Endowment, 2006. [16] Breternitz, Mauricio, et al. "Cloud Workload Analysis with SWAT." Computer Architecture and High Performance Computing (SBAC-PAD), 2012 IEEE 24th International Symposium on. IEEE, 2012. Lizy K. John 3/1/2014