HPMR Prefetching and Pre-shufﬂing in Shared

HPMR : Prefetching and Pre-shufﬂing in Shared MapReduce Computation Environment IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈 Contents 1. Introduction 2. Related Work 3. Design 4. Implementation 5. Evaluations 6. Conclusion 2 /27 Introduction  It   is difficult to deal internet services Enormous volumes of data Generate a large amount of data which needs to be processed every day  To solve the problem, use MapReduce programing model  Support distributed and parallel processing for largescale data-intensive application  data-intensive simulation application e.g : data mining, scientific 3 /27 Introduction  Hadoop;  Since hadoop is distributed system, it’s called HDFS(Hadoop distributed file system)  HDFS  master server that manages the namespace of a file system, regulates clients’ access to file A Number of DataNode  manage storage directly attached to each DataNode  HDFS  cluster is consist of A Single NameNode   based on MapReduce placement policy place each of three replicas on each node in the local rack  Advantage : improve write performance by cutting down interrack write traffic 4 /27 Introduction Node2 Node1 Files loaded from HDFS stores file file Split Split Split RR RR RR map map map Combiner Partitione r Writeback to Local HDFS store  Input format Input format RecordReaders “Shuffling” process (over the N/W) Split Split Split RR RR RR map map map file file Combiner Partitione r (sort) (sort) reduce reduce Output Format Output Format Essential to reduce the shuffling overhead to improve the overall performance of the MapReduce computation.  the network bandwidth between nodes is also an important factor of the shuffling overhead. 5 /27 Introduction  Hadoop’s  Moving computation is better  Better  It’s  basic principle to migrate the computation closer used for when the size of data set is huge the migration of the computation minimizes network congestion and increase the overall throughput1) of the system. 1)Throughput : 지정된 시간 내 전송된 처리량 6 /27 Introduction  HOD(Hadoop-On-Demand,  developed by Yahoo!) a management system for provisioning virtual Hadoop cluster over a large physical large physical cluster  All physical nodes are shared by more than one Yahoo! Engineers  Increase the utilization of physical resource  When the computing resources are shared by multiple users, Hadoop policy(‘Moving computation’) is not effective  Because resource are shared  Resource e.g : computing n/w, hardware resource 7 /27 Introduction  To solve the that problem, two optimization scheme is proposed  Prefetching  Intra-block prefetching  Inter-block prefetching  Pre-shuffling 8 /27 Related work  J.  Dean and S. Ghemawat Traditional prefetching techniques  V. Padmanabhan and J.Mogul, T.Kroeger and D. long, P. Cao,E. Felten et al.,  Prefetching method to reduce I/O latency 9 /27 Related work  Zaharia  et al., LATE(Longest Approximation Time to End)  More efficiently in the shared environment  Drayd(Microsoft)  Can be expressed as direct acyclic graph  The degree of data locality is highly related to the MapReduce performance 10 /27 Design(Prefetching Scheme) Assigned input split for map task Computation In progress Prefetching In progress Fig.1. The intra-block prefetching in Map Phase Expected data for reduce task Computation In progress Prefetching In progress Fig.2. The intra-block prefetching in Reduce Phase  Intra2)-block prefetching   Bi-directional processing A simple prefetching technique that prefetches data within a single block while performing a complex computation 2)Intra : 안 내부 11 /27 Design(Prefetching Scheme)  While a complex job is performed in the left side, the to be-required data are prefetched and assigned in parallel to the corresponding task  Advantage   of Intra-block prefetching 1. Using the concept of processing bar that monitors the current status of each side and invokes a signal if synchronization is about to be broken 2. Try find the appropriate prefetching rate at which the performance can be maximized while minimizing the prefetching overhead  Can be minimize the network overhead 3)At which : when, where 12 /27 Design(Prefetching Scheme) 1 n1 2 n2 n3 block block block block block block D=1 D=5 D=8 3  Inter-block prefetching  runs in block level, by prefetching the expected block replica4) to a local rack 4)replica : 복제본 • A2, A3, A4 is prefetching the required blocks D=Distance 13 /27 Design(Prefetching Scheme)  Inter-block prefetching  runs in block level, by prefetching the expected block replica4) to a local rack 4)replica : 복제본 • A2, A3, A4 is prefetching the required blocks 14 /27 Design(Prefetching Scheme)  Inter-block prefetching processing Algorithm   1. Assign map task to the node that are the nearest to the required blocks 2. The predictor generates the list of data blocks, B, to be prefetched for the target task t 15 /27 Design(Pre-Shuffling Scheme)  Pre-Shuffling  processing The pre-shuffling module in the task scheduler looks over input split or candidate data in the map phase, and predicts which reducer the key-value pairs are partitioned into. 16 /27 Design(Optimization)  LATE(Longest Approximation Time to End) algorithm  How to robustly perform speculative execution to maximize performance under heterogenous environment   Did not consider data locality that can accelerate the MapReduce computation further D-LATE(Data-aware LATE) algorithm  Almost the same LATE, except that a task is assigned as nearly as possible to the location where the needed data are present 17 /27 Implementation – Optimizer scheduler)  Optimized  scheduler Predictor module  Not only finds stragglers, but also predicts candidate data blocks and the reducers into which the key-value pairs are partitioned  D-LATE  These predictions, the optimized scheduler perform the D-LATE algorithm 18 /27 Implementation – Optimizer scheduler)  Prefetcher  To Monitor the status of worker threads and to manage the prefetching synchronization with processing bar  Load   Balancer Check the logs(include dis usage per node and current n/w traffic per data block) Invoke to maintain load balancing based on disk usage and n/w traffic 19 /27 Evaluation  Two dual-core 2.0Ghz AMD, 4GB main memory  400GB ATA Hard disk drives  Gigabit Ethernet n/w interface card     The entire nodes are divided in to 40racks which are connected with L3 routers Yahoo! Grid which consists of 1670 nodes All test configured that HDFS maintains four replicas for each data block, whose size is 128MB Three type of workload ; wordcount, search log aggregator, similarity calculator 20 /27 Evaluation  Fig7, We can observe that HPMR shows significantly better performance than the native Hadoop for all of test sets  Fig8, #1 : smallest ratio of number of nodes to the number of map tasks.  #5 : due to significant reduction in shuffling overhead 21 /27 Evaluation  The prefetching latency is affected by disk overhead or n/w congestion  Therefore, the long prefetching latency indicates that the corresponding node is heavily loaded  Prefetching rate increases beyond 60% 22 /27 Evaluation  This means that HPMR assures consistent performance even in the shared environment such as Yahoo!Grid where the available bandwidth fluctuates severely.  4Kbps ~ 128Kbps 23 /27 Conclusion  Two  The prefetching scheme   innovative schemes Exploits data locality The pre-shuffling scheme  Reduce the network overhead required to shuffle key-value pairs  HPMR is implemented as a plug-in type component for Hadoop  HPMR improves the overall performance by up to 73% compared to the native Hadoop  Next, step we plan to evaluate a more complicated workload such as HAMA(Open-source Apache incubator project) 24 /27 Appendix : MapReduce Example  MapReduce Example : Weather data set 분석  하나의 레코드는 라인 단위로 저장되며, 이때 저장 타입은 ASCII 형태  하나의 파일에서 각 필드는 구분자없이 고정길이로 저장되어 있음  레코드 예제) 0057332130999991950010103004+51317+028783FM12+017199999V0203201N00721004501CN0100001N9-01281-01391102681  질의  1901년 ~ 2001년 동안 작성된 NCDC 데이터 파일들로부터 각 년도별 가장 높은 기온(F)을 측정하라 Input: 1st Map: 2nd Map: Shuffle: Reduce: Chunk(64MB) 단위 파일로부터 각 레코드로부터 연도별 데이터 그룹 최종 결과 데이터 파일 <offset, 레코드>추출 <연도, 기온> 추출 으로 정리 병합 및 반환 25 /27 Appendix : MapReduce Example  1st Map : 파일에서, <Offset, Record> 추출  <Key_1, Value> = <offset, record> <0, 0067011990999991950051507004...9999999N9+00001+99999999999...> <106, 0043011990999991950051512004...9999999N9+00221+99999999999...> <212, 0043011990999991950051518004...9999999N9-00111+99999999999...> <318, 0043012650999991949032412004...0500001N9+01111+99999999999...> <424, 0043012650999991949032418004...0500001N9+00781+99999999999...> ... 연 도  2nd Map : 각 레코드별 Year, Temp 추출  기온 <Key_2, Value> = <year, Temp> <1950, 0> <1950, 22> <1950, −11> <1949, 111> <1949, 78> … 26 /27 Appendix : MapReduce Example  Shuffle  2nd Map의 결과가 너무 많기 때문에, 이를 각 연도별 데이터 그룹으로 다시 정리  Reduce 과정에서 병합시, 처리 비용 감소 2nd Map <1950, <1950, <1950, <1949, <1949,  0> 22> −11> 111> 78> Shuffle <1949, [111, 78]> <1950, [0, 22, −11]> Reduce : 모든 Map의 후보집합을 병합하여 최종 결과 반환 Mapper_1 (1950, [0, 22, −11]) (1949, [111, 78]) Mapper_2 Reducer (1950, [0, 22, −11, 25, 15]) (1950, 25) (1949, [111, 78, 30, 45]) (1949, 111) (1950, [25, 15]) (1949, [30, 45]) 27 /27 Appendix : Hadoop the Definitive Guide p19~20 1 2 3 4 28 /27

HPMR Prefetching and Pre-shufﬂing in Shared

Related documents

Products

Support

HPMR Prefetching and Pre-shufﬂing in Shared

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib