ネットワークコンピューティング論Ⅱ 平成27年度 後期 火曜 第2時限(10:40-12:10) 吉永 努(UEC) yosinaga@is.uec.ac.jp NC論2 1 内容 • 分散・並列処理計算機における相互結合ネット ワークとその上でのメッセージ・ルーティング技 法などについて学ぶ • 資料 http://comp.is.uec.ac.jp/yoshinagalab/yoshinaga/dp2.html • http://www.gap.upv.es/slides/appendixE.html • http://booksite.mkp.com/9780123838728/references.php appendix_f.pdf (P.118, 2MB) • TA: 村田 篤志 amurata@comp.is.uec.ac.jp NC論2 2 References • T. M. Pinkston and J. Duato: Interconnection Networks, Appendix E in Computer Architecture: A Quantitative Approach, 4th Edition, Morgan Kaufmann publishers (2006). • 5th Edition, Morgan Kaufmann publishers (2011). • J. Duato, S. Yalamanchili, L. Ni: Interconnection Networks - an Engineering Approach-, 第2版, Morgan Kaufmann publishers (2003) • 富田眞治: 並列コンピュータ、昭晃堂(1996) • W.D. Dally, B. Towles: Principles and Practices of Interconnection Networks, Morgan Kaufmann publishers (2003) 3 What is an interconnection Network? • It is a programmable system that transports data between terminals, such as processors and memory. • It is programmable in the sense that it makes different connections at different points. • It is a system because it is composed of many components: buffers, channels, switches, and controls that works together to deliver data. NC論2 4 Interconnection Network (1/2) Interconnection Network P P P M M M Multicomputer NC論2 5 Interconnection Network (2/2) P P P Interconnection Network M M M UMA type shared memory multiprocessor It is also called dance-hall architecture. NC論2 6 Trend • Its performance is increasing with processor performance at a rate of 50% per year. • Communication is a limiting factor in the performance of many modern systems. • Buses have been unable to keep up with the bandwidth demand, and point-to-point interconnection networks are rapidly taking over. NC論2 7 Computer Classifications (%) 2015/06 2014/06 2013/06 MPP 13.2 14.6 16.6 Cluster 86.8 85.4 83.4 Others 0.0 0.0 0.0 http://www.top500.org/ share of the TOP500 June, 2015 – June, 2013 NC論2 8 Examples of clusters Processors Tianhe-2 (天河2号) China 2013 Tsubame 2.5 Tokyo Tech. 2013 Accelerator Interconnect Intel Xeon E5-2692 12C 2.2 GHz×2 ×16K Xeon Phi 31S1P (57 cores)×3 ×16K TH Express-2 (proprietary) Fat tree Xeon X5670 2.93GHz×2 ×1,408 NVIDIA Kepler K20x ×3×1,048 Infiniband QDR (40Gbps) ×2 Fat tree NC論2 9 Examples of MPPs K computer @RIKEN Fujitsu 2011 Titan@ORNL Cray XK7 2012 Node Topology #core Rmax SPARC64 VIIIfx 2 GHz (16 GFlops× 8 cores) 6D mesh/ 3D torus Tofu interconnect 80K-node x 8-core = 640K-core 10.51 PFlops 7,890 KW 3D torus Gemini interconnect 18,688 nodes (200 Cabinets) 27.11 PFlops 8,209 KW AMD Opteron 16C 2.2 GHz + NVIDIA K20x NC論2 10 Other Networks of Supercomputers • Sequoia/BG Q (2011): 5D torus, proprietary IBM SeaStar • Pleiades / NASA (2011): partial 11D hypercube topology with IB QDR/DDR • Red Sky/ Sandia National Lab. (2010): 3D torus (12 bristled node) with IB QDR switches • IBM Roadrunner (2009): fat-tree with IB DDR • Earth Simulator2 / NEC SX-9E (2009): Fat-Tree (64GB/s/cpu, 8-CPU/node, 160 nodes) • IBM Blue Gene/L (2004): 3D torus proprietary (64 x 32 x 32 = 64K nodes) NC論2 11 Architecture vs. software memory programming UMA (SMP) shared OpenMP NUMA (MPP) distributed (not shared) MPI (Message Passing Interface) NC論2 12 Network Design (1/3) • Performance: latency and throughput (bandwidth) • Scalability: #processors vs. network, memory, I/O bandwidth • Incremental expandability: small to maximum size • Partitionability: netwrok may be partitioned for several users NC論2 13 Network Design (2/3) • Simplicity: simple design, higher clock frequency, easy to use • Distance span: smaller system is preferred for noise and cable delay, etc. • Physical constraints: packaging (pin count), wiring(wire length), and maintenance (power consumption) should meet physical limitation. NC論2 14 Network Design (3/3) • Reliability: fault tolerant, reliable communication, hot swap • Expected workload: robust performance over a wade range of traffic conditions. • Cost: trade-offs between cost and performance. NC論2 15 Classifiction of Interconnection Networks • Shared-Medium Networks – Local area networks (ethernet, token ring) – Backplane bus (e.g. SUN Gigaplane) • Direct Networks (router-based) – mesh, torus, hypercube, tree, … etc. • Indirect Networks (switch-based) • Hybrid Networks NC論2 16 Shared-Medium Networks (LAN) • Arbitration that determines the mastership of the shared-medium network to resolve network access is needed. • The most well-known protocol is carrier-sense multiple access with collision detection (CSMA/CD). • Token bus and token ring pass a token from the owner which has the right to access the bus/ring and resolve nondeterministic waiting time. NC論2 17 Shared-Medium Networks (Backplane bus) • It is commonly used to interconnect processor(s) and memory modules to provide SMP (Symmetrical Memory Processor) architecture. • It is realized by printed lines on a circuit board by discrete wiring. • Gigaplane in SUN Enterprise x000 server(1996): 2.6GB/s, 256 bits data, 42 bits address, 83.8MHz clock. NC論2 18 Direct (static) Networks • Consists of a set of nodes. • Each node is directly connected to a subset of other nodes in the network. • Examples: – 2D mesh (intel Paragon), 3D mesh (MIT J-Mahine) – 2D torus (Fujitsu AP3000), 3D torus (Cray T3D, T3E) – Hypercube (CM1, CM2, nCUBE) NC論2 19 Mesh topology node 2D 3D NC論2 20 Torus topology 2D 3D (4-ary 2-cube) (3-ary 3-cube) NC論2 21 Hypercube (binary n-cube) 4D (2-ary 4-cube) NC論2 22 tree Binary tree fat tree NC論2 x tree 23 Hierarchical topology (1/2) Pyramid Hierarchical ring (Hierarchical 2D mesh) NC論2 24 Hierarchical topology (2/2) Cube-connected cycles RDT (Recursive Diagonal Torus) NC論2 25 Hypermesh (spaninng-bus hypercube) Single or multiple buses NC論2 26 Base-m n-cube (hyper-crossbar) 770 070 777 077 707 000 007 8x8 crossbar Base-8 3-cube (Toshiba Prodigy) NC論2 27 Diameter and degrees (1/2) 2D mesh #node N 2D torus N Diameter 2√N √N degree 4 4 NC論2 3D torus N binary n-cube N = 2n √N log N 6 log N 3 28 Diameter and degrees (2/2) Base-m n-cube #node N = mn N = n2n Diameter logm N degree CCC logm N 3n/2 3 NC論2 Binary tree N ring 2log N N/2 3 2 3 N 29