ネットワークコンピューティング論Ⅱ 平成24年度 後期 火曜 第2時限(10:40-12:10) 吉永 努(UEC) [email protected] NC論2 1 内容 • 分散・並列処理計算機における相互結合ネット ワークとその上でのメッセージ・ルーティング技 法などについて学ぶ • 資料 http://comp.is.uec.ac.jp/yoshinagalab/yoshinaga/dp2.html • http://ceng.usc.edu/smart/presentations/archives/Appendi xE.ppt (253 slides, 13MB) • http://booksite.mkp.com/9780123838728/references/appe ndix_f.pdf (P.118, 2MB) • TA: 島 圭吾君 [email protected] NC論2 2 References • T. M. Pinkston and J. Duato: Interconnection Networks, Appendix E in Computer Architecture: A Quantitative Approach, 4th Edition, Morgan Kaufmann publishers (2006). • 5th Edition, Morgan Kaufmann publishers (2011). • J. Duato, S. Yalamanchili, L. Ni: Interconnection Networks - an Engineering Approach-, 第2版, Morgan Kaufmann publishers (2003) • 富田眞治: 並列コンピュータ、昭晃堂(1996) • W.D. Dally, B. Towles: Principles and Practices of Interconnection Networks, Morgan Kaufmann publishers (2003) 3 What is an interconnection Network? • It is a programmable system that transports data between terminals, such as processors and memory. • It is programmable in the sense that it makes different connections at different points. • It is a system because it is composed of many components: buffers, channels, switches, and controls that works together to deliver data. NC論2 4 Interconnection Network (1/2) Interconnection Network P P P M M M Multicomputer NC論2 5 Interconnection Network (2/2) P P P Interconnection Network M M M UMA type shared memory multiprocessor It is also called dance-hall architecture. NC論2 6 Trend • Its performance is increasing with processor performance at a rate of 50% per year. • Communication is a limiting factor in the performance of many modern systems. • Buses have been unable to keep up with the bandwidth demand, and point-to-point interconnection networks are rapidly taking over. NC論2 7 Computer Classifications (%) 2012/06 2011/06 2010/06 MPP 18.6 17.4 14.8 Cluster 81.4 82.2 84.8 Others 0.0 0.4 0.4 http://www.top500.org/ share of the TOP500 June, 2012 – June, 2010 NC論2 8 Examples of MPPs K computer @RIKEN Fujitsu 2011 Sequoia@LLNL IBM BlueGene/Q 2011 processor Topology #core Rmax SPARC64 VIIIfx 2GHz (16GFlops× 8 cores) 6D mesh/ 3D torus Tofu interconnect 80K-node x 8-core = 640K-core 10.51PFlops 7890KW Power BQC 16C 1.6 GHz ( 16 cores) 5D torus SeaStar interconnect NC論2 1,572,864-core 16.32PFlops 12660KW 9 Examples of clusters Tianhe-1A (天河一号A) China 2010 Tsubame 2.0 Tokyo Tech. 2010 processors GPU Interconnect Intel EM64T Xeon X5670 2.93 GHz (11.72 GFlops) ×14,336 Xeon X5670 2.93GHz×1,408 + Xeon E7520 2GHz×34 NVIDIA Tesla M2050 (515GFlops) ×7,168 Galaxy 160Gbps/link (proprietary) Fat tree Tesla M2050 Infiniband (515GFlops) QDR (40Gbps) ×2 ×1,048×3 Fat tree NC論2 10 Other Networks of Supercomputers • Cray XE6 (2011): 3D torus, proprietary GEMINI link) • Pleiades / NASA (2011): partial 11D hypercube topology with IB QDR/DDR • Red Sky/ Sandia National Lab. (2010): 3D torus (12 bristled node) with IB QDR switches • IBM Roadrunner (2009): fat-tree with IB DDR • Earth Simulator2 / NEC SX-9E (2009): Fat-Tree (64GB/s/cpu, 8-CPU/node, 160 nodes) • IBM Blue Gene/L (2004): 3D torus proprietary (64 x 32 x 32 = 64K nodes) NC論2 11 Architecture vs. software memory programming UMA (SMP) shared OpenMP NUMA (MPP) distributed (not shared) MPI (Message Passing Interface) NC論2 12 Network Design (1/3) • Performance: latency and throughput (bandwidth) • Scalability: #processors vs. network, memory, I/O bandwidth • Incremental expandability: small to maximum size • Partitionability: netwrok may be partitioned for several users NC論2 13 Network Design (2/3) • Simplicity: simple design, higher clock frequency, easy to use • Distance span: smaller system is preferred for noise and cable delay, etc. • Physical constraints: packaging (pin count), wiring(wire length), and maintenance (power consumption) should meet physical limitation. NC論2 14 Network Design (3/3) • Reliability: fault tolerant, reliable communication, hot swap • Expected workload: robust performance over a wade range of traffic conditions. • Cost: trade-offs between cost and performance. NC論2 15 Classifiction of Interconnection Networks • Shared-Medium Networks – Local area networks (ethernet, token ring) – Backplane bus (e.g. SUN Gigaplane) • Direct Networks (router-based) – mesh, torus, hypercube, tree, … etc. • Indirect Networks (switch-based) • Hybrid Networks NC論2 16 Shared-Medium Networks (LAN) • Arbitration that determines the mastership of the shared-medium network to resolve network access is needed. • The most well-known protocol is carrier-sense multiple access with collision detection (CSMA/CD). • Token bus and token ring pass a token from the owner which has the right to access the bus/ring and resolve nondeterministic waiting time. NC論2 17 Shared-Medium Networks (Backplane bus) • It is commonly used to interconnect processor(s) and memory modules to provide SMP (Symmetrical Memory Processor) architecture. • It is realized by printed lines on a circuit board by discrete wiring. • Gigaplane in SUN Enterprise x000 server(1996): 2.6GB/s, 256 bits data, 42 bits address, 83.8MHz clock. NC論2 18 Direct (static) Networks • Consists of a set of nodes. • Each node is directly connected to a subset of other nodes in the network. • Examples: – 2D mesh (intel Paragon), 3D mesh (MIT J-Mahine) – 2D torus (Fujitsu AP3000), 3D torus (Cray T3D, T3E) – Hypercube (CM1, CM2, nCUBE) NC論2 19 Mesh topology node 2D 3D NC論2 20 Torus topology 2D 3D (4-ary 2-cube) (3-ary 3-cube) NC論2 21 Hypercube (binary n-cube) 4D (2-ary 4-cube) NC論2 22 tree Binary tree fat tree NC論2 x tree 23 Hierarchical topology (1/2) Pyramid Hierarchical ring (Hierarchical 2D mesh) NC論2 24 Hierarchical topology (2/2) Cube-connected cycles RDT (Recursive Diagonal Torus) NC論2 25 Hypermesh (spaninng-bus hypercube) Single or multiple buses NC論2 26 Base-m n-cube (hyper-crossbar) 770 070 777 077 707 000 007 8x8 crossbar Base-8 3-cube (Toshiba Prodigy) NC論2 27 Diameter and degrees (1/2) 2D mesh #node N 2D torus N Diameter 2√N √N degree 4 4 NC論2 3D torus N binary n-cube N = 2n √N log N 6 log N 3 28 Diameter and degrees (2/2) Base-m n-cube #node N = mn N = n2n Diameter logm N degree CCC logm N 3n/2 3 NC論2 Binary tree N ring 2log N N/2 3 2 3 N 29
© Copyright 2024 ExpyDoc