A Quantitative Approach, 4 th Edition

ネットワークコンピューティング論Ⅱ
平成24年度 後期
火曜 第2時限(10:40-12:10)
吉永 努(UEC)
[email protected]
NC論2
1
内容
• 分散・並列処理計算機における相互結合ネット
ワークとその上でのメッセージ・ルーティング技
法などについて学ぶ
• 資料
http://comp.is.uec.ac.jp/yoshinagalab/yoshinaga/dp2.html
• http://ceng.usc.edu/smart/presentations/archives/Appendi
xE.ppt
(253 slides, 13MB)
• http://booksite.mkp.com/9780123838728/references/appe
ndix_f.pdf (P.118, 2MB)
• TA: 島 圭吾君 [email protected]
NC論2
2
References
• T. M. Pinkston and J. Duato: Interconnection
Networks, Appendix E in Computer Architecture:
A Quantitative Approach, 4th Edition, Morgan
Kaufmann publishers (2006).
• 5th Edition, Morgan Kaufmann publishers (2011).
• J. Duato, S. Yalamanchili, L. Ni: Interconnection
Networks - an Engineering Approach-, 第2版,
Morgan Kaufmann publishers (2003)
• 富田眞治: 並列コンピュータ、昭晃堂(1996)
• W.D. Dally, B. Towles: Principles and Practices
of Interconnection Networks,
Morgan Kaufmann publishers (2003)
3
What is an interconnection
Network?
• It is a programmable system that transports data
between terminals, such as processors and
memory.
• It is programmable in the sense that it makes
different connections at different points.
• It is a system because it is composed of many
components: buffers, channels, switches, and
controls that works together to deliver data.
NC論2
4
Interconnection Network (1/2)
Interconnection Network
P
P
P
M
M
M
Multicomputer
NC論2
5
Interconnection Network (2/2)
P
P
P
Interconnection Network
M
M
M
UMA type shared memory multiprocessor
It is also called dance-hall architecture.
NC論2
6
Trend
• Its performance is increasing with processor
performance at a rate of 50% per year.
• Communication is a limiting factor in the
performance of many modern systems.
• Buses have been unable to keep up with the
bandwidth demand, and point-to-point
interconnection networks are rapidly taking
over.
NC論2
7
Computer Classifications (%)
2012/06
2011/06
2010/06
MPP
18.6
17.4
14.8
Cluster
81.4
82.2
84.8
Others
0.0
0.4
0.4
http://www.top500.org/
share of the TOP500 June, 2012 – June, 2010
NC論2
8
Examples of MPPs
K computer
@RIKEN
Fujitsu
2011
Sequoia@LLNL
IBM BlueGene/Q
2011
processor
Topology
#core
Rmax
SPARC64 VIIIfx
2GHz
(16GFlops×
8 cores)
6D mesh/
3D torus
Tofu interconnect
80K-node x 8-core
= 640K-core
10.51PFlops
7890KW
Power BQC 16C
1.6 GHz
( 16 cores)
5D torus
SeaStar
interconnect
NC論2
1,572,864-core
16.32PFlops
12660KW
9
Examples of clusters
Tianhe-1A
(天河一号A)
China
2010
Tsubame 2.0
Tokyo Tech.
2010
processors
GPU
Interconnect
Intel EM64T
Xeon X5670
2.93 GHz
(11.72 GFlops)
×14,336
Xeon X5670
2.93GHz×1,408
+ Xeon E7520
2GHz×34
NVIDIA
Tesla M2050
(515GFlops)
×7,168
Galaxy
160Gbps/link
(proprietary)
Fat tree
Tesla M2050
Infiniband
(515GFlops) QDR (40Gbps)
×2
×1,048×3
Fat tree
NC論2
10
Other Networks of Supercomputers
• Cray XE6 (2011): 3D torus, proprietary GEMINI link)
• Pleiades / NASA (2011): partial 11D hypercube
topology with IB QDR/DDR
• Red Sky/ Sandia National Lab. (2010):
3D torus (12 bristled node) with IB QDR switches
• IBM Roadrunner (2009): fat-tree with IB DDR
• Earth Simulator2 / NEC SX-9E (2009):
Fat-Tree (64GB/s/cpu, 8-CPU/node, 160 nodes)
• IBM Blue Gene/L (2004): 3D torus proprietary
(64 x 32 x 32 = 64K nodes)
NC論2
11
Architecture vs. software
memory
programming
UMA
(SMP)
shared
OpenMP
NUMA
(MPP)
distributed
(not shared)
MPI
(Message Passing Interface)
NC論2
12
Network Design (1/3)
• Performance: latency and throughput
(bandwidth)
• Scalability: #processors vs. network,
memory, I/O bandwidth
• Incremental expandability:
small to maximum size
• Partitionability: netwrok may be partitioned
for several users
NC論2
13
Network Design (2/3)
• Simplicity: simple design, higher clock
frequency, easy to use
• Distance span: smaller system is preferred
for noise and cable delay, etc.
• Physical constraints: packaging (pin count),
wiring(wire length), and maintenance
(power consumption) should meet physical
limitation.
NC論2
14
Network Design (3/3)
• Reliability: fault tolerant, reliable
communication, hot swap
• Expected workload: robust performance
over a wade range of traffic conditions.
• Cost: trade-offs between cost and
performance.
NC論2
15
Classifiction of Interconnection Networks
• Shared-Medium Networks
– Local area networks (ethernet, token ring)
– Backplane bus (e.g. SUN Gigaplane)
• Direct Networks (router-based)
– mesh, torus, hypercube, tree, … etc.
• Indirect Networks (switch-based)
• Hybrid Networks
NC論2
16
Shared-Medium Networks
(LAN)
• Arbitration that determines the mastership of the
shared-medium network to resolve network access
is needed.
• The most well-known protocol is carrier-sense
multiple access with collision detection
(CSMA/CD).
• Token bus and token ring pass a token from the
owner which has the right to access the bus/ring
and resolve nondeterministic waiting time.
NC論2
17
Shared-Medium Networks (Backplane bus)
• It is commonly used to interconnect processor(s)
and memory modules to provide SMP
(Symmetrical Memory Processor) architecture.
• It is realized by printed lines on a circuit board by
discrete wiring.
• Gigaplane in SUN Enterprise x000 server(1996):
2.6GB/s, 256 bits data, 42 bits address, 83.8MHz
clock.
NC論2
18
Direct (static) Networks
• Consists of a set of nodes.
• Each node is directly connected to a subset
of other nodes in the network.
• Examples:
– 2D mesh (intel Paragon), 3D mesh (MIT J-Mahine)
– 2D torus (Fujitsu AP3000), 3D torus (Cray T3D, T3E)
– Hypercube (CM1, CM2, nCUBE)
NC論2
19
Mesh topology
node
2D
3D
NC論2
20
Torus topology
2D
3D
(4-ary 2-cube)
(3-ary 3-cube)
NC論2
21
Hypercube (binary n-cube)
4D
(2-ary 4-cube)
NC論2
22
tree
Binary tree
fat tree
NC論2
x tree
23
Hierarchical topology (1/2)
Pyramid
Hierarchical ring
(Hierarchical 2D mesh)
NC論2
24
Hierarchical topology (2/2)
Cube-connected cycles
RDT
(Recursive Diagonal Torus)
NC論2
25
Hypermesh
(spaninng-bus hypercube)
Single or
multiple buses
NC論2
26
Base-m n-cube (hyper-crossbar)
770
070
777
077
707
000
007
8x8 crossbar
Base-8 3-cube (Toshiba Prodigy)
NC論2
27
Diameter and degrees (1/2)
2D
mesh
#node
N
2D
torus
N
Diameter
2√N
√N
degree
4
4
NC論2
3D
torus
N
binary
n-cube
N = 2n
√N
log N
6
log N
3
28
Diameter and degrees (2/2)
Base-m
n-cube
#node
N = mn N = n2n
Diameter logm N
degree
CCC
logm N
3n/2
3
NC論2
Binary
tree
N
ring
2log N
N/2
3
2
3
N
29