ハードウェアに依存しない 先端的科学技術計算コードの

1
HPC Middleware on GRID
… as a material for discussion of WG5
GeoFEM/RIST
August 2nd, 2001, ACES/GEM at MHPCC
Kihei, Maui, Hawaii
Background
2
• Various Types of HPC Platforms
–
–
–
–
MPP, VPP
PC Clusters, Distributed Parallel MPPs, SMP Clusters
8-Way SMP, 16-Way SMP, 256-Way SMP
Power, HP-RISC, Alpha/Itanium, Pentium, Vector PE
• Parallel/Single PE Optimization is Important Issue for Efficiency
– Everyone knows that ... but it's a big task especially for application
experts such as geophysics people in ACES community.
– Machine dependent optimization/tuning required.
• Simulation Methods such as FEM/FDM/BEM/LSM/DEM etc. have
Typical Processes for Computation.
• How about "Hiding" these Processes from Users ?
– code development : efficient, reliable, portable, maintenance-free
• line number of the source codes will be reduced
– accelerates advancement of the applications (= physics)
Background (cont.)
•
3
Current GeoFEM provides this environment
– limited to FEM
– not necessarily perfect
•
GRID as next generation HPC infrastructure
– Currently, middlewares and protocols are being developed
to enable unified interface to treat various OS, computers,
ultra-speed network and database.
– What are expected to GRID ?
•
•
•
•
•
Meta-computing : simultaneous use of supercomputers in the world
Volunteer-computing : efficient use of idling computers
Access Grid : research collaboration environment
Data Intensive Computing : computation with large-scale data
Grid ASP : application services on WEB
4
Similar Research Groups
• ALICE(ANL)
• CCAforum(Common Component Architecture,DOE)
• DOE/ASCI/Distributed Computing Research Team
– ESI(Equation Solver Interface Standards)
– FEI(The Finite Element/Equation Solver Interface Specification)
• ADR(Active Data Repository)(NPACI)
Are they successful ?
It seems NO
• Very limited targets, processes
– Mainly for Optimization of Linear Solvers
• Where are Interfaces between Applications and Libraries ?
– Approach from Computer/Computational Science People
– Not Really Easy to Use by Application People
Computer/
Computational
Science
-Linear solvers
-Numerical Algorithms
-Parallel Programming
-Optimization
Applications
-FEM
-FDM
-Spectral Methods
-MD, MC
-BEM
5
6
Example of HPC Middleware (1)
Simulation Methods include Some Typical
Processes
O(N) Ab Initio MD
Sparse Mat. Mult.
Nonlinear Procedure
FFT
Eward Terms
Example of HPC Middleware (2)
Individual Process could be optimized for
Various Types of MPP Architectures
O(N) Ab Initio MD
MPP-A
Sparse Mat. Mult.
Nonlinear Proc.
FFT
MPP-B
Eward Terms
MPP-C
7
Example of HPC Middleware (3)
Use Optimized Libraries
O
(N)
ab
initio
M
D
Sparse Matrix Mult.
Sparse Matrix Mult.
Sparse Matrix Mult.
Nonlinear Proc.
Nonlinear
Nonlinear Proc.
Alg.
Nonlinear Proc.
FFT
FFT
FFT
Eward Terms
Eward Terms
Eward Terms
8
Example of HPC Middleware (4)
- Optimized code is generated by special language/
compiler based on analysis data and H/W information.
- Optimum algorithm can be adopted
MPP-A
MPP-B
Data for
Analysis Model
Parameters
of H/W
O
(N)
ab
initio
M
D
MPP-C
Sparse Matrix Mult.
Nonlinear Proc.
Special
Compiler
FFT
Eward Terms
9
Example of HPC Middleware (5)
- On network-connected H/W's (meta-computing)
- Optimized for individual architecture
- Optimum load-balancing
O
(N)
analysis
model
space
ab
initio
M
D
10
11
Example of HPC Middleware (6)
Multi Module Coupling through Platform
Ab-Initio MD
Ab-Initio MD
Classical MD
Classical MD
FEM
FEM
HPC Platform/Middleware
Modeling
Data
Assimilation
Load
Balancing
HPC Platform/Middleware
Visualization
Optimization
Resource
Management
12
PETAFLOPS on GRID
from GeoFEM's Point of View
• Why? When?
– Datasets (mesh, observation, result)
could be distributed.
– Problem size could be too large for
single MPP system.
• according to G.C.Fox, S(TOP500) is
MPP-A
MPP-C
about 100 TFLOPS now ...
• Legion
– Prof.Grimshaw (U.Virginia)
– Grid OS, Global OS
– Can handle MPP's connected
through network as one huge MPP
(= Super MPP)
MPP-B
– Optimization on Individual
Architecture (H/W)
– Load balancing according to
machine performance and
resource availability
13
PETAFLOPS on GRID (cont.)
• GRID + (OS) + HPC MW/PF
• Environment for "Electronic Collaboration
14
15
"Parallel" FEM Procedure
Pre-Processing
Main
Post-Processing
Initial Mesh Data
Data Input/Output
Post Proc.
Partitioning
Matrix Assemble
Visualization
Linear Solvers
Domain Specific
Algorithms/Models