Grid Computing in Japan - The Coalition for Academic Scientific

advertisement
WTEC Panel on
High End Computing in Japan
Site visits: March 29 - April 3, 2004
Study Commissioned By:
National Coordination Office
Department of Energy
National Science Foundation
National Aeronautics and Space Administration
WTEC Overview
• Provides assessments of research and
development
• This was one of 55 international
technology assessments done by WTEC
• WTEC Process






Write proposals for NSF “umbrella” grants
Put together a coalition of sponsors
Recruit a panel of experts
Conduct the study with on-site visits
Publish a report
Full text reports at wtec.org
7/14/2004
WTEC High End Computing in Japan
2
Purpose & Scope of this Study
• Gather information on current status and future
trends in Japanese high end computing
 Govt agencies, research communities, vendors
 Focus on long-term HEC research in Japan
• Compare Japanese and U.S. HEC R&D
• Provide review of ES development process and
operational experience
 Include user experience and its impact on computer
science and computational science communities
 Report on follow-on projects
• Determine HEC areas amenable for Japan-U.S.
cooperation to accelerate future advances
7/14/2004
WTEC High End Computing in Japan
3
WTEC HEC Panel Members
Al Trivelpiece (Panel Chair)
Peter Paul
Former Director
Oak Ridge National Laboratory
Deputy Director, S&T
Brookhaven National Laboratory
Rupak Biswas
Kathy Yelick
Group Lead, NAS Division
NASA Ames Research Center
Computer Science Professor
University of California, Berkeley
Jack Dongarra
Horst Simon (Advisor)
Director, Innovative Computing Lab
University of Tennessee &
Oak Ridge National Laboratory
Director, NERSC
Lawrence Berkeley National Lab
Dan Reed (Advisor)
Praveen Chaudhari (Advisor)
Computer Science Professor
University of North Carolina,
Chapel Hill
Director
Brookhaven National Laboratory
7/14/2004
WTEC High End Computing in Japan
4
Sites Visited (1)
1. Earth Simulator Center
2. Frontier Research System for Global Change
3. National Institute for Fusion Science (NIFS)
4. Japan Aerospace Exploration Agency (JAXA)
5. University of Tokyo
6. Tokyo Institute of Technology
7. National Institute of Advanced Industrial S&T (AIST)
8. High Energy Accelerator Research Org. (KEK)
9. Tsukuba University
10. Inst. of Physical and Chemical Research (RIKEN)
11. National Research Grid Initiative (NAREGI)
12. Research Org. for Information Sci. & Tech. (RIST)
13.
Japan AtomicWTEC
Energy
Institute (JAERI) 5
7/14/2004
High EndResearch
Computing in Japan
Sites Visited (2)
14. Council for Science and Technology Policy (CSTP)
15. Ministry of Education, Culture, Sports, Science, and
Technology (MEXT)
16. Ministry of Economy, Trade, and Industry (METI)
17.
18.
19.
20.
21.
Fujitsu
Hitachi
IBM-Japan
Sony Computer Entertainment Inc. (SECI)
NEC
7/14/2004
WTEC High End Computing in Japan
6
HEC Business and Government
Environment in Japan
Government Agencies
•
Council for Science & Tech. Policy (CSTP)
–
–
–
•
Cabinet Office, PM resides over monthly meetings
Sets strategic directions for S&T
Rates proposals submitted to MEXT, METI and others
Ministry of Education, Culture, Sports,
Science, and Technology (MEXT)
–
–
•
Funds most of S&T R&D activities in Japan
Funded the Earth Simulator
Ministry of Economy, Trade, & Industry (METI)
–
–
–
Administers industrial policy
Funds R&D projects with ties to industry
Not interested in HEC, except for grids
7/14/2004
WTEC High End Computing in Japan
8
Business and Government
• New Independent Administrative Institution
(IAI) model




Some research institutes had already converted
Universities were being converted during our visit
Govt. funds institution as whole; control own budget
Funding being cut annually as well
• Commercial viability of vector supers is
problematic.
 Only NEC still committed to this architectural model
• Commodity PC clusters increasingly prevalent
 All three Japanese vendors have cluster products
7/14/2004
WTEC High End Computing in Japan
9
Business Partnerships
• Each of the Japanese vendors is
partnered with a US vendor
 NEC and Cray ?
 Fujitsu and Sun Microsystems
 Hitachi and IBM
7/14/2004
WTEC High End Computing in Japan
10
HEC Hardware in Japan
Architecture/Systems Continuum
Loosely
Coupled
• Commodity processor with commodity
interconnect
 Clusters
• Pentium, Itanium, Opteron, Alpha, PowerPC
• GigE, Infiniband, Myrinet, Quadrics, SCI
 NEC TX7
 Fujitsu IA-Cluster
• Commodity processor with custom interconnect
 SGI Altix
• Intel Itanium 2
 Cray Red Storm
• AMD Opteron
 Fujitsu PrimePower
• Sparc based
• Custom processor with custom interconnect
Tightly
Coupled
7/14/2004
 Cray X1
 NEC SX-7
 Hitachi SR11000
WTEC High End Computing in Japan
12
Fujitsu PRIMEPOWER HPC2500
Peak (128 nodes):
High Speed Optical Interconnect85 Tflop/s system
128Nodes
4GB/s x4
SMP Node
8‐128CPUs
SMP Node
8‐128CPUs
SMP Node
8‐128CPUs
・・・・
SMP Node
8‐128CPUs
Crossbar Network for Uniform Mem. Access (SMP within node)
<System Board>
<DTU Board>
8.36 GB/s per
system boardD D
T T
133 GB/s total
U U
D D
T T
U U
CPU
CPU
CPU
CPU
CPU
CPU
memory
CPU
CPU
・・・
CPU
CPU
CPU
CPU
memory
CPU
CPU
Channel
Channel
CPU
CPU
Adapter
Adapter
…
…
System Board x16
to Channels
DTU : /Data
Transfer Unit
Gflop/s
7/14/2004
WTEC High End 5.2
Computing
inproc
Japan
41.6 Gflop/s system board
1.3 GHz Sparc
666 Gflop/s node
based architecture
to High Speed Optical Interconnect
PCIBOX
<System Board>
to I/O Device
13
Fujitsu IA-Cluster: System Configuration
System Configuration
 Compute Node
Control Node
- FUJITSU PRIMERGY (1U)
Compute Nodes
- PRIMERGY BX300
Max. 20 blades in a 3U chassis
- PRIMERGY RXI600
IPF(1.5GHz): 2~4CPU
Giga Ethernet
Switch
InfiniBand or
Myrinet Switch
 Compute Network
InfiniBand or Myrinet
for Compute Network
Control Network
7/14/2004
Compute Network
InfiniBand, Myrinet
WTEC High End Computing in Japan
14
Latest Installation of FUJITSU HPC Systems
User Name
Configuration
Japan Aerospace Exploration Agency (JAXA)
PRIMEPOWER 128CPU x 14(Cabinets) (9.3 Tflop/s)
Japan Atomic Energy Research Institute (ITBL
Computer System)
PRIMEPOWER 128CPU x 4 + 64CPU (3 Tflop/s)
Kyoto University
PRIMEPOWER 128CPU(1.5 GHz) x 11 + 64CPU (8.8
Tflop/s)
Kyoto University (Radio Science Center for Space and
Atmosphere )
PRIMEPOWER 128CPU + 32CPU
Kyoto University (Grid System)
PRIMEPOWER 96CPU
Nagoya University (Grid System)
PRIMEPOWER 32CPU x 2
National Astronomical Observatory of Japan
(SUBARU Telescope System)
PRIMEPOWER 128CPU x 2
Japan Nuclear Cycle Development Institute
PRIMEPOWER 128CPU x 3
Institute of Physical and Chemical Research
(RIKEN)
IA-Cluster (Xeon 2048CPU) with InfiniBand & Myrinet
(8.7 Tflops)
National Institute of Informatics
(NAREGI System)
IA-Cluster (Xeon 256CPU) with InfiniBand PRIMEPOWER
64CPU
Tokyo University
IA-Cluster (Xeon 64CPU) with Myrinet
(The Institute of Medical Science)
PRIMEPOWER 26CPU x 2
7/14/2004
WTEC High End Computing in Japan
Osaka University (Institute of Protein Research)
IA-Cluster (Xeon 160CPU) with InfiniBand
15
HITACHI’s HPC system
100,000
Peak Performance
[GFLOPS]
Single CPU peak
performance 8GFlops
(Fastest in the world)
1,000
1
0.1
Integrated
Array
Processor
system
0.01
M-200H
IAP
S-820
B
A
Vector-Scalar
Combined
type
POWER4+
AIX 5L
S-3600
40
5
Vector-Scalar
combined type
180
60
10
D
140
80
20
M-280H
IAP
C
S-3800
S-810
F1 G1
E1
S-3800
480
First Japanese
Vector Supercomputer
H1
SR2201
Single CPU peak
performance 3GFlops
100
10
SR8000
First commercially available
distributed memory parallel processor
10,000
SR11000
First HPC machine combined with
vector processing and scalar processing
20
120
15
M-680
IAP
Vector
VOS3/HAP,HI-OSF/1-MJ
Automatic Vectorization
Scalar HI-UX/MPP
Parallel
Auto Parallelization
(MPP type)
Automatic
Pseudo
Vectorization
'77'78 '79 '80 '81 '82'83 '84 '85 '86 '87 '88 '89 '90 '91 '92'93'94 '95 '96 '97 '98 '99 '00 '01 '02 '03 ‘04‘05
7/14/2004
WTEC High End Computing in Japan
16
SR8000 Pseudo Vector
Processing (PVP)
Vector
Arithmetic
Unit
Pipelining
Pseudo Vector
PVP Feature
Arithmetic Unit
Floating-point
Registers
(FPRs)
Vector
Register
Preload
(H/W Ctl)
Load
Preload
(S/W Ctl)
Cache
Prefetch
(S/W Ctl)
Pipelining
MS
7/14/2004
Problems of conventional RISC
- Reduction of performance for
large scale simulations
because of cache-overflow
- Sustained : Under 10% of peak
MS
Prefetch
- Read data from main memory to
cache before calculation
- Accelerate sequential data access
Preload
- Read data from main memory to
Floating Registers before calculation
- Accelerate stride memory access and
indirectly addressed memory access
WTEC High End Computing in Japan
17
Hitachi SR11000
 109 Gflop/s / node(6.8 Gflop/s / p)
 IBM uses 32 in their machine
2-6 planes
• Based on IBM Power 4+
• SMP with 16 processors/node
Node
• IBM Federation switch
 Hitachi: 6 planes for 16 proc/node
 IBM uses 8 planes for 32 proc/node
• Pseudo vector processing features
 Minimal hardware enhancements
• Fast synchronization
• No preload like SR 8000
• Hitachi’s Compiler effort is separate from IBM
 Automatic vectorization, no plans for HPF
• 3 customers for the SR 11000,
 National Institute for Material Science Tsukuba - 64 nodes (7 Tflop/s)
 Okasaki Institute for Molecular Science - 50 nodes (5.5 Tflops)
 Institute for Statistic Math Institute - 4 nodes
7/14/2004
WTEC High End Computing in Japan
18
SR11000 Pseudo Vector
Processing (PVP)
Vector
Arithmetic
Unit
Pseudo Vector
PVP Feature
Arithmetic Unit
Pipelining
Vector
Floating-point
Registers
(FPRs)
Register
Load
Cache
Preload
(H/W Ctl)
Pipelining
MS
7/14/2004
Problems of conventional RISC
- Reduction of performance for
large scale simulations
because of cache-overflow
- Sustained : Under 10% of peak
Prefetch
(S/W Ctl)
(H/W Ctl)
Prefetch
- Read data from main memory to
cache before calculation
- Accelerate sequential data access
MS
WTEC High End Computing in Japan
19
SR11000 Next Model
• Continuing IBM partnership
• Power5 processor
• Greatly enhanced memory bandwidth
- Flat Memory Interleaving
• Hardware Barrier Synchronisation Register
7/14/2004
WTEC High End Computing in Japan
20
NEC HPC Products
High-End Capability
Computing
Parallel Vector
Processors
SX-6/7 Series
Middle - Small Size
Capacity Computing
IA-64 SERVER
TX7
TX7 SERIES
Express5800/
1160Xa
Parallel PC- Clusters
IA-32 Workstations
7/14/2004
Express 5800/50
Series
WTEC High End Computing in Japan
Express 5800
Parallel Linux
Cluster
21
TX7 Itanium² Server
• up to 32 Itanium²
Processors
• up to 128 GB of RAM
• Linux operating system
with NEC enhancements
• more than 100GF on
Linpack
• file server functionality for
SX
7/14/2004
• cc-NUMA architecture
• employs a chipset and
crossbar switch developed
in-house by NEC
• achieves near uniform
high-speed memory
access.
WTEC High End Computing in Japan
22
SX-Series Evolution
NEXT
2001
GENERATION
SX
1998
1994
SX-6 Series
- SINGLE-CHIP VECTOR PROCESSOR
-GREATER SCALABILITY
1989
SX-5 Series
-HIGH SUSTAINED PERFORMANCE
-Large Capacity SHARED MEMORY
1983
SX-4 Series
-CMOS INNOVATIVE TECHNOLOGY
-ENTIRELY AIR-COOLING
SX-3 Series
-SHARED MEMORY・MULTI-FUNCTION PROCESSOR
-UNIX OS
The Latest Technology
Always in SX-Series
SX Series
-THE FIRST COMPUTER IN THE WORLD
SURPASSING 1GFLOPS
7/14/2004
WTEC High End Computing in Japan
23
NEC SX-7/160M5
Total Memory
1280 GB
Peak performance
1412 Gflop/s
# nodes
5
# PE per 1 node
32
Memory per 1 node
256 GB
Peak performance per PE
8.83 Gflop/s
# vector pipe per 1PE
4
Data transport rate between nodes
• SX-6: 8 proc/node
 8 GFlop/s, 16 GB
 processor to memory
7/14/2004
Rumors of SX-8
8 GB/sec8 CPU/node
26 Gflop/s / proc
• SX-7: 32 proc/node
 8.825 GFlop/s, 256 GB,
 processor to memory
WTEC High End Computing in Japan
24
Special Purpose: GRAPE-6
• The 6th generation of GRAPE
(Gravity Pipe) Project
• Gravity (N-Body) calculation for
many particles with 31 Gflop/s / chip
• 32 chips / board - 0.99 Tflop/s / board
• 64 boards of full system is installed in University of
Tokyo 63 Tflop/s
• On each board, all particles data are set onto SRAM
memory, and each target particle data is injected into
the pipeline, then acceleration data is calculated
 No software!
• Gordon Bell Prize at SC for a number of years
(Prof. Makino, U. Tokyo)
7/14/2004
WTEC High End Computing in Japan
25
Sony PlayStation2
• Emotion Engine:
• 6 Gflop/s peak
• Superscalar MIPS 300 MHz
core + vector coprocessor +
graphics/DRAM
 About $200
 70M sold
• PS1 100M sold
• 8K D cache; 32 MB memory not
expandable OS goes here as well
• 32 bit fl pt; not IEEE
• 2.4GB/s to memory (.38 B/Flop)
• Potential 20 fl pt ops/cycle




FPU w/FMAC+FDIV
VPU1 w/4FMAC+FDIV
VPU2 w/4FMAC+FDIV
EFU w/FMAC+FDIV
High-Performance Chips
Embedded Applications
• The driving market is gaming (PC and game consoles)
 Motivation for almost all the technology developments.
 Demonstrate that arithmetic is quite cheap.
• Today there are three big problems with these apparent nonstandard "off-the-shelf" chips.
 Most of these chips have very limited memory bandwidth and little if
any support for inter-node communication.
• Integer or only 32 bit floating point
 No software support to map scientific applications to these
processors; minimal general-purpose programming tools.
 Poor memory capacity for program storage
• Not clear that they do much for scientific computing.
 Developing "custom" software is much more expensive than
developing custom hardware.
7/14/2004
WTEC High End Computing in Japan
27
TOP500 Data
% of Total Performance for the
US and Japan
in the Top500
70%
60%
Sum of Rmax Over Time
1000000
50%
100000
10000
All
USA
30%1000
JAPAN
OTHER
100
20%
10
0%
03
06
02
06
01
06
00
06
99
06
98
06
97
06
96
06
95
06
94
06
1
93
06
10%
Date
93
06
93
11
94
06
94
11
95
06
95
11
96
06
96
11
97
06
97
11
98
06
98
11
99
06
99
11
00
06
00
11
01
06
01
11
02
06
02
11
03
06
03
11
Sum of Rmax
40%
7/14/2004
WTEC High End Computing in Japan
Date
US
Japan
28
USA
Japan
Germany
UK
Canada
France
China
Top 20 Computers
Where They are Located
Rank J-93 N-93 J-94 N-94 J-95 N-95 J-96 N-96 J-97 N-97 J-98 N-98 J-99 N-99 J-00 N-00 J-01 N-01 J-02 N-02 J-03 N-03 J-04
1 TMC Fujitsu Intel Fujitsu Fujitsu Fujitsu Hitachi Hitachi/T Intel Intel Intel Intel Intel Intel Intel IBM IBM IBM NEC NEC NEC NEC NEC
2 TMC TMC Fujitsu Intel Intel Intel Fujitsu Fujitsu Hitachi/T CraySGI CraySGI CraySGI SGI IBM IBM Intel IBM HP IBM HP HP HP Cal DC
3 TMC TMC TMC TMC Intel Intel Intel Hitachi Fujitsu CraySGI CraySGI CraySGI CraySGI SGI SGI IBM Intel IBM HP HP Linux NQ Self made HP
4 TMC TMC TMC TMC CraySGI CraySGI Intel Intel Hitachi Hitachi/T CraySGI SGI Hitachi CraySGI IBM SGI IBM Intel HP IBM IBM Dell IBM/LLNL
5 NEC TMC Fujitsu Fujitsu Fujitsu Fujitsu Intel Intel CraySGI CraySGI CraySGI CraySGI CraySGI Hitachi Hitachi IBM Hitachi IBM IBM Linux NQ IBM HP Dell
6 NEC NEC Fujitsu Fujitsu TMC IBM CraySGI Intel CraySGI Hitachi Hitachi/T IBM SGI CraySGI Hitachi IBM SGI HP HP HP IBM/Q Linux N IBM
7 TMC NEC Fujitsu Fujitsu Fujitsu IBM Fujitsu CraySGI CraySGI Fujitsu CraySGI CraySGI CraySGI SGI Cray Hitachi IBM Hitachi Intel HP Fujitsu Linux NQ Fujitsu
8 Intel Intel TMC TMC TMC IBM IBM Fujitsu CraySGI Fujitsu CraySGI IBM IBM CraySGI Cray IBM NEC SGI IBM HPTi HP IBM IBM/ LLNL
9 CraySGI Intel TMC TMC CraySGI NEC IBM Fujitsu CraySGI CraySGI CraySGI CraySGI CraySGI CraySGI Hitachi Hitachi IBM IBM IBM IBM HP IBM HP
10 CraySGI TMC Hitachi Hitachi CraySGI TMC NEC Fujitsu CraySGI CraySGI CraySGI CraySGI CraySGI IBM Cray Cray IBM IBM IBM IBM HP IBM/Q Dawning
11 CraySGI TMC Hitachi Hitachi IBM Fujitsu NEC CraySGI IBM CraySGI CraySGI CraySGI IBM CraySGI IBM Cray Cray IBM IBM IBM HPTi Fujitsu Linux N
12 CraySGI TMC NEC CraySGI IBM Fujitsu IBM CraySGI Intel CraySGI NEC CraySGI Hitachi IBM SGI Fujitsu Hitachi NEC IBM IBM IBM HP Linux NQ
13 CraySGI Intel NEC CraySGI IBM TMC TMC CraySGI CraySGI CraySGI NEC CraySGI CraySGI CraySGI Cray Hitachi IBM IBM Hitachi IBM IBM IBM IBM
14 CraySGI CraySGI CraySGI CraySGI Fujitsu Fujitsu Fujitsu IBM CraySGI CraySGI Hitachi Hitachi/T CraySGI CraySGI Cray Cray Hitachi IBM Hitachi IBM IBM lenovo IBM
15 CraySGI CraySGI CraySGI NEC Fujitsu Fujitsu Fujitsu IBM Intel CraySGI Fujitsu CraySGI CraySGI Fujitsu Cray IBM Cray Cray SGI Intel IBM HP IBM
16 CraySGI CraySGI CraySGI NEC Fujitsu CraySGI TMC IBM CraySGI CraySGI CraySGI CraySGI CraySGI IBM IBM IBM Fujitsu IBM IBM IBM IBM IBM IBM/Q
17 CraySGI CraySGI Hitachi NEC Intel CraySGI Fujitsu NEC Fujitsu IBM CraySGI CraySGI CraySGI Hitachi Hitachi IBM Hitachi Hitachi IBM Atipa Intel HPTi IBM
18 CraySGI CraySGI NEC Hitachi CraySGI CraySGI Fujitsu NEC Fujitsu Intel CraySGI NEC Hitachi/T CraySGI Cray Hitachi IBM IBM IBM HP IBM IBM IBM
19 CraySGI CraySGI NEC NEC TMC Fujitsu CraySGI NEC Intel CraySGI Fujitsu NEC CraySGI CraySGI IBM SGI Cray Hitachi NEC IBM Atipa Cray IBM
20 CraySGI CraySGI Intel NEC TMC Fujitsu CraySGI NEC CraySGI CraySGI CraySGI CraySGI IBM CraySGI Cray IBM IBM Cray IBM IBM HP Cray Cray
7/14/2004
WTEC High End Computing in Japan
29
Efficiency is Declining Over time
• Analysis of top 100 machines in 1994 and 2004
• Shows the # of machines in the top 100 that achieve a given
efficiency on the Linpack benchmark
Efficiency of Machines in Top 100
# of machines in top 100
120
• In 1994 40
machines
had >90%
efficiency
1994
2004
100
80
• In 2004 50
have < 50%
efficiency
60
40
20
0
100
7/14/2004
90
80
70
60
50
40
30
20
WTECEfficiency
High End(%)
Computing in Japan
10
30
ESS Impact on Climate Modeling
• NERSC IBM SP3:
 1 simulated year per compute day on 112 processors
• ORNL/NCAR IBM SP4:
 ~2 simulated years per compute day on 96 processors
• ORNL/NCAR IBM SP4:
 3 simulated years per compute day on 192 processors
• ESS:
 40 simulated years per compute day on unknown
number of processors (probably ~128)
• Cray X1 rumor:
 14 simulated years per compute day on 128 procs.
Source: Michael Wehner
7/14/2004
WTEC High End Computing in Japan
31
Technology Transfer from Research
•Government projects encouraged new architectures.
•New technologies were commercialized.
•
•
•
•
Numerical Wind Tunnel
cp-pacs
Earth Simulator
Grape, MDM, eHPC, …
7/14/2004
→ Fujitsu VPP500
→ Hitachi SR2201
→ NEC SX-6
→ ?(MD-engine)
WTEC High End Computing in Japan
32
Hardware Summary
• The commercial viability of "traditional"
supercomputing architectures with
vector processors and high-bandwidth
memory subsystems is problematic.
 NEC only remaining in Japan
• Clusters are replacing traditional highbandwidth systems
7/14/2004
WTEC High End Computing in Japan
33
HEC Software in Japan
Software Overview
• Emphasis on vendor software
 Fujitsu, Hitachi, NEC
 Earth Simulator software
• Languages and compilers
 Persistent effort in High Performance Fortran
 Including HPF/JA extensions
• Use of common libraries
 Little academic work for supercomputers: vendors
supply tools
 Support for clusters
7/14/2004
WTEC High End Computing in Japan
35
Achievements HPF on the
Earth Simulator
• PFES
• Oceanic General Circulation Model
based on Princeton Ocean Model
• Achieved 9.85TFLOPS with 376 nodes
• 41% of the peak performance
• Impact3D
• Plasma fluid code using Total Variation
Diminishing (TVD) scheme
• Achieved 14.9 TFLOPS with 512 nodes
• 45% of the peak performance
7/14/2004
WTEC High End Computing in Japan
36
HPF/JA Extensions
• HPF research in language and compilers
• HPF 2.0 extends HPF 1.0 for irregular apps
• HPF/JA further extends HPF for performance
 REFLECT: placement of near-neighbor
communication
 LOCAL: communication not needed for a scope
 Extended ON HOME: partial computation replication
• Compiler doesn’t need full interprocedural
communication and availability analyses
• HPF/JA was a consortium effort by vendors
 NEC, Hitachi, Fujitsu
7/14/2004
WTEC High End Computing in Japan
37
Vectorization and Parallelization
on the Earth Simulator (NEC)
Interconnection Network
Inter-node
Parallelization
HPF
・・・
MPI
HPF
Intra-node
Parallelization Open MP
共有メモリ
AP
AP
AP
共有メモリ
Main Memory
Automatic parallelization
Vectorization
AP
AP
AP
AP
AP
Processor
Node
7/14/2004
WTEC High End Computing in Japan
38
Hitachi
Automatic Vectorization = COMPAS + PVP
Parallelized with parallel libraries
(HPF,MPI,PVM,etc.)
Internode
COMPAS
(Automatic parallelization)
PVP
(Automatic pseudo
vectorization)
Nod
e
IP
Example of applied image
Inter-node parallelization (With parallel libraries)
Intra-node elementwise parallel processing (COMPAS)
b in IP (With PVP)
Vector processing
DO i=1,l
DO j=1,m
DO k=1,n
Inner DO loop
7/14/2004
PVP: Pseudo Vector Processing
COMPAS: CO-operative
Micro-Processors
WTEC High
End Computingininsingle
JapanAddress Space
IP : Instruction Processor
39
Conclusions
• Longer sustained effort on HPF than in the US
 Part of the Earth Simulator Vision
 Successful on two of the large codes, including GB prize
 Languages extensions were also needed
• MPI is dominant model for internode communication
 Although larger nodes on Vector/Parallel means smaller
degree of MPI parallelism
 Combined with automatic vectorization within nodes
• Other familiar tools developed outside Japan:
 numerical libraries, debuggers, etc.
7/14/2004
WTEC High End Computing in Japan
40
Grid Computing in Japan
Kathy Yelick
U.C. Berkeley
and
Lawrence Berkeley National Laboratory
Outline
• Motivation for Grid Computing in Japan
 E-Business, E-Government, Science
• Summary of grid efforts
 Labs, Universities,
• Grid Research Contributions
 Hardware
 Middleware
 Applications
• Funding summary
7/14/2004
WTEC High End Computing in Japan
42
Grid Motivation
• e-Japan: create a "knowledge-emergent
society," where everyone can utilize IT
• In 2001, Japan internet usage was at the
lowest level among major industrial nations
• Four strategies to address this:
 Ultra high speed network infrastructure
 Facilitate electronic commerce
 Realize electronic government
• Key is information sharing across agencies and society
 Nurturing high quality human resources
• Training, support of researchers, etc.
7/14/2004
WTEC High End Computing in Japan
43
Overview of Grid Projects in Japan
• Super-SINET (NII)
• National Research Grid Initiative (NAREGI)
• Campus Grid(Titech)
• Grid Technology Research Center (AIST)
• Information Technology Based Lab (ITBL)
• Applications:
 VizGrid (JAIST)
 BioGrid (Osaka-U)
 Japan Virtual Observatory (JVO)
7/14/2004
WTEC High End Computing in Japan
44
SuperSINET: All Optical
Production Research Network
• Operational since
Jan. 2002
• 10Gbps Photonic
•
•
•
7/14/2004
WTEC High End Computing in Japan
Backbone
GbEther Bridges for
peer-connection
6,000+km dark fiber
100+ e-e lambda
and 300+Gb/s
45
NAREGI:
National Research Grid Initiative
• Funded by MEXT: Ministry of Education,
Culture, Sports,Science and Technology
• 5 year project (FY2003-FY2007)
• 2 B Yen(~17M$) budget in FY2003
• Collaboration of National Labs. Universities
and Industry in the R&D activities
• Applications in IT and Nano-science
• Acquisition of Computer Resources underway
7/14/2004
WTEC High End Computing in Japan
46
NAREGI Goals
1.Develop a Grid Software System:
 R&D in Grid Middleware and Upper Layer
 Prototype for future Grid Infrastructure in scientific
research in Japan
2.Provide a Testbed
 100+Tflop/s expected by 2007
 Demonstrate High-end Grid Computing
Environment can be applied to Nano-science
 Simulations over the Super SINET
3.Participate in International Collaboration
 U.S., Europe, Asian Pacific
4.Contribute to standards activities, e.g., GGF
7/14/2004
WTEC High End Computing in Japan
47
NAREGI Phase 1 Testbed
~3000 CPUs
~17 Tflops
TiTech
Campus Grid
Osaka Univ.
BioGrid
AIST
SuperCluster
Kyushu Univ.
Super-SINET
Small Test App Clusters
Tohoku Univ.
Small Test App Clusters
(10Gbps)
AIST
Small Test App Clusters
KEK
Small Test App Clusters
Kyoto Univ.
Small Test App Clusters
7/14/2004
Comp. Nano-science Center
(IMS)
~10 Tflops
WTEC High End Computing in Japan
ISSP
Small Test App Clusters
Center for GRID R&D
(NII)
~5 Tflops 48
AIST Super Cluster for Grid R&D
P32: IBM eServer325
10,200mm
Myrinet
Opteron 2.0GHz, 6GB
2way x 1074 node
Myrinet 2000
8.59TFlops/peak
M64: Intel Tiger 4
Madison 1.3GHz, 16GB
4way x 131 node
Myrinet 2000
10,800mm
2.72TFlops/peak
F32: Linux Networx
P32
M64
Xeon 3.06GHz, 2GB
2way x 256+ node
GbE
total 14.5TFlops/peak, 3188 CPUs 3.13TFlops/peak
7/14/2004
WTEC High End Computing in Japan
53
NAREGI Grid Software Stack
WP6: Grid-Enabled Apps
WP3: Grid Visualization
WP4: Packaging
WP2: Grid
Programming
-Grid RPC
-Grid MPI
WP3: Grid PSE
WP3: Grid Workflow
WP1:
SuperScheduler
WP1: Grid
Monitoring &
Accounting
(Globus,Condor,UNICOREOGSA)
WP1: Grid VM
WP5: High-Performance & Secure Grid Networking
7/14/2004
WTEC
High
Computing
in Japan
Note:
WP
= End
“Work
Package”
54
R&D in Grid Software and Networking
Area (Work Packages)
• WP-1: Lower and Middle-Tier Software for Resource
Management:
 Matsuoka (Titech), Kohno(ECU), Aida (Titech)
• WP-2: Grid Programming Middleware:
 Sekiguchi (AIST), Ishikawa(AIST)
• WP-3: User-Level Grid Tools & PSE:
 Miura (NII), Sato (Tsukuba-u), Kawata(Utsunomiya-u)
• WP-4: Packaging and Configuration Management:
 Miura (NII)
• WP-5: Networking, Security & User Management
 Shimojo (Osaka-u), Oie ( Kyushu Tech.), Imase(Osaka-u)
• WP-6: Grid-enabling tools for Nanoscience Apps.
 Aoyagi (Kyushu-u)
7/14/2004
WTEC High End Computing in Japan
55
WP-1: Lower and Middle-Tier Software
for Resource Management
• Unicore  Condor  Globus Interoperability
 Adoption of ClassAds Framework
• Meta-scheduler
 Scheduling Schema, Workflow Engine, Broker Function
• Grid Information Service
 Attaches to multiple monitoring frameworks
 User and job auditing and accounting
• Self-Configurable Management & Monitoring
• GridVM (Lightweight Grid Virtual Machine)
 Support for co-scheduling, resource Control
 Node (IP) virtualization
 Interfacing with OGSA (Open Grid Services Architecture)
7/14/2004
WTEC High End Computing in Japan
56
WP-2:Grid Programming
GridRPC/Ninf-G2
• GridRPC: Programming with Remote
Procedure Calls (RPC) on the Grid




GridRPC API standardization by GGF
Ninf-G is a reference implementation of GridRPC
Implemented on Globus Toolkit (C and Java APIs)
Used by groups outside Japan
Numerical
Library
IDL
FILE
Client
3. invoke
Executable
1. interface
request
2. interface
reply
7/14/2004
GRAM
MDS
Client side
IDL Compiler
4. connect
back
generate
Remote Executable
fork
retrieve
Interface
Information
LDIF File
Server side
WTEC High End Computing in Japan
57
WP-2:Grid Programming
GridMPI
• GridMPI: Programming with MPI on the Grid
 Environment to run MPI applications efficiently in the Grid.
 Flexible and heterogeneous process invocation on each
compute node
 GridADI and Latency-aware communication topology:
• Optimizes communication over non-uniform latency
• Hides the differences of lower-level communication libraries
 Extremely efficient implementation based on MPI on Score
(Not MPICHI-PM)
MPI Core
RIM
SSH
RSH GRAM
Grid ADI
Vendor
MPI
IMPI Latency-aware Communication Topology
P-to-P Communication
TCP/IP
7/14/2004
PMv2
Others
WTEC High End Computing in Japan
Vendor
MPI
Other
Comm.
Library
58
WP-3: User-Level Grid Tools
& PSEs
• Grid Workflow
 Workflow Language Definition
 GUI(Task Flow Representation)
• Visualization Tools
 Real-time volume visualization on
the Grid
• PSE /Portals
Server
Simulation
or
Storage
Raw Data
3D Objects
3D Object
Generation
Images
Rendering
UI
Problem Solving Environment
PSE
Portal
Workflow
Super-Scheduler
7/14/2004
Rendering
Client
Storage
 Multiphysics/Coupled Simulation
 Application Pool
 Collaboration with Nano-science
Applications Group
3D Object
Generation
WTEC High End Computing in Japan
PSE
Toolkit
PSE
App-pool
Info Service
Application
Server
59
WP-4: Packaging and
Configuration Management
• Collaboration with WP1 management
• Activities
 Selection of packagers to use
 Interface with autonomous configuration
management (WP1)
 Test Procedure and Harness
 Testing Infrastructure
c.f. NSF NMI packaging and testing
7/14/2004
WTEC High End Computing in Japan
60
WP-5: Network Measurement,
Management & Control
• Traffic measurement on SuperSINET
• Optimal QoS Routing based on user policies and
network measurements
• Robust TCP/IP Control for Grids
• Grid CA/User Grid Account Management and Deployment
Grid Application
Grid Application
Super-scheduler
Grid Application
User Policy Information DB
Network Information DB
Grid Network Management Server
Network Control Entity
Network Control Entity
Measurement Entity
Dynamic bandwidth Control
and QoS routing
Measurement Entity
Multi-Points real-time
measurement
High-speed
managed networks
7/14/2004
WTEC High End Computing in Japan
61
ITBL Grid Applications Plan to Use Mixture
of Computational Technologies
Environmental Circulation Simulation
for Pollutant Materials
VPP300
(Vector Parallel Computer)
Wind Field Calculation
Atmospheric Environment
Simulation
Two-Dimensional
Data
at Ground SurfaceStampi
3D Wind Field Data
Real Time Viz. Multi‐Vision
case1 case2 ......... case4
Stampi
Marine Environment
Simulation
Terrestrial Environment
Simulation
Several hundreds of the simulations
based on the possible release
parameters are conducted quickly
by using parallel computers.
Atmospheric Dispersion
Simulations
Two-Dimensional Data
at Sea Surface
COMPAQ α
(High-Performance PC)
AP3000
(Scalar Parallel Computer)
Radioactive Source
Estimation System
Japan Meteorological Agency
Numerical Weather
Prediction Data
Prediction Data
at Monitoring Points
Observation
Data
Statistical Analysis
Estimation Result
Fluid-Particle Hybrid Simulation
for Tokamak Plasmas
Large-scale Hartree-Fock
Calculation
SPring8
Electronic fluid
/Electro-Magnetic field
Ion Particles
Control
Diagonalization
Orthonormalizarion
Pool of task distribution
Stampi
7/14/2004
Vector Machine
WTEC High End Computing in Japan
Scalar Machine
Vector Machine
Integral handling
Partial accumulation
Fij<-Fij+Dkl*qijkl
62
Scalar Machine
Grid for the Bell Detector
SuperSINET backbone of
the Belle network
e+e-  Bo Bo
Tohoku U.
~ 1TB/day (planned) 400 GB/day
~45 Mbps
The Belle
detector
NFS
10Gbps
Osaka U.
USA
Korea
Taiwan
Etc.
Nagoya U.
7/14/2004
1TB/day
~100Mbps
170 GB/day
U. Tokyo
KEK
computing
center
Tokyo Institute of Technology
WTEC High End Computing in Japan
63
Grid Applications: Fusion Grid
Real Experiment
Fusion Grid
ITBL
VR visualization
Using super-computer
Connection between
experiment and simulation
7/14/2004
Numerical Experiment
WTEC High End Computing in Japan
64
Adaptation of Nano-science
Applications to Grid Environment
• Analysis of Nanoscience Applications




Parallel Structure
Granularity
Resource Requirement
Latency Tolerance
• Coupled Simulation
 RISM: Reference
Interaction Site Model
 FMO: Fragment
Molecular Orbital
Method
7/14/2004
RISM
Solvent
distribution
Mediator
FMO
Solute
structure
Mediator
In-sphere
correlation
Cluster (Grid)
SMP SC
WTEC High End Computing in Japan
65
ITBL Computer
resource pool
Job
Riken Grid
RIKEN
Output files
Super Combined
User
User
User
Cluster
User
User
Front end
Web portal
server
Globus
L.M. Job
P. Job
MDfiles
Job
Input
Jobfiles
Input
Input files
Input files P. Job
L.M. Job
Output files
MD Job
Output files
7/14/2004
WTEC High End Computing in Japan
Output files
66
Grid Summary
• More emphasis on Grids than expected
 More government support
 More application involvement
 Higher level tools
• Computational, data, business grids included
• Research contributions from Japan on:
 Clusters computing
 Grid Middleware
• Heavily involved in international collaborations
7/14/2004
WTEC High End Computing in Japan
67
Download