Evaluation of Early Systems -- recent results
PCCM2 Performance Results
As part of the Evaluation of Early Systems project and the CHAMMP climate modeling program, the performance of Version 2.1 of the Parallel Community Climate Model PCCM2 has been used to evaluate parallel machines of interest to the U.S. High Performance Computing and Communications (HPCC) Program. One stage of this ongoing study was completed in June 1995, with performance measurements obtained on an Intel Paragon MP system and on an IBM SP2 system.
PCCM2 is a scalable, parallel implementation of the National Center for Atmospheric Research (NCAR)'s Community Climate Model version 2 (CCM2). CCM2 is a three-dimensional global atmospheric general circulation model using 18 vertical levels with comprehensive physics for use in the analysis and prediction of global climate. CCM2 uses two different numerical methods to simulate the fluid dynamics of the atmosphere. The spherical harmonic (spectral) transform method is used for the horizontal discretization of vorticity, divergence, temperature, and surface pressure; this method features regular, static, global data dependencies. A semi-Lagrangian transport scheme is used for highly accurate advection of water vapor and an arbitrary number of other scalar fields, such as sulfate aerosols and chemical constituents; this scheme features irregular, dynamic, mostly local data dependencies. Neither method is straightforward to implement efficiently on a parallel computer.
CCM2 also performs numerous calculations to simulate the wide range of atmospheric processes relating to clouds and radiation absorbtion, moist convection, the planetary boundary layer, and the land surface. These processes share the common feature that they are coupled horizontally only through the dynamics. These processes have the important property of being independent for each vertical column of grid cells in the model. However, they also introduce significant spatial and temporal variations in computational load, and it proves useful to incorporate load balancing algorithms that compensate for these variations.
The performance of PCCM2 on massively parallel multiprocessors (MPP) is of great interest to the CHAMMP program. The communication-intensive nature of the spectral transform and of the semi-Lagrangian transport schemes, and the performance sensitivity to the efficient calculation of physical processes also make PCCM2 an interesting evaluation tool for MPPs. The following figures and table, taken from Drake, Foster, Michalakes, Toonen, Worley, Design and Performance of a Scalable Parallel Community Climate Model, describe the performance of PCCM version 2.1 on the Paragon and on the SP2.
Figure: Computational rate in Gflop/sec as a function
of node count at T42 resolution on the Paragon and SP2, for single and double
precision.
Figure: Computational rate in Mflop/sec/node as a function of
node count at T42 resolution on the Paragon and SP2, for single and
double precision.
Table: Elapsed time per model day and computational rate at T170 resolution
on the Paragon and SP2 for single and double precision
This data demonstrates reasonable performance scalability with the number of processors. The 2.2 Gflop/sec rate (double precision) achieved on the 128-node IBM SP2 and the 1024-node Paragon compares well with that achieved on a Cray Y-MP/8 (1.2 Gflop/sec) but the performance is less than that achieved on a Cray C90/16 (about 6 Gflop/sec). At T170 (double precision) the Paragon computational rate increases to 3.2 Gflop/sec on 1024 nodes while the 128 node SP2 computational rate remains unchanged. This compares with 5.3 Gflop/sec on the C90/16. In summary, MPPs can provide useful cycles for climate modelling application codes like PCCM2, but they are not yet better than the modestly parallel vector supercomputers. Note however that a machine with the raw compute power of the SP2 and with the communication speed of the Paragon would be a strong contender, and the technology for each of these already exists.
Evaluation of Early Systems Project