put logo here
CSMD
people
people
sitemap
search

Preliminary Results Using PCTM Parallel Climate Model

Evaluation of Early Systems

PCTM

The following performance data were collected by Tom Bettge on the AlphaServer SC at Oak Ridge National Laboratory, the IBM SP at the National Center for Atmospheric Research, and a 128-processor SGI Origin 2000 at Los Alamos National Laboratory during 11/2000. The Parallel Climate Transitional Model (PCTM) is the next generation of the Parallel Climate Model. It is made up of atmosphere, ocean, land surface, and sea ice component models, and a coupler to exchange fluxes between the component models. The atmospheric model is a recent version of the Community Climate Model, developed at the National Center for Atmospheric Research (NCAR). The ocean model is POP (Parallel Ocean Program), developed at Los Alamos National Laboratory (LANL), the National Physical Laboratory (NPL), and NCAR.

The first graph plots the number of years that can be simulated in one day as a function of the number of processors. The Compaq and IBM systems are each clusters of 4-way SMP nodes. For these measurements, all processors were used in each 4-way SMP node.

While the performance curves appear reasonably linear, with the Compaq superior in both achieved performance and scaling, the component models do not scale equally well. For example, the atmosphere model is currently limited to 64-way parallelism, and some algorithmic inefficencies are introduced when using more than 32 processors. The following 6 graphs show the scaling for the component models for each of the platforms. The first in each pair is the number of seconds per model day, while the second indicates the percentage of time for the PCTM run that is spent in each of the individual components. Note that "ATM and coupler" also includes the land surface model. As noted above, POP refers to the ocean model. POP has two computational phases, barotropic and baroclinic, with dramatically different performance characteristics. For this reason, the two phases are graphed spearately.

On the Compaq, the baroclinic phase of the ocean model is the most expensive component. The performance of the baroclinic phase and the atmospheric model have similar scaling behaviors. The ice model shows perfect scalability, which is what it should show. The barotropic phase scales poorly. However, little time is spent computing the barotropic phase and the scalability of the model is primarily determined by the atmosphere and baroclinic phase of the ocean.

On the IBM, the atmosphere is the most expensive component, and it scales much worse than the baroclinic phase of the ocean. As with the Compaq, the ice model scales perfectly, and the barotropic phase of the ocean does not scale at all.

On the SGI, the baroclinic phase of the ocean model is the most expensive component. While it scales much better than the atmosphere, the atmosphere is never more expensive. As with the other results, the ice model scales perfectly, and the barotropic phase of the ocean scales poorly, if at all.

Note that the IBM performance is somewhat different from the other two. POP-baroclinic scales very well on the IBM, and, unlike the other systems, is not the slowest component for any processor count. In fact, POP is faster on the IBM than on the Compaq for 64 processors. The Compaq is consistently (and significantly) faster for all of the other component models. Given this trend, and other performance benchmarks, it seems worthwhile to investigate the differences in the compiler optimizations between the Compaq and IBM for the POP code, which was originally a CM5 code and uses more F90 constructs than the other component models.

ornl | ccs | csm| disclaimer | search

URL http://www.csm.ornl.gov/evaluation/PCTM/index.html
Updated: Monday, 20-Aug-2001 12:43:56 EDT
webmaster