This graph shows the message traffic between processors during an interval
of time. Horizontal lines indicate the state of each processor (busy or
idle), while the other lines indicate the sending and receiving of
messages. The left end of a communcation line indicates the
source of the message, while the right end indicates the message
destination. The color of the communication line indicates the size of the
message. This graph describes the performance of CCM3.2/MP-2D (the two
dimensional message-passing parallel implementation of NCAR's Community
Climate Model
version 3.2) using 128 processors on the T3E-900 at the National Enersy
Research Scientific Computing Center (NERSC). The problem size is T170L18,
corresponding to a 512 by 256 by 18 longitude-latitide-vertical
computational grid and using 5 minute timesteps. The parallel algorithm uses
a 4 by 32 logical processor grid, so 4 processors are used to decompose the
longitude direction and 32 processors are used to decompose the latitude
direction. A transpose algorithm is used to undecompose the longitude
coordinate in order to calculate the Fourier transforms, followed by another
transpose (re)decompose in the resulting wavenumber coordinate direction. The
transposes are implemented using the MPI_ALLTOALLV command. A recursive
halving algorithm is used to compute the collective sums in the latitude
direction required by the Lengendre transforms. Communication is also
required to fill the halo regions when using the semi-Lagrangian algorithm to
advect moisture, when swapping data to load balance the columnar physics
computations, and when computing global statistics.