While the serial complexity of a run of CCM/MP-2D is affected by
the evolving solution, for example, by the effect of cloud fraction on the
the radiation calculations and wind velocities on the advection of moisture,
much of the cost of a day of simulation is fixed from day to day.
To estimate the complexity of a typical day of simulation we ran the
CCM/MP-2D on an Origin 2000 at Los Alamos National Laboratory using
the SGI Speed Shop tools to count floating point operations.
The "-ideal" option was used, to count the requested number of floating point
operations, and the optimization level was varied to determine the
minimum number of operations.
The complexity was measured for two different problem sizes, T42L18,
which uses a 128 longitude by 64 latitude grid with 18 vertical
levels and a 20 minute timestep. and T170L18, which uses a
512 longitude by 256 latitude grid with 18 levels and a 5 minute
timestep.
For T42L18, operation counts were calculated for one day (72 timesteps)
and 2 days (144 timesteps) on a single processor, then differenced to
estimate the cost for a standard day (without problem initialization).
For T170L18, the counters used to tally floating point operations
could not handle the large numbers, even when running on 16 processors
and using a separate counter for each processor.
In a day of simulation, there are three types of timesteps:
"standard"
plus short- and longwave radiation calculations (once an hour)
plus absorptivity and emissivity calculations (once every twelve hours)
Counts for these three types were measured directly, flushing the counters
one timestep before the timestep to be examined, and running experiments that
ended immediately before the timestep and immediately after, then
differencing. The counts for these three types were weighted appropriately
to construct an operation count for an entire day.
This approach was also used on the T42L18 problem size and compared to the
count computed with a direct measurement. The two approaches
gave essentially equivalent results.
grid
timestep
steps per day
floating point operations per day
sqrt calls in flop count
fdiv calls in flop count
T42L18
128 X 64 X 18
20 minutes
72
59,554,603,237
8.3%
4.2%
T170L18
512 X 256 X 18
5 minutes
288
3,231,429,529,384
7.0%
3.1%
The high percentage of sqrt and fdiv calls makes it more complicated
to use these counts to compute meaningful flops per second rates.
While the percentages are functions of the SGI compiler and, for
example, how the "pow" function is implemented, they are still indicative
of an interpretation problem that will occur on any platform
for which sqrt and fdiv are significantly slower than a floating
point multiply/add.