|
|
Experiments are run for problem sizes T42L18 and T170L18, where each experiment involves timing 9 or 10 and 14 or 15 timesteps, respectively. To eliminate timing variations that occur only for the initial timesteps, the experiments typically run longer than the stated number, but only the last timesteps are timed. For T42L18, these timesteps are made up of 6 or 7 "normal" timesteps and 3 timesteps that include long and shortwave radiation calculations. For T170L18, the timesteps are made up of 13 or 14 "normal" timesteps and 1 radiation timestep. A third type of timestep that includes absorptivity/emissivity calcuations is not represented. However, this does not change the qualitative comparison of the different parallel algorithms.
Experiments are run for a sequence of numbers of processors. For each number of processors, experiments are run for the full range of supported aspects ratios. For example, for 64 processors, aspect ratios 64x1, 32x2, 16x4, 8x8, etc. would be tried, for each of the identified parallel algorithm options.
For each of the following platforms, the set of identified parallel algorithm options is described. This is followed by the results of the comparison, where the best algorithm is described for each number of processors and problem size. A separate graph is generated for each number of processors. This graph is a scatterplot of runtimes for each parallel algorithm, where the x-axis indicates the aspect ratio and the symbol indicates the particular parallel algorithm option, as defined on the "Candidate Algorithm" webpage. This is typically difficult to read, but should indicate something about the general distribution of runtimes. Since all of the tested parallel algorithms are among the best available, most of the variation should be a function the aspect ratio.
Note that the experiments run on the different platforms are typically not the same, differing primarily in the number and choice of timesteps. Therefore, the raw timings are not comparable between platforms without postprocessing. Also, due to the cost of these experiments (and difficulty of running on large numbers of processors on some platforms), not all aspect ratios are examined for all problem sizes.
CCM/MP-2D Performance Studies Page
Patrick H. Worley / (
worleyph@ornl.gov)