next up previous
Next: PARALLEL CCM2 USER'S Up: PARALLEL MODEL VALIDATION Previous: Error Growth

Reproducibility

Changing the number of processors for the simulation is like changing the machine on which the code is run. Consistency of the numerical results is therefore an issue for production runs where the number of processors may change over the course of the simulation. This issue is referred to as reproducibility and can be paraphrased by asking if the results from one machine configuration exactly (bit for bit) reproduce the results from another machine configuration on the same machine.

Non-reproducibility can arise in otherwise ``correct'' parallel implementations due to the nonassociativity of floating point addition. In the parallel spectral algorithm the order of the sum in the Legendre transform is different depending on the number of processors. The computation of global sums for diagnostics is also sensitive to the order in which the sum is taken. But it is possible to impose order by carefully structuring the parallel algorithms. This reordering results in little loss of performance.

The PCCM2.1 is fully reproducible for power of two horizontal resolutions: T21, T42, T85, T170, etc. If on changing numbers of processors on a given machine, the results differ by any amount, then there is an implementation error (bug) or hardware problem. This feature has helped tremendously in trouble shooting problems with the implementation on new architectures.



John B. Drake
Wed May 15 09:51:22 EDT 1996