next up previous
Next: Parallel Legendre Transform Up: PARALLEL ALGORITHMS Previous: The Spectral Transform

Data Decompositions

 

In the spectral transform algorithm, computations are performed in both the physical and spherical harmonic (or spectral) domains, and transforming from one domain to the other involves passing through the Fourier domain, whose coordinates are Fourier wavenumber and latitude coordinates. Thus, we must be concerned with the distribution of data in three domains.

In specifying the domain decompositions, the multiprocessor is viewed as a logical two dimensional processor grid. (P and Q are currently compile-time parameters for PCCM2.) For the physical domain, the latitudinal dimension is partitioned into 2Q intervals, each containing consecutive grid lines along the latitude axis. Each processor row is assigned two of these intervals, one from the northern hemisphere, and the reflected latitudes in the southern hemisphere. This assignment allows symmetry to be exploited in the Legendre transform. The assignment also restricts Q, the number of processor rows, to be no larger than .

The longitudinal dimension is partitioned into P equal intervals, with each interval being assigned to a different processor column. The resulting ``block'' decomposition of the physical domain is illustrated in Fig. 1 for a small example.

Figure 1,

The decomposition of the (a) physical, (b) spectral, and (c) Fourier domains over a grid of processors. For figures (a) and (b), each small cell represents a data item. The thicker lines show the boundaries between processors. The circles contain the processor coordinates. The shaded cells in figure (c) represent the spectral coefficients to be included in the spectral transform, and shows how these are decomposed over processor.

The Fourier domain can be regarded as a wavenumber-latitude grid, so, like the physical domain, the Fourier domain is two-dimensional. However, a different decomposition is used. The differences arise because of the way in which the FFT algorithm permutes the ordering of the output Fourier coefficients [36]. But, modulo this reordering, the wavenumber ``dimension'' is partitioned into P sets of consecutive wavenumbers, with each set being assigned to a different processor column. The partitioning function in the latitude direction is the same as in the physical domain. See Fig. 1 for an example decomposition.

The spectral domain can also be regarded as two dimensional. For example, for a triangular truncation, the domain is a triangular grid whose axes are wavenumber and degree of associated Legendre polynomial (n). The wavenumber ``dimension'' is partitioned and assigned to processors exactly as for the Fourier domain, i.e. the wavenumbers are reordered, partitioned into consecutive blocks, and assigned to the processor columns. But, unlike the physical and Fourier domains, the remaining dimension in the spectral domain is not partitioned. Instead, all spectral coefficients associated with a given wavenumber are duplicated across all processors in the processor column to which that wavenumber was assigned. It is this duplication that allows the vector sum algorithm described below to be used. Again, see Fig. 1 for an example decomposition.

Note that in a triangular truncation, the number of spectral coefficients associated with a given Fourier wavenumber decreases as the wavenumber increases. Without the reordering of the wavenumbers caused by the FFT, this would cause a noticeable load imbalance, with processor columns associated with larger wavenumbers having very few spectral coefficients. The reordering of the wavenumbers leads to a much better load balance.



next up previous
Next: Parallel Legendre Transform Up: PARALLEL ALGORITHMS Previous: The Spectral Transform



John B. Drake
Wed May 15 09:51:22 EDT 1996