CCM/MP-2D SP3 Parallel Algorithms

Performance Studies using

CCM/MP-2D


IBM SP3-200 Winetrhawk I Parallel Algorithms for CCM/MP-2D

(Results based on June, 1999 PSTSWM Experiments)

The IBM SP3 is a distributed-memory parallel architecture utilizing high-end workstation-class processors interconnected by a bidirectional multistage switch. The SP3 used in these experiments was the first stage of a larger machine being cited at Oak Ridge National Laboratory. We used the results of our June, 1999 studies of PSTSWM performance to identify the following parallel algorithms for CCM/MP-2D on the SP3-200.

When referring to the parallel algorithms and their implementations, we use the following shorthand:

The parallel algorithms chosen for the SP3-200 experiments are listed below. Ten different algorithms were examined, four distributed FFT/distributed Legendre transform algorithms:

d0: (df0 , dl0 , ds0 , lb0)
d1: (df1 , dl1 , ds1 , lb1)
d2: (df2 , dl1 , ds0 , lb0)
d3: (df1 , dl1 , ds2 , lb2)

and six transpose FFT/distributed Legendre transform algorithms:

t0: (tf0 , dl0 , ds0 , lb0)
t1: (tf1 , dl1 , ds0 , lb0)
t2: (tf2 , dl1 , ds0 , lb0)
t4: (tf0 , dl1 , ds0 , lb0)
t5: (tf1 , dl1 , ds1 , lb1)
t6: (tf2 , dl1 , ds2 , lb2)

where the codes for the individual parallel algorithms are as follows:

Distributed FFT

Transpose FFT

Distributed Legendre Transform

Distributed semi-Lagrangian

Physics load balancing


Patrick H. Worley / ( worleyph@ornl.gov)
Last Modified Monday, 15-Jul-2002 10:17:29 EDT.
81592 accesses since 1/2/96.