PROGRAM M7 c MPI version c ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc c c cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc c c Benchmark #7 -- Bit Twiddle c c c The Problem: c c Given {A(i)}, a binary stream of length L, let {B(i)} and c {C(i)} be binary streams formed as follows: c {B(i)} = A(0),A(1),A(2),A(3),A(4),A(11),A(12),...etc. c {C(i)} = A(5),A(6),A(7),A(8),A(9),A(16),A(17),...etc. c c i.e., to form the {B(i)} stream, take 5 from {A(i)}, drop 6, c take 5, drop 6, etc., c and to form the {C(i)} stream, drop 5 from {A(i)}, take 5, c drop 6, take 5, drop 6, and so on (dropping 5 only on the c first instance). c c Calculate the stream {D(i)} defined as c c D(i) = (C(i) ^ B(i+1)) + (~C(i) ^ B(i-1)) c c where c ~X = the complement of X c XOR = exclusive "or" function c ^ = "and" function c + = can be integer add, exclusive "or", or inclusive "or". c They are all equivalent here. c c Then calculate {E(i)} where c c E(i) = C(i) XOR D(i) XOR C(i+37) XOR D(i+37) XOR c C(i+100) XOR D(i+100). c c Now locate all sequences of zeros of length greater than 100 in bit c stream {E(i)}, and output for each sequence of zeros, the c starting position of the sequence in the E stream and the c length of the sequence. c c Parameter: NUMBUFS = 100 c cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc c c Main Program for Benchmark #7 c c Call time parameter: c c NUMBUFS = Number of buffers of the A stream to process c Default = 100 buffers = 72089600000 bits c Maximum = 200 buffers = 144179200000 bits c cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc c IMPLICIT INTEGER (A-Z) c #include "bench7.h" #include c INTEGER (kind=8) A, START c c Maximum length of the A stream buffer in 64 bit words c c PARAMETER (BUFSIZE = 11 264 000) c c NOTE: Changing this buffer size may result in errors in c the check routine C7 as the answers may not be in expected c locations with a different buffer size. c c Minimum length of zeros to report. The following code assumes c that NZ > 74. See the notes in routine P7 on the relationship c between the offsets and NZ. c PARAMETER (NZ = 100) c c Default/max number of buffers of A PARAMETER (DEFBUFS = 100) PARAMETER (MAXBUFS = 200) c c max number of PEs PARAMETER (MAXPES = 64) c DIMENSION A(BUFSIZE) c Arrays to hold answers - MAXANS defined in bench7.h DIMENSION START(MAXANS), LENGTH(MAXANS), OK(MAXANS) c c Error flag array INTEGER IER(2) external IARGC, GETARG c Array for run time parameters CHARACTER*80 ARG c c Variables for timing purposes REAL (kind=8) CPUTIME,CSET,CRUN,WALL,WSET,WRUN,X0,Y0 c INTEGER PE0 INTEGER LOCNN(MAXPES) INTEGER DISPL(MAXPES) c integer status(MPI_STATUS_SIZE) c call MPI_INIT(ierror) call MPI_COMM_RANK(MPI_COMM_WORLD, mype, ierror) call MPI_COMM_SIZE(MPI_COMM_WORLD, npes, ierror) c c #ifdef SW_LEADZ CALL INITLZ #endif c #ifdef SW_POPCNT CALL INITPC #endif c c c Get input parameter - number of buffers of A-stream c NUMBUFS = 0 c #ifdef CRAY numargs = ipxfargc() if (numargs.ge.1) then call pxfgetarg(1, arg, d1,d2) read(arg,'(I5)') NUMBUFS endif #else c Not-Cray numargs = iargc() if (numargs.ge.1) then call getarg(1, arg) read(arg,'(I5)') NUMBUFS endif #endif c IF (NUMBUFS .LE. 0) NUMBUFS = DEFBUFS IF (NUMBUFS .GT. MAXBUFS) NUMBUFS = MAXBUFS c IF ( NPES .GT. NUMBUFS ) THEN IF (MYPE .EQ. 0) * PRINT 5, npes, NUMBUFS 5 FORMAT ("Sorry, you can't run with npes=", I2, * " > numbufs=", I2) CALL MPI_FINALIZE(ierror) CALL EXIT(1) ENDIF c c SEED for the random number generator in S7 ISEED = 99907 c c Initialize timing variables CSET=0.0 WSET=0.0 CRUN=0.0 WRUN=0.0 c c Initialize count of zero-runs NN = 0 c c Distribute the NUMBUFS buffers evenly among the npes PEs c ntasks = numbufs / npes extra = numbufs - ntasks * npes myfirst = mype * ntasks + 1 + min ( mype, extra ) mylast = (mype+1) * ntasks + min ( mype+1, extra ) c c Generate and process one buffer at a time - avoid I/O c DO 1000 BUFNUM = MYFIRST, MYLAST c c Generate bitstream A X0 = CPUTIME() Y0 = WALL() c #ifdef Debug print *, 'call S7', mype, bufnum #endif CALL S7 ( A, BUFNUM, ISEED ) #ifdef Debug print *, 'return S7', mype, bufnum #endif c X0 = CPUTIME() - X0 Y0 = WALL() - Y0 CSET = CSET + X0 WSET = WSET + Y0 c c Time the actual work being done in subroutine P7 X0 = CPUTIME() Y0 = WALL() c #ifdef Debug print *, 'call P7', mype, bufnum, nn #endif CALL P7 ( A, BUFNUM, MYFIRST, MYLAST, START, LENGTH, NN, * mype, npes ) #ifdef Debug print *, 'return P7', mype, bufnum, nn #endif c X0 = CPUTIME() - X0 Y0 = WALL() - Y0 CRUN = CRUN + X0 WRUN = WRUN + Y0 c 1000 CONTINUE c IF (NPES .GT. 1) THEN c c Collect all results to PE0 PE0 = 0 c c First get counts NN from each PE into array LOCNN on PE0 CALL MPI_GATHER ( NN, 1, MPI_INTEGER, * LOCNN, 1, MPI_INTEGER, * PE0, MPI_COMM_WORLD, ierror ) c c Now accumulate results in START and LENGTH on PE0 c Set up array of offsets for variable gather IF (MYPE .EQ. 0) THEN DISPL(1) = 0 DO I = 2, NPES DISPL(I) = DISPL(I-1) + LOCNN(I-1) ENDDO GLOBNN = DISPL(NPES) + LOCNN(NPES) ENDIF c CALL MPI_GATHERV ( * START, NN, MYMPI_INTEGER8, * START, LOCNN, DISPL, MYMPI_INTEGER8, * PE0, MPI_COMM_WORLD, ierror ) c CALL MPI_GATHERV ( * LENGTH, NN, MPI_INTEGER, * LENGTH, LOCNN, DISPL, MPI_INTEGER, * PE0, MPI_COMM_WORLD, ierror ) c ENDIF c c Now results are in order on PE0, which checks and prints c IF (MYPE .EQ. 0) THEN c IF (NPES .GT. 1) THEN c c FIRST check for overlap at boundary between PEs c If found, combine overlapping intervals and slide arrays down 1 place c Must loop backwards to keep boundaries in right place c DO I = NPES, 2, -1 IF ( START(1+DISPL(I)) .LE. * (START(DISPL(I)) + LENGTH(DISPL(I))) ) THEN LENGTH(DISPL(I)) = * START(1+DISPL(I)) + * LENGTH(1+DISPL(I)) - START(DISPL(I)) DO J = DISPL(I)+2, GLOBNN START(J-1) = START(J) LENGTH(J-1) = LENGTH(J) ENDDO GLOBNN = GLOBNN - 1 ENDIF ENDDO c NN = GLOBNN c ENDIF ! npes>1 c c Check the results CALL C7 ( NUMBUFS, START, LENGTH, OK, NN, IER ) c c Output results CALL R7 ( NUMBUFS, NN, START, LENGTH, OK, IER, NPES, * CSET, WSET, CRUN, WRUN ) c ENDIF ! mype = 0 c call MPI_FINALIZE(ierror) c STOP END