7 pghpf Parallelism

The HPF language in conjunction with Fortran 90 array features provides several methods for the programmer to convey parallelism which the pghpf compiler will detect and parallelize. Using the HPF INDEPENDENT directive HPF provides a method to specify the degree of dependence, and thus parallelism, between iterations in a DO loop or a FORALL. This chapter provides examples showing how to use parallelism in an HPF program and provides examples of the FORALL statement.

7.1 Array Assignment Parallelism

The pghpf compiler treats Fortran array expressions as parallel expressions. Each node or processor on the parallel system will execute its part of the computation (if the arrays associated with the left-hand-side of the expression are distributed). Array constructs are internally converted to an equivalent FORALL statement and then the distributed array is computed with a FORALL statement that is parallelized by localizing array indices. For example, the following Fortran 90 array statement is parallelized by the compiler and produces the Fortran 77 code shown:
	Y=X+1
Assuming the array Y is distributed, the code would be generated and run locally on each processor.

Note that calls to the pghpf runtime library routines are found in the generated Fortran 77. Some of the tasks of the runtime routines include generating the bounds for the index space of arrays residing on local processors. For example, on a four processor system, a call would generate different loops bound depending on the processor the call is made on, and the portion of the array stored on that processor, as shown below:


! Processor 1
do i1 = 1, 4
   y(i1) = x(i1) + 1
enddo


! Processor 2
do i1 = 5, 8
   y(i1) = x(i1) + 1
enddo


! Processor 3
do i1 = 9, 12
   y(i1) = x(i1) + 1
enddo


! Processor 4
do i1 = 13, 16
   y(i1) = x(i1) + 1
enddo

7.2 WHERE Statement Parallelism

The WHERE statement is Fortran 90 statement that conveys parallelism in a manner similar to array assignment described in the previous section. The compiler adds a conditional statement to mask the elements of the array's index space that are assigned (or not assigned) a particular value. For example, given that X and Y are distributed arrays, the following WHERE statement produces code similar to the Fortran 77 output shown.
	WHERE(X/=0) Y=X

	call pghpf_localize_bounds(x$d1,1,1,16,1,i$l,i$u)
	do i1 = i$l, i$u
	   if (x(i1) .ne. 0) then
	      y(i1) = x(i1)
	   endif
	enddo
The generated code is similar to the node code for an array expression, with the addition of the conditional within the DO loop.

7.2.1 WHERE Construct Parallelism

The WHERE construct is an Fortran 90 statement that conveys parallelism with a conditional mask for a number of statements in a block, optionally also for the alternative to the mask condition. Due to the definition of the WHERE construct, the code generated involves a temporary that holds an array of logicals that specify the mask result for all local array elements. This logical array is computed before the where block is executed.
        WHERE(X/=0)
          X=0
        END WHERE

7.3 FORALL Statement and Construct Parallelism

The FORALL statement allows specification of a set of index values and an assignment expression utilizing the index values (or using a masked subset of the index values). The computation involving the index values for the assignment expression may be performed in an unspecified order on a scalar machine, or in parallel on a parallel system. For more details on the definition of FORALL, refer to The High Performance Fortran Handbook. The following example shows a simple masked FORALL.
FORALL(I=1:15, I>5) X(I)=Y(I)

Note that HPF intrinsic functions can be called from the expression part of a FORALL statement.

The FORALL construct provides a parallel mechanism to assign values to the elements of arrays. The FORALL construct is interpreted essentially as a series of single statement FORALL's.

	FORALL (I = 1:3) 
		A(I) = D(I)
		B(I) = C(I) * 2
	END FORALL

7.4 Library Routines and Intrinsics

Many of the HPF library routines and intrinsics provide functions that are executed in parallel. The Fortran 90 array valued intrinsics also execute in parallel, when possible. Refer to the PGHPF Reference Manual for a list of the HPF and Fortran 90 Intrinsics and the HPF Library Routines.

7.4 3F Library Routines

Many of the standard 3F library routines are supported on platforms running pghpf. An include file is available, named lib3f.h that supports using these routines. Using the lib3f.h include file, programs can call standard 3F routines. The statement INCLUDE "lib3f.h" is required when using 3F procedures.

Programs that use getarg() or iargc() in pghpf require the INCLUDE statement.