The HPF language in conjunction with Fortran 90 array features provides several
methods for the programmer to convey parallelism which the pghpf
compiler will detect and parallelize. Using the HPF INDEPENDENT directive HPF
provides a method to specify the degree of dependence, and thus parallelism,
between iterations in a DO loop or a FORALL. This chapter provides examples
showing how to use parallelism in an HPF program and provides examples of the
FORALL statement.
The pghpf compiler treats Fortran array expressions as parallel
expressions. Each node or processor on the parallel system will execute its
part of the computation (if the arrays associated with the left-hand-side of
the expression are distributed). Array constructs are internally converted to
an equivalent FORALL statement and then the distributed array is computed with
a FORALL statement that is parallelized by localizing array indices. For
example, the following Fortran 90 array statement is parallelized by the
compiler and produces the Fortran 77 code shown:
Y=X+1
Assuming
the array Y is distributed, the code would be generated and run
locally on each processor.
Note that calls to the pghpf runtime library routines are found in the
generated Fortran 77. Some of the tasks of the runtime routines include
generating the bounds for the index space of arrays residing on local
processors. For example, on a four processor system, a call would generate
different loops bound depending on the processor the call is made on, and the
portion of the array stored on that processor, as shown below:
! Processor 1
do i1 = 1, 4
y(i1) = x(i1) + 1
enddo
! Processor 2
do i1 = 5, 8
y(i1) = x(i1) + 1
enddo
! Processor 3
do i1 = 9, 12
y(i1) = x(i1) + 1
enddo
! Processor 4
do i1 = 13, 16
y(i1) = x(i1) + 1
enddo
The WHERE statement is Fortran 90 statement that conveys parallelism in a
manner similar to array assignment described in the previous section. The
compiler adds a conditional statement to mask the elements of the array's index
space that are assigned (or not assigned) a particular value. For example,
given that X and Y are distributed arrays, the following
WHERE statement produces code similar to the Fortran 77 output shown.
WHERE(X/=0) Y=X
call pghpf_localize_bounds(x$d1,1,1,16,1,i$l,i$u)
do i1 = i$l, i$u
if (x(i1) .ne. 0) then
y(i1) = x(i1)
endif
enddo
The
generated code is similar to the node code for an array expression, with the
addition of the conditional within the DO loop.
7.2.1 WHERE Construct Parallelism
The WHERE construct is an Fortran 90 statement that conveys parallelism with a
conditional mask for a number of statements in a block, optionally also for the
alternative to the mask condition. Due to the definition of the WHERE
construct, the code generated involves a temporary that holds an array of
logicals that specify the mask result for all local array elements. This
logical array is computed before the where block is executed.
WHERE(X/=0)
X=0
END WHERE
The FORALL statement allows specification of a set of index values and an
assignment expression utilizing the index values (or using a masked subset of
the index values). The computation involving the index values for the
assignment expression may be performed in an unspecified order on a scalar
machine, or in parallel on a parallel system. For more details on the
definition of FORALL, refer to The High Performance Fortran Handbook.
The following example shows a simple masked FORALL.
FORALL(I=1:15, I>5) X(I)=Y(I)
Note
that HPF intrinsic functions can be called from the expression part of a FORALL
statement.
The FORALL construct provides a parallel mechanism to assign values to the
elements of arrays. The FORALL construct is interpreted essentially as a series
of single statement FORALL's.
FORALL (I = 1:3)
A(I) = D(I)
B(I) = C(I) * 2
END FORALL
Many of the HPF library routines and intrinsics provide functions that are
executed in parallel. The Fortran 90 array valued intrinsics also execute in
parallel, when possible. Refer to the PGHPF Reference Manual for a list
of the HPF and Fortran 90 Intrinsics and the HPF Library Routines.
Many of the standard 3F library routines are supported on platforms running
pghpf. An include file is available, named lib3f.h that supports using
these routines. Using the lib3f.h include file, programs can call standard 3F
routines. The statement INCLUDE "lib3f.h" is required when using 3F
procedures.
Programs that use getarg() or iargc() in pghpf require
the INCLUDE statement.