This chapter covers two topics that pghpf users often have questions about. These topics are HPF features whose use is not clearly defined in The High Performance Fortran Handbook.
A .mod file is currently searched for in the directories specified as follows:
1. The directory containing the file that contains the USE statement.
2. Each -I directory specified on the command-line.
3. The standard include area.
Using the -I command line option module search directories can be added to the module search path.
MODULE B
CONTAINS
FUNCTION G
.
.
.
CALL H
END FUNCTION G
SUBROUTINE H
.
.
.
END SUBROUTINE H
END MODULE B MODULE C
CONTAINS
SUBROUTINE H
.
.
.
END SUBROUTINE H
FUNCTION G
.
.
.
CALL H
END FUNCTION G
END MODULE C
For example, the code below defines an interface to the DOTP_BLK procedure:
DOTP (N, X, Y, A)The called local Fortran 77 procedure (or a C routine with matching arguments) may use the underlying communication primitives upon which the HPF runtime is based, or may use the generic pghpf send and receive routines. The set of generic routines listed in this section may be expanded in the future and is supported on all systems.
USE HPF_LIBRARY
INTEGER*4 N
REAL*8 X(N), Y(N), A
C
!HPF$ DISTRIBUTE (BLOCK) :: X
!HPF$ ALIGN (:) WITH X(:) :: Y
C
INTERFACE
EXTRINSIC (F77_LOCAL) SUBROUTINE DOTP_BLK
& (RANK, SHAPE, N, X, Y, A)
INTEGER*4, INTENT(IN) :: RANK
INTEGER*4, INTENT(IN) :: SHAPE(RANK)
INTEGER*4, INTENT(IN) :: N
REAL*8, INTENT(IN) :: X(N)
REAL*8, INTENT(IN) :: Y(N)
REAL*8, INTENT(OUT) :: A
!HPF$ DISTRIBUTE (BLOCK) :: X
!HPF$ ALIGN (:) WITH X(:) :: Y
END SUBROUTINE DOTP_BLK
END INTERFACE
C
CALL DOTP_BLK (SIZE(PROCESSORS_SHAPE()), PROCESSORS_SHAPE(),
& N, X, Y, A)
C
RETURN
END
The called local routine DOTP_BLK is shown below. Note, since the data is distributed, the called routine must determine which data it owns, and handle the communications and computations on that portion of the data.
DOTP_BLK computes the dot product of global vectors X and Y and returns the result in A on each processor. Each processor determines which portions of X and Y it owns, computes the dot product of the local portion, and then performs the necessary communication to complete the dot product on each processor.
CWhen using EXTRINSIC(F77_LOCAL), the extrinsic is an Fortran 77 program unit and must be compiled using Fortran 77 compiler rather than with pghpf. A .o file produced by compiling the extrinsic with the -c option can then be linked with the HPF calling program by including it on the pghpf link line.
C D O T P R O D U C T
C
C PURPOSE:
C To compute the dot product of two block-distributed
C double-precision vectors.
C
SUBROUTINE DOTP_BLK (RANK, SHAPE, N, X, Y, A)
INTEGER*4 RANK, SHAPE(RANK), N
REAL*8 X(*), Y(*), A
C
INCLUDE '/usr/pgi/arch/include/pglocal.f'
INTEGER PGHPF_MYPROCNUM, PGHPF_NPROCS
EXTERNAL PGHPF_MYPROCNUM, PGHPF_NPROCS
INTEGER MAXCPUS
PARAMETER (MAXCPUS = 2048)
INTEGER MYCPU, NCPUS, COORD(7)
INTEGER I, J
DOUBLE PRECISION TA(0:2047)
C
C Get my processor number and number of processors.
C
MYCPU = PGHPF_MYPROCNUM()
NCPUS = PGHPF_NPROCS()
C
C Determine processor arrangement information.
C
CALL PGHPF_PROCNUM_TO_COORD (MYCPU, RANK, SHAPE, COORD)
C
C Check for error conditions.
C
IF (RANK .NE. 1) THEN
PRINT *, "DOTP: Processor arrangement must be of rank 1"
STOP
ENDIF
IF (N .LE. 0) RETURN
C
IF (SHAPE(1) .GT. MAXCPUS) THEN
PRINT *, "DOTP: Number of CPUs must be less than:",MAXCPUS+1
STOP
ENDIF
C
C Determine how many elements reside on this processor
C
BLKSZ = (N + SHAPE(1) - 1) / SHAPE(1)
MYCT = MIN((N - MYCPU * BLKSZ), BLKSZ)
MYCT = MAX(MYCT,0)
C
C Allocate an array to hold the intermediate results and do the
C local dot product
C
TA(MYCPU) = 0.0D0
DO I = 1, MYCT
TA(MYCPU) = TA(MYCPU) + X(I) * Y(I)
ENDDO
C
C Broadcast the results to all other processors
C
IF (SHAPE(1) .GT. 1) THEN
DO I = 0, SHAPE(1) - 1
IF (I .EQ. MYCPU) THEN
DO J = 0, SHAPE(1) - 1
IF (J .NE. MYCPU) THEN
CALL PGHPF_CSEND (J,TA(MYCPU),1,1,PGLCL_REAL8)
ENDIF
ENDDO
ELSE
CALL PGHPF_CRECV (I,TA(I),1,1,PGLCL_REAL8)
ENDIF
ENDDO
ENDIF
C
C Complete global sum of intermediate results
C
A = 0.0D0
DO I = 0, SHAPE(1) - 1
A = A + TA(I)
ENDDO
C
RETURN
END
For example, if DOTP_BLK.F is the local Fortran 77 routine, compile it as follows:
% pgf77 -c DOTP_BLK.FThen compile the HPF main program and link in the extrinsic as follows:
% pghpf DOTP.hpf DOTP_BLK.oIf desired, the local routine can be compiled using pghpf with the -Mnohpfc option present on the compile line:
% pghpf -Mnohpfc -c DOTP_BLK.F
% pghpf DOTP.hpf DOTP_BLK.o
C interface:
int __hpf_nprocs()Fortran interface:
nprocs = __hpf_nprocs()
integer pghpf_nprocs
external pghpf_nprocs
nprocs = pghpf_nprocs()
C interface:
int __hpf_myprocnum()Fortran interface:
myprocnum = __hpf_myprocnum()
integer pghpf_myprocnum
external pghpf_myprocnum
myprocnum = pghpf_myprocnum()
void __hpf_procnum_to_coordFortran interface:
(int procnum, int rank, int *shape, int *coord)
integer procnum, rank, shape(rank), coord(rank)The rank and shape arguments describe the processor grid. The pghpf processor number given by procnum is translated to grid coordinates returned in coord. Grid coordinates are integers between 1 and the size of the corresponding grid dimension. If the processor number is outside the bounds of the processor grid, zeroes are returned in coord.
call pghpf_procnum_to_coord(procnum, rank, shape, coord)
int __hpf_coord_to_procnum(int rank, int *shape, int *coord)Fortran interface:
integer procnum, rank, shape(rank), coord(rank)The rank and shape arguments describe the processor grid. The processor grid coordinates in coord are translated to a pghpf processor number. Grid coordinates are integers between 1 and the size of the corresponding grid dimension. If the coordinates are outside the bounds of the processor grid, -1 is returned.
integer pghpf_coord_to_procnum
external pghpf_coord_to_procnum
procnum = pghpf_coord_to_procnum(rank, shape, coord)
The data types for the generic local communications routines for the C interface are defined in the file /usr/pgi/arch/include/pglocal.h. (where arch is your system's architecture i.e. SPARC, SGI, or some other system). The data types for the Fortran interface are defined in /usr/pgi/arch/include/pglocal.f.
C interface:
void __hpf_csend(int cpu, void *adr, int cnt, int str, int typ) void __hpf_crecv(int cpu, void *adr, int cnt, int str, int typ)Fortran interface:
integer cpu, cnt, str, typThe cpu argument is the pghpf processor number for the remote partner, adr is the local data address, cnt is the number of data items to transfer, typ is the data type, and str is the stride between each item in the local array (in item units).
integer adr(*)
call pghpf_csend(cpu, adr, cnt, str, typ)
call pghpf_crecv(cpu, adr, cnt, str, typ)
Fortran interface:
integer cpu, cnt, strThe cpu argument is the pghpf processor number for the remote partner, adr is the local data address, cnt is the number of character items to transfer, and str is the stride between each item in the local character array (in item units). Each character item is a fixed-length sequence of characters.
character*(*) adr(*)
call pghpf_csendchar(cpu, adr, cnt, str)
call pghpf_crecvchar(cpu, adr, cnt, str)
int __hpf_tid(int procnum)Fortran interface:
tid = __hpf_tid(procnum)
integer pghpf_tidTranslates the pghpf processor number to the processor identifier used by MPI.
external pghpf_tid
itid = pghpf_tid(iprocnum)
int __hpf_tid(int procnum)Fortran interface:
tid = __hpf_tid(procnum)
integer pghpf_tidTranslates the pghpf processor number to the processor identifier used by PVM (the tid).
external pghpf_tid
itid = pghpf_tid(iprocnum)