The Portland Group 9150 SW Pioneer Court, Suite H Wilsonville, Oregon 97070
While every precaution has been taken in the preparation of this document, The
Portland Group, Inc. makes no warranty for the use of its products and assumes
no responsibility for any errors which may appear, or for damages resulting
from the use of the information contained herein. The Portland Group, Inc.
retains the right to make changes to this information at any time, without
notice. The software described in this document is distributed under license
from The Portland Group, Inc and may be used or copied only in accordance with
the terms of the license agreement. No part of this document may be reproduced
or transmitted in any form or by any means, for any purpose other than the
purchaser's personal use without the express written permission of The Portland
Group, Inc. Commercial uses are strictly prohibited.
PGI, pghpf, pgf77, pgcc, pgprof, and pgdbg
are trademarks of The Portland Group, Inc. Other brands and names are the
property of their respective owners.
pghpf Version 2.0 Release Notes Copyright (c) 1995 The Portland Group,
Inc. All rights reserved.
Printed in the United States of America
Printing History
October 1995: First Printing
January 1996: Second
Printing
Part Number: 2401-990-990-1095
Phone: (503) 682-2806
Fax: (503) 682-2637
e-mail: trs@pgroup.com
This document describes important issues relating to pghpf Version 2.0.
This section briefly lists the features of pghpf Version 2.0.
Most features of full HPF are included, with the few exceptions noted in these
release notes. If you encounter any HPF feature that is not supported, and not
listed in Section 6, "Restrictions and Omissions - pghpf 2.0", you
should consider it a bug and report it to the e-mail address
trs@pgroup.com. Supported HPF language features include:
- Multi-dimensional block-cyclic distributions, including support for
BLOCK(k) and CYCLIC(k) distributions.
- Number of processors determined at run time.
- Parallelization of array assignments, FORALL statements and
FORALL constructs.
- Partial support for the INDEPENDENT directive for DO
loops. Calls to procedures within independent loops are inlined and
parallelized (if the -Minline option is supplied).
- Parallelization of parallelizable DO loops which operate on
distributed data using the -Mautopar option.
- Support for Fortran 77 sequence/storage association by directive or
compiler option.
- Support for array constructors.
- Support for all HPF_LIBRARY routines.
- Extrinsic support for Fortran 77 local routines.
EXTRINSIC(F77_LOCAL).
- Support for the HPF directives REALIGN, REDISTRIBUTE and
DYNAMIC.
- Compiler informational messages using the command line option
-Minfo.
- Enhanced support for MODULEs.
- Support for the Fortran 90 features: Derived types, the SELECT
CASE construct, and the KIND keyword.
Other features
include:
- A self-contained compile-and-go usage model, along with support for the
widely used communication packages PVM and MPI.
- A cpp-like preprocessing capability that allows conditional compilation.
- Limited support for CM Fortran compatibility. Where there is no chance for
ambiguity, pghpf supports both the CM Fortran and HPF spellings of
common Fortran 90 features (refer to the option -Mcmf for details).
1.1 Release 2.0 Changes from Previous Release 1.3
Fortran 90 Additions and Changes
Fortran 90 derived-types are supported in pghpf 2.0. The Fortran
keywords TYPE and END TYPE are supported.
The Fortran CASE construct is now supported. The keywords
SELECT CASE, CASE, CASE DEFAULT
and END SELECT are now supported.
Support for MODULEs has been significantly enhanced from that
available in the previous version of pghpf. Generic procedures are now
supported. For details refer to Section 8, "Modules ".
The random number generator intrinsics RANDOM_NUMBER and
RANDOM_SEED have been rewritten to provide, for a given seed,
including the default seed, a generated sequence that is independent of the
platform and number of processors used. The new, pghpf 2.0 random number
intrinsics should be much faster than those provided with pghpf 1.3, and
they replace all patch versions sent out subsequent to pghpf 1.3. For
details refer to Section 11, "Random Number Generation".
A program not containing a PROGRAM statement will now have a
PROGRAM statement added in the generated Fortran 77 code. The name of
the program will be unnamed$main in the Fortran 77 code.
The search rules for the Fortran 90 INCLUDE and USE statements have changed.
The directories where include files or modules may be found are the
following:
1. Each -I directory specified on the command-line.
2. The directory containing the file that contains the INCLUDE/USE
statement (the current working directory.)
3. The standard include area.
HPF Additions and Changes
The HPF REALIGN, REDISTRIBUTE, and DYNAMIC
directives are now supported. The pghpf runtime now supports dynamic
realignment and redistribution.
The HPF directive INDEPENDENT is now supported for DO loops.
INDEPENDENT will operate on all DO loops without procedure
calls and on loops where the procedure calls can be inlined. The
-Mautopar option is no longer needed to parallelize INDEPENDENT
DO loops. The pghpf option -Minline is required to
inline procedures in INDEPENDENT DO loops. For more details
on using the -Minline option and the INDEPENDENT directive,
refer to Section 9, "INDEPENDENT DO Loops".
The HPF Library: GRADE_UP(ARRAY,DIM) and
GRADE_DOWN(ARRAY,DIM) now support types other than
INTEGER. These functions are not operational for CYCLIC
distributions. Also, for these routines, the DIM argument is required.
In pghpf 1.3, there were additional restrictions on these routines.
Other Additions and Changes from pghpf 1.3 to pghpf 2.0
Support for the MPI communications library has been added to the compiler.
Refer to Section 10, "MPI Runtime" for more details on the MPI runtime
library.
A new include file is available, named lib3f.h. Using the lib3f.h
include file, programs can call standard 3F routines available on most
platforms. The statement:
INCLUDE "lib3f.h"
is
required when using 3F procedures.
Programs that use getarg() or iargc() in pghpf 2.0
require the INCLUDE statement. This was not required to use
these routines in pghpf 1.3
The compiler option -Mnoautopar has been added. This option disables
automatic parallelization of INDEPENDENT DO loops.
The compiler has been modified to issue a severe error if a reference occurs to
an intrinsic function/subprogram and the number of arguments is incorrect or if
the types of the arguments are incorrect. Most of the Fortran 90 and HPF
intrinsics are checked for correctness.
The version of the pghpf 2.0 FLEXlm license manager software has
changed. The new version is FLEXlm v4.1, Copyright 1988-1995, Globetrotter
Software, Inc.
For more information on the license manager, contact the following web address:
http://www.globetrotter.com/faq.html
Once pghpf has been installed, use the following steps to get started
using the compiler (this assumes you are using csh or a variant of
csh; for other shells the commands may differ). Assume that the compiler
has been installed in the directory /usr/pgi on your system, the
target is platform (for example, RS6000, SOLARIS, HP, SGI, etc.), and
that a valid license.dat file has been placed in /usr/pgi:
% setenv PGI /usr/pgi
% set path=($PGI/platform/bin $path)
% setenv LM_LICENSE_FILE $PGI/license.dat
You
should now be able to compile and run HPF programs as follows:
% pghpf hello.hpf
% a.out options -pghpf pghpf_options
If
you wish to link and run with a version of the pghpf runtime other than
the default for your system, refer to the pghpf User's Guide for more
details.
The pghpf 2.0 compiler performs many optimizations. Some of these
optimizations are only available using higher levels of optimization (-O1
or -O2 on the compiler command-line). These optimizations include:
- 1.
- Generation of collective shift communication calls in the presence of
appropriate indexing patterns and for CSHIFT calls. For example, the
compiler inlines CSHIFT calls when the DIM argument is a
compile-time constant.
- 2.
- Generation of overlap shift communications in the presence of appropriate
indexing patterns, and for CSHIFT calls. This optimization involves
generation of overlap shift communication when certain compile-time
specifications are met. For example:
INTEGER, DIMENSION(N,N) :: A,B
!HPF$ DISTRIBUTE (BLOCK,BLOCK) :: A,B
A = CSHIFT(B, DIM=1, SHIFT=2)
-
- For this example, a temporary will not be created, and data requiring
communication will be communicated through an overlap area.
- 3.
- Generation of collective regular communication calls. For example:
INTEGER, DIMENSION(N,N) :: A,B
!HPF$ DISTRIBUTE (BLOCK,BLOCK) :: A,B
FORALL(I=1:N,J=1:N) A(I,J) = B(J,I)
-
- This example will generate a call to a runtime routine that handles
permutations of axes for communications.
- 4.
- Generation of collective irregular communication calls in the presence of
indexed array assignments or FORALL. For example:
INTEGER, DIMENSION(N,N) :: A,B,C
!HPF$ DISTRIBUTE (BLOCK,BLOCK) :: A,B,C
FORALL(I=1:N,J=1:N) A(C(1,I),C(2,J)) = B(J,I)
-
- This scatter array access is recognized and a scatter communication call is
generated.
- 5.
- Sharing of runtime data descriptors for arrays of like size and shape that
are identically aligned to a common template. For example:
INTEGER, DIMENSION(N,N) :: A,B
!HPF$ DISTRIBUTE (BLOCK,BLOCK) :: A,B
-
- One descriptor is created and shared for both arrays A and
B. The compiler automatically aligns A and B to the
same template. This alignment occurs even though the programmer did not align
A and B with each other or align them to a common template.
- 6.
- The compiler uses INTENT information to eliminate unnecessary
copying of arguments at subroutine boundaries. Note, many Fortran 90 compilers
currently ignore INTENT statements. If you compile code containing
erroneous INTENT statements, your program may fail under
pghpf.
- 7.
- Common runtime call elimination across basic blocks. For example:
INTEGER, DIMENSION(N,N) :: A,B,C,D
!HPF$ DISTRIBUTE (BLOCK,BLOCK) :: A,B,D
!HPF$ DISTRIBUTE (CYCLIC,CYCLIC) :: C
A(:,:) = C(:,:)
B(:,:) = C(:,:)
-
- This code will generate a single runtime communication sequence, so any
communication involved with the use of C will only happen once.
- 8.
- Sharing of communications schedules including schedules generated for
irregular communications. For example with arrays as above and an indirection
array V:
A = B(V)
C = D(V)
-
- This sequence could generate two gather communication sequences, one for
B(V) and one for D(V). However, the compiler creates a single
schedule for the communications since they both use the same communication
pattern. This reduces the overhead of scheduling computation, especially in
nested loops.
- 9.
- Fusing is performed for Fortran 90 array assignments and FORALL
statements. For example, with this optimization, the statements for arrays
A, B, C and D will be fused in the same
generated loop. Without this optimization, several sequences of loops will be
generated for these statements:
A = B
C = D
- 10.
- Invariant communication calls are hoisted out of loops. For example:
DO I = 1,N
A(I) = B(1) + A(I)*2
CALL FOO(A(I)
END DO
-
- The communication of B(1) is loop invariant and will be hoisted
out of the loop.
The following points may be helpful in enabling you to
obtain the best possible performance with the 2.0 release:
- If allocatable arrays are not aligned to the same template, pghpf
may not be able to recognize efficient communication patterns even though the
arrays appear to be distributed in the same way. It is best to align arrays to
a common template wherever possible.
- In general cyclic and block-cyclic distributions do not perform well using
pghpf 2.0. For better performance, do not distribute your arrays
with this distribution. For most problems, CYCLIC or
CYCLIC(K) distributions will be less efficient than BLOCK or
BLOCK(K) distributions.
- The operations performed in parallel include: array assignments,
FORALL statements, FORALL constructs, WHERE
statements and constructs, many of the HPF_LIBRARY procedures and the
transformational intrinsics and parallelizable INDEPENDENT DO
loops (and all parallelizable DO loops using -Mautopar).
- By default, the INDEPENDENT directive currently only parallelizes
DO loops without calls. If you supply the option -Minline,
INDEPENDENT DO loops with inlinable calls will be
parallelized.
- With the -Mautopar option, parallelizable DO loops which
operate on explicitly distributed data are parallelized. DO loops may
also be expressed as array assignments or FORALL statements to realize
parallel execution.
- Using the -Minfo option on the command line, the compiler provides
information about expensive communications. Complicated array subscripts lead
to poor communication patterns. In particular, if pghpf cannot derive an
available collective communication pattern it will generate a scalarized loop.
The resulting code will in general be unacceptably slow, particularly if the
loop is nested. The compiler will flag these situations with compile-time
warning messages if you use the -Minfo option on the command
line.
Debugging programs developed under pghpf 2.0 can be difficult. No HPF
debugger is currently provided. However, if necessary the programmer can debug
the generated SPMD Fortran 77 program on each node using multiple X windows.
This can be particularly useful in obtaining a traceback on a program that is
crashing unexpectedly.
To prepare an HPF program for debugging, use the -Mg compile-time option
to pghpf. The generated Fortran 77 output will be saved and the Fortran
77 node compiler will be invoked with the -g compile-time option to
provide symbolic information in the image file. If you wish to execute the
program on only a single processor, use the -Mrpm1 compiler option when
linking the program (this is not available on all platforms). You can then use
a standard debugger on the image file.
If you need to execute the program on multiple processors, the following
sequence is useful:
- Create a shell script, say xdbg, that spawns a debugger in an X
window. For example:
#!/bin/sh
exec xterm -e dbx $1
- Set the environment variable PGHPF_DEBUGGER to the name of the
script; make sure the script is in your path or that you use the full pathname:
% setenv PGHPF_DEBUGGER xdbg
- Execute the HPF program using the -pghpf -g all
option. For example:
a.out -pghpf -np 2 -g all
will create
two xterm windows running dbx on the program.
Another debugging
technique is to use the compiler command-line option -Mprof=lines, and
then run the program. If a control-C is used, a traceback will usually
result.
Similar functionality is possible using PVM.
PGI provides a graphical HPF profiling tool, pgprof, which allows
function and line level profiling of HPF programs. Refer to the pgprof
User's Guide for more information.
Profiling can be also be performed by hand by inserting calls to an appropriate
timing function on each platform and including the appropriate timing
libraries.
The following is a list of restrictions that apply to pghpf 2.0. Some of
these restrictions are known bugs, others are Fortran 90 features that are not
yet implemented, and others are HPF features that are not yet implemented.
6.1 Known Restrictions - Version 2.0
- The HPF_LIBRARY routines GRADE_UP and
GRADE_DOWN require a DIM argument; these routines do not
support cyclic distributions.
- A new include file is available, named lib3f.h. Using the
lib3f.h include, programs can call standard 3F routines available on
most platforms. The statement: INCLUDE "lib3f.h"
- is required when using 3F procedures. Programs that use
getarg() or iargc() in pghpf 2.0 require the
INCLUDE statement. This was not required to use these routines
in pghpf 1.3 This is a change from pghpf 1.3, where getarg()
and iargc() could be used without a USE statement.
- The compiler command-line option -g is a beta feature for debugger
developers. Details of this option are available by contacting PGI at
sales@pgroup.com. This option creates files named with a
.stb extension. For a file filename.hpf, the -g option
creates a file named filename.stb in the current directory.
- An object of derived type cannot be initialized with a DATA statement; use
the f90-style form of initializing an object.
- HPF mapping directives cannot be used with an object of derived type or
any of its components.
- A structure component may not be declared as an allocatable
array.
Finally, where a and b are automatic arrays and
they have extents n and m that are equal pghpf
currently can not conclude that n and m are equal. If the
prgogrammer gives them same extent, pghpf may perform more
optimizations. For example:
subroutine foo(a,b)
common /c1/ n,m
integer, dimension(n) :: a
integer, dimension(m) :: b
!hpf$ distribute (block) :: a,b
a(:) = b(:)
end
The
assignment a(:) = b(:) says that a and b must be equal sized arrays
because of the conformbility rule. If either n or m is used
in the declaration for a and b, additional optimizations will
be performed, as compared with the code shown above.
Module Restrictions
Refer to Section 8.2 "Limitations on Modules" for a description of
MODULE restrictions for pghpf 2.0.
INDEPENDENT DO Restrictions
Refer to Section 9 "INDEPENDENT DO Loops" for a description of the pghpf
2.0 INDEPENDENT directive implementation.
6.2 Omissions - Version 2.0
This section lists Fortran 90 and HPF features that are omitted from
pghpf 2.0.
Fortran 90 Language Omissions
- Internal Procedures are not currently supported in pghpf 2.0.
- The Fortran 90 intrinsic ASSOCIATED is not yet supported.
- Fortran 90 pointers are not supported and the keywords POINTER
and TARGET are not supported.
- The Fortran 90 character array language is not supported.
- Fortran 90 recursion is not supported.
- Function returning CHARACTER *(*) is not supported.
HPF Language Omissions
- The PURE keyword is not supported in pghpf 2.0.
- The compile-time warning message:
PGHPF-W-3011-Non-replicated mapping for character/struct/union array,
char_table, ignored (file.F: lineno)
- may occur. The pghpf 2.0 release ignores distribution
directives applied to character or record types, derived types,
data-initialized arrays, arrays subject to SAVE, arrays subject to a
SEQUENCE directive, and NAMELIST arrays. Support for legal
distributions of these types of arrays will be provided in upcoming releases.
- The INDEPENDENT directive has limited support in this release. In
particular, INDEPENDENT FORALL loops are recognized, but the
INDEPENDENT directive does not change the compiler's behavior. The
NEW clause is recognized in an INDEPENDENT directive, but it
does not change the behavior of the compiler.
6.3 Extrinsics Changes
Generic Extrinsic Routines - f77_local Changes
In pghpf 2.0, there is a change in the behavior of pghpf_csend
and pghpf_crecv. This change may affect existing f77_local
message-passing routines. For performance reasons, data transferred by
pghpf_csend and pghpf_crecv may not be buffered as in the
past, so programs that used to run under pghpf 1.3 may hang with release
2.0. The solution is to change the f77_local routine so that processors "pair
off" when exchanging messages, when one processor calls pghpf_csend
the partner processor must call pghpf_crecv. A simple way to decide
who sends first is to compare the processor numbers, for example:
old:
call pghpf_csend(partner, x, ...)
call pghpf_crecv(partner, y, ...)
new:
me = pghpf_myprocnum()
if (partner .lt. me) then
call pghpf_csend(partner, x, ...)
call pghpf_crecv(partner, y, ...)
else
call pghpf_crecv(partner, y, ...)
call pghpf_csend(partner, x, ...)
end
Note
that pghpf_csend and pghpf_crecv did not and still do not
allow a processor to send a message to itself. The code must handle this case
if it can arise in the user's algorithm. For example, the preceding example
could be extended as shown here:
me = pghpf_myprocnum()
if (partner .eq. me) then
y = x
else if (partner .lt. me) then
call pghpf_csend(partner, x, ...)
call pghpf_crecv(partner, y, ...)
else
call pghpf_crecv(partner, y, ...)
call pghpf_csend(partner, x, ...)
end
6.4 System Specific Notes
CRAY T3D Runtime
The execution of a T3D program depends on the policies of the host site. In
general, programs are executed with:
%a.out mppexec_opt user_opt -pghpf HPF_opt
The
mppexec options are described in the mppexec(1) man page. The
-npes option is required and specifies the number of processors. The
number of processors must be a power of 2.
The only supported HPF options are -stat and -np. The HPF
-np option may be specified to reduce the number of processors from
the value specified by the -npes option. The use of the -np
option is not recommended as the unused processors are not available for other
uses.
CRAY T3D Profiling
The profiler, pgprof, is not currently supported on the CRAY T3D.
However, CRAY T3D programs can be compiled and run with the -Mprof
options. The resulting pgprof.out file can be analyzed on any supported
workstation platform.
CRAY T3D Compiler Options
There is a compiler option that is only available on the T3D. The option is
%pghpf -Ojump file1.hpf
The
-Ojump switch will pass -Wf,"-ojump" to the T3D Fortran 77 compiler and link a
version of the runtime library compiled with "-h jump". See the documentation
on "-h jump" for the T3D C compiler for more details.
IBM SP2 Runtime
The MPI implementation used by pghpf is the IBM version (MPI-F 1.41).
The execution of a SP2 MPI program depends on the policies of the host site.
For example, programs could be executed with:
%mpirun -np numberofprocs a.out user_opt -pghpf HPF_opt
The
only supported HPF option is -stat. The HPF -np option is not
supported.
For the IBM SP2, the MPL communications library is also available. To use the
MPL library, include the option -Mmpl on the compiler command line (this
is loaded when linking occurs).
SGI Linking
Due to a known bug in the IRIX 6.0 linker, some HPF programs may fail to link
and could produce the following link-time error:
ERROR 104: GOT page/offset relocation out of range: x.o
ERROR 104: GOT page/offset relocation out of range: x.o
where
x.o is one of the object files being linked. This problem should not
occur with the version of the linker included in IRIX 6.1. No known workaround
is available.
Convex Exemplar
The pghpf compiler, running on the Convex Exemplar, requires the
permissions on /dev/lan0 to be 0666 for licensing to work correctly.
Using these permissions leads to possible security implications for the site.
The Convex Exemplar PVM runtime implementation has a limited buffer size. This
may cause programs to fail. The buffer size can be increased. Refer to your
system administrator or the bugs section in the PVM Readme.mp file for
more information.
When compiling extrinsic routines, the Fortran 77 compiler option +ppu
should be used. This option appends underscores at the end of definitions of
and references to externally visible symbols. Since the caller appends
underscores for extrnisc names, the callee extrinsic needs this option when it
is compiled.
Solaris Systems
The installation directory for Sun Solaris systems was /usr/pgi/sparc in
version 1.3 and previous versions of pghpf. For version 2.0, the default
directory has changed to /usr/pgi/solaris.
Intel Paragon Systems
The pghpf 2.0 release supports cross development from various systems to
Intel Paragon systems. To support this cross development environment, several
variables need to be set. The environment variable PARAGON_XDEV needs
to be set to use the Intel tools. This should be one directory above the
Intel-supplied paragon directory. Intel's documentation should provide
information on how to do this.
For example:
setenv PARAGON_XDEV /usr/local
The
environment variable PGI needs to be set:
setenv PGI /usr/local/paragon/pgi
Then
two elements need to be added to the path:
set path=($PARAGON_XDEV/paragon/bin.<arch> \
$PGI/pgon/bin.<arch> $path)
Where
<arch> is the architecture on which the compilation is
performed. Choices for arch include: sgi, solaris,
and sun4, among others.
This section briefly lists the bugs fixed from release 1.3 to 2.0.
- Declarations of common variables for adjustable arrays were output after
the array declarations, resulting in a warning message from the Fortran 77
compiler if IMPLICIT NONE was present in the program. For example:
subroutine sub(a)
implicit none
common /c/ n
integer n
character*8 a(n)
end
- This problem is fixed in pghpf 2.0 (tpr861).
- In previous versions of pghpf, a character literal or named
constant could not appear in a substring reference; this problem is fixed in
pghpf 2.0 (tpr 905).
- There was a problem in pghpf 1.3 where, in free form input, double
precision was not recognized properly. This problem has been fixed (tpr903).
- If a format label appeared in an ASSIGN statement, the compiler
generated the %loc builtin . However, %loc is nonstandard
Fortran 77, and this has been replaced by a call to a pghpf library
routine (tpr908).
- Allocatable arrays were not supported in modules. This is fixed in
pghpf 2.0 (tpr912 and tpr926).
- Optional vector arguments were handled incorrectly and calling a routine
with them gave a runtime error message. This is fixed in pghpf 2.0
(tpr918).
- Passing replicated assumed-shape array to extrinsic did not work in
pghpf 1.3. This bug is fixed (tpr920).
- Using freeform input, -Mfreeform, the compiler did not recognized
an END FORALL unless there was no blank separating the
END and the FORALL. This bug is fixed (tpr924).
- The format of list directed writes has been modified. By default, in
previous versions list directed output of the REAL and DOUBLE
PRECISION value 0.0 was treated as a special case, with a different number
of blank spaces being output. In pghpf 2.0, the G edit descriptor is
used for REAL and DOUBLE PRECISION 0.0, as is the case with
other REAL values. (tpr941).
- In release 2.0, pghpf detects uses of intrinsic functions whose
arguments do not match in kind or number as severe errors (tpr 943).
- When possible, the compiler now evaluates uses of the size,
ubound and lbound intrinsics at compile-time. Without
compile-time evaluation of these intrinsics, appearances in initialization
expressions would result in the compiler detecting 'non-constant expression'
errors. Appearances in specification expressions would result in errors at
execution time .
- In release 2.0, the compiler will also detect an error when the dimension
specifier is a constant whose value is equal to the rank of the array. If the
dimension specifier is a non-constant expression, the run-time will check for
the error
(tpr 961, tpr962, tpr 963).
- A number of bugs reported with the pghpf 1.3 runtime have been
fixed.
- A number of bugs reported when using the compiler command line option
-Mautopar have been fixed.
- In earlier versions of pghpf the EOSHIFT BOUNDARY
argument was not implemented. This restriction has been removed from
pghpf 2.0.
- GRADE_UP and GRADE_DOWN now support REAL
numbers.
- In earlier versions of pghpf the HPF_LIBRARY routines
xxx_PREFIX and xxx_SUFFIX did not support cyclic
distributions. This restriction has been removed from pghpf 2.0.
- In pghpf 1.3 there was a problem with list-directed I/O using
complex numbers. This problem caused message-passing to get out of sync. This
bug is fixed in pghpf 2.0.
- The following bugs are fixed only in pghpf 2.0.2 (build two
of pghpf 2.0).
- Symbol redefinition error when a subprogram name is use associated. In the
release 1 version of pghpf 2.0, when a subprogram uses a module which
contains an explicit interface of the subprogram, the compiler issued a severe
error
(tpr 982).
- Using the -Mautopar option, some uses of the PARAMETER statement
caused copy propagation problems. (tpr 990).
- At times, the compiler generated non-portable calls to the intrinsic
JMAX0.
(tpr 993).
- A problem with the pghpf driver and the use of the -g option which
caused the compiler to crash has been fixed. (tpr 994).
- The compiler was incorrectly creating statement labels that could conflict
with user specified statement labels. This problem is fixed. (tpr 995).
- When the value of a named real or double precision constant is negative,
the code generated by pghpf for a unary negate of the named constant
was reported as an error by the underlying Fortran 77 compiler if the compiler
did not allow the extension of adjacent unary operators. This problem is fixed.
(tpr 996).
- Blanks before the !HPF$ prefix were not ignored in fixed form (tpr 997).
- Incorrect code was sometimes generated for FORALL statements using the
intrinsic 'count' in the forall triplet. The count call was passed through in
the output(tpr 998).
- In some cases subscripted mapped arrays using vector subscripts generated
incorrect code. (tpr 999).
- When a distributed array was SAVEd, in the generated program its local
lower and upper bounds were not saved. (tpr 1000).
- A problem where the MODULO() Intrinsic gave incorrect results in
some cases was fixed (tpr 1001).
- A problem with use of the SAVE statement and local allocatable
arrays has been fixed. (tpr 1003).
- A problem with modules and allocatable arrays specifically on T3D systems
has been fixed. (tpr 1004).
- The module reader was not rewriting correctly when using array
constructors as initializers and the internal compiler error
PGHPF-S-0000-Internal compiler error. sym_of_ast:unexp.ast was generated. This
problem is fixed(tpr 1009).
- A problem ELSE WHERE and freeform format has been fixed: a space
between ELSE and WHERE is now accepted in freeform format
(tpr 1010).
- A problem with use of the ONLY clause and modules has been fixed.
(tpr
1016).
- A problem with level -O2 optimizations causing incorrect results for some
programs has been fixed. (tpr 1017).
- The compiler was issuing an error message when public or private were
specified for objects of derived type or to derived types. The public &
private access attributes arenow allowed to be applied to a derived type and
an object of derived type(tpr 1018).
- A problem with the compiler cause the following error to be produced with
some programs containing the RESHAPE intrinsic: PGHPF-S-0074-Illegal number or
type of arguments to reshape. This problem has been fixed. (tpr 1030).
- The public & private access attributes are allowed to be applied to
intrinsics. This was not the case in previous releases of pghpf. (tpr
1031).
- Use of the DOT_PRODUCT function with array sections caused the following
runtime error message to be produced: : COPY_IN: nonconforming alignment. This
problem is fixed. (tpr 1043).
The compiler supports Fortran 90 modules. Modules can be independently compiled
and used within programs using the USE statement. Use of Fortran 90
modules causes the compiler to create a filename.mod file in the current
directory ( a .mod file). This file contains all the information the
compiler needs concerning interface specifications and the data types for the
routines defined in the module. When a program, routine, or another module
encounters the USE statement, the .mod file is read and
"included" in the program, using the scope rules defined in Fortran 90 for
USE association. If you are using separate modules, this creates
another step in the program development process. When a module is compiled,
both a .mod and a .o file are created. The .mod file is
used when a USE statement is encountered, and the .o file is
used when the program is loaded.
For example, if module1.hpf contains a module with several procedures,
and test1.hpf contains a USE statement that uses
module1, the compilation would involve the steps.
% pghpf -c module1.hpf
% pghpf -otest1 test1.hpf module1.o
A
.mod file is searched for in the following directories:
1. Each -I directory specified on the command-line.
2. The directory containing the file that contains the USE statement
(the current working directory.)
3. The standard include area.
Using the -I command-line option directories can be added to the search
path for .mod files.
Note that if you currently have .mod files created with pghpf 1.3
and you are upgrading to pghpf 2.0, you will need recreate the
.mod files using pghpf 2.0.
8.1 Modules with Generic Interfaces
The 2.0 version of pghpf now supports modules with generic procedures.
For example, the following is a valid module which defines FUNC_ANY
taking either a real or an integer argument.
MODULE A
INTERFACE FUNC_ANY
MODULE PROCEDURE FUNC1
MODULE PROCEDURE FUNC2
END INTERFACE
CONTAINS
FUNCTION FUNC1(R)
INTEGER R
END FUNCTION
FUNCTION FUNC2(R)
REAL R
END FUNCTION
END MODULE A
PROGRAM B
USE A
X = FUNC_ANY(1) ! func1
Y = FUNC_ANY(1.0) ! func2
END
8.2 Limitations on Modules
- A module subprogram cannot contain references to procedures defined later
in the same module. For example, the module B below will not work in
the current release, while module C will work:
MODULE B
CONTAINS
FUNCTION G
.
.
.
CALL H
END FUNCTION G
SUBROUTINE H
.
.
.
END SUBROUTINE H
END MODULE B
MODULE C
CONTAINS
SUBROUTINE H
.
.
.
END SUBROUTINE H
FUNCTION G
.
.
.
CALL H
END FUNCTION G
END MODULE C
- Named array constants defined in a module can't be used as an initializer
in a subprogram which USEs the module.
- NAMELIST objects are not allowed in the specification part of a
MODULE.
- The module PUBLIC/PRIVATE access statements cannot reference a CONTAIN'd
subprogram.
Pghpf includes a partial implementation of the INDEPENDENT
directive. This directive is applied to a DO loop and instructs the
compiler to generate parallel code. There are two phases of independent loop
processing: the inline phase, and the auto parallelization phase. This section
describes these two phases and the command line options that allow the user to
control the processing of independent loops.
Pghpf parallelizes certain loops without using the INDEPENDENT
directive (if the loops operate on distributed data). Such loops include those
without the following: external procedure calls, array assignments,
FORALL, WHERE, or ALLOCATE or DEALLOCATE
statements. The INDEPENDENT directive allows the programmer to provide
information to the compiler to broaden the class of parallelizable loops. Use
of the INDEPENDENT directive on a loop nest provides assurance, by the
programmer, that all procedure calls within the loop nest are parallelizable
and each iteration can be performed independently. The compiler inlines
procedure calls within INDEPENDENT loops when -Minline is used,
so that the resulting statements can be parallelized.
There is no guarantee that loops within the scope of an independent directive
will be parallelized. If the loops contain array assignments, for example, the
auto-parallelizer phase of pghpf will not process the independent loop.
On the other hand, inlining procedure calls is an essential step for the
compiler to take in determining how to parallelize independent loops containing
procedure calls.
9.1 Directives
Inlining of function and subroutine calls will only take place within loop
nests that follow the INDEPENDENT directive. INDEPENDENT in
pghpf 2.0 is only applicable for a DO loop. For example:
!HPF$ INDEPENDENT
DO I = 1, N
A(I) = FUN(I) * B(I)
END DO
9.2 Switches
When inlining is to occur within independent loops, use -Minline
on the compilation command line. Inlining requires a preliminary extraction
phase which saves compiler information about procedures. You can allow the
compiler to create a temporary extraction, thus handling the inlining
automatically, or you can create and maintain a directory of "extract" files
using -Mextract. The compiler produces a message when it is
creating its database of extract procedures. For example the following message
indicates that the compiler is extracting the routine scatter_count.
pghpfc_ex: extracting scatter_count
The inline Switch
All forms of the pghpf -Minline switch only inline
procedures within DO loop nests that follow an INDEPENDENT
directive (the pghpf INDEPENDENT inliner is not a general
purpose inliner). The full syntax of the -Minline switch
is as follows:
-Minline[=[lib:dir,][levels:n,]{name:fun,}]
For
most users, supplying -Minline on the command line will suffice, for
example,
%pghpf -Minline filename.hpf
This
instructs the compiler to perform inlining within loops with an
INDEPENDENT directive and to use a temporary directory for the extract
phase. Note that while the extract phase extracts all possible procedures, the
inliner will only inline procedures in an INDEPENDENT DO
loop.
Including the lib:dir parameter, assumes that an extract phase has been
completed, and that the extracted procedures, if any, will be taken from
directory dir (created using the -Mextract switch
described below).
If the levels:n parameter is specified, inlining is repeated up to
n times within any inlined loop so that calls up to n levels deep
can be removed (the default for this value is one 1).
If the name:fun parameter is provided, only the function or subroutine
fun will be inlined within the INDEPENDENT loop. Multiple name
parameters can be provided in order to inline multiple procedures.
The extract Switch
The full syntax of the -Mextract pghpf switch is as follows:
-Mextract[=name1,name2...] -o dir
This
switch instructs the compiler to extract inlineable functions and subroutines
using directory dir to store the extract files. Names of procedures to
be extracted may be specified as parameters to the -Mextract
switch. Note that while the extract phase extracts all possible procedures, the
inliner will only inline procedures in an INDEPENDENT DO loop.
The noautopar Switch
By default, the compiler sets parallelization of INDEPENDENT loops on,
which means that if possible FORALL statements will be generated for
parallelizable loops. It is possible to disable this functionality using the
-Mnoautopar compiler command-line switch. For example:
% pghpf -Mnoautopar test1.hpf
Compiler Information Switches
If you are using the pghpf inlining capability and you want to keep
track of which functions are inlined, as well as whether parallelization is
taking place for the independent loop, use the -Minfo=inline and
-Minfo=autopar switches. For example:
% pghpf -Minline -Minfo=inline,autopar test1.hpf
11, 1 FORALL generated
12, Inlining f
9.3 INDEPENDENT Inlining
This section covers further details of the process the compiler uses when
creating an extract directory and when inlining a procedure.
Extract Directories
When the -Mextract switch is specified, an extract directory is
created for holding extract files. An extract file is an ASCII file holding
information created by the HPF compiler about a single procedure. The
procedure's name is in the first line of the extract file. The extract
directory contains a special table-of-contents file, named TOC. This ASCII file
associates procedure names with extract file names.
Inlining Transformations
The inliner first inserts the statements of a called procedure into the calling
program unit at the point of the call. If a function has been inlined, the
function call is replaced by the variable holding the function's return value.
If there are name conflicts between variables local to the inlined procedure
and variables within the calling program unit, the inlined procedure's local
variables will be renamed.
Assignments of actual arguments to dummy parameters with INTENT
IN or INTENT OUT are made at the beginning of the
inlined statements. Actual arguments that are identifiers or subscript
expressions associated with dummy parameters with INTENT
INOUT textually substitute for the dummy parameters where ever they
occur within the inlined statements. If necessary, adjustments are made to
array subscripts to accommodate array bounds that are different between the
calling program unit and the called procedure.
9.4 Examples
Extracting to a temporary extract directory and inlining.
% pghpf -Minline test1.hpf
Creating
an extract library.
% pghpf -Mextract -o exlib test1.hpf
Inlining
with extract library exlib.
% pghpf -Minline=lib:exlib test1.hpf
9.5 Effects of Inlining
Consider the following program.
SUBROUTINE FOO(X, Y, Z, A)
REAL, INTENT(IN) :: X
REAL, INTENT(INOUT) :: Y
REAL, INTENT(OUT) :: Z
REAL, DIMENSION(*) :: A
Y = X + 1.0
Z = Y + 2.0
A(3) = A(2) + 3.0
END
REAL FUNCTION GOO(X, Y, Z, A) RESULT (RES)
REAL, INTENT(IN) :: X
REAL, INTENT(INOUT) :: Y
REAL, INTENT(OUT) :: Z
REAL, DIMENSION(*) :: A
Y = X + 1.0
Z = Y + 2.0
A(3) = A(2) + 3.0
RES = 5.0
END
PROGRAM TESTINLINE
REAL XMAIN, YMAIN, ZMAIN, Y1MAIN, Z1, MAIN,AMAIN(10) !HPF$ SEQUENCE :: AMAIN
!HPF$ INDEPENDENT
DO I = 1, 10
CALL FOO(GOO(7.0, Y1MAIN, Z1MAIN,AMAIN(2)),
& YMAIN, ZMAIN, AMAIN(3))
ENDDO
END
The
independent loop will be rewritten to be the following:
do i = 1, 10
goo$x = 7.0
y1main = goo$x + 1.0
z = y1main + 2.0
amain(4) = amain(3) + 3.0
goo$goo = 5.0
z1main = z
ymain = goo$goo + 1.0
foo$z = ymain + 2.0
amain(5) = amain(4) + 3.0
zmain = foo$z
enddo
Variables
containing '$' are introduced by the compiler to disambiguate them from
variables in the main program.
9.6 INDEPENDENT Parallelization
The compiler parallelizes INDEPENDENT DO loops. A loop can be
parallelized when its variables are explicitly distributed using HPF
directives, and when dependence analysis shows that array elements do not
conflict across the indices of the loop (other dependence analysis is also
performed).
Parallelization of parallelizable DO loops is disabled when the
-Mnoautopar compiler option is supplied.
The compiler generates FORALL statements and calls to reduction
intrinsics in INDEPENDENT parallel DO-loops with distributed
arrays.
If the option -Mautopar is included on the command line, the compiler
checks the entire program for loops that may be parallelized, and does not
limit its search to only INDEPENDENT loops.
9.7 Limitations on INDEPENDENT Inlining and Parallelization
INDEPENDENT DO loops containing array assignments or
FORALL statements are not parallelized. Procedures with multiple
entries or returns, format lists, or using name list I/O are not extracted.
The pghpf 2.0 implementation supports full Fortran 90 I/O semantics.
This includes non-advancing I/O, namelist I/O, and I/O of array sections. There
are no restrictions on using mapped arrays in list directed, formatted, or
unformatted I/O statements. For example:
INTEGER, DIMENSION(N,N) :: A,B
!HPF$ DISTRIBUTE (BLOCK,BLOCK) :: A,B
FORALL(I=1:N,J=1:N) A(I,J) = B(J,I)
PRINT *, A,B
END
Variables
used in namelist groups (NAMELIST) may not be mapped; the compiler
issues a warning message if an attempt is made to map a variable in a namelist
group:
PGHPF-W-0311-Non-replicated mapping for namelist
array, name, ignored (test1.hpf:4)
Currently
input and output is serialized. One processor reads or writes the data and
sends or receives it to or from the other processors owning the data.
Performance of I/O on mapped arrays is comparable to the I/O performance of a
single-processor Fortran 90 compiler.
MPI consists of a number of transport mechanisms conforming to the Message
Passing Interface standard. One version of MPI supported by pghpf is the
Public Domain software available from Argonne National Laboratory and
Mississippi State University. MPI is a software system that enables a
collection of computers or processors of a single computer to be used as a
coherent and flexible computational resource. PGI supports MPI version 1.10
(refer to section 12.3, "Retrieving Software and Documentation" for details on
obtaining MPI).
The options in this section apply to programs using MPI for communication.
The executable runtime option -pghpf specifies that the following
options are to be passed to the communication control portion of the executable
program. The -pghpf option allows you to pass user-defined options to
your application program and communications control runtime options to the
pghpf runtime library.
In general, the command line format for a compiled program is:
% mpirun mpirun_options a.out user_options -pghpf pghpf_options
where:
- mpirun
- is the command used to execute MPI programs.
- mpirun_options
- are the options to the mpirun command.
- a.out
is the executable program.
- user_options
- are the program's options.
- pghpf_options
- are any of the valid options to the pghpf runtime library.
Most
systems require the use of the mpirun command or some other command to
execute programs. Refer to your system's documentation for details.
Running an MPI program requires that MPI is installed on your system.
Table 11-1 shows the valid MPI executable options (-pghpf options).
Table 11-1 MPI Runtime Library Options and Variables
Option Environment Variable Purpose
-stat PGHPF_STAT options Print runtime statistics upon
options program completion.
The random number used in the 2.0 release now generates a 46 bit lagged
fibonacci pseudo-random sequence with a short lag of 5 and a long lag of 17.
For a given seed, including the default seed, the sequence generated is
independent of the platform and number of processors. Due to limitations of
some platforms' default integer type, the seed vector is of size 34. Only the
least significant 23 bits of each element of the seed array are used, thus a
seed array returned or used is portable between platforms. For non-degenerate
seed arrays, the period of this generator is (217 - 1) *
245.
If all the odd elements of the seed array are even, the
period will be shorter.[*]
The best performance on distributed arrays is for block distributions. The
higher the order of the first distributed dimension, the better the performance
will be.
Table 13-1 provides a list of compiler command line options that are valid on
many systems. Some systems do not support some of these options. In addition,
most node compiler options are available for systems with a node compiler that
is not supplied with pghpf. The -Marg pghpf
specific options are described in the pghpf User's Guide.
Table 13-1 Compiler Command-line Options
Option Description
-c Stops after assembling (results placed in
filename.o).
-Dname[=val ] Defines a preprocessor macro name with value
val.
-dryrun Show but do not execute all commands created by
the driver.
-E Displays preprocessed HPF file to the standard
output.
-F Saves a preprocessed HPF file in filename.f.
-help Display the complete list of driver options.
-Idirectory Adds a directory directory to the search path
for #include files.
-Ldirectory Adds a directory directory to the search path
for library files.
-llibrary Loads the library, in addition to the standard
libraries.
-O[level] Specifies code optimization at the specified
level.
-ofilename Names the object file filename.
-r4 Interpret DOUBLE PRECISION variables as REAL.
-r8 Interpret REAL variables as DOUBLE PRECISION.
-time Print execution times for the various compiler
steps.
-Uname Undefine a preprocessor macro name.
-V Displays the compiler phase version messages.
-v Displays the compiler, assembler and linker
phase invocation.
-W0,arg Passes arguments arg to the node compiler.
-Wa,arg Passes arguments arg to the assembler.
-Wl,arg Passes arguments arg to the linker.
-Wh,arg Passes arguments arg to the HPF compiler.
-w Do not print warning messages.
The Portland Group, Inc. has the following mail address and telephone number.
You can call PGI, or contact us by email as described below.
The Portland Group, Inc
9150 SW Pioneer Ct, Suite H +1-503-682-2806 (voice)
Wilsonville, OR 97070 +1-503-682-2637 (FAX)
14.1 Obtaining Sales Information
To obtain further information on pghpf 2.0, or on other PGI products,
please send
e-mail to sales@pgroup.com or contact PGI at the
address/number shown above.
The Portland Group, Inc. also maintains a WWW home page with information on PGI
and its products; the URL is http://www.pgroup.com.
14.2 Reporting Bugs
To report bugs with the pghpf compiler or runtime, please send e-mail to
trs@pgroup.com. If you are reporting a bug, it is best if you include a
code sample that demonstrates the bug, a description of the system you are
using, as well as the error message and the options used to compile and if a
runtime error, or a problem with your program's results, the options used while
running the program.
To obtain further assistance on pghpf 2.0, or on other PGI products, you
can also use the address/number shown above.
14.3 Retrieving Software and Documentation
For information on the current version of MPI or PVM contact the following:
MPI
PGI currently supports version 1.10 of MPI. For information contact the
following:
http://www.mcs.anl.gov/mpi/index.html
PVM
PGI currently supports the latest version of PVM version 3.3. For information
on obtaining PVM, contact the following:
http://www.netlib.org/pvm3
14.4 pghpf 2.0 Online Documentation
Online documentation is available for pghpf 2.0 using a WWW web browser
such as Mosaic. To access the online documentation, access the file
pghpf.index.html. For example, using mosaic the command to bring up the
online documents would be:
%xmosaic $PGI/doc/hpf/html/pghpf.index.html