PathScale Compiler
The PathScale compiler suite is a high performance compiler suite with a proprietary back-end based on the SGI compilers and a GNU front-end. The suite includes Fortran 77/90/95 (with Cray/SGI Fortran 95 extensions and character pointers), C and C++ compilers. We have found that the PathScale compilers outperform PGI in the majority of codes tested at CSCS. The PathScale compiler suite is accessed by loading the PrgEnv-pathscale module file: to change from the PGI programming environment to the PathScale environment, issue the command module switch PrgEnv-pgi PrgEnv-pathscale.
Versions
The current default version of the compiler is shown here. Older and/or newer versions of the compiler may be available: to see which versions are available issue module avail pathscale. To use a different version of the PathScale compiler issue module switch pathscale pathscale/<new_version>.
Invocation
To compile a Fortran 90 MPI code on Rosa invoke the Cray compiler wrapper:
> ftn [compiler options] example.f90 -o example.x
Likewise for C and C++ codes:
> cc [compiler options] example.c -o example.x
> CC [compiler options] example.C -o example.x
The man pages (man pathf90; man pathcc) provide information on all the compiler options available. Note that if two compiler options conflict, the last option on the command line takes precedence!
Source files suffix rules
The PathScale Fortran compiler supports the following file extensions:
.f, .F, .f90, .F90, .f95, and .F95
By default, the compiler expects fixed form Fortran source for file suffixes .f, .F, and free form for suffixes .f90, .F90, .f95 and .F95. You can use the -freeform or -fixedform flags to override these defaults.
Suffix | Processing to be done |
.f | fixed form Fortran source; compile |
.F | fixed form Fortran source; preprocess, compile |
.f90 | free form Fortran source; compile |
.F90 | free form Fortran source; preprocess, compile |
.f95 | free form Fortran source; compile |
.F95 | free form Fortran source; preprocess, compile |
.s | assembler code; assemble |
.S | assembler code; preprocess, assemble |
.o | object file; passed to linker |
.a | library archive file; passed to linker |
For Fortran source files with .f, .f90 or .f95 you can invoke the Fortran preprocessor with the -ftpp option, or the C preprocessor with the -cpp option. The default preprocessor for files with .F, .F90 and .F95 extensions is -cpp.
Options
The PathScale compiler organizes options into twelve groups by compiler phase or class of feature. The general syntax is
- -GROUPNAME:option[=value]{:option[=value]}
The group names are as follows:
Group | Meaning |
-LIST | Options relating to the writing of a listing file |
-OPT | Optimizations |
-TARG | Target machine |
-TENV | Target environment |
-INLINE | Back-end inlining |
-IPA | Interprocedural analysis |
-LANG | Language features |
-CG | Code generation |
-WOPT | Global scalar optimization |
-LNO | Loop nest optimization |
Optimization
Using the appropriate compiler optimization flags is essential for reasonable performance of your application on the XT5 platform. The default general optimization level is -O2, which corresponds to the inclusion of conservative optimizations, ie. ones that are virtually always beneficial. We recommend in the first instance to add the following optimization flag for increased performance:
- -OPT:Ofast
This adds a subset of -OPT suboptions, equivalent to -OPT:ro=2:Olimit=0:dvi_split=ON:alias=typed. There are a number of other -OPT suboptions that are documented in the eko man pages (man eko).
More aggressive optimization can be obtained with:
- -Ofast
which is equivalent to -O3 -ipa -OPT:Ofast -fno-math-errno -ffast-math.
- -O3 turns on aggressive optimizations that may or may not improve code performance
- -ipa turns on interprocedural analysis (see bleow)
- -fno-math-errno prevents errno being set when calling math functions that are executed in a single instruction, e.g. sqrt.
- -ffast-math improves floating point performance by relaxing ANSI and IEEE rules
Note that optimizations introduced with the -Ofast flag may affect floating point accuracy due to the rearrangment of computations (see "Floating point accuracy" below).
Floating point accuracy
Relaxing the precision requirements for floating-point numbers might allow for faster code. There are three relevant options here: -OPT:roundoff, OPT:IEEE_arithmetic, and -OPT:IEEE_Nan_Inf.
The -OPT:roundoff option defines the extent to which the compiler can introduce roundoff error:
- -OPT:roundoff=0 no roundoff error permitted (default at -O0, -01, and -02)
- -OPT:roundoff=1 permits limited roundoff error (default at -O3)
- -OPT:roundoff=2 permits roundoff error due to reassociating expressions (default at -Ofast)
- -OPT:roundoff=3 permits any roundoff error
The -OPT:IEEE_arithmetic option specifies the level of conformance to the IEEE 754 floating-point roundoff and exception handling behaviour:
- -OPT:IEEE_arithmetic=1 defines strict conformance to IEEE standard (default at -O0, -O1, and -O2)
- -OPT:IEEE_arithmetic=2 allows some relaxation of accuracy
- -OPT:IEEE_arithmetic=3 allows any mathematically equivalent tranformations to be appiled
The -OPT:IEEE_NaN_Inf=(on|off) controls the conformance to IEEE standards for NaN and Infinity. Default is on.
Note: the GNU-style flag -ffast-math (which is implied by -Ofast) is equivalent to -OPT:IEEE_arithmetic=2 -fno-math-errno. If you wish to adhere strictly to IEEE arithmentic then you should use the -fno-fast-math flag, which implies -OPT:IEEE_arithmetic=1 -fmath-errno.
Loop nest optimization
The loop nest optimizer (LNO) performs loop transformations to optimize nested loops by making better use of cache. The -LNO options are only enabled at general optimization level -O3 or higher. See man eko for a description of the LNO options.
Interprocedural analysis
Interprocedural analysis and optimization can be turned on (at any optimization level) with the -ipa option, and is turned on by default with the -Ofast option. Note that the interprocedural analysis is "whole program" and must be enabled for all source files. When -ipa is used, the majority of compiler optimization is done at the link stage rather than at the compile stage, so compilation will be fast but linking may take significantly longer. The interprocedural analysis flag must be turned on at both compile and link time.
Alias Options
It is possible to allow the compiler to make assumptions about aliasing, which could improve the performance of your code.
- -OPT:alias=typed (this is implied by -OPT:Ofast)
- -OPT:alias=restrict
- -OPT:alias=disjoint
See man eko for an explanation of these options.
Automatic optimization tuning
The pathopt2 tool can be used to help tune the PathScale compiler for higher performance for a specific code. This tool iteratively tests different compiler options and combinations of options, tracks the resutls, selects the best options within a subgroup, and then elevates those options within the test hierarchy. See man pathopt2 for more details.
OpenMP
For the PathScale compiler use the -mp option to enable OpenMP support. There is no support for OpenMP directives in C++ that use exceptions, classes or templates.
GCC object compatibility
PathScale is fully compatible with GCC which means that you can mix and match the linking of GNU and PathScale compiled binaries and libraries. The front-end is source compatible with the GNU compiler suite for C/C++.
Debugging
The following compiler flags may be useful for helping debug your code:
Flag | Meaning |
-g | Generate debugging information (changes optimization to -O0 unless explicitly overridden) |
-C | Enable array bounds checking for Fortran 90 codes. If you then set the environment variable F90_BOUNDS_CHECK_ABORT=YES the code will crash on an out-of-bounds access |
-trapuv | Initializes variables with NaN. If the program uses the variable it will crash rather than producing incorrect results |
-zerouv | Initializes variables to 0 |
-OPT:alias=no_parm | If your program gets the right answers with this flag and wrong without, you are likely breaking Fortran aliasing rules |
-LANG:rw_const=on | Prevent segmentation fault when a constant parameter in Fortran is written to |
Further Information
See the man pages for detailed information on the compilers and compiler flags (man pathcc, man path95, man eko)
Refer to the online documentation from PathScale.


