To: J3 J3/24-156 From: Jeff Larkin Subject: OpenACC Liaison Report Date: 2024-June-27 Reference: 22-194 OpenACC Liaison Report ====================== Information about OpenACC, including the current standard document, training materials, and upcoming events can be found at http://www.openacc.org/. The current OpenACC standard is version 3.3. Version 3.3 of the OpenACC specification was adopted November 2022. The specification can be found at https://www.openacc.org/specification. The OpenACC technical committee is tentatively planning to release a specification update in November 2024. News ---- The 3.3 specification release focused primarily on standardizing extensions that were introduced by the NVIDIA compilers and used in multiple important applications. Several important new features are detailed below. * Multiple dimensions of gang parallelism. "Gangs" represent the coarsest of OpenACC's 3 levels of parallelism. Several important applications reported a challenge in generating sufficient parallelism for modern GPUs due to rich loop nests spanning multiple subroutines in the callstack. These subroutines prevent easily collpasing the loop nest to express more coarse-grained parallelism, so this feature enabled the programmer to identify loops in multiple subroutines for coarse-grained parallelization. The specification now allows for up to 3 degrees of gang parallelism. * A "force" modifier for the "collapse" clause. Traditionally collapsing was only allowed on tightly-nested loops, but this modifier enables the programmer to express a desire to collapse nested loops even if intervening code appears between those loops. Because this requires redundant execution of intervening code, it was deemed best that this collapsing be opt-in with a clause modifier. * Fortran interfaces were added to several memory management routines that did not have Fortran bindings previously. These routines were originally designed for C-style pointers to device memory and had no corresponding functionality in Fortran. Because multiple Fortran OpenACC applications were creating their own bindings to these routines it was decided that demand was sufficient to standardize on an interface. Hewlett Packard Enterprise has has continued to update its support for OpenACC in the crayftn compiler. OpenACC is supported on GPUs from NVIDIA and AMD as well as multicore targets. NVIDIA has released HPC SDK 24.5 with support for OpenACC in C, C++, and Fortran. The 23.11 release introduced a new memory mode that is targeted at machines with unified memory, either in hardware (NVIDIA's Grace Hopper Superchip) or software (Linux HMM). The NVIDIA compilers support OpenACC for NVIDIA GPUs and multicore CPUs. GCC supports OpenACC on AMD GPUs and NVIDIA GPUs. Progress continues on upstreaming support for OpenACC into the LLVM Compiler Infrastructure Project. This will include support in Fortran and C/C++. Applications ------------ There are over 250 applications in production or development using OpenACC, including: - Gaussian 16 - ANSYS Fluent - VASP 6 - MPAS-A - COSMO - GAMERA for GPU - Quantum Espresso - OVERFLOW More information on these can be found on the OpenACC web site at https://www.openacc.org/success-stories. Hackathons ---------- OpenACC continues to host numerous hackathons each year. OpenACC is generally the preferred approach for Fortran applications attending the events. Event dates are posted at https://www.gpuhackathons.org/events. It is not a requirement to use OpenACC directives to participate in these hackathons.