To: J3 J3/23-172 From: Bill Long Subject: OpenMP Report Date: 2023-June-11 [Text provided by Deepak Eachempati, the HPE representative to the OpenMP committee.] The OpenMP language committee continues to work on the upcoming OpenMP 6.0 release. A new technical report (TR) will be released in November of this year, and public comment draft will likely be released in Summer of 2024, and the final OpenMP 6.0 draft is expected to be released in November of 2024. Since the last update on OpenMP activities, we made progress on the following: * General improvements to the document by adding hyperlinks throughout to specification to definitions in the glossary and adding missing definitions. * Allow the "induction" clause and the "declare induction" directive. This supports parallelization of loops that apply an arbitrary induction operation on a variable in each iteration. Here is a simple example: !$omp parallel do induction(step(2), *: k) DO i=1,n k = k * 2 ... END DO * Extended the num_threads clause, to allow it to take a list. This can be used to control the default number of threads that are requested for a nested parallel region. An example: !$omp parallel num_threads(4,2) ! 4 threads requested here ... !$omp parallel ! 2 threads requested here for each encountering thread ... !$omp end parallel !$omp end parallel * Added the coexecute construct for Fortran to, among other things, support parallelization of array syntax on GPUs. It functions much like workshare, except it binds to teams regions rather than parallel regions. Example: !$omp target teams !$omp coexecute C(:) = A(:) * B(:) D(:) = C(:) * 2 ... !$omp end coexecute !$omp end target teams * Support for BLOCK inside of an atomic construct and more allowable forms for atomic conditional updates. Examples: !$omp atomic capture ! atomic update of x and capture into y BLOCK x = x + 1 y = x END BLOCK !$omp atomic compare if (x < n) x = n ! as alternative to: ! !$omp atomic ! x = max(x,n) * Allow the atomic construct to appear inside a loop construct. Example: !$omp loop DO i=1,n !$omp atomic a(i) = a(i) * 2 END DO * Permit the nowait clause to be (runtime) conditional. Example: !$omp target teams distribute nowait(do_async()) DO i=1,n ... END DO The following is a sample of topics that are being actively discussed, but has not yet been voted in by the language committee: * Support for "free-agent" threads. Allows the implementation to reserve additional threads in a thread pool which are available to execute certain tasks (which may call "free-agent", or "unconfined", or other) even when those threads are not actually assigned to a thread team associated with a parallel region. * Support scalable vector architectures. The current simd construct allows one to specify a desired vector length, but this must be a compile-time constant. Scalable vector architectures (e.g., SVE or RISC-V) permit vector code that does not specify a vector length at compile-time. We are considering extensions to the "simdlen" clause to support such architectures. * Support OpenMP parallelization of DO CONCURRENT. Currently, DO CONCURRENT is not considered one of the canonical loop forms. We are working on defining behavior when an OpenMP loop-associated directive is applied to DO CONCURRENT. We are also considering whether to allow other OpenMP constructs to appear inside a DO CONCURRENT loop body, such as "simd" or "atomic". * The tools subcommittee has been discussing DWARF extensions to support debugging of OpenMP or OpenACC programs. We plan to bring forward proposed extensions to the DWARF standards committee for possible inclusion in DWARF 6. * Better support for controlling how data-mapping work for environments that provide unified shared memory. * Support for task dependencies for tasks generated by a taskloop construct. * Better support for incrementally mapping parts of aggregate types (structures, arrays) to a device. * Support updates to mapped pointers that are "attached" to a data object on a device. Currently, they aren't permitted to be modified while attached. * Support for additional loop transformation directives, including fusion, fission, and concatenation.