To: J3 J3/23-227 From: Bill Long Subject: OpenMP report Date: 2023-October-16 Since the last report, the OpenMP language committee met once more to prepare the upcoming technical report (TR12), to be released at SC23. There will be more more TR released next summer, a 6.0 comment draft, before the final OpenMP 6.0 document is released (expected at SC24). The following are a list of new features that have been voted in: * Fixes to the specification of the new coexecute construct, enabling array syntax and intrinsic functions to parallelized more effectively in target regions. * Permit the loop directive (and combined constructs that end in the loop directive) to be applied to a DO CONCURRENT loop. This gives more control to the user on how a DO CONCURRENT loop should utilize the parallelism available in the OpenMP program. For instance, to offload a DO CONCURRENT loop to an accelerator, one can write: !$omp target teams loop do concurrent (i=1:n,j=1:n) ... end do There are limitations on the clauses that can appear on the loop directive, to avoid clashes with the DO CONCURRENT semantic. Specifically, no collapse clause and no data-sharing attribute clauses may be specified. OpenMP also restricts applying a directive to a DO CONCURRENT loop if there are conflicting accesses to a variable in multiple iterations (at least is a modification) unless a locality is explicitly specified for the variable. * Support for "free-agent threads" was added. This allows certain tasks to be scheduled on an available thread in an implementation's thread pool that is not part of the current thread team. This allows the program to achieve additional concurrency when generating tasks even if an active parallel region is not present. * The 'fuse' loop transformation construct was added, enabling a sequence of consecutive loop nests to be fused. Prior loop transformation constructs we have added include: unroll, tile, interchange, and reverse. * A 'self' modifier was added to the 'map' clause. This directs the implementation to map the list item without greating a new copy in device memory (i.e., the address of the list item is mapped to itself). While this would mostly commonly be used on systems that support unified shared memory, it could be supported on other systems as well through address pinning. * Add omp_target_memset and omp_target_memset_sync routines. These are some of the other big topics we are looking to fix or address before the next comment draft TR for OpenMP 6.0: * Improved specification for Fortran allocatable mapping. The current spec comes with limitations when a user wants to map derived types that is 'declare target' and that has many allocatable components. * Better control over mapping a "descriptor" versus the target of the descriptor for pointers and allocatables. * Allow more directives to be combined into a single combined directive. * Support ad-hoc memory partitioning with OpenMP memory allocators. * Support depend and affinity clauses on 'taskloop' directive.