To: J3 J3/23-165
From: T. CLune
Subject: Rank agnostic nested loops
Date: 2023-June-09
#Reference: 23-130
Forward:
========
Paper 23-130 from the previous meeting in Berkeley slipped through the
cracks and received little discussion. That paper suggested a new
feature which will be of considerable use for generic programming, but
falls squarely in the domain of JQR. Unfortunately, this oversight
prevented it from making the short list of new features provided in
23-154. I am therefore resubmitting in the hopes that JOR can give it
proper consideration for F202y.
1. Introduction
===============
The introduction of generics features which allow parameterizations of
an algorithm by the ranks of one or more arrays, is unfortunately
limited in certain important cases. ELEMENTAL and rank-agnostic
access of array elements handle a great many use cases, but important
capability gaps remain.
I propose that Fortran should be extended to include support for
rank-agnostic nested loops.
Section 2 provides some relevant use cases, and section 3 provides
potential solutions using suggestive syntax.
2. Use cases
============
The use cases presented here are:
2.1 Tensor/outer product
2.2 MINLOC analog
2.3 user-rolled stuff with assumed-rank (test framework generate
comparison messages)
https://gitlab.com/everythingfunctional/prune
2.1 Tensor/outer product
========================
Consider how one might write a template procedure that computes the
outer product of 2 array arguments. We assume that the template is
parameterized by two integers RA and RB that specify the ranks of the
two arguments A and B.
With the current features planned for Fortran 202X, the user would be
forced to manually implement rather tedious logic for iterating
through a set of indices:
FUNCTION OUTER_PRODUCT(A, B) RESULT(C)
REAL, RANK(RA), INTENT(IN) :: A
REAL, RANK(RB), INTENT(IN) :: B
REAL, RANK(RA+RB) :: C ! C(I,J) = A(I) * B(J)
INTEGER :: a_idx(RA)
INTEGER, POINTER :: b_idx(:)
INTEGER, POINTER :: c_idx(:)
LOGICAL :: done
c_idx = 1
DO
C(@c_idx) = A(@c_idx(1:RA)) * B(@c_idx(1:RB))
c_idx(1:RA) = next_idx(c_idx(1:RA), shape(A), done)
IF (done) THEN
a_idx = 1
c_idx(RA+1:) = next_idx(c_idx(RA+1:), shape(B), done)
IF (done) EXIT
END IF
END DO
CONTAINS
FUNCTION next_idx(idx, shp, done) result(new_idx)
INTEGER, INTENT(IN) :: idx(:)
INTEGER, INTENT(IN) :: shp(:)
LOGICAL, INTENT(OUT) :: done
INTEGER, INTENT(OUT) :: new_idx(size(idx))
INTEGER :: m
done = .FALSE. ! unless
DO m = 1, SIZE(idx)
IF (idx(m) < shp(m)) THEN
new_idx(m) = idx(m) + 1
RETURN
ELSE !
IF (m == SIZE(shp)) THEN
done = .true.
RETURN
ELSE
new_idx(m) = 1
END IF
END IF
END DO
END FUNCTION next_idx
END FUNCTION
2.2 MINLOC analog
=================
The proposed TEMPLATE features in F202Y will in theory allow us to
define a TEMPLATE that implements an intrinsic-like MINLOC function
for rank-N arrays of arbitrary type provided a compare operation.
However, the existing rank agnostic machinery in F2023 makes this more
difficult than one might expect.
Brad Richards has implemented this generalized MINLOC at:
https://github.com/j3-fortran/generics/blob/main/examples/intrinsics/
minloc_m.f90
Much as with the previous use case, the code is particularly complex
due to the need to manually loop through an arbitrary number of
dimensions.
2.3 Assertion library
=====================
Consider the use case where a testing framework wishes to provide an
overloaded assert() function that compares 2 same-shape arrays of rank
N for equality. If any elements differ, then the procedure produces a
message indicating the location of the first difference and the values
at that location.
This use case is quite similar to that of MINLOC, as it could be
expressed a
SUBROUTINE assert(expected, found)
REAL, RANK(N), INTENT(IN) :: expected
REAL, RANK(N), INTENT(IN) :: found
ASSOCIATE ( loc => FINDLOC(found /= expected) )
print*,'First difference at location ", loc
print*,' Expected: ', expected(@loc)
print*,' Found: ', found(@loc)
END ASSOCIATE
END SUBROUTINE
3. Suggested simplification
===========================
My suggested approach is weakly analogous to the approach for
rank-agnostic access of array elements. The index and bounds of a DO
loop would be permited to be a multi-index.
To clarify the syntax consider this simple doubly nested loop:
INTEGER :: I, J
DO I = 1, M
DO J = 1, N
A(I,J) = < some expression >
END DO
END DO
This could be rewritten as:
INTEGER :: IJ(2)
DO @IJ = [1,1]:[M,N]
A(@IJ) = < some expression >
END DO
-----------
Now consider the outer product use case
FUNCTION OUTER_PRODUCT(A, B) RESULT(C)
REAL, RANK(RA), INTENT(IN) :: A
REAL, RANK(RB), INTENT(IN) :: B
REAL, RANK(RA+RB) :: C ! C(I,J) = A(I) * B(J)
INTEGER :: idx(RA+RB)
DO @idx = 1:[SHAPE(A), SHAPE(B)]
C(@idx) = A(@idx(1:RA)) * B(@idx(RA+1:))
END DO
END FUNCTION
A variant that could exploit some degree of ELEMENTAL optimizations:
FUNCTION OUTER_PRODUCT(A, B) RESULT(C)
REAL, RANK(RA), INTENT(IN) :: A
REAL, RANK(RB), INTENT(IN) :: B
REAL, RANK(RA+RB) :: C ! C(I,J) = A(I) * B(J)
INTEGER :: idx(RA)
DO @idx = 1:SHAPE(A)
C(@idx,@:shape(B)) = A(@idx) * B
END DO
END FUNCTION
----------
For the MINLOC case, the suggestion notation would permit the
following much simpler, and arguably clearer, implementation shown
below. (For simplicity, I have omitted the mask and dim arguments.)
TEMPLATE MINLOC_T(T, N, less)
TYPE, DEFERRED :: T
INTEGER, CONSTANT :: N
INTERFACE
LOGICAL FUNCTION less(x,y)
TYPE(T), INTENT(IN) :: x, y
END FUNCTION
END INTERFACE
CONTAINS
FUNCTION MINLOC(A) RESULT(idx)
INTEGER :: idx(N)
TYPE(T), RANK(N), INTENT(IN) :: A
TYPE(T) :: x_min
INTEGER :: i(N)
SELECT RANK (A)
RANK (0) ! scalar
idx = 1
RANK DEFAULT
idx = 1
x_min = A(@idx)
DO @i = 1:SHAPE(A)
ASSOCIATE( x => A(@i) )
IF (less(x, x_min)) THEN
idx = i
x_min = x
END IF
END ASSOCIATE
END DO
END SELECT RANK
END FUNCTION
END TEMPLATE
===END===