To: J3                                                     J3/24-136
From: Malcolm Cohen
Subject: DIN-4: Generic processing of assumed-rank objects
Date: 2024-June-21


1. Introduction

DIN-4 suggests
    "Allow generic processing of assumed-rank arguments,
     possibly under suitable restrictions."

That is a very vague requirement.
This paper suggests some concrete possibilities.

Note that this is very much a first draft. Without doubt, it is not a
complete set of proposals. It may, however, be the start of something
small enough that we might be able to do it and get it right in one
revision.


2. Assumed size

The fly in the ointment for most proposals is the possibility that
the assumed-rank object might be associated with an assumed-size array.
There are four obvious solutions. Whatever is done (which might not be
one of these four) should be chosen consistently, for consistency.

    (1) Don't allow anything that does not work for an assumed-size array.
        This is approximately the current state of affairs, and is hardly
        adequate.
    (2) Use the bounds reported by LBOUND and UBOUND for assumed-size.
        That produces a zero extent in the final dimension, so nothing will
        happen in the case of assumed-size. Hello silent wrong answers.
    (3) Raise a runtime error if the "generic processing" is applied to an
        assumed-size array. This has the advantage of avoiding silent wrong
        answers, but the disadvantage of not (in general) being capable of
        being caught and handled by the user program.
    (4) Only permit the generic processing in the RANK DEFAULT block of a
        SELECT RANK that has a RANK(*) block. This forces the user program
        to handle assumed-size specifically. It might be too much of a
        straightjacket though.

Of those obvious solutions, both (3) and (4) seem reasonable.


3. Contexts for additional usage of assumed rank

3.1 Whole-array reductions

It would seem to unproblematic to permit assumed rank in array reduction
intrinsics when there is no DIM argument. The result in this case is
reduced all the way to scalar, so there is no nightmare "variable rank"
expression.

ALL, ANY, REDUCE, SUM, PRODUCT, MAXVAL, MINVAL, IALL, IANY,
IPARITY, PARITY.

Currently all those functions require the argument being reduced to be an
array. It would be highly undesirable for the scalar association case of
assumed rank to make an error, and the result is obvious (it's just the
value of the scalar).

Consistency might suggest permitting scalar all the time, but of course
that would inhibit detection of typos in the usage of such functions; the
consistency argument is weaker than the error-detection argument, so unless
there is another reason for permitting scalar, they should continue to
require arrays.

The location reductions MAXLOC, MINLOC, FINDLOC could also be permitted for
assumed rank. The size of their result would be non-constant, but the rank
would be constant.

3.2 Array constructors

An array is permitted in an array constructor, and simply expands to its
elements in array element order. That seems perfectly reasonable for
assumed rank. On the other hand, it might not be very useful.

3.3 Array transformation intrinsics

The only one here that is clearly non-problematic is RESHAPE.
It does not seem very useful.

3.4 Contiguous assumed rank

There are additional contexts where a contiguous assumed rank object could
be used, for example, as a sequence associated actual argument. Is this
sufficiently useful? Perhaps not.

3.5 C_LOC

C_LOC of an array returns the address of its first element. There would
thus seem to be no problem permitting it for an assumed rank array, though
it would not be of much use unless the array is contiguous.

3.6 Array elements

A slight extension of the existing rank-independent subscripting syntax
would make it possible to reference elements of an assumed-rank array, even
one that is associated with an assumed-size actual argument. The question
that arises is what happens when the number of subscripts is incorrect?
For insufficient subscripts, some obvious possibilities are
    - error termination
    - the missing subscripts are treated as equal to the lower bound
    - processor-dependent random garbage results or crash,
    - have a pseudo-subscript STAT=, like we do for image indexing.
For too many subscripts, some obvious possibilities are
    - error termination
    - the extra subscripts are ignored
    - segmentation fault crash.

The slight extension is that we need to clearly permit the subscripting
vector to have non-constant size when the object is assumed rank.

3.7 Array sections with triplets

A similar extension is possible here, and similar questions arise when the
user gets it wrong. It would, however, be important not to permit triplet
vectors to be variable size, otherwise we don't know the rank of the
subobject that would be produced.

3.8 Other array sections

These don't seem reasonably possible without falling foul of the "unknown
rank" issue.


4. Assumed rank array traversal

Due to the problem of variable-rank expressions, there is no obvious way to
use elemental procedures on assumed-rank objects, even though the compiler
would know how to traverse them.

Possibilities here would be
    - a DO loop which has an index vector (of unknown size)
      instead of an index variable;
    - a DO association loop which has an associate-name that is associated
      with consecutive array elements on each iteration.

Actually, it would probably be better not to reuse the DO keyword here, as
this is rather a special case, and operates quite differently to the normal
DO.

For example, in casual BNF:
(a)
    TRAVERSE (assumed-rank-object-name) WITH (index-vector-name)
        block
    END TRAVERSE

In the block, index-vector-name would take on successive values such that
it traverses the object in array element order. For example
    TRAVERSE (A) WITH (IDX)
        ... here, A(@IDX) is the array element for this iteration.

I think we'd want IDX to be a construct entity of rank 1, size equal to
RANK(A), and of type INTEGER with a processor-dependent kind not less than
default integer kind. Or we could allow an integer-type-spec in front of
IDX, e.g.
    TRAVERSE (A) WITH ( [ integer-type-spec :: ] IDX )
but it may be cleaner just to require the processor to automatically make
the kind big enough to hold any subscript value.

(b)
    TRAVERSE (assumed-rank-object-name) ASSOCIATE (element-name)
        block
    END TRAVERSE

In the block, element-name would be associated with each successive element
in the iteration, e.g.
    TRAVERSE (A) ASSOCIATE (ELT)
        ... here, ELT is the array element for this iteration

This avoids messing around with index values, needing to know what integer
kind they should be, etc. If one wants to mess around with index values,
e.g. to do a neighbourhood computation, perhaps one should be constructing
and updating the index values manually anyway.


5. Conclusions

The only thing here that is both easy and lacking in problems is permitting
array reduction intrinsics without DIM=.

Although there are potential problems, the more general array element
subscripting and array traversal operations are worth investigating.

===END===