To: J3 J3/24-136 From: Malcolm Cohen Subject: DIN-4: Generic processing of assumed-rank objects Date: 2024-June-21 1. Introduction DIN-4 suggests "Allow generic processing of assumed-rank arguments, possibly under suitable restrictions." That is a very vague requirement. This paper suggests some concrete possibilities. Note that this is very much a first draft. Without doubt, it is not a complete set of proposals. It may, however, be the start of something small enough that we might be able to do it and get it right in one revision. 2. Assumed size The fly in the ointment for most proposals is the possibility that the assumed-rank object might be associated with an assumed-size array. There are four obvious solutions. Whatever is done (which might not be one of these four) should be chosen consistently, for consistency. (1) Don't allow anything that does not work for an assumed-size array. This is approximately the current state of affairs, and is hardly adequate. (2) Use the bounds reported by LBOUND and UBOUND for assumed-size. That produces a zero extent in the final dimension, so nothing will happen in the case of assumed-size. Hello silent wrong answers. (3) Raise a runtime error if the "generic processing" is applied to an assumed-size array. This has the advantage of avoiding silent wrong answers, but the disadvantage of not (in general) being capable of being caught and handled by the user program. (4) Only permit the generic processing in the RANK DEFAULT block of a SELECT RANK that has a RANK(*) block. This forces the user program to handle assumed-size specifically. It might be too much of a straightjacket though. Of those obvious solutions, both (3) and (4) seem reasonable. 3. Contexts for additional usage of assumed rank 3.1 Whole-array reductions It would seem to unproblematic to permit assumed rank in array reduction intrinsics when there is no DIM argument. The result in this case is reduced all the way to scalar, so there is no nightmare "variable rank" expression. ALL, ANY, REDUCE, SUM, PRODUCT, MAXVAL, MINVAL, IALL, IANY, IPARITY, PARITY. Currently all those functions require the argument being reduced to be an array. It would be highly undesirable for the scalar association case of assumed rank to make an error, and the result is obvious (it's just the value of the scalar). Consistency might suggest permitting scalar all the time, but of course that would inhibit detection of typos in the usage of such functions; the consistency argument is weaker than the error-detection argument, so unless there is another reason for permitting scalar, they should continue to require arrays. The location reductions MAXLOC, MINLOC, FINDLOC could also be permitted for assumed rank. The size of their result would be non-constant, but the rank would be constant. 3.2 Array constructors An array is permitted in an array constructor, and simply expands to its elements in array element order. That seems perfectly reasonable for assumed rank. On the other hand, it might not be very useful. 3.3 Array transformation intrinsics The only one here that is clearly non-problematic is RESHAPE. It does not seem very useful. 3.4 Contiguous assumed rank There are additional contexts where a contiguous assumed rank object could be used, for example, as a sequence associated actual argument. Is this sufficiently useful? Perhaps not. 3.5 C_LOC C_LOC of an array returns the address of its first element. There would thus seem to be no problem permitting it for an assumed rank array, though it would not be of much use unless the array is contiguous. 3.6 Array elements A slight extension of the existing rank-independent subscripting syntax would make it possible to reference elements of an assumed-rank array, even one that is associated with an assumed-size actual argument. The question that arises is what happens when the number of subscripts is incorrect? For insufficient subscripts, some obvious possibilities are - error termination - the missing subscripts are treated as equal to the lower bound - processor-dependent random garbage results or crash, - have a pseudo-subscript STAT=, like we do for image indexing. For too many subscripts, some obvious possibilities are - error termination - the extra subscripts are ignored - segmentation fault crash. The slight extension is that we need to clearly permit the subscripting vector to have non-constant size when the object is assumed rank. 3.7 Array sections with triplets A similar extension is possible here, and similar questions arise when the user gets it wrong. It would, however, be important not to permit triplet vectors to be variable size, otherwise we don't know the rank of the subobject that would be produced. 3.8 Other array sections These don't seem reasonably possible without falling foul of the "unknown rank" issue. 4. Assumed rank array traversal Due to the problem of variable-rank expressions, there is no obvious way to use elemental procedures on assumed-rank objects, even though the compiler would know how to traverse them. Possibilities here would be - a DO loop which has an index vector (of unknown size) instead of an index variable; - a DO association loop which has an associate-name that is associated with consecutive array elements on each iteration. Actually, it would probably be better not to reuse the DO keyword here, as this is rather a special case, and operates quite differently to the normal DO. For example, in casual BNF: (a) TRAVERSE (assumed-rank-object-name) WITH (index-vector-name) block END TRAVERSE In the block, index-vector-name would take on successive values such that it traverses the object in array element order. For example TRAVERSE (A) WITH (IDX) ... here, A(@IDX) is the array element for this iteration. I think we'd want IDX to be a construct entity of rank 1, size equal to RANK(A), and of type INTEGER with a processor-dependent kind not less than default integer kind. Or we could allow an integer-type-spec in front of IDX, e.g. TRAVERSE (A) WITH ( [ integer-type-spec :: ] IDX ) but it may be cleaner just to require the processor to automatically make the kind big enough to hold any subscript value. (b) TRAVERSE (assumed-rank-object-name) ASSOCIATE (element-name) block END TRAVERSE In the block, element-name would be associated with each successive element in the iteration, e.g. TRAVERSE (A) ASSOCIATE (ELT) ... here, ELT is the array element for this iteration This avoids messing around with index values, needing to know what integer kind they should be, etc. If one wants to mess around with index values, e.g. to do a neighbourhood computation, perhaps one should be constructing and updating the index values manually anyway. 5. Conclusions The only thing here that is both easy and lacking in problems is permitting array reduction intrinsics without DIM=. Although there are potential problems, the more general array element subscripting and array traversal operations are worth investigating. ===END===