11-175 To: J3 From: Nick Maclaren Subject: Interop TR: CFI_section and strides Date: 2011 June 09 Reference: N1854 I don't like one aspect of CFI_section at all, but cannot find any paper explaining why the decision was taken the way that it was. However, I need to describe another topic first. Background ---------- I was thoroughly confused by the TRANSPOSE and RESHAPE intrinsics for a long time, and still think that the standard is unclear. I had incorrectly assumed that they delivered an association, in the same way that a section does, and not a free-standing value. I still think that it was a mistake not to create such a class of intrinsic, but that decision was taken 20 years ago. Another part of the problem is that it IS possible to create an array section that is associated with the diagonal of a contiguous square array, and do similar reshaping (including increasing the rank of an array), even in pure Fortran. All you have to do is to use either sequence or pointer association, though both need contiguity. Also, being able to create diagonal sections was shown to be straightforward in Algol68, was in earlier drafts of Fortran 90, and its omission was regretted by several people. A related part is that it is a long-standing requirement to do those, and similar manipulations, more cleanly, and without the constraint of contiguity when it is unnecessary. Most FFT codes rely on this, and it is often why they are stuck in the Fortran 77 era. Lastly, one otherwise trivial problem that we punted on previously (in this TR) is that the constraints on stride consistency (10:26-29) are needed in most cases, but aren't always for extent sizes of 0 and 1. Hard cases make bad law, and I don't propose complicating the wording, but it does affect this function. Current Specification --------------------- The current specification of CFI_section provides exactly the right tool to create array diagonals, and then forbids doing that, as an apparently gratuitous restriction. A lot of people will use it to do such reshaping, no matter what this TR says, find that it works on one system, and then complain when it doesn't on another. Also, it is very error-prone to use and there is currently one omission in the wording and one lack of clarity. The omission is that it does not say anything about the sm member of elements with member extent = -1. The lack of clarity is that the constraints on descriptor consistency apply to the result descriptor and not to the argument, and C programmers will quite correctly realise that the sm member is not used when the extent size is 0 or 1. That will definitely cause confusion over when CFI_INVALID_SM must be returned, must not be returned and may be returned. Some attention is needed to sort these issues out, whatever is decided. Possibility 1 ------------- We could relax the current wording to allow the creation of valid views of an array that are not possible in Fortran, but would be 'safe'. I originally started with that opinion, but have come round to the view that it is not a desirable path to follow in this TR. Its implications are just too fundamental. For example, we could provide a proper CFI_reshape function, either as an alternative or a replacement, that would look a bit like this: int CFI_reshape ( CFI_cdesc_t * result, const CFI_cdesc_t * source, CFI_attribute_t attribute, CFI_rank_t rank, const CFI dim t dim[] ); This would establish a reshaping of the source, subject only to the the resulting descriptor being valid and all elements of the result being elements of the source. The wording for this would be slightly simpler than that for the current CFI_section, as there would be no special interpretation of the dim argument. This possibility may have been considered in Las Vegas, but it is hard to tell. I feel that it needs explicit consideration, but would not favour adopting it, because I do not know how much it would impact on existing implementations. Possibility 2 ------------- We could make no technical change and just fix the omission and lack of clarity. The wording is little shorter because we need to put more effort into the CFI_dim_t consistency wording. This approach would seem to leave the door open to future extensions, but they would be at most things like producing diagonals. CFI_section could not be extended to a general reshaping function, and I believe that reshaping would be better done by a proper CFI_reshape in any case. I do not favour it, because it is nearly as error-prone as a general reshape, only marginally more flexible than Fortran sections, and messier to specify than either. Possibility 3 ------------- We could reduce the opportunity to create improper arrays and increase the safety of this call, by making the sm members relative and not absolute, exactly as in Fortran sections. Note that the argument that a value of type CFI_dim_t is being treated anomalously doesn't hold water, because that has to be done even under the existing approach! I believe that this will be less error-prone in use, and offers considerably less opportunity for C programmers to shoot their feet off while claiming they are just doing what the TR allows. As far as I can see, its functionality is identical. This document is proposing this approach, in one of two forms, with a mention of a third. The first is a minimal change, and the second splits the dim argument up, on the grounds that it is not being used as a normal CFI_dim_t value (even at present). The lower bounds, extents and strides are then specified as separate arguments, and each may be NULL, which adds a useful convenience. The forms are otherwise functionally equivalent. Edits to N1854: --------------- Alternative A: ------------- [17:24] Delete "of dim" and, after "value of -1", append "and the corresponding sm member is ignored". Also append "The other sm members of dim are treated as relative strides and are multiplied by the corresponding sm members of source to produce the sm members of result." [18:1] Delete "*source->dim[0].sm". Alternative B: ------------- [17:9] Replace "const CFI_dim_t dim[]" by "const CFI_index_t lower_bounds[], const CFI_index_t upper_bounds[], const CFI_index_t strides[]". [17:19-24] Replace the paragraph specifying the dim argument by: "lower_bounds points to an array specifying the subscripts of the element in the given array that is the first element of the array section. If it is NULL, the first element of source is used. "upper_bounds points to an array specifying the subscripts of the element in the given array that is the last element of the array section. If it is NULL, the last element of source is used. strides points to an array specifying the strides of the array section in units of the sm member of the dim member of argument source; if an element is 0, the corresponding dimension is a subscription and the corresponding elements of lower_bounds and upper_bounds shall be equal." If it is NULL, the strides are treated as all being 1." [17:27-28] Replace "the number of dim entries for which the extent member is equal to -1" by "the number of stride elements which are equal to 0". [17:34-39] and [18:1-2] Replace the example by: "the following code fragment establishes a C descriptor for the array section A(3::5). CFI_index_t lower_bounds[] = {2}, strides[] = {5}; CFI_CDESC_T(1) section; int ind; ind = CFI_section ( (CFI_cdesc_t *) §ion, source, CFI_attribute_assumed, lower_bounds, NULL, strides ); If source already points to a C descriptor for the rank-two Fortran array A declared as real A(100,100) the following code fragment establishes a C descriptor for the rank-one array section A(:,42). CFI_index_t lower_bounds[] = {source->dim[0].lower_bound,41}, upper_bounds[] = {source->dim[0].upper_bound,41}, strides[] = {1,0}; CFI_CDESC_T(1) section; int ind; ind = CFI_section ( (CFI_cdesc_t *) §ion, source, CFI_attribute_assumed, lower_bounds, upper_bounds, strides ); " {{{ Optionally: 1) We could ignore the upper bound element if the stride is zero, rather than requiring it to be set. 2) I do not think that we need the following any longer, but it could easily be added if people prefer: [17:28] Append to the end of the paragraph "The argument result shall be such that it specifies an array that could have been obtained by associating the source argument with a Fortran assumed-shape array and applying array section notation in Fortran." }}} Alternative C: ------------- We could restore the CFI_bounds_t type, and proceed as in alternative B, but by putting the wording in different places. I do not think that it is worthwhile for a single function.