J3/03-253r1 Date: 14 November 2003 To: J3 From: Aleksandar Donev Subject: Post Fortran 2003: Allowing Multiple Nonzero-Rank Part References for Structure Components Reference: "Multiple Nonzero-rank Part References", A. Donev, Fortran Forum, December 2002, pp 2-10. J3/03-007R1 Response is in J3/03-277 ______________________________________ Summary ______________________________________ I propose to delete the constraint that prohibits multiple nonzero rank part-refs: "In a data-ref, there shall be no more then one part-ref with nonzero rank." There is no justification for this constraint, and removing it would unleash a most useful capability which Fortran is uniquely capable of with its ability to deal with non-contiguous arrays. The constraint that "A part-name to the right of a part-ref with nonzero rank shall not have the ALLOCATABLE or POINTER attribute" should remain as this prohibits skewed arrays, which cannot be implemented with regular array descriptors. However, structure components with multiple nonzero rank part-refs *can* be implemented with regular array descriptors. In fact, I have implemented extensions for the three compilers I use to be able to use such structure components in only a hundred lines of Fortran+C code or so. So I know for sure that this won't pose an implementation challenge, while I also know that it is a most useful functionality in scientific codes. It is also relatively easy to incorporate the change into the standard. ______________________________________ Motivation ______________________________________ Take the simple example: TYPE Point3D ! A point in 3D REAL :: coordinates(3), data(2) END TYPE Point3D TYPE(point3D), DIMENSION(10) :: points ! A collection of points In Fortran, points[1:2]%coordinates[1] produces a strided rank-1 array section (I will use this term more liberally then the actual standard) which contains the x coordinates of the first two points. This can, for example, be used as a target of a rank-1 array pointer. However, the reference points[1:2]%coordinates[:] is not allowed. In a user's mind, this would reference the xyz coordinates of the first two points, and can be thought of as an "array of arrays". But in fact, it can just as well be thought of as a rank-2 array of shape (/3,2/). This is more then just a convenient convention. In fact, the memory layout of the collection of real numbers (coordinates) referenced by this array of array can be described by a regular strided array section, so that in fact it is almost trivial for any existing F95 compiler to implement the following nonstandard assignment of a rank-2 array pointer to this "array of arrays": REAL, DIMENSION(:,:) :: selected_coordinates selected_coordinates=>points[1:2]%coordinates[:] Yet no compiler known to the author implements such an extension. It should be obvious to the reader that this kind of functionality would indeed be useful. For example, finding the centroid of the selected points would be performed with, WRITE(*,*) "The centroid is", SUM(points%coordinates, DIM=2) which requires no loops. Even more useful would be the ability to pass the coordinates of the selected points to a procedure (note that this procedure need not know that the coordinates came from an array of derived type point3D) as an actual argument associated with an assumed-shape array dummy argument. ______________________________________ Solution ______________________________________ Delete "In a data-ref, there shall be no more then one part-ref with nonzero rank" in C614 (105:12). Then add constraint _______________________ The rank of a data-ref is the sum of the ranks of the part-refs with nonzero rank, if any; otherwise, the rank is zero. ... Cxxx: The maximum rank of a data-ref is 7. _______________________ and modify 106:14+ to say something like: _______________________ The rank and shape of a nonzero rank part-ref are determined as follows. If the part-ref has no section-subscript-list, the rank and shape are those of part-name. Otherwise, the rank is the number of subscript triplets and vector subscripts in section-subscript-list, and the shape is the rank-1 array whose i-th element is the number of integer values in the sequence indicated by the i-th subscript triplet or vector subscript. If any of these sequences is empty, the corresponding element in the shape is zero. In an array-section, the rank of the array is the sum of the ranks of the nonzero rank part-refs. The shape of the array is the rank-1 array obtained by concatenating the shapes of the nonzero rank part-refs, in backward order, i.e., starting from the last one. If the shape has an element with the value of zero, the array section has size zero. _______________________ There are some other edits that will be needed, mostly in Section 6.1.2. ______________________________________ Problems and Alternatives ______________________________________ Since Fortran only guarantees that arrays of rank up to 7 will be supported by a conforming processor, the validity of having a total rank of more then 7, as in the reference, level1(:,:,:)%level2(:,:,:)%level3(:,:) will need to be either prohibited or left "processor-dependent". I believe this is a minor issue. Another problem is that the Fortran order of specifying components, structure%component, as opposed to the alternative component%structure, is the opposite of the order of concatenation of the shapes of the non-zero rank references. For example, the reference: level1(1:4,1:5,1:6)%level2(1:2,1:3)%level3(1:1) represents an array section of shape (/1,2,3,4,5,6/), and not (/4,5,6,2,3,1/) as might be thought at first. Again, I believe this to be a mere "steep learning curve" and not an sufficient reason to deny very useful functionality to Fortran programmers. ______________________________________ Edits ______________________________________ Will be written in a revision after some feedback is received. ______________________________________ Extended Example ______________________________________ Here is an illustration of the expressivity and power of the proposed feature: ! A type hierarchy of meterological data: TYPE :: Hourly_Record REAL (KIND=r_wp) :: temperature (3) = 0.0 ! Three temperature readings (water, air, soil) LOGICAL (KIND=l_byte) :: synny = .TRUE. END TYPE TYPE :: Daily_Record TYPE (Hourly_Record), DIMENSION (24) :: hourly_records INTEGER (KIND=i_sp) :: sunrise = 7, sunset = 18 END TYPE TYPE :: Weekly_Record TYPE (Daily_Record), DIMENSION (7) :: daily_records REAL (KIND=r_sp) :: forecast_success (5) END TYPE ! Weather data over a grid of observation points: INTEGER, PARAMETER :: n_x = 100, n_y = 50 TYPE (Weekly_Record), DIMENSION (n_x, n_y), TARGET :: weekly_records ... ! Assign values to the weather data, do calculations, etc... ! Select the second temperature reading on Mondays and Wednesdays ! at 9:00 and 15:00 hours at grid point (3,1): WRITE (*,*) "The selected temperatures are:", & weekly_records(3,1)%daily_records(1:3:2)%hourly_records(9:15:6)%temperature(2) ! EOF