To: J3 J3/18-279r1 From: Tom Clune & Steve Lionel Subject: Extending usability of deferred-length strings Date: 2018-October-18 Reference: 18-130 1. Introduction --------------- Deferred length strings have proven to be a very useful and popular feature. However, there remain several aspects in the language that currently require fixed-length strings or allocated deferred-length strings of sufficient length. This could/should be improved to work with deferred-length strings. At meeting 215, J3 approved this item for moving forward. 2. Use cases ------------ 2.1 IOMSG and ERRMSG Many Fortran statements optionally return status information in the form of strings (IOMSG and ERRMSG). Use of these requires that the user provide a string of "sufficient" length, and there is no standard mechanism to determine the necessary length. It would be preferable to allow these features to assign deferred-length-strings as if by intrinsic assignment. Unallocated strings would be allocated and filled. Allocated strings of different length would be reallocated and then filled. 2.2 Intrinsics that return strings Several intrinsic procedures (e.g., GET_COMMAND, GET_ENVIRONMENT_VARIABLE, ...) also have intent out string arguments. Unlike the features above, some (but not all) of these have a mechanism to determine the necessary length. A typical use is to then CALL GET_ENVIRONMENT_VARIABLE('PATH', LENGTH=n) ALLOCATE(CHARACTER(LEN=n) :: var) CALL GET_ENVIRONMENT_VARIABLE('PATH', VALUE=var) A natural simplification would be to allow allocatable, deferred-length strings to be passed to the intrinsic which would then perform the necessary allocation internally, if the value length is different from the current length (or if the variable is unallocated). So, users could do: CHARACTER(LEN=:), ALLOCATABLE :: var, msg CALL GET_ENVIRONMENT_VARIABLE('PATH', VALUE=var, ERRMSG=msg) If there is no error, then var would be assigned the value of the environment variable as if by intrinsic assignment. (If msg was also deferred-length and there is no error, it is unchanged.) 2.3 Writes to internal files As with IOMSG and ERRMSG, there is no general mechanism for a user to determine how long a string is required for writes to internal files. A typical scenario is to either use a hopefully-large-enough fixed length string (or an allocated large-enough deferred-length string) and then trim: ALLOCATE (CHARACTER(LEN=MAX_LEN) :: buffer) WRITE(buffer, fmt) a, b, c buffer = TRIM(buffer) Alternatively, IOSTAT can be used to iteratively increase the buffer size to provide a more robust mechanism: INTEGER :: n, stat n = MIN_LENGTH stat = -1 DO WHILE (stat /= 0) ALLOCATE (CHARACTER(LEN=MAX_LEN) :: buffer) WRITE(buffer, fmt, IOSTAT=stat) a, b, c IF (stat /= 0) DEALLOCATE(buffer) END DO buffer = TRIM(buffer) If the buffer could instead be a deferred-length string then the implementation could be responsible for ensuring a string of sufficient length is allocated. The implementation could either start with a large allocation or even use a strategy to progressively allocate larger strings until sufficiency is obtained. If the io-unit is an array, however, there is no reasonable solution to allow for reallocation as each element (record) might be a different length. 3. Proposals It should be noted that, if this feature is added to the standard, a user can "opt out" by specifying 'varname(:)', thus disabling the reallocation feature. 3a. IOMSG and ERRMSG The current texts for IOMSG and ERRMSG (for statements) say that, if a value is assigned, it "is assigned an explanatory message, truncated or padded according to the rules of intrinsic assignment." The rules of intrinsic assignment specify that if the length type parameters differ the variable is deallocated and then reallocated to the length of what is being assigned. The change here would be to replace "truncated or padded according to the rules" with "as if by", and noting this change in the Introduction. 3b. Intrinsic procedures Character output arguments for intrinsic subroutines should have wording added to indicate that assignment of the output value is as if by intrinsic assignment. A note should be added to the introduction calling this out. The list of such procedures and relevant arguments is: DATE_AND_TIME (DATE, TIME, ZONE) EXECUTE_COMMAND_LINE (CMDMSG) GET_COMMAND (COMMAND, ERRMSG) GET_COMMAND_ARGUMENT (VALUE, ERRMSG) GET_ENVIRONMENT_VARIABLE (VALUE, ERRMSG) MOVE_ALLOC (ERRMSG) 3c. Internal WRITE Note: Discussion at M217 raised some concerns with the complexity and efficiency of implementing this feature, though some members noted that it's an aspect that the language currently provides no help to the programmer for implementing on their own. It was decided to say that J3 recommends this feature be added, but that if WG5 rejects it that the features described in the previous sections still be implemented. JOR recommends that if the io-unit in an internal WRITE is an unallocated deferred-length character allocatable scalar variable, the WRITE behaves as if the record were assigned as if by intrinsic assignment. The requirement that the variable be unallocated is to provide consistency with the array case described in the next paragraph. If the variable is allocated it behaves as it does in F2008, using the current length. JOR further recommends that the standard disallow the case of io-unit being an unallocated deferred-length character array, as there is no support for array elements having different lengths. If the array is allocated, then it works the way it does in F2008 with the current length being used.