To: J3 J3/19-254 From: Rich Bleikamp Subject: Edits for ISO_FORTRAN_STRINGS US03 Date: 2019-October-16 Reference: 18-007r1, 19-160, 19-196r3, 19-245r1 See 19-245r1 specs and syntax. --------------------- Edits for 18-007r1: [xiii] In the bullet item "Intrinsic procedures and modules", add "The new intrinsic SPLIT parses a string into tokens." in table 16.1 (list of intrinsics) [337:?] after the entry for SPACING, add a new entry "SPLIT (STRING, SET, TOKENS S Parse a string into tokens. [, SEPARATOR, BACK]) or (STRING, SET, FIRST, LAST [, SEPARATOR, BACK]) or (STRING, SET, POS [, SEPARATOR, BACK]) " [418:18+] add a new entry for SPLIT after SPACING "16.9.180+ SPLIT (STRING, SET, TOKENS [, SEPARATOR, BACK]) or SPLIT (STRING, SET, FIRST, LAST [, SEPARATOR, BACK]) or SPLIT (STRING, SET, POS [, SEPARATOR, BACK]) Description. Parse a string into tokens. Class: Subroutine. Arguments. STRING shall be a scalar of type CHARACTER. It is an INTENT(IN) argument. The SPLIT intrinsic will scan STRING, identifying tokens as described in the description of SET. SET shall be a scalar of type CHARACTER of the same kind as STRING. It is an INTENT(IN) argument. Every character in SET is a token delimiter. Two consecutive token delimiters in STRING, or a token delimiter in the first or last character in STRING, indicate a null (zero-length) token. A sequence of zero or more characters in STRING delimited by any token delimiter, or the beginning or end of STRING, comprise a token. TOKENS shall be of type CHARACTER of the same kind as STRING. It is an INTENT(OUT) argument. It shall be an allocatable array with deferred length. When TOKENS is present, every token found in STRING is assigned, in the order found, to the next element of TOKENS. Every element of TOKENS has the length of the longest token found, and SIZE(TOKENS) is the number of tokens found. The elements of TOKENS are blank padded as needed. If the blank character was not one of the token delimiters in SET, then the form of SPLIT that returns FIRST and LAST may be more useful. SEPARATOR (optional) shall be of type CHARACTER of the same kind as STRING. It is an INTENT(OUT) argument. When TOKENS, or FIRST and LAST, are present, SEPARATOR, if present, shall be an allocatable array with deferred length, and the token delimiter which separated TOKENS(i) from TOKENS(i+1), if any, is returned in SEPARATOR(i). Note that when BACK is present and TRUE, SEPARATOR(i) is the token delimiter character, if any, that appeared immediately before the token stored in TOKENS(i). When POS and SEPARATOR are present, SEPARATOR shall be a scalar, and is assigned the value of a token delimiter as described in 1) and 2) below when a token delimiter was found, and is not assigned a value when no token delimiter was found. 1) When BACK is not present, or present with the value FALSE, the token delimiter that appeared immediately after the identified token, if any, is assigned to SEPARATOR. 2) When BACK is present with the value TRUE, the token delimiter that appeared before the identified token, if any, is assigned to SEPARATOR. FIRST shall be an allocatable array of type integer It is an INTENT(OUT) argument. LAST shall be an allocatable array of type integer It is an INTENT(OUT) argument. When FIRST and LAST are present, they are assigned the offsets in STRING such that STRING(FIRST(i):LAST(i))is the ith token found. When the ith token is a zero length token, LAST(i) will be FIRST(i)-1. POS shall be a scalar of type INTEGER. It is an INTENT(INOUT) argument. When POS is present, and BACK is not present or present with the value FALSE, the token beginning at STRING(POS+1:POS+1) is identified, and if that token was not the last token in STRING, POS is set to the position of the separator character that appeared immediately after the identified token. When the identified token was the last token in STRING, POS is set to LEN(STRING)+1. When SPLIT is invoked with POS LEN(STRING), the last token in STRING has been found. When POS is present, and BACK is present with the value TRUE, the token ending at STRING(POS-1:POS-1) is identified, and if that token was not the first token in STRING, POS is set to the position of the separator character that appeared immediately before the identified token. When the identified token was the first token in STRING, POS is set to 0. When SPLIT is invoked with POS The effect of the procedure is to divide STRING into tokens at every occurrence of a character that is in SET, and assign those tokens, in the order found, to TOKENS. Every element of TOKENS has the length of the longest token found, and SIZE(TOKENS) is the number of tokens found. The STRING is searched in the forward direction unless BACK is present with the value true, in which case the search is in the backward direction. If the argument SEPARATOR is present, the character which separated TOKENS(i) from TOKENS(i+1)is returned in SEPARATOR(i). Note that when BACK is present and TRUE, SEPARATOR(i) is the separator character that appeared immediately before the token stored in TOKENS(i). If no character from SET is found or SET is of zero length, the whole STRING is returned in the first element of TOKENS, and SEPARATOR (if present) is returned as zero length. Otherwise, SIZE(SEPARATOR) will be SIZE(TOKENS)-1. When FIRST and LAST are present, they are assigned the offsets in STRING such that STRING(FIRST(i):LAST(i))is the ith token found. SEPARATOR, if present, is assigned values as described in case (i) above. When the ith token is a zero length token, LAST(i) will be FIRST(i)-1. When POS is present, the token beginning at STRING(POS+1:POS+1) is identified, and if that token was not the last token in STRING, POS is set to the position of the separator character that appeared immediately after the identified token. When the identified token was the last token in STRING, POS is set to LEN(STRING)+1. When SPLIT is invoked with POS LEN(STRING), the last token in STRING has been found. SEPARATOR is assigned values as described in case (i) above. Edge cases: 1) if the first character in STRING is a separator character, or There are two separator characters, a null (zero length) token is found. 2) if a blank (space) character appears in SET, consecutive spaces in STRING will result in a null token. 3) If LEN(STRING) is zero, TOKENS, FIRST and LAST, if present, will have zero elements. <> Example 1: CHARACTER( LEN= :), ALLOCATABLE :: STRING CHARACTER( LEN= :), ALLOCATABLE, DIMENSION(:) :: TOKENS CHARACTER( LEN= 2) :: SET = ',;' STRING = 'first,second,third' CALL SPLIT( STRING, TOKENS, SET) PRINT *, STRING, TOKENS, SET will print first,second,third first second third ,; Example 2: CHARACTER( LEN= :), ALLOCATABLE :: STRING CHARACTER( LEN= 2) :: SET = ',;' INTEGER, DIMENSION(:):: FIRST, LAST STRING = 'first,second,,forth' CALL SPLIT( STRING, SET, FIRST, LAST) PRINT *, FIRST PRINT *, LAST prints 1 7 14 15 5 12 13 19 Example 3: CHARACTER( LEN= :), ALLOCATABLE :: INPUT CHARACTER( LEN= 2) :: SET = ', ' INTEGER P INPUT = "one,last example" P = 0 DO IF( P > LEN( INPUT) ) EXIT ISTART = P + 1 CALL SPLIT(INPUT, SET, P) IEND = P - 1 PRINT *, INPUT( ISTART: IEND) END DO prints one last example