To: J3 J3/21-115r1 From: Richard Bleikamp & JOR Subject: SPLIT intrinsic rework Date: 2021-February-23 Reference: 21-007, 21-109 This r1 paper deletes two sentences from the edits (present in the r0 paper), namely, in the two paragraphs that begin with "If BACK is", the last sentence of those paragraphs has been deleted. Milan Curcic proposed separating the 3rd form of the new SPLIT intinisic from the other forms, and either just keeping the 2 forms of SPLIT that tokenize the whole string all at once, or making the 3rd form of SPLIT, the serialized form, a new intrinsic. JOR is recommending option (a) in the straw vote below, and will move both batches of edits unless the straw vote in strongly opposed. JOR also notes that the names "TOKENIZE" and "SPLIT" can be changed to something more appropriate, but JOR has no objections to these names. STRAW VOTE: should the 3rd serialized form of SPLIT be retained? a: keep the 3rd serialized form as well as SCAN b: keep the 3rd serialized form as well and deprecate SCAN c: delete the 3rd serialized form of SPLIT d: undecided The edits below are against 21-007. They are segregated into two batches. The first batch renames SPLIT to TOKENIZE (or something similar if desired), and removes the 3rd serialized form. The second batch of edits adds the 3rd form of SPLIT back to the 007, named SPLIT (or something similar). Depending on the straw vote above, we can move/approve the first batch of edits only, effectively deleting the 3rd form. We also simplified the limitations for the value of POS (3rd form), since return values from SPLIT aren't affected by the simplification. Note that Bill's suggested edits for the original SPLIT, to prohibit coarrays and coindexed objects as arguments, is included (cut and pasted verbatim from 21-109, except substituting TOKENIZE for SPLIT). Bill's changes only affected the first two forms of the old SPLIT intrinsic, not the 3rd serialized form. ---------- Edits to 21-007, BATCH 1: [xiii, 2] under Intrinsic procedures and modules: Change "SPLIT" to "TOKENIZE" in the list of new intrinsics. [348, before line 1] In table 16.1, "Standard generic intrinsic procedure summary" Delete the 3rd form of SPLIT, and the word "or" at the end of the preceding line AND Rename SPLIT as TOKENIZE, and place the entry for TOKENIZE after the entry for TINY. Renumber as appropriate (16.9.xxx) [432:23] Delete the 3rd form of SPLIT (renamed TOKENSIZE above), and delete the word "or" from the preceding line (line number may vary after reordering in immediately precedding edit). [432:33-34] In 16.9.194 SPLIT (renamed TOKENIZE above), in the description of the TOKENS argument change the second sentence: "It is an INTENT(OUT) argument." to "It is an INTENT(OUT) argument. The corresponding actual argument shall not be a coarray or a coindexed object." [433:4-5] In 16.9.194 SPLIT (renamed TOKENIZE above), in the description of the SEPARATOR argument change the second sentence: "It is an INTENT(OUT) argument." to "It is an INTENT(OUT) argument. The corresponding actual argument shall not be a coarray or a coindexed object." [433:9] In 16.9.194 SPLIT (renamed TOKENIZE above), in the description of the FIRST argument change the second sentence: "It is an INTENT(OUT) argument." to "It is an INTENT(OUT) argument. The corresponding actual argument shall not be a coarray or a coindexed object." [433:15] In 16.9.194 SPLIT (renamed TOKENIZE above), in the description of the LAST argument change the second sentence: "It is an INTENT(OUT) argument." to "It is an INTENT(OUT) argument. The corresponding actual argument shall not be a coarray or a coindexed object." [433:20-36] Delete all the descriptions for POS and BACK. [434:3-19] Delete the example thats uses a DO loop. ---------- end of BATCH 1 edits Edits to 21-007, BATCH 2: [xiii, 2] under Intrinsic procedures and modules: Add this sentence after the description of the new TOKENIZE. "The intrinsic subroutine SPLIT parses a string into tokens, one at time." [348, 1-] in table 16.1, "Standard generic intrinsic procedure summary" Add an entry for SPLIT after the entry for SPACING "SPLIT (STRING, SET, POS [, BACK])" [432:22+] After the entry for SPACING, add a new entry (note, all the added text below is identical to what was in 21-007, for SPLIT, but just those parts that apply to the 3rd/serialized form, EXCEPT that the Description has been reworded) "16.9.xxx SPLIT (STRING, SET, POS [, BACK]) Description. Parse a string into tokens, one at a time. Class. Subroutine. Arguments. STRING shall be a scalar of type character. It is an INTENT (IN) argument. SET shall be a scalar of type character with the same kind type parameter as STRING. It is an INTENT (IN) argument. Each character in SET is a token delimiter. A sequence of zero or more characters in STRING delimited by any token delimiter, or the beginning or end of STRING, comprise a token. Thus, two consecutive token delimiters in STRING, or a token delimiter in the first or last character of STRING, indicate a token with zero length. POS shall be an integer scalar. It is an INTENT (INOUT) argument. The value of POS shall be in the range 0 < POS <= LEN (STRING) + 1. If BACK is absent or is present with the value false, POS is assigned the position of the leftmost token delimiter in STRING whose position is greater than POS, or if there is no such character, it is assigned a value one greater than the length of STRING. If BACK is present with the value true, POS is assigned the position of the rightmost token delimiter in STRING whose position is less than POS, or if there is no such character, it is assigned the value zero. If SPLIT is invoked with a value for POS in the range 1 <= POS <= LEN (STRING), and the value of STRING (POS:POS) is not equal to any character in SET, the token identified by SPLIT will not comprise a complete token as described in the description of the SET argument, but rather a partial token. BACK (optional) shall be a logical scalar. It is an INTENT (IN) argument. Example. Execution of CHARACTER (LEN=:), ALLOCATABLE :: INPUT CHARACTER (LEN=2) :: SET = ', ' INTEGER P INPUT = "one,last example" P = 0 DO IF (P > LEN (INPUT)) EXIT ISTART = P + 1 CALL SPLIT (INPUT, SET, P) IEND = P - 1 PRINT '(T7,A)', INPUT (ISTART:IEND) END DO will print one last example" ---------- end of BATCH 2 edits