To: J3 J3/13-345 From: Bill Long, John Reid Subject: Comments on collectives Date: 2013 October 01 References: N1983, N1989, 13-318 Overview -------- This paper contains a portion of the responses to the comments in N1989, the results of the ballot on the TS 18508 draft N1983. The following identifiers are used to indicate the source of comments: Reinhold : Reinhold Bader Bill : Bill Long Tobias : Tobias Burnus Nick : Nick Maclaren Daniel : Daniel Chen Dan : Dan Nagel Malcolm : Malcolm Cohen John : John Reid Steve : Steve Lionel Van : Van Snyder along with {Ed} for the document editor. Discussion - Responses with edits --------------------------------- {John 3}: At [15:22] After "the beginning" add "to the end". Response: Agreed. Edit included below. {Nick 13}: 7.3 p15:24+. Since we have not agreed whether and how collectives should be consistent with atomics, a Note along the following lines should be provided: Collective subroutines are ordered only by the usual execution sequence (2.3.5) and segment ordering (8.5.2) rules and behave as if they were implemented using private coarrays to transfer the data. In particular, using them together with atomic subroutines in unordered segments is likely to produce effects that appear to be inconsistent. Response. We agree that it is desirable to add a note explaining that collectives involve the transfer data between images and synchronization. It would be a good idea also to mention the interaction between atomics and collectives in a note. Edits are provided. {Bill I8}: At [15:32-34] The first sentence of paragraph 5 in 7.3 says that a failed execution of a collective is the same as SYNC MEMORY which has the effect of making it an image control statement. But a successful execution is not an image control statement. This is not consistent. The end of the sentence "and the effect...SYNC MEMORY statement" should be deleted. Response: Agreed. (Edit subsumed by {Nick 14}.) {Nick 14}: 7.3 p15:32-39. This should state that all INTENT(INOUT) and INTENT(OUT) arguments become undefined if an error occurs. Response: The STAT and ERRMSG arguments should not become undefined - their purpose is to return information in the event of an error. That leaves the SOURCE and RESULT arguments. An edit is provided. {Bill I9}: At [15:38-39] The last sentence of the same paragraph says "..an image had failed..". It is unclear what images count here. Any one from the initial team, or just ones in the current team (and hence involved with the call)? Probably intend the current team, but it should be clarified. Response: Agreed. An edit is provided. {Nick 15}: 7.3 p15:38. "If an image had failed" should be changed to "If a processor detects that an image has failed". Response. Agreed. An edit is provided. {John 2.3}: The last para. of 15, says that if an error condition occurs and STAT is present, the effect is as if SYNC MEMORY were executed. This seems wrong because RESULT has intent out so one expects it to become undefined. Do we expect SOURCE to be used by the implementation for workspace when RESULT is absent? If so, an error condition should cause SOURCE to become undefined. Response: See {Bill I8} edit for the SYNC MEMORY issue. We do intend that the implementation be allowed to use SOURCE as scratch space if RESULT is not present. An edit is provided. {John 1.1b}: The collective procedures are not massively parallel. They should surely fail if any of the images of the team have failed. However, the last paragraph of page 15 says "If an image has failed, but no other error condition occurred, the argument is assigned the value STAT_FAILED_IMAGE.". If this behavior is retained, the effect of failed images on the result needs to be described. Response: In the case of failed images, the answer to the collective has to be considered undefined. (Edit needed.) {Nick 15}: 7.4.6 p18:2. "shall be a coarray" is almost certainly erroneous and should be removed. If it is a coarray, then all images can access it directly. And the usual requirement is to broadcast a local array, anyway. Response: The argument that all images could access a coarray anyway misses the purpose of CO_BROADCAST, which is to provide a high-performance broadcast operation. Whether require a lower performance version that accommodates non-coarray SOURCE arguments is at topic for subgroup discussion. Edits deferred. {Nick 16}: 7.4.6 p18:2-4. All collectives except CO_BROADCAST specify that, if SOURCE is an array, all images must have the same type parameters and shape, but there is no such wording for scalars. This is a problem for character lengths, but may also be one for polymorphic procedures. Either all images need to specify that SOURCE is the same, or it needs to state what happens if that is not the case; the former would be a lot easier for implementors and to specify. Specific issues that must be fixed are described below [B]. Response: SOURCE should be the same. An edit is provided. {Nick 17}: 7.4.6 p18:2-4, 7.4.7 p18:16-20 and 7.4.8 p1-5. If polymorphic procedures allow the same dummy argument to be different types on different images, CO_BROADCAST, CO_MIN and CO_MAX need to forbid mixing numeric and character. See [B]. Response: Modifications similar to the Proposed edits for {Nick 16} should apply to all of the collectives. Require type to also be the same. Edits are provided. {Nick 18}: The specification is [of the collectives] messy and restrictive, and should be changed. For example, it is not possible to reduce INTENT(IN) variables, which is a fairly common requirement. While copying them is possible, it involved extra data accesses and is potentially inefficient. A far better solution would be either two procedures or a generic interface, of the following forms (CO_REDUCE is used as an example): CO_REDUCE (OBJECT, OPERATOR [, STAT, ERRMSG]) ! OBJECT would be INTENT(INOUT) CO_REDUCE (RESULT, SOURCE, OPERATOR [, RESULT_IMAGE, STAT, ERRMSG]) ! RESULT would be INTENT(INOUT) and SOURCE would be INTENT(IN) Response. Although SOURCE has INTENT(INOUT) for all the collectives, it is not modified if RESULT is present. A much simpler solution to the problem presented here is to avoid specifying intent for SOURCE. Edits are provided. {Van 26, 27, 28, 29} In collectives is the image index an index in the current team or the initial team? Response: The current team. Edits are provided. {Van 30}: At [19:29] If SOURCE is scalar, is it required to have the same type parameter values on all images? Response: Yes. Covered by edits for {Nick 16,17}. {Van 31}: At [19:32] "parameters" -> "parameter values". Repsonse: Change is not necessary because the previous edit deleted this text. {Bill I10}: At [19:37] In the description of the OPERATOR argument to CO_REDUCE we say that the operation is commutative, but fail to say is is associative. That should be added. Response: Agreed. Need to specify "mathematically" associative. An edit is provided. {Nick 19}: 7.4.9 p19:37-39. The last two sentences should be replaced by "OPERATOR shall implement a mathematically associative and commutative operation.", as I remember we agreed. Response: See {Bill I10}. {Nick 20}: 7.4.9 p19:35-39. This must specify that OPERATOR must implement the same mathematical function on all images, but I have no idea how to say that. Indeed, I am not completely sure that there is no way of providing different procedures for OPERATOR, because it involves areas of Fortran I am not familiar with. Response. We agree that the same function should be invoked on all the images. Discussion - Responses only --------------------------- {Dan 6}: In the last paragraph [16:3-5] change wording to give an explanatory message when an informative value is passed to the stat= specifier. Response: This is a subroutine call, not a statement, so there is a STAT argument, not a stat= specifier. The current wording specifies the same semantics as for the similar arguments to the EXECUTE_COMMAND_LINE intrinsic. {John 2.2}: For CO_MAX, CO_MIN, and CO_SUM, there is a corresponding transformational function so it is easy to write code for the common case where the max, min, or sum of all the elements of the arrays on all the images is wanted. We need to add REDUCE to play the same role for CO_REDUCE. Response: A REDUCE intrinsic should be considered as a feature for F201x rather than as an addition to the TS. The routine does not have any parallel programming association. See paper 13-318. {Nick 21}: 7.4.9 p20:8-11. This must state that the procedure may be called on any image an indeterminate number of times. Response. We do not see why this needs to be said. Since the function is pure, the number of times it is called cannot affect the executing program. Edits to N1983 -------------- [15:22] {John 3} After "the beginning" add "to the end". [15:33-34] {Nick 14} Change "and the effect...SYNC MEMORY statement" to ", the SOURCE argument becomes undefined, and the RESULT argument becomes undefined if it is present". [15:38] {Bill I9} After "If an image" add "of the current team". [15:38] {Nick 15} Change "had failed" to "had been detected as failed". [15:39] [Missing edit for {John 1.1b}.] [15:39+] {John 2.3} Add a new Note "NOTE 7.0 SOURCE becomes undefined in the event of an error condition for a collective with RESULT present because it is intended that implementations be free to use SOURCE as scratch space." [16:5+] {Nick 13} Add notes: "NOTE 7.0a Each collective involves the transfer of data between images and synchronization." "NOTE 7.0b There is no synchronization at the beginning and end of an invocation of a collective procedure, which allows overlap with other actions. Note that the rules of Fortran do not allow the value of an associated argument such as SOURCE to be changed except via the argument. This includes action taken by another image that has not started its execution of the collective or has finished it." [18:2] {Nick 16, 17} Before the second "SOURCE" add "It shall have the same type and type parameters on all images of the current team. If it is an array, it shall have the same shape on all images of the current team." [18:2] {Nick 18} Delete "It is an INTENT(INOUT) argument." [18:5] {Van 25} Change "an image index" to "the image index of an image of the current team". [18:16] {Nick 16, 17} After "character." add "It shall have the same type and type parameters on all images of the current team." [18:16] {Nick 18} Delete "It is an INTENT(INOUT) argument." [18:18] {Nick 16, 17} Delete "and type paramaters". [18:25] {Van 26} Change "an image index" to "the image index of an image of the current team". [19:1] {Nick 16, 17} After "character." add "It shall have the same type and type parameters on all images of the current team." [19:1] {Nick 18} Delete "It is an INTENT(INOUT) argument." [19:3] {Nick 16, 17} Delete "and type paramaters". [19:10] {Van 27} Change "an image index" to "the image index of an image of the current team". [19:29] {Nick 16, 17} After "polymorphic." add "It shall have the same type and type parameters on all images of the current team." [19:29] {Nick 18} Delete "is an INTENT(INOUT) argument. It " [19:31-32] {Nick 16, 17} Delete "and type paramaters". [19:36] {Nick 20} Before "Its" add "It shall be the same function on all images of the current team." [19:37] {Bill I10} Change "mathematically commutative" to "mathematically commutative and associative". [19:38-39] {Bill I10} Delete "If the operation implemented ... processor dependent." [19:44] {Van 28} Change "an image index" to "the image index of an image of the current team". [20:21] {Nick 16, 17} After "argument." add "It shall have the same type and type parameters on all images of the current team." [20:21] {Nick 18} Delete "It is an INTENT(INOUT) argument." [20:31] {Van 29} Change "an image index" to "the image index of an image of the current team".