To: J3 J3/19-181 From: John Reid Subject: Interpretation request on FORM TEAM and failed images Date: 2019-July-17 Reference: 18-007r1 ---------------------------------------------------------------------- NUMBER: F18/nnnn TITLE: FORM TEAM and failed images KEYWORDS: failed images, FORM TEAM DEFECT TYPE: Interpretation STATUS: J3 consideration in progress QUESTION 1: 18-007r1, 11.6.9 p6 says "If execution of a FORM TEAM statement assigns the value STAT_FAILED_IMAGE to the stat-variable, the effect is the same as for the successful execution of FORM TEAM except for the value assigned to stat-variable." How can this be the case? 11.6.11 says: "If a STAT= specifier appears in a sync-stat in an image control statement, the stat-variable is assigned the value zero if execution of the statement is successful." ANSWER 1: This was intended to refer to a successful execution for the team of active images. An edit is provided. EDIT 1: In 11.6.9 FORM TEAM statement p6 add "on a team consisting of the active images of the team" before "except" to make the sentence read "If execution of a FORM TEAM statement assigns the value STAT_FAILED_IMAGE to the stat-variable, the effect is the same as for the successful execution of FORM TEAM on a team consisting of the active images of the team except for the value assigned to stat-variable." QUESTION 2: If there are any failed images in a team that executes a FORM TEAM statement, do these images belong to any of the new teams? ANSWER 2: No, provided the statement is successful or returns with its stat-variable having the value STAT_FAILED_IMAGE. See edit 1 above. QUESTION 3: What happens if a new team was intended to have 16 images, one of these images has failed, and an active image of the new team specified 16 as its NEW_INDEX value? ANSWER 3: This would be an error condition because the new team has 15 images. If no image in the team has stopped and the FORM TEAM statement has a STAT= specifier, the variable specified will be given a processor-dependent value other than STAT_STOPPED_IMAGE and STAT_FAILED_IMAGE. This follows from the rules in 11.6.11, p5. QUESTION 4: In the definition of the FAILED_IMAGES intrinsic function, 16.9.77 p5 states: "If the executing image has previously executed an image control statement whose STAT= specifier assigned the value STAT_FAILED_IMAGE from the intrinsic module ISO_FORTRAN_ENV, or referenced a collective subroutine whose STAT argument was set to STAT_FAILED_IMAGE, at least one image in the set of images participating in that image control statement or collective subroutine reference shall be known to have failed." Should the list include the cases of a reference to the intrinsic IMAGE_STATUS that returned STAT_FAILED_IMAGE, an atomic subroutine with a STAT argument was set to STAT_FAILED_IMAGE, and an image selector with a STAT= specifier that was set to STAT_FAILED_IMAGE? ANSWER 4: Yes. An edit is provided. EDIT 4: In 16.9.77 FAILED_IMAGES ([TEAM, KIND]) p5, change "or referenced a collective subroutine whose STAT argument was set to STAT_FAILED_IMAGE, at least one image in the set of images participating in that image control statement or collective subroutine reference" to "referenced an atomic or a collective subroutine with a STAT argument that assigned the value STAT_FAILED_IMAGE, referenced an image selector with a STAT= specifier that assigned the value STAT_FAILED_IMAGE, or referenced the intrinsic IMAGE_STATUS with value STAT_FAILED_IMAGE, at least one image in the set of images involved in that execution or reference" so that the sentence reads: "If the executing image has previously executed an image control statement whose STAT= specifier assigned the value STAT_FAILED_IMAGE from the intrinsic module ISO_FORTRAN_ENV, referenced an atomic or a collective subroutine with a STAT argument that assigned the value STAT_FAILED_IMAGE, referenced an image selector with a STAT= specifier that assigned the value STAT_FAILED_IMAGE, or referenced the intrinsic IMAGE_STATUS with value STAT_FAILED_IMAGE, at least one image in the set of images involved in that execution or reference shall be known to have failed." QUESTION 5: A collective subroutine is defined (3.143.2) as an "intrinsic subroutine that performs a calculation on a team of images without requiring synchronization", and collective subroutines are invoked "on all active images of its current team in segments that are not ordered with respect to each other" (16.6 par. 1). Is the guarantee provided by 16.6 par. 7: "If the STAT argument is present and ... the current team contained failed images, an error condition occurs and the value STAT_FAILED_IMAGE from the intrinsic module ISO_FORTRAN_ENV is assigned to the STAT argument" too strong? It seems to imply that all images in the team will have knowledge of a failed image if there is one. ANSWER 5: Yes. It was not intended to require a synchronization of all the images involved. Rather than "if the current team contained failed images", the standard should say "if the current team is detected to contain failed images". This is particularly important for CO_BROADCAST, because the failure of one image does not prevent others receiving the correct result, and each should be allowed to continue execution as soon as it has the result. EDIT 5: In 16.6 Collective subroutines, para 7, sentence 2, change "if the current team contained failed images" to "if the current team is detected to contain failed images". so that the sentence becomes "Otherwise, if the current team is detected to contain failed images, an error condition occurs and the value STAT_FAILED_IMAGE from the intrinsic module ISO_FORTRAN_ENV is assigned to the STAT argument."