To: J3 J3/15-178r1 From: Nick Maclaren Subject: Event ordering semantics Date: 2015 June 29 References: WG5/N1971, 13-352 and many others Discussion ---------- In the context of TS 18508, specific issues were raised as soon as events were introduced, but have never been addressed. Leaving some uses processor-dependent (i.e. unspecified) and others undefined is fine, but it is essential to agree what defined uses actually do, and exactly where the boundaries are with processor-dependent and undefined. Currently, none of that is agreed, let alone specified. N1971, 13-352 and several other documents contain examples of open questions. We do not necessarily need a mathematical consistency model, but we DO need to know that one could be produced, in theory. There are three key points: 1) Events look simple, but are not, because semaphores are not even known to be self-consistent, and 'obvious truths' in parallel semantics are very likely to be false. See Background for how the Java people got caught by that one, and how POSIX semaphores are a well-known trap. 2) I have been unable to find a consensus in WG5 as to the semantic intent, and individual people's answers have usually indicated that they often hold two incompatible views. The current wording of TS 18508 reflects that. I have proposed three possible solutions: 3) I cannot guarantee that any of these solutions are both consistent and implementable under all circumstances (see Background again for why that is unavoidable), but they are the best that I can do. 4) It will be seriously inefficient on large clusters, because of the requirement for EVENT POST to block, and EVENT_QUERY to synchronise, but that seems to be what people want and is needed for example A.3.3 to be defined. Proposed Solution One --------------------- This would make examples A.3.2 and A.3.3 both defined. [Fortran 2008 189:10+] Add a new paragraph after 8.5.2p2: "The execution of ordered segments is sequentially consistent, and the values returned from EVENT_QUERY are consistent with that ordering." [18:1+] Add a new paragraph to 7.3: "An EVENT POST statement completes when the event variable has been updated on the owning image, and does not depend on the execution of a matching EVENT WAIT statement." This means that EVENT POST will block until the acknowledgement of the update has been received. [18:2-4] Delete the paragraph (7.3p4). [18:4+] Delete NOTE 7.2. [18:21-24] Replace 7.4p3 by: "If an EVENT WAIT statement using an event variable is executed with a threshold of k when the event variable has a value of n, and k <= n, the segments preceding all of the EVENT POST statements that have updated that event variable will precede the segment following the EVENT WAIT statement. If k > n, the EVENT WAIT statement will wait until the value of the event variable reaches k. EVENT_QUERY is now an anomalous intrinsic procedure that performs logic (but not data) synchronisation, and I feel that it would be better in a new class. [18:25+] Delete NOTE 7.4 because it is now normative. [27:21+] Add a new paragraph to 8.4.15: "If the value returned by an EVENT_QUERY invocation is in position k of the sequence of values that the event variable takes during execution and the EVENT_QUERY invocation precedes an EVENT WAIT invocation on the variable, the segment preceding the EVENT POST statement that made change i, for any i <= k, will precede the segment following that EVENT WAIT statement." [27:21+] Add two new Notes to 8.4.15: "NOTE 8.x This means that EVENT_QUERY has a synchronization effect, but it does not mean that it synchronizes data updates." "NOTE 8.y In order to meet this requirement, implementations may need to ensure that EVENT POST blocks until it has received an acknowledgement that the event variable has been updated." Proposed Solution Two --------------------- This would make example A.3.2 defined, but A.3.3 would be processor- dependent. EVENT POST would be inefficient, but EVENT_QUERY could be more efficient. [Fortran 2008 189:10+] Add a new paragraph after 8.5.2p2: "The execution of ordered segments is sequentially consistent, and the values returned from EVENT_QUERY for variables on the executing image are consistent with that ordering." [18:1+] Add a new paragraph to 7.3: "An EVENT POST statement completes when the event variable has been updated on the owning image, and does not depend on the execution of a matching EVENT WAIT statement." This means that EVENT POST will block until the acknowledgement of the update has been received. [18:2-4] Delete the paragraph (7.3p4). [18:4+] Delete NOTE 7.2. [18:21-24] Replace 7.4p3 by: "If an EVENT WAIT statement using an event variable is executed with a threshold of k when the event variable has a value of n, and k <= n, the segments preceding all of the EVENT POST statements that have updated that event variable will precede the segment following the EVENT WAIT statement. If k > n, the EVENT WAIT statement will wait until the value of the event variable reaches k. [18:25+] Delete NOTE 7.4 because it is now normative. [27:21+] Add a new paragraph to 8.4.15: "If the value returned by an EVENT_QUERY invocation on an event variable on the same image is in position k of the sequence of values that the event variable takes during execution and the EVENT_QUERY invocation precedes an EVENT WAIT invocation on the variable, the segment preceding the EVENT POST statement that made change i, for any i <= k, will precede the segment following that EVENT WAIT statement." [50:35+] Add a new paragraph to A.3.2: "Because the value returned from EVENT_QUERY is purely indicative on any image other than the one on which the event variable is located, whether this program will make progress or hang is processor-dependent." Proposed Solution Three ----------------------- This would make examples A.3.2 and A.3.3 both processor-dependent, but would be by far the easiest to implement and most efficient. [Fortran 2008 189:10+] Add a new paragraph after 8.5.2p2: "Segment ordering is sequentially consistent." [18:1+] Add a new paragraph to 7.3: "A program that depends on the segment following an EVENT POST statement being executed before a matching EVENT WAIT statement is executed for that event variable is not conforming." [18:2-4] Delete the paragraph (7.3p4). [18:22] Change "at least" to "exactly" in 7.4p3. [18:23] Following the first "WAIT statement." in 7.4p3, add: "If the event variable has a value of greater than k, which of those segments precede the EVENT WAIT is processor-dependent." [18:25+] Replace the body of NOTE 7.4 in 7.4 by: "The ordering of segments is precisely that defined by the execution of the image control statements, and the value of the event variable (as returned from EVENT_QUERY) is purely indicative". [49:13+] Add a new paragraph to A.3: "Because the value returned from EVENT_QUERY is purely indicative, whether these programs will make progress or hang is processor- dependent." FULL DISCUSSION --------------- --------------- Background ---------- Events are general semaphores (as distinct from binary semaphores), which are often called counting semaphores, so there are many APIs that use them. However, every specification I have found has been merely the syntax and intent, with only a few aspects of the semantics specified. What is more, there are rarely any useful constraints on what those semantics may be, so they are closer to Fortran undefined than processor-dependent. POSIX is well-known for that (see below). Up until the late 1960s, essentially all parallel experts believed that parallel correctness was merely a matter of getting the synchronisation right, but experience with the early parallel systems showed that was not even approximately true. It was another decade before any real progress was made (Hoare in 1978 and Lamport in 1979) - that identified data consistency as the key concept, and one that could NOT be reduced to a synchronisation property. Dijkstra invented semaphores in 1962/3, and described them solely as synchronisation facilities, with only an API (in the above sense). In over 50 years, nobody seems to have been able to provide a specification of general semaphores that does not contain serious ambiguities or inconsistencies, and many of the experts I have talked to suspect that it is impossible. See, for example: https://github.com/android/platform_bionic/blob/master/libc/... .../bionic/semaphore.cpp // Memory order requirements for POSIX semaphores appear unclear and are // currently interpreted inconsistently. // We conservatively prefer sequentially consistent operations for now. // CAUTION: This is more conservative than some other major // implementations, and may change if and when the issue is resolved. An indication of how deceptive this area is may be seen by the fact that it was only in the early 1990s that the theoreticians finally managed to understand the relationship between causality (i.e. that which is delivered by sychronisation) and the 'obvious' semantics. See, for example: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.6403 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.50.3356 Fortran events impose the restriction that they can be waited on only by their owning image, but that simplifies their semantics only very slightly (see below), and a precise specification of them remains beyond the state of the art in computer science. The situation is far nastier than it seems to be, because a lot of hardware and software optimisations allow 'out of thin air' effects, and that includes IBM POWER. Despite common belief, such effects cannot be excluded by ANY specification that limits itself to the semantics of a single semaphore. None of the experts I know trust their own intuition, because it is so common that 'obviously correct' specifications contain serious flaws, often unusabilities or inconsistencies. I have been following this area for 40 years, have 20 years practical experience, and am still learning new ways in which 'obvious truths' are incorrect. Java has had threading since 1995, but its model was discovered to be badly flawed only in 2003 or so, and it had to be replaced in 2004. http://www.cs.umd.edu/~pugh/java/memoryModel/ These are not purely theoretical issues, either, as I can witness from helping people with parallel problems. Even the more extreme effects (e.g. 'out of thin air' ones) really do occur in many parallel contexts. Speaking to other people in the parallel development and support area, most of them had encountered similar issues, but resolved them by rewriting code to use different facilities or a different algorithm, sometimes repeating that for each new environment! Issue 1 (Blocking) ------------------ Is EVENT POST allowed to block until there is a matching EVENT WAIT? The reason that this is a critical question is that all image control statements except SYNC MEMORY are allowed to block, and the ordering of that is explicitly processor-dependent (8.5.5p4). The point is that message-passing implementations have to ensure that all relevant outstanding transfers (including externally-initiated ones) are completed appropriately, and it is very hard to identify them without blocking. It appears that the consensus is "no", in the light of the decision taken in Las Vegas to require asynchronous progress (N2020), but it is not clear from the current normative wording. The current Note 7.2 clarifies nothing, unfortunately. Specifically, are the following programs required to terminate? PROGRAM Block_One USE, INTRINSIC :: ISO_FORTRAN_ENV TYPE(EVENT_TYPE) :: event[*] IF (NUM_IMAGES() /= 2) STOP POST(event[3-THIS_IMAGE()]) END PROGRAM Block_One PROGRAM Block_Two USE, INTRINSIC :: ISO_FORTRAN_ENV TYPE(EVENT_TYPE) :: event[*] IF (NUM_IMAGES() /= 2) STOP POST(event[3-THIS_IMAGE()]) WAIT(event) END PROGRAM Block_Two Solution A: If those are conforming: ---------- [18:1+] Add a new paragraph to 7.3: "An EVENT POST statement completes when the event variable has been updated on the owning image, and does not depend on the execution of a matching EVENT WAIT statement." This means that EVENT POST will block until the acknowledgement of the update has been received. [18:4+] Delete NOTE 7.2. Solution B: If not: ---------- [18:1+] Add a new paragraph to 7.3: "A program that depends on the segment following an EVENT POST statement being executed before a matching EVENT WAIT statement is executed for that event variable is not conforming." Issue 2 (EVENT_QUERY) --------------------- Is the segment ordering required to match the value of the event variable, and must that be consistent across images? Are the following programs conforming? PROGRAM Query_One ! This asks whether, if the event count is higher than the wait ! count, the WAIT synchronises after all of the POSTs USE, INTRINSIC :: ISO_FORTRAN_ENV TYPE(EVENT_TYPE) :: event[*] INTEGER :: data[*] = 0, count IF (NUM_IMAGES() /= 2) STOP IF (THIS_IMAGE() == 1) THEN DO CALL EVENT_QUERY(event,count) IF (count == 2) EXIT END DO WAIT(event) data = 1 ELSE IF (THIS_IMAGE() == 2) THEN POST(event[1]) data[1] = 0 POST(event[1]) END IF END PROGRAM Query_One PROGRAM Query_Two ! Assuming the answer to Query_One is "yes", this asks whether the event ! count must be consistent across all images USE, INTRINSIC :: ISO_FORTRAN_ENV TYPE(EVENT_TYPE) :: event_1[*], event_2[*] INTEGER :: data[*] = 0, count IF (NUM_IMAGES() /= 3) STOP IF (THIS_IMAGE() == 1) THEN WAIT(event_2) WAIT(event_1) data = 1 ELSE IF (THIS_IMAGE() == 2) THEN POST(event_1[1]) data[1] = 0 POST(event_1[1]) ELSE IF (THIS_IMAGE() == 3) THEN DO CALL EVENT_QUERY(event_1[1],count) IF (count == 2) EXIT END DO POST(event_2[1]) END IF END PROGRAM Query_Two PROGRAM Query_Three ! Assuming the answer to Query_One is "yes", this asks whether waiting ! for the event count to reach the required value is enough (as distinct ! from requiring the WAIT to be executed) USE, INTRINSIC :: ISO_FORTRAN_ENV TYPE(EVENT_TYPE) :: event[*] INTEGER :: data[*] = 0, count IF (NUM_IMAGES() /= 2) STOP IF (THIS_IMAGE() == 1) THEN WAIT(event) DO CALL EVENT_QUERY(event,count) IF (count == 1) EXIT END DO data = 1 ELSE IF (THIS_IMAGE() == 2) THEN POST(event[1]) data[1] = 0 POST(event[1]) END IF END PROGRAM Query_Three Five plausible responses are "no, no, no", "yes, no, no", "yes, yes, no", "yes, no, yes" and "yes, yes, yes". The solution depends on which one is preferred. Solution A: If "no, no, no": ---------- I favour this, and there are certainly some other people who do, but I do not know whether we are in a majority. [18:22] Change "at least" to "exactly" in 7.4p3. [18:23] Following the first "WAIT statement." in 7.4p3, add: "If the event variable has a value of greater than k, which of those segments precede the EVENT WAIT is processor-dependent." [18:25+] Replace the body of NOTE 7.4 in 7.4 by: "The ordering of segments is precisely that defined by the execution of the image control statements, and the value of the event variable (as returned from EVENT_QUERY) is purely indicative". [49:13+] Add a new paragraph to A.3: "Because the value returned from EVENT_QUERY is purely indicative, whether these programs will make progress or hang is processor- dependent." Solution B: If "yes, no, no": ---------- My understanding from what they have said is that many people expect this, and would make example A.3.2 defined, but not A.3.3. [18:21-24] Replace 7.4p3 by: "If an EVENT WAIT statement using an event variable is executed with a threshold of k when the event variable has a value of n, and k <= n, the segments preceding all of the EVENT POST statements that have updated that event variable will precede the segment following the EVENT WAIT statement. If k > n, the EVENT WAIT statement will wait until the value of the event variable reaches k. [18:25+] Delete NOTE 7.4 because it is now normative. [27:21+] Add a new paragraph to 8.4.15: "If the value returned by an EVENT_QUERY invocation on an event variable on the same image is in position k of the sequence of values that the event variable takes during execution and the EVENT_QUERY invocation precedes an EVENT WAIT invocation on the variable, the segment preceding the EVENT POST statement that made change i, for any i <= k, will precede the segment following that EVENT WAIT statement." [50:35+] Add a new paragraph to A.3.2: "Because the value returned from EVENT_QUERY is purely indicative on any image other than the one on which the event variable is located, whether this program will make progress or hang is processor-dependent." Solution C: If "yes, yes, no": ---------- This differs from solution B in making example A.3.3 defined and requiring EVENT_QUERY to synchronised. I do not like it, because I can't convince myself that it doesn't have some nasty ambiguities lurking, and it adds serious inefficiency, but am prepared to propose it for the sake of consensus. [18:21-24] Replace 7.4p3 by: "If an EVENT WAIT statement using an event variable is executed with a threshold of k when the event variable has a value of n, and k <= n, the segments preceding all of the EVENT POST statements that have updated that event variable will precede the segment following the EVENT WAIT statement. If k > n, the EVENT WAIT statement will wait until the value of the event variable reaches k. [18:25+] Delete NOTE 7.4 because it is now normative. [27:21+] Add a new paragraph to 8.4.15: "If the value returned by an EVENT_QUERY invocation is in position k of the sequence of values that the event variable takes during execution and the EVENT_QUERY invocation precedes an EVENT WAIT invocation on the variable, the segment preceding the EVENT POST statement that made change i, for any i <= k, will precede the segment following that EVENT WAIT statement." EVENT_QUERY is now an anomalous intrinsic procedure that performs logic (but not data) synchronisation, and I feel that it would be better in a new class. [27:21+] Add two new Notes to 8.4.15: "NOTE 8.x This means that EVENT_QUERY has a synchronization effect, but it does not mean that it synchronizes data updates." "NOTE 8.y In order to meet this requirement, implementations may need to ensure that EVENT POST blocks until it has received an acknowledgement that the event variable has been updated." Solution E: If "yes, no, yes": ---------- I do not think that anyone on WG5 favours this, though some users may, so I have not tried to specify it. Solution F: If "yes, yes, yes": ---------- I sincerely hope that nobody on WG5 favours this, but some people have said things that are equivalent to it. The best solution in this case would be to change EVENT_QUERY from a subroutine into an image control statement, but there seems to be no willingness to do that. If not, the next best solution would be a new class of intrinsic subroutine, i.e. an atomic one containing an image control statement, but that would mean significant changes to the TS and main standard. Issue 3 (Sequential Consistency) -------------------------------- Is event order required to be sequentially consistent? Many people will assume they are, as Fortran 2008 8.5.2 does. The second example is relevant only if the answer to Query_One is "yes", because otherwise the ordinary atomic consistency rules (whatever they are) apply. Is the following program required to print '1'? PROGRAM SC_One ! This is a classic example, and a surprising number of environments ! will, at least on occasion, print '0' (which indicates simultaneous ! access to data in unordered segments) - 'obviously' that can't ! happen, but parallelism is no more obvious than quantum mechanics USE, INTRINSIC :: ISO_FORTRAN_ENV TYPE(EVENT_TYPE) :: event[*] INTEGER :: data[*] = 0 IF (NUM_IMAGES() /= 3) STOP IF (THIS_IMAGE() == 1) THEN WAIT(event) POST(event[2]) PRINT *, data ELSE IF (THIS_IMAGE() == 2) THEN WAIT(event) POST(event[1]) ELSE IF (THIS_IMAGE() == 3) THEN data[1] = 1 POST(event[1]) END IF END PROGRAM SC_One Is the following program allowed to print "1 0" from image 4 and "0 1" from image 6? PROGRAM SC_Two ! This is the classic IRIW example, and often occurs when the network ! has multiple paths between two points. USE, INTRINSIC :: ISO_FORTRAN_ENV TYPE(EVENT_TYPE) :: event[*] INTEGER :: count_1, count_2 IF (NUM_IMAGES() /= 6) STOP IF (THIS_IMAGE() == 1) THEN POST(event[5]) ELSE IF (THIS_IMAGE() == 2) THEN POST(event[6]) ELSE IF (THIS_IMAGE() == 3) THEN CALL EVENT_QUERY(event[5],count_1) CALL EVENT_QUERY(event[6],count_2) PRINT *, count_1, count_2 ELSE IF (THIS_IMAGE() == 4) THEN CALL EVENT_QUERY(event[6],count_2) CALL EVENT_QUERY(event[5],count_1) PRINT *, count_1, count_2 END IF END PROGRAM SC_Two It would be a good idea to close the issue by explicitly requiring sequential consistency for segment ordering and, if the answer to Query_One is "yes", to include EVENT_QUERY. Solution A: If the answer to Query_One is "no": ---------- [Fortran 2008 189:10+] Add a new paragraph after 8.5.2p2: "Segment ordering is sequentially consistent." Solution B: If the answer to Issue 2 is "yes, yes, yes": ---------- [Fortran 2008 189:10+] Add a new paragraph after 8.5.2p2: "Segment ordering is sequentially consistent, and the values returned from EVENT_QUERY are consistent with that." Solution C: Otherwise, if the answer to Query_One is "yes": ---------- [Fortran 2008 189:10+] Add a new paragraph after 8.5.2p2: "The execution of ordered segments is sequentially consistent, and the values returned from EVENT_QUERY for variables on the executing image are consistent with that ordering." or (potentially with serious efficiency loss): "The execution of ordered segments is sequentially consistent, and the values returned from EVENT_QUERY are consistent with that ordering." Issue 4 (Tidying Up) -------------------- One response to these issues was to add the following paragraph to 7.3p4 (EVENT POST): "If the segment that precedes an EVENT POST statement is unordered with respect to the segment that precedes another EVENT POST statement for the same event variable, the order of execution of the EVENT POST statements is processor dependent." I am unable to see that it adds anything useful. That never was disputed and does not resolve any problems. It should be deleted. [18:2-4] Delete the paragraph (7.3p4).