J3/05-183r1 Date: 13 May 2005 To: J3 From: John Reid Title: Co-arrays for parallel programming Subject: Revision of Requirement UK-001 Status: For Consideration Severity Level: 6 References: ISO/IEC JTC1/SC22/WG5 N1317, N1626 Basic Functionality: Co-arrays provide a simple parallel extension of Fortran. The extension assumes the SPMD programming model with the program replicated a fixed number of times. Each copy is called an image. Normal subscripts refer to memory addresses within an image while a second set of subscripts refer to an image. Each image executes asynchronously with explicit synchronization provided by the programmer. For example, the statement REAL, DIMENSION(N)[*] :: X,Y declares that each image has two real arrays of size N. When an image executes the statement X(:) = Y(:)[Q] it makes a copy of the array Y from the memory of image Q into array X in the memory of the image executing the statement. A reference to X or Y without its second set of subscripts is a reference to the local array. Rationale: It is our belief that the wide adoption of this extension, compared with other parallel programming models, would lead to easier program development, faster execution times, and better program maintainability, particularly where the number of processes is large. Fortran 95 included language features needed by HPF in the expectation that HPF would become widely used for parallel programming. Unfortunately, this has not happened because HPF is difficult to implement and its performance in practice is often disappointing. Furthermore, its ability to represent complicated data distributions is limited. Instead MPI has become the de-facto standard for parallel programming. MPI is awkward to program and its performance tends to decay as the number of processes increases. MPI is essentially a library of C procedures, and the overhead from calling these procedures often limits scalability for programs that generate many small messages. In contrast, co-arrays add additional syntax to the language that requires support in the compiler. They allow the programmer to design complicated data distributions and to write customized communication patterns using a Fortran-like syntax. The lack of any external library call, coupled with the fact that the extension has been designed to be easy to implement, gives great scope for communications optimization compared with either HPF or MPI codes. Estimated Impact: Other features would be little affected since execution is normal and normal optimizations are applicable except where there is explicit co-array syntax. Uniprocessors would be required to recognize the syntax but can ignore most of it and, if asked, would indicate that the number of images is one. Detailed Specification: This is a revision of the previous specification, which simply referenced report RAL-TR-1998-060, an updated version of N1317. This report is still useful for further explanation and is available as ftp://ftp.numerical.rl.ac.uk/pub/reports/nrRAL98060.ps.gz This document is now self-contained. We have made some technical changes, each of which is highlighted with a note. In this proposal, we use the notation < ... > to represent bold font (definitions) and braces { ... } to represent italic (Fortran bnf). Fortran 2003 uses square brackets for array constructors, which does not conflict with square brackets indicating a co-array reference though it does mean that such a reference will not be as obvious as it was in Fortran 95. In this proposal, we use the Fortran 95 syntax for array constructors and reserve square bracket-syntax for co-arrays. We also use square brackets to indicate optional arguments. 1 Program images A Fortran program containing co-arrays executes as if it were replicated a number of times, the number of replications remaining fixed during execution of the program. Each copy is called an and each image executes asynchronously. A particular implementation may permit the number of images to be chosen at compile time, at link time, or at execute time. The number of images may be the same as the number of physical processors, or it may be more, or it may be less. The programmer may retrieve the number of images at run time by invoking the intrinsic function NUM_IMAGES(). Note: The model is that the program is written for any number of images and the actual number for a specific run is specified on the command line. This allows the program to be developed with fewer images that will be used in the eventual production runs. If the system is a uniprocessor, the compiler knows that the number of images is 1 and can take advantage of this. Thus the early testing of the program can be performed on a uniprocessor. For compilation on a multiprocessor, the vendor could supply a switch to specify that the number of images is 1, which would permit the optimizer to eliminate overhead associated with parallel execution. Each image is indexed by an . These form a sequence of integers running from one to NUM_IMAGES(). The programmer may retrieve the index of the invoking image through the intrinsic function THIS_IMAGE(). The programmer may retrieve the set of subscript values that correspond to the invoking image for a particular co-array by providing an optional argument to THIS_IMAGE (Section 10.3). The programmer may retrieve the index of the image that corresponds to a set of subscripts for a particular co-array through the intrinsic function IMAGE_INDEX (Section 10.3). The programmer controls the execution sequence in each image through explicit use of Fortran control constructs and through explicit use of intrinsic synchronization procedures. 2 Specifying data objects Each image has its own set of data objects, all of which may be accessed in the normal Fortran way. Some objects are declared with in square brackets immediately following dimensions in parentheses (round brackets) or in place of them. Such an object is a . These are examples of co-array declarations: REAL, DIMENSION(20)[20,*] :: A REAL :: C[*], D[*] CHARACTER :: B(20)[20,0:*] INTEGER :: IB(10)[*] TYPE(INTERVAL) :: S[20,*] Unless the array is allocatable (Section 6), the form for the dimensions in square brackets is the same as that for the dimensions in parentheses for an assumed-size array. The total number of subscripts and co-subscripts is limited to 15. Note: change from N1626 following the decision of WG5 re item J3-039 in N1626. The part of a co-array that resides on another image may be addressed by using subscripts in square brackets following any subscripts in parentheses, for example: A(5)[3,7] = IB(5)[3] D[3] = C A(:)[2,3] = C[1] Each such subscript must be a scalar integer expression (section subscripts are not permitted). Note: Change from RAL-TR-1998-060. Co-subscripts are now limited to scalars. This is a big change that substantially simplifies the extension. It was suggested at Delft that at least elementals be allowed but it was decided that this would be better left for a later extension. We call any object whose designator includes square brackets a . For each co-array, subscripts in square brackets are mapped to image indices in the same way as Fortran array subscripts are mapped to the position of the array element in array element order. On each image, the image index is available from the intrinsic THIS_IMAGE with no arguments, the set of subscript values for a co-array A that correspond to the image are available as THIS_IMAGE(A), and the image index that corresponds to a set of subscript values for a co-array A is available as IMAGE_INDEX(A,SUB). For example, on image 5, for the array declared as REAL :: A(10,20)[10,0:9,0:*] THIS_IMAGE() has the value 5 and THIS_IMAGE(A) has the value (\ 5,0,0 \). For the same example on image 213, THIS_IMAGE(A) has the value (\ 3,1,2 \). On any image, IMAGE_INDEX(A, (\ 5,0,0 \) ) has the value 5 and IMAGE_INDEX(A, (\3,1,2 \) has the value 213. Note: Change from RAL-TR-1998-060. The intrinsic function IMAGE_INDEX has been added. Note: On a shared-memory machine, we expect a co-array to be implemented as if it were an array of higher rank. On a distributed-memory machine with one physical processor for each image, a co-array may be stored from the same virtual address in each physical processor. On any machine, a co-array may be implemented in such a way that each image can calculate the virtual address of an element on another image relative to the array start address on that other image. An implementation might arrange for each co-array to be stored from the same virtual address in each image, but this is not required. The , , , , and of a co-array are given by the data in parentheses in its declaration or allocation. The , , and are given by the data in square brackets in its declaration or allocation. The co-size of a co-array is always equal to the number of images. The syntax and semantics mirror those of assumed-size arrays - the final extent is always indicated with an asterisk and a co-array has no final co-extent, no final upper bound, and no co-shape. For example, the co-array declared thus REAL, DIMENSION(10,20)[20,5,0:*] :: A has rank 2, co-rank 3, and shape (/10,20/); its lower co-bounds are 1, 1, 0. Note: Change from RAL-TR-1998-060. Now that co-subscripts are limited to scalars, it is better to use 'rank' for what was 'local rank', and similarly for the other terms. The actual argument corresponding to a dummy co-array must be a co-array or a subobject of a co-array, with no square brackets in its designator. It is not required to be the same object on all images. For example, the subroutine SUBROUTINE SOLVE1(N,A) INTEGER :: N REAL :: A(N)[*] : may be called at the same time thus on three images: CALL SOLVE1 (100,A) CALL SOLVE1 (200,B) CALL SOLVE1 (100,A(10)) Furthermore, each image independently defines the bounds and co- bounds. For example, the subroutine SUBROUTINE SOLVE2(M,N,A) INTEGER :: M,N REAL :: A(M)[N,*] might be called simultaneously on two images with differing values of M and N. Note: We expect that the case with all images having the same co-array with the same bounds and co-bounds will occur very frequently, but the additional freedom (used with care) may be very useful. An example is when different teams of images are working independently. Note: Change from RAL-TR-1998-060. We previously required the dummy co-arrays to be identical. A co-array may be allocatable: SUBROUTINE SOLVE3(N,A,B) INTEGER :: N REAL :: A(N)[*], B(N) REAL,ALLOCATABLE :: WORK(:)[:] Allocatable arrays are discussed in Section 6. Unless it is allocatable or a dummy argument, a co-array that is not declared in a main program must be given the SAVE attribute. Note: If a co-array is declared in a procedure, with a fixed size but without the SAVE attribute, there would need to be an implicit synchronization on entry to the procedure and return from it. Without this, there might be a reference from one image to non-existent data on another image. An allocatable array is not required to have the SAVE attribute because a recursive procedure may need separate allocatable arrays at each level of recursion. Note: Change from N1626: the SAVE attribute was previously acquired automatically when needed. Automatic co-arrays are not permitted; for example, the co-array work in the above code fragment is not permitted to be declared thus SUBROUTINE SOLVE3(N,A,B) INTEGER :: N REAL :: A(N)[*], B(N) REAL :: WORK(N)[*] ! Not permitted Note: Were automatic co-arrays permitted, it would be necessary to require image synchronization, both after memory is allocated on entry and before memory is deallocated on return. We would also need rules to ensure that the sizes are the same in all images. Effectively the arrays would be just like allocatables, except for not needing to write allocate syntax. A co-array is not permitted to be a constant. Note: This restriction is not necessary, but the feature would be useless since each image would hold exactly the same value. To ensure that data initialization is local (the same on each image), co-subscripts are not permitted in DATA statements. For example: REAL :: A(10)[*] DATA A(1) /0.0/ ! Permitted DATA A(1)[2] /0.0/ ! Not permitted A derived type is not permitted to have a co-array component unless the component is allocatable. If an object has a co-array component at any level of component selection, each ancestor of the co-array component must be a non-allocatable non-pointer scalar. If the object is declared in a procedure and does not have the SAVE attribute, there is an implicit deallocation (and associated synchronization) before the procedure in which it is declared is exited by execution of a RETURN statement or an END statement. Note: change from N1626. An array component was not permitted to be a co-array. Note: Were we to allow a co-array of a type with co-array components, we would be confronted with references such as Z[P]%X[Q]. A logical way to read such an expression would be: go to image P and find component X on image Q. This is logically equivalent to Z[Q]%X. A co-array is not permitted to be a pointer but a co-array may be of a derived type with pointer or allocatable components. The targets of such components are always local. Such a component or a subobject of such a component must not be associated as an actual argument with a dummy co-array since its shape may vary from image to image. Note: If a large array is needed on a subset of images, it is wasteful of memory to specify it directly as a co-array. Instead, it should be specified as an allocatable component of a co-array and allocated only on the images on which it is needed. Note: It has been suggested that we might allow a co-array to have the pointer attribute with the meaning that on each image it has a local target or is disassociated but access between images is exactly as for non-pointer co-arrays. This would add no extra functionality - just the minor convenience of allowing references such as A[P]%COMP(I:) being written as A(I:)[P]. 3 Accessing data objects In an expression, a reference without square brackets is always a reference to the object on the invoking image. For example, SIZE(IB) for the co-array IB declared at the start of Section 2 returns 10. The co-subscript list must map to a valid image index. For example, if there are 16 images and the co-array A is declared thus REAL :: A(10)[5,*] A(:)[1,4] is valid since it has co-subscript order value 16, but A(:)[2,4] is invalid. Square brackets attached to objects alert the reader to communication between images. Unless square brackets appear explicitly, all objects reside on the invoking image. Communication may take place, however, within a procedure that is referenced, which might be a defined operation or assignment. Note: Whether the executing image is selected in square brackets has no bearing on whether the executing image evaluates the expression or assignment. For example, the statement P[6] = 1 is executed by every image, not just image 6. If code is to be executed selectively, the Fortran IF or CASE statement is needed. For example, the code REAL :: P[*] ... IF (THIS_IMAGE(P)==1) THEN READ(6,*)P DO I = 2, NUM_IMAGES() P[I] = P END DO END IF CALL SYNC_ALL employs the first image to read data and broadcast it to other images. 4 Procedures A co-array subobject is permitted in intrinsic operations, intrinsic assignments, and input/output lists. It is also permitted in non-intrinsic operations and as an actual argument in a procedure call provided the interface is explicit and the dummy argument is not a co-array and has INTENT(IN) or the VALUE attribute; a local copy of the actual argument is made before execution of the procedure starts. Note: Change from N1626. Allowing co-array subobjects in the non-intrinsic case is an addition. If a dummy argument is a co-array, the interface must be explicit. The rules for resolving generic procedure references are based on the local properties and are therefore unchanged. Note: The rules cannot be extended to allow overloading of array and co-array versions since the syntactic form of an actual argument would be the same in the two cases. If a dummy argument is a co-array that is not allocatable, the co-rank and co-bounds are defined afresh and are completely independent of those of the actual argument. The actual argument must be the name of a co-array or a subobject of a co-array without any vector-valued subscripts, allocatable component selection, or pointer component selection. Note: It is intended to ensure that copy-in or copy-out does not take place for a dummy co-array. A dummy co-array may be of assumed size or assumed shape: SUBROUTINE SUBR(N,A,B) INTEGER :: N REAL :: A(N,*)[*], B(:,:)[*] If an assumed-shape array or a subobject of an assumed-shape array appears as an actual argument corresponding to a dummy co-array, the dummy co-array must be of assumed shape. If an array section appears as an actual argument corresponding to a dummy co-array that is not of assumed shape, the section must have elements whose subscript order values in its parent array consist of a sequence without gaps. Note: Change from RAL-TR-1998-060. These rules have been added to ensure that copy-in or copy-out does not take place. If a dummy argument is an allocatable co-array, the corresponding actual argument must be an allocatable co-array of the same rank and co-rank. Note: Change from RAL-TR-1998-060. Allocatable dummy co-arrays are permitted in view of allocatable dummy arguments being permitted in Fortran 2003. A function result is not permitted to be a co-array. Note: A co-array function result is like an automatic co-array and is disallowed for the same reasons. A pure or elemental procedure is not permitted to contain any co-array syntax. The array reduction functions of 13.5.12 are extended to be collectives with the same name and an additional optional argument TEAM that specifies the team of images or is the string 'ALL'. All participating images return a result, but roundoff effects may cause these to vary between images. The procedure must be called on all participating images and involves synchronization across these images. Note: change from N1626. The extension to collectives is new. 5 Storage association COMMON and EQUIVALENCE statements are permitted for co-arrays and specify how the storage is arranged on each image (the same for every one). Therefore, co-array references are not permitted in an EQUIVALENCE statement. For example EQUIVALENCE (A[10],B[7]) ! Not allowed (compile-time constraint) is not permitted. Appearing in a COMMON or EQUIVALENCE statement has no effect on whether an object is a co-array; it is a co-array only if declared with square brackets. An EQUIVALENCE statement is not permitted to associate a co-array with an object that is not a co-array. For example INTEGER :: A,B[*] EQUIVALENCE (A,B) ! Not allowed (compile-time constraint) is not permitted. Note: The ban on associating a co-array with an object that is not a co-array is not necessary, but we see it as desirable to keep some separation between objects that are co-arrays and those that are not. A COMMON block that contains a co-array must have the SAVE attribute. Which objects in the COMMON block are co-arrays may vary between scoping units. Since blank COMMON may vary in size between scoping units, co-arrays are not permitted in blank COMMON. 6 Allocatable arrays A co-array may be allocatable. The ALLOCATE statement is extended so that the co-bounds can be specified, for example, REAL, ALLOCATABLE :: A(:)[:], S[:,:] : ALLOCATE ( ARRAY(10)[*], S[-1:34,0:*] ) The co-bounds must always be included in the allocate statement and the upper bound for the final co-dimension must always be an asterisk. For example, the following are not permitted (compile-time constraints): ALLOCATE( A(NUM_IMAGES()) ) ! Not allowed (no co-bounds) ALLOCATE( A(10)[NUM_IMAGES()] ) ! Not allowed (co-bound not *) Also, the values of all the local bounds are required to be the same on all images. For example, the following is not permitted (run-time constraint) ALLOCATE( A(THIS_IMAGE())[*] ) ! Not allowed (varying local bound) There is implicit synchronization of all images in association with each ALLOCATE statement that involves one or more co-arrays. Images do not commence executing subsequent statements until all images finish executing the ALLOCATE statement. Similarly, for DEALLOCATE, all images delay making the deallocations until they are all about to execute the DEALLOCATE statement. This synchronization is independent of those obtained by calling SYNC_ALL and SYNC_TEAM (see Sections 8 and 9). Note: Without these rules, an image might reference data on another image that has not yet been allocated or has already been deallocated. Note: Change from RAL-TR-1998-060. We now require all images to execute the same ALLOCATE statement and the same DEALLOCATE statement. This ensures that the co-arrays are allocated and deallocated in the same order on each image without imposing complicated extra rules. If the new rule is found in practice to be too restrictive, it could be relaxed in a revision of the language. Note: When an image executes an allocate statement, no communication is necessarily involved apart from any required for synchronization. The image allocates the local part and records how the corresponding parts on other images are to be addressed. The compiler, except perhaps in debug mode, is not required to enforce the rule that the bounds are the same on all images. Nor is the compiler responsible for detecting or resolving deadlock problems. For an allocatable co-array without the SAVE attribute there is an implicit deallocation (and associated synchronization) before the procedure in which it is declared is exited by execution of a RETURN statement or an END statement. Note: Change from RAL-TR-1998-060. Previously, it was the programmer's responsibility to explicitly deallocate such a co-array. We do not want the possibility of the allocation state becoming undefined. Anyway, it is awkward for the user to program this if there is more than one return statement. Note: An allocatable co-array may be given the SAVE attribute unless separate arrays are needed at each level of recursion in a recursive procedure. For allocation of such a co-array, each image must descend to the same level of recursion or deadlock may occur. In an array assignment of an array to an allocatable co-array, the shapes must agree. In an intrinsic assignment to a remote object of a type that has an allocatable component at any level of component selection, the component shapes must agree. Note: Change from RAL-TR-1998-060. This rule was not needed in Fortran 95. 7 Array pointers A co-array may be of a derived type with pointer or allocatable components. For example, if P is a pointer component, Z[Q]%P is a reference to the target of component P of Z on image Q. This target must reside on image Q and must have been established by an allocate statement executed on image Q or a pointer assignment executed on image Q. For example, R => Z[Q]%P ! Not allowed (compile-time constraint) is not permitted. There is no requirement for all these components to be allocated or associated or for them to have the same shape, so an allocatable or pointer component or a subobject of such a component is not permitted as an actual argument that corresponds to a co-array dummy argument. Note: Change from RAL-TR-1998-060. Previously, pointer components were severely restricted in order that they could be associated with co-array dummy arguments. Note: Change from RAL-TR-1998-060. Allocatable components added (not included in Fortran 95). Intrinsic assignments are not permitted for co-array subobjects of a derived type that has a pointer component, since they would involve a disallowed pointer assignment for the component: Z[Q] = Z ! Not allowed if Z has a pointer Z = Z[Q] ! component (compile-time constraint) Similarly, for a co-array of a derived type that has a pointer or allocatable component, it is illegal to allocate one of those components on another image: TYPE(SOMETHING), ALLOCATABLE :: T[:] ... ALLOCATE(T[*]) ! Allowed ALLOCATE(T%PTR(N)) ! Allowed ALLOCATE(T[Q]%PTR(N)) ! Not allowed (compile-time constraint) A co-array is permitted to be of a type that has a procedure pointer component or a type bound procedure. Such a procedure must not be invoked remotely; for example, the statement CALL A[P]%PROC(X) is not permitted. Note: Change from RAL-TR-1998-060. These were not part of Fortran 95. A possible future extension is for these to be used for remote procedure calls. 8 Execution control Most of the time, each image executes on its own as a Fortran program without regard to the execution of other images. It is the programmer's responsibility to ensure that whenever an image alters a co-array datum, no other image might still need the old value. Also, that whenever an image accesses a co-array datum, it is not an old value that needs to be updated by another image. The programmer uses invocations of the intrinsic synchronization procedures to do this, and the programmer should make no assumptions about the execution timing on different images. Note: This obligation on the programmer provides the compiler with scope for optimization. When constructing code for execution on an image, it may assume that it is the only image in execution until the next invocation of one of the intrinsic synchronization procedures and thus it may use all the optimization techniques available to a standard Fortran compiler. Note (cont): In particular, if the compiler employs temporary memory such as cache or registers (or even packets in transit between images) to hold co-array data, it must copy such data to memory that can be accessed by another image to make it visible to it. Also, if another image changes the co-array data, the executing image must recover the data from global memory to the temporary memory it is using. The intrinsic procedure FLUSH_MEMORY is provided for both purposes. It is concerned only with data held in temporary memory on the executing image for co-arrays in the local scope. Given this fundamental intrinsic procedure, the other synchronization procedures can be programmed in Fortran (see Appendix 1), but the intrinsic versions are likely to be more efficient. In addition, the programmer may use it to express customized synchronization operations in Fortran. If data calculated on one image are to be accessed on another, the first image must call FLUSH_MEMORY after the calculation is complete and the second must call FLUSH_MEMORY before accessing the data. Synchronization is needed to ensure that FLUSH_MEMORY is called on the first before FLUSH_MEMORY is called on the second. Note: A compiler can hold co-arrays in temporary storage, such as cache or registers, between calls to FLUSH_MEMORY. Note: If the local part of a co-array or a subobject of it is an actual argument corresponding to a dummy argument that is not a co-array, a copy may be passed to the procedure. To avoid the possibility of the original being altered by another image after the copy has been made, a synchronization may be needed ahead of the procedure invocation. Similarly, a synchronization is needed after return before any other image accesses the result. The subroutine SYNC_TEAM (see Section 10) provides synchronization for a team of images. The subroutine SYNC_ALL provides a shortened call for the important case where the team contains all the images. Each invocation of SYNC_TEAM or SYNC_ALL also has the effect of FLUSH_MEMORY. The subroutine SYNC_ALL is not discussed further in this section. Note: No information is available about whether an action on one image occurs before or after an action on another image unless one is executed ahead of a synchronization call and the other is executed behind the corresponding synchronization call on the other. For example, while one image executes the statements between two invocations of SYNC_ALL, another image might be out of execution. Here is a example that imposes the fixed order 1, 2, ... on images: ME = THIS_IMAGE() NE = NUM_IMAGES() IF(ME>1) CALL SYNC_TEAM( ME-1 ) P[6] = P[6] + 1 IF(ME= CMEQ ) EXIT END DO END DO END SUBROUTINE WAIT_TEAM LOGICAL FUNCTION READY_TEAM(TEAM) INTEGER, INTENT(IN) :: TEAM(:) CALL FLUSH_MEMORY DO K= 1,SIZE(TEAM) Q = TEAM(K) IF( C(Q) < C(ME)[Q]) EXIT END DO READY = K>SIZE(TEAM) END FUNCTION READY_TEAM SUBROUTINE SYNC_TEAM11(TEAM,WAIT) INTEGER, INTENT(IN) :: TEAM(:),WAIT(:) CALL NOTIFY_TEAM(TEAM) CALL WAIT_TEAM(WAIT) END SUBROUTINE SYNC_TEAM11 SUBROUTINE SYNC_TEAM1(TEAM) INTEGER, INTENT(IN) :: TEAM(:) CALL SYNC_TEAM11(TEAM,TEAM) END SUBROUTINE SYNC_TEAM1 SUBROUTINE SYNC_TEAM0(TEAM) INTEGER, INTENT(IN) :: TEAM CALL SYNC_TEAM1((/ME,TEAM/)) END SUBROUTINE SYNC_TEAM0 SUBROUTINE SYNC_TEAM10(TEAM,WAIT) INTEGER, INTENT(IN) :: TEAM(:) INTEGER, INTENT(IN) :: WAIT CALL SYNC_TEAM11(TEAM,(/WAIT/)) END SUBROUTINE SYNC_TEAM10 SUBROUTINE SYNC_TEAM00(TEAM,WAIT) INTEGER, INTENT(IN) :: TEAM INTEGER, INTENT(IN) :: WAIT CALL SYNC_TEAM11((/ME,TEAM/),(/WAIT/)) END SUBROUTINE SYNC_TEAM00 SUBROUTINE SYNC_TEAM01(TEAM,WAIT) INTEGER, INTENT(IN) :: TEAM INTEGER, INTENT(IN) :: WAIT(:) CALL SYNC_TEAM ((/ME,TEAM/),WAIT) END SUBROUTINE SYNC_TEAM01 SUBROUTINE SYNC_ALL CALL SYNC_TEAM (ALL,ALL) END SUBROUTINE SYNC_ALL SUBROUTINE SYNC_ALL1(WAIT) INTEGER, INTENT(IN) :: WAIT(:) CALL SYNC_TEAM(ALL,WAIT) END SUBROUTINE SYNC_ALL1 SUBROUTINE SYNC_ALL0(WAIT) INTEGER, INTENT(IN) :: WAIT CALL SYNC_TEAM(ALL,(/WAIT/)) END SUBROUTINE SYNC_ALL0 END MODULE SYNC Appendix 2. Module for THIS_IMAGE(ARRAY) and IMAGE_INDEX(ARRAY,SUB) The intrinsics THIS_IMAGE(ARRAY) and IMAGE_INDEX(ARRAY,SUB) cannot be coded in Fortran since ARRAY may be of any type and THIS_IMAGE(ARRAY) needs to know the index of the image on which the code is running. We therefore require the bounds to be specified as integer arrays and we require the image index for THIS_IMAGE(ARRAY). MODULE INDEX CONTAINS INTEGER FUNCTION IMAGE_INDEX(LBOUND,UBOUND,SUB) INTEGER,INTENT(IN) :: LBOUND(:),UBOUND(:),SUB(:) INTEGER :: I,N N = SIZE(SUB) IMAGE_INDEX = SUB(N) - LBOUND(N) DO I = N-1,1,-1 IMAGE_INDEX = IMAGE_INDEX*(UBOUND(I)-LBOUND(I)+1) + & SUB(I) - LBOUND(I) END DO IMAGE_INDEX = IMAGE_INDEX + 1 END FUNCTION IMAGE_INDEX INTEGER FUNCTION THIS_IMAGE(LBOUND,UBOUND,ME) RESULT(SUB) INTEGER,INTENT(IN) :: LBOUND(:),UBOUND(:),ME INTEGER :: SUB(SIZE(LBOUND)) INTEGER :: EXTENT,I,M,ML,N N = SIZE(SUB) M = ME - 1 DO I = 1,N-1 EXTENT = UBOUND(I)-LBOUND(I)+1 ML = M M = M/EXTENT SUB(I) = ML - M*EXTENT + LBOUND(I) END DO SUB(N) = M + LBOUND(N) END FUNCTION THIS_IMAGE END MODULE INDEX History: Presented by John Reid at the 1998 meeting of WG5 in Trollhattan.