J3/06-135 Date: 05-February-2006 To: J3 From: Bob Numrich Subject: Ragged allocatable/pointer co-arrays References: Feature UK-001, J3/05-208, J3/05-272r2, J3/06-122, J3/06-129 The proposal (J3/06-129) to outlaw co-array derived types with allocatable or pointer components is ill-advised. These structures are in fact the most important, the most powerful, and the most useful features of the entire co-array model (Numrich, Parallel Computing, 31:588-607, 2005). The proposal for ragged allocatable and pointer co-arrays (J3/06-129) is also ill-advised. It violates the fundamental principles of the co-array model. The first principle is simplicity. The underlying philosophy of the co-array programming model is that it is simple to understand and simple to implement. It is designed so that existing compilers and run-time systems can implement it with only minimal new technology. New types of objects are not added to the language unless there is a compelling reason for them. A ragged array is a new type of allocatable object. They provide no functionality that does not already exist. The second principle is that the rules for co-dimensions are the same as the rules for normal dimesnions unless there is a compelling reason to change them. The ALLOCATE statement for these objects violates this principle because it requires the co-dimensions to be specified in the declaration rather than being defined dynamically when the allocate statement executes. This has the major disadvantage that the program is hard-wired to a specific co-shape and must be recompiled to change the co-shape for each run when the total number of images changes. The third principle is that the compiler knows how to locate co-arrays from purely local information. A ragged array most likely will be placed on the local heap and the dope vector describing the array will be stored somewhere in local memory. When the array is used with co-array syntax, the compiler will have to retrieve the remote dope vector from another image. How does the compiler keep track of where these dope vectors are located using purely local information? It would need to build a large table for every ragged array and there would have to be a global update of all these tables, on all images, every time any image allocates or deallocates one of these ragged arrays. The ALLOCATE statement becomes a complicated statement with global communication required across images. But the proposal does not even require an implicit synchronization when one of these objects is allocated or when a pointer is associated with local objects. Might the local assignment of a ragged pointer generate global communication across images? Does such an assignment need to be a segment boundary statement? Furthermore, suppose one of these ragged arrays is passed to a subroutine where it is used as a co-array. Across the subroutine call, all information that this is a ragged array is lost. How would the compiler know that it needs to read up the remote dope vector? In fact, the dummy argument doesn't even have the same name. Are we suggesting a whole new calling sequence for ragged arrays? In addition, the programmer must keep track of lower bounds and upperbounds on remote images. The LBOUND and UBOUND intrinisic functions will need to access remote dope vectors to get this information. Using this information will be error-prone and bugs wiil be difficult to find. The proposal would allow assignment of a ragged pointer to a local target. This destroys the whole logic of the proposal. Unless the implementation knows how to handle references to local heaps, a reference to this local target from another image will fail. But this is the problem that the proposal was supposed to solve. The memory for that local target has not been registered with the communication protocol. Even if the local pointer assignment triggers global communication to update tables, it will not work. Even if the compiler reads up the local dope vector, it will not work. It has all the same "problems" the proposal was trying to solve. This proposal seems to be asking the language to make up for poor hardware design and poor communication protocols. Rather than indulging their inability to design, or at least emulate, a global address space, we should pressure them to learn how to handle the co-array model. The situation is quite different from using allocatable components of co-array derived types. Co-arrays of derived type follow all the design principles of the co-array model. These objects are the same on all images, and the compiler always knows how to find them from puerly local information. The fact that these are co-arrays does not get lost across procedure calls because they are always passed as co-array structures. The compiler knows it needs to read up the dope vector for the component because it is a component of a co-array structure. It knows how to find the dope vector from purely local information because it knows how to find the co-array structure on any image and the dope vector for each component will be located in the same place on each image. If just the component of the co-array structure is passed to a procedure, it can only be used as a local variable. "Ragged" data structures are important. Co-array structures with allocatable components are the logical way to define them within the logical design of Fortran 90. What is "awkward" about using the normal Fortran syntax for type components? The syntax RAGGED_ALLOC[1]%ARRAY(3) is no more awkward than UNRAGGED_ALLOC(1)%ARRAY(3). Is the proposal suggesting that we change this syntax for references to components of an array of derived types even for those that are not co-arrays? Furthermore, derived type co-array structures encourage the use of object-oriented techniaues. An allocated array with the name RaggedArray may or may not actually be a ragged array. A derived type with the name RaggedArray, on the other hand, can be guaranteed to be a ragged array by designing a constructor that creates the data components correctly and records information about what "ragged" means. The derived type can contain as much or as little information as one wishes including the relationship between RaggedArrays across images and methods that perform communication among RaggedArrays across images. There is a very simple, straightforward solution to the problem of supporting allocatables that may or may not be associated with a co-array. Just register ALL memory allocation with the communication protocol if you have to. With a little thought, there are probably better solutions. Let's get the cart before the horse. Fortran was not designed to support Seymour's vector architecture. Seymour designed his machine to support the Fortran. I dare say, were he still with us, his parallel machine would support a global address space and would be a perfect machine for Co-Array Fortran.