To: J3 J3/19-122r1 From: Van Snyder & Gary Klimowicz Subject: Supporting generic programming -- updaters Date: 2019-February-12 Reference: Minutes of 1986 X3J3 Albuquerque meeting, 04-141, 18-249 Introduction ============ In "On the Criteria for Dividing Programs into Modules" (CACM December 1972), David Parnas pointed out that a change to a program more frequently has a cost proportional to the size of the program than to the extent or complexity of the change. This is because, in most programming languages, the syntax to reference or define an object depends upon its representation. Parnas proposed that access to objects ought to be done only by procedures, hiding the dependence on representation behind the uniform syntax or procedure references. This was the beginning of the practice known as "information hiding," an early development of the larger discipline of software engineering. Douglas T. Ross had observed the problem in 1970. Charles M. Geschke and James T. Mitchell observed the problem in 1975. Both proposed a better solution: The syntax to reference an object ought not to depend upon its representation. Parnas's proposal caught on because it didn't require changes to any existing languages, or converting codes to new languages such as Mesa. Parnas's approach is incomplete, because assignment doesn't have the same syntax as a subroutine reference. It is also less efficient (both in labor and computer resources) than the proposals by Ross, Geschke, and Mitchell. The applicability of generic programming is not restricted to assisting to create similar entities during program development. It can also provide significant reduction in maintenance cost. Fortran has the fortunate history that function references and array references have the same syntax. I proposed in 1986 that structure component references ought also to use a function- or array-like reference syntax, but at the time, Parnas's advice hadn't sunk in. The X3J3 consensus was "Fortran programmers want to see what their program is doing and how it is doing it," which is, of course, now recognized as unwise advice. History ======= On Monday 15 October at meeting 217, paper 18-249 was announced "no action" without plenary discussion. Meeting 217 minutes include "no use case." Use cases appeared in the referenced paper 04-141, and an accompanying PDF. The PDF file was removed from the server on Monday 15 October at meeting 217. One simple use case is presented below. An extended use case appears in the accompanying PDF, which is a significant revision of the paper that was removed from the server. There is a companion pdf.gz for this paper. This paper is a summary. Read the PDF, which is a complete proposal, cast in the form of a TS, but without edits. Proposal ======== Provide a new variety of procedure, called an "updater" in POP-2, that has the same syntax as a function reference, but that can appear in a variable-definition context. When it is invoked, a value is associated to its "transfer" or "receiver" or "acceptor" variable, in the same way that an actual argument is associated to a dummy argument that has INTENT(IN). Execution of an updater might be thought of as a "time reversed" execution of a function reference. Provide a new entity, called an "accessor" that explicitly couples procedures to reference and update objects, i.e., functions and updaters, and a new data object called an activation record. An activation record can have several instances, called instance variables. Details and examples of this proposal appear in the PDF file accompanying this paper. The 1986 proposal to X3J3 included updaters, but not accessors and instance variables. Instance variables allow a single collection of functions, updaters, and subroutines to have independent persistent states that are not represented only by their arguments, and variables accessed by host or use association. Without instance variables, functions and updaters are limited to operations on their arguments, or variables they access by host or use association. Without instance variables, to have independent persistent states, other than by argument association, it is necessary to copy functions, updaters, and subroutines to different scoping units, either physically or by using INCLUDE statements. A function or updater that is accessed by host or use association is the same entity in every scoping unit, and accesses the same scope by host association. If it is necessary to revise a program so that several similar objects that need independent persistent states are represented by functions and updaters instead of variables, it is necessary either to copy the procedures to different scoping units, one for each persistent state, provide the persistent state explicitly using additional arguments, or use instance variables. Scoping units are not allocatable, so the "copy" strategy is limited to fixed numbers of persistent states. If the persistent states were to be provided using argument association, each persistent state, a sparse matrix for example, would need to be provided as an argument in addition to, for example, subscripts. This would require all references to and definitions of each original entity to be revised. Use case outline ================ Suppose that in a program contains a sparse array represented densely, with explicit zeroes. Suppose more ambitions requirements cause the array to become so large that it needs to be represented sparsely. References to the array are of the form A(I,J). These can be provided by a function. Assignments to the array are also of the form A(I,J), in a variable-definition context, e.g., A(I,J) = 42 In order to represent the array sparsely, it becomes necessary to replace every definition with a subroutine call, e.g., call Set_A ( I, J, 42 ) This is the primary reason that modifications to a program frequently have a cost proportional to the size of the program, not to the extent or complexity of the revision. The appearance of an assignment statement cannot be preserved using defined assignment, because the subscripts cannot be passed to the defining procedure. It could in principle be done using left-hand functions, i.e., A(I,J) returns a pointer. But in A(I,J) = B one does not want to create a representation for a new element in the sparse representation for the array if an A(I,J) object does not already exist, because B might be zero. This cannot, however, be avoided, because the left-hand function is not given the value to be stored. The reason to use a sparse representation is to avoid explicitly representing elements whose values are zero. An updater would allow the assignment to remain unchanged, e.g., A(I,J) = 42 READ THE ACCOMPANYING PDF for more complete descriptions and examples! JoR Review of the Proposal ========================== JoR finds a. This proposal defines a set of mechanisms to support generic programming, but only for access and updates to entities. b. The mechanism of accessors and updaters is not sufficient (in and of itself) to solve the problem of writing programs and modules that are type-agnostic at the time the programs or modules are written (e.g., for generic containers). c. Hiding the access and update mechanisms also hides the potential performance implications of using the accessors or updaters. That is, what looks like a simple variable reference or indexed variable reference might invoke an almost unbounded amount of code complexity. d. The extended example and use cases do not demonstrate the use of the SECTION type, or motivate its need. e. Exposing the details of activation records, which it seems we do, does not seem like the proper level of abstraction for scientists and application programmers. f. The use of procedures and function references to get this capability does not seem too onerous a mechanism. g. This feels very "un-Fortran-like". A mechanism that has been implemented in so few languages does not feel like a good candidate to add to Fortran. Disposition of the Proposal =========================== JoR recommends that we do not add this item to the worklist for the Fortran 202X standardization effort. It is possible that some of the mechanisms proposed here may be useful "behind the scenes" for the ultimate proposal for generic programming. It does not seem like the sort of mechanism that we would like to expose directly to Fortran programmers themselves.