To: J3 J3/19-167 From: Van Snyder Subject: Preliminary requirements for support for containers Date: 2019-June-14 References: 04-153 04-383r1 19-120 19-121 19-122 19-123 Introduction ============ At meeting 216, there was agreement to develop support for containers, but not to provide any specific containers such as lists, stacks, queues, trees.... Containers need =============== o A representation for their persistent state, o Methods to initialize their persistent state, e.g.: - Initial size (e.g. for a stack or queue), - Internal parameters (e.g., dimensions of a sparse matrix), o Methods to enter or update elements, o Methods to retrieve a specified element or a pointer associated with one, o Methods to traverse the elements or a subset thereof, o Methods to finalize their persistent state, o Ability to instantiate a framework for a representation for a persistent state, and associated methods, for specified type and kind, and o Ability to declare or allocate more than one container object without physically copying program text, or effectively copying it using a device such as a parameterized module, macro, template or INCLUDE statement. Physically or effectively copying program text is not a run-time action. The ability to allocate a container, or an array of containers, using an ALLOCATE statement, is important. Representation of persistent state ================================== The persistent internal state of a container could be represented by o Save variables within a subprogram. This would require all methods to be in one subprogram, with either argument values or ENTRY statements to specify which method to apply. This allows only one persistent state, or array of similar persistent states (all the same type and kind) per instance. o Save variable within a modulle. This allows only one persistent state, or array of similar persistent states (all the same type and kind) per instance. o Object of derived type with type-bound methods. This requires to invoke methods using type-bound procedure syntax. If a container replaces an abstraction originally represented by an array or arrays, this requires substantial labor to modify existing programs. o A new variety of entity that combines a representation of the state and the methods, but does not require to invoke methods using type-bound procedure syntax, as described in 19-122 and 19-123. This - Is similar to a derived type definition in that it declares the respresentation of data entities that represent the persistent internal state; - Is similar to a derived type definition in that it declares methods used to initialize and manipulate the persistent internal state; - Is similar to a derived type definition in that it allows more than one container object without physically copying program text or instantiating it using a device such as a parameterized module, macro, template, or INCLUDE statement; - is different from a derived type in that . invoking procedures to initialize and manipulate the persistent internal state does not require to use type-bound procedure invocation syntax; this would allow to replace an intrinsic representation such as an array with a persistent state representation and non-intrinsic methods, without requiring changes to the syntax to insert or examine elements of the container, . procedures to initialize and manipulate the persistent internal state are defined within the entity, and access data declarations within the entity by host association, without the need to name the entity in references, and . entities that represent the persistent internal state are not accessible outwith the container; if it were a derived type, the entities that represent the persistent internal state would be private components. Methods to initialize persistent state ====================================== The persistent internal state of a container could be initialized by o Declarations, if the initial size and internal parameters (e.g. dimensions of a sparse matrix) can be known - when the container is developed, - when it is instantiated if it is defined by a parameterized module, macro, or template, - when its scope is entered (e.g. the persistent state is represented as an automatic variable) or - when it is allocated. o ENTRY points in a subprogram in which the persistent internal state is represented by local saved variables. o Module procedures in the module in which the persistent internal state is represented by local saved variables. o Type bound procedures if the persistent internal state is represented by an object of derived type. o Procedures associated with a new variety of entity that combines a representation of the state and the methods, as described in 19-122 and 19-123. Methods to enter elements into the persistent state =================================================== Elements could be updated or entered into the persistent internal state by o Assignment to a data object. This requires to expose the details of the representation of the persistent internal state. o Subroutine calls. If a container was originally represented by an intrinsic entity such as an array, and elements were entered by assignment statements, and it becomes necessary to represent it using more elaborate data structures, this requires to find and replace every assignment of an element of the persistent state by a subroutine call. This is one reason that changes to programs frequently have costs proportional to the size of the program rather than to the magnitude of the change in representation of one container. o Stand-alone updater procedures of the variety associated with the representation described in 19-122 and 19-123. Updater procedures have the advantage that the syntax is the same as for an assignment statement, and therefore it is not necessary to find and replace every assignment of an element of the persistent state by a subroutine call. It has the disadvantage that a stand-alone updater procedure does not easily provide for more than one container object without physically copying or instantiating program text using a device such as a parameterized module, macro, template or INCLUDE statement. o Type-bound updater procedures of the kind associated with the representation described in 19-122 and 19-123. This has the advantage that an assignment statement using a type-bound updater has the same syntax as an assignment statement to a component, provided it is permitted to invoke an updater that has no (non optional) arguments without using empty parentheses. o Updater procedures associated with a new variety of entity that combines a representation of the state and the methods, as described in 19-122 and 19-123. This has the advantage that the syntax is the same as for an assignment statement, and therefore does not require to find and replace every assignment of an element of the persistent state by a subroutine call. It also has the advantage that there can be several containers, without requiring physically to copy, or instantiate program text, using a device such as a parameterized module, macro, template or INCLUDE statement. In particular, a container or array of containers can be instantiated by an ALLOCATE statement. Methods to retrieve a specified element ======================================= A specified element of a container, or a pointer associated with it, can be retrieved by o Subroutine calls. This is required if the internal state is represented by save variables within a subprogram, unless initialization, and entering elements, are performed using functions that have side effects. If a container had been represented by an intrinsic data object such as an array and it becomes necessary to replace it with a more elaborate data structure and methods, this requires to find and replace every reference with a subroutine call. This is one reason that changes to programs frequently have costs proportional to the size of the program rather than to the magnitude of the change in representation of one container. o References to stand-alone functions. If a container had been represented by an intrinsic data object such as an array, and a revised representation consists either of local save variables within a subprogram or module, this has the advantage that it does not require to find and replace references. It has the disadvantage that a stand-alone function does not easily provide for more than one container object without physically copying or instantiating program text using a device such as a parameterized module, macro, template or INCLUDE statement. A subset of elements cannot be accessed using array section notation unless the SECTION type described in 19-122 and 19-123 is provided. All elements (as opposed to the container itself) cannot be retrieved without changes to the program because a function cannot be referenced without parentheses. o References to type-bound functions. If a container had been represented by an intrinsic data object such as an array, and a revised representation consists either of local save variables within a subprogram or module, this has the disadvantage that it requires to find and replace references. If a container had been represented by a derived type and type-bound procedures, and some facet of a revised container had been represented by a scalar component, but in the revision it is required to be represented by procedures, this has the disadvantage that every reference to that facet must be found and changed to a function reference. The change is trivial because it consists of adding empty parentheses. This could easily be ameliorated by allowing to reference type-bound procedures that have no (non-optional) arguments without using parentheses. o Reference to functions associated with a new variety of entity that combines a representation of the state and the methods, as described in 19-122 and 19-123. This has the advantage that it does not require to find and replace references to array elements. It would still be necessary to find and replace scalar references or whole-array references because parentheses are necessary to invoke a function. Methods to traverse the elements or a subset thereof ==================================================== The elements of a container, or a subset of the elements, could be traversed using o Ordinary looping mechanisms. This requires: - To expose the details of the representation. - To provide specialized procedures for each operation to be performed by a traversal, even if some operations are performed in only one place. - To provide a generalized procedure to carry out a traversal, which procedure has one or more procedure arguments to carry out operations on traversed elements. This can cause difficulty to provide additional data to the procedure that carries out an operation, especially when revising "legacy" programs. - To provide a generalized procedure to carry out a traversal, which procedure uses "reverse communication" to return to the invoking process to carry out operations on traversed elements, after which the invoking process invokes the traversal procedure to continue the traversal. This in turn requires the traversal procedure to maintain an internal state of the traversal process, which in turn limits the applicability to one container object. It is also inherently not thread safe. - To provide coroutines. This is similar to "reverse communication" in that a coroutine temporarily suspends its execution by transferring control to the invoking process, instead of returning to the invoking process. After carrying out the desired operation, the invoking process resumes the coroutine instead of invoking it anew. This has the important distinctions to "reverse communication" that . the internal state of the traversal is automatically represented by the ordinary control flow of the coroutine, . local variables of the coroutine are not destroyed when it is suspended and not re-created when it is resumed, which reduces execution cost, . a coroutine as described in 19-120 and 19-121 can traverse several containers, even simultaneously, without duplicating the program unit text, and . a coroutine is inherently thread safe. o Iterators and ITERATE constructs, as described in 19-120 and 19-121. Iterators and ITERATE constructs have the advantages of coroutines, a more concise syntax, and clear indication and scope of iteration. Methods to finalize their persistent state ========================================== The persistent internal state of a container can be finalized by o Subroutine or function references to ENTRY points if the container's data are represented by local save variables within a subprogram. This has the disadvantage that a stand-alone subprogram does not easily provide for more than one container object without physically copying or instantiating program text using a device such as a parameterized module, macro, template or INCLUDE statement. o References to module subroutines or functions. This also has the disadvantage that it does not easily provide for more than one container object without physically copying or instantiating program text using a device such as a parameterized module, macro, template or INCLUDE statement. o Reference to type-bound procedures. o Final subroutines bound to a derived type. o Final subroutines associated with a new variety of entity that combines a representation of the state and the methods, as described in 19-122 and 19-123. Ability to instantiate a framework ================================== Instantiating a framework for a container could be done by: o Physically copying the text of the program unit that implements the container's representation and methods. This has the disadvantage that it is not possible to create similar containers for different type or kind. o Effectively copying the text of the program unit that implements the container's representation and methods, using INCLUDE statements. Although useful, and not difficult to instantiate a container for a particular type with a different kind, it is tedious to instantiate a container using different types. This has the disadvantage that such instantiations are not allocatable, and therefore a program can have only a fixed number of them. o A new device such as - parameterized modules - macros, or - templates. o Declaring an object of a new entity such as described in 19-122 and 19-123 that allows to combine the declarations of a container's internal state and methods. This can, of course, be combined with a new device such as - parameterized modules - macros, or - templates.