To: J3 J3/22-120r4 From: Tom Clune & generics subgroup Subject: Generics formal requirements Date: 2022-March-08 Reference: 21-187, 21-144r4, 22-123r1, 03-264 1. Introduction =============== At meeting 224, generics subgroup introduced a use cases paper that was sufficient to drive solutions that would accommodate a broader set of use cases that were endorsed by subgroup. Here we present the formal requirements that have been derived from these use cases. This version of the paper deletes several obsolete statements that were not noticed after separate development of the corresponding specs paper. It also attempts to address some of the concerns raised during plenary discussion. 2. Formal requirements ====================== A. Named templates shall be provided. A TEMPLATE is a parameterized scoping unit which can contain variables, type definitions, procedure definitions, etc. Templates can be 'instantiated' to yield concrete (i.e., non-parameterized) instances of entities defined within the template. Aside: The TEMPLATE approach for providing generic programming facilities is similar to previously proposed PARAMETERIZED MODULES (PM) (paper 03-264). The key differences are: 1. A template can be declared within a program unit and then instantiated within that same program unit. (Not within itself!) This is _not_ a formal requirement, but it can be useful to use locally defined specialized templates within a program unit. 2. The template approach sidesteps some thorny issues that would need to be resolved for PM. In particular, SUBMODULEs, which were specifically disallowed in paper 03-264. Basically, instantiation of a parameterized module would necessitate instantiation of the corresponding submodules, but there is (currently) no mechanism to determine this set of submodules. Likely, a solution could be found for this, but would itself be an extension. An example of a simple TEMPLATE with suggestive syntax is given below. The TEMPLATE name is "TMPL" and has one (dummy) TEMPLATE parameter "U" which is a type. The TEMPLATE provides a single function "GET_ITH" whose first argument is of type U. TEMPLATE TMPL(U) TYPE :: U END TYPE CONTAINS PURE FUNCTION GET_ITH(arr, i) result(ith) TYPE(U) :: ith TYPE(U), INTENT(IN) :: arr(:) INTEGER, INTENT(IN) :: i ith = arr(i) END FUNCTION END TEMPLATE TMPL B. Named restrictions shall be provided. A RESTRICTION is a parameterized construct that provides a mechanism for expressing necessary relationships among a set of TEMPLATE dummy parameters. Such relationships serve as a filter on allowed actual parameters to a TEMPLATE (see "weak concepts" below), and also to facilitate verification that a template will produce valid code for sets of actual parameters that satisfy the restriction. (See "strong concepts" below.) Notionally, a RESTRICTION is a construct which specifies the procedures and operators that must be supported for legal instantiations of a TEMPLATE. E.g., one may require that types T and U can be added to produce objects of type V. Suggestive syntax for this is: RESTRICTION :: BINARYOP(F,T,U,V) TYPE T; END TYPE TYPE U; END TYPE TYPE V; END TYPE INTERFACE FUNCTION F(x,y) RESULT(z) TYPE(T), INTENT(IN) :: x TYPE(U), INTENT(IN) :: y TYPE(V) :: z END FUNCTION END INTERFACE END RESTRICTION Aside: A RESTRICTION is somewhat similar to an abstract interface in that it contains a group of otherwise unrelated procedure interfaces, but different in that (1) it has a name, which facilitates reuse in multiple contexts, and (2) is parameterized. C1. A template shall have zero or more template dummy parameters. C2. A template can define any of the following entities: - derived type - procedure - interface - variable - another template - constants - enumeration types and their enumerators Note: Allowing PUBLIC enumeration types/enumerators is probably problematic. D. A template dummy parameter must be one of the following: - type (see straw votes at end of paper) - value - procedure - operator A suggestive example involving all categories of template dummy parameters: TEMPLATE TMPL2(T,C,sub,OPERATOR(+)) TYPE :: T END TYPE COMPLEX :: C INTERFACE SUBROUTINE sub(x) TYPE(T), INTENT(INOUT) :: x END SUBROUTINE sub FUNCTION OPERATOR(+)(x,y) RESULT(z) TYPE(T), INTENT(IN) :: x, y TYPE(T) :: Z END FUNCTION END INTERFACE END TEMPLATE E. The INSTANTIATION statement shall be provided. The INSTANTIATE statement specifies the name of a TEMPLATE and a list of TEMPLATE actual parameters corresponding to the TEMPLATE's dummy parameters. E1. A type-spec must be provided for a type name template dummy parameter. E2. A const-expr (of the correct type) must be provided for a value template dummy parameter. E3. A procedure or a generic-spec must be provided for a procedure template dummy parameter E4. An operator must be provided for an operator template dummy parameter. Suggestive syntax: INSTANTIATE TMPL(INTEGER(KIND=INT32)) ... INTEGER(KIND=INT32) :: I, ARRAY(10) ... I = GET_ITH(ARRAY, 3) E5. A generic-spec may be passed as an actual template parameter corresponding to a dummy procedure parameter, because generic resolution can be done during instantiation. Likewise for generic operators. Rationale: Client code often does not have access to specific names for procedures and operators. And the ability to use generic entities is far too valuable too disallow in this context. Intrinsic operators are a particularly important case of this. F. Multiple instantiations of a given TEMPLATE with identical template actual parameters define the same instance within a program. Consequences: 1. A derived type defined in a template is the _same_ type across all instantiations that have the identical template actual parameters. 2. An enumeration and its associated enumerators are the same across all instantiations that have the identical template actual parameters. Multiple instantiations of a given template with _any_ different actual parameters do _not_ define the same instance. In particular any derived types, enumerators etc. defined in these instances are distinct. NOTE: While BIND(C) use is not prohibited within a TEMPLATE, it may be problematic as it can associate distinct entities with the same global identifier. G. A named RESTRICTION shall have a set of zero or more dummy restriction parameters. NOTE: The zero-parameter case here is not useful. Possibly this should become "one or more parameters". H. A RESTRICTION dummy parameter shall be one of the same categories as for TEMPLATE dummy parameters. (type, value, procedure, operator) Note: "value" probably does not make sense in this context. But possibly does no harm? I. A named RESTRICTION shall specify a set of interfaces that hold among its dummy parameters: TKR of procedure/operator arguments and function results. K. The REQUIRES statement shall be provided. A REQUIRES statement specifies the name of a RESTRICTION and a set of actual parameters corresponding to the RESTRICTION parameters. A TEMPLATE uses REQUIRES statements to indicate relations among its own parameters that are necessary for the TEMPLATE to be valid. Legal instantiations must provide actual parameters that satisfy all of a template's REQUIRES statements ("weak concepts") and also may not use any other procedures/operators involving the template parameters that are types ("strong concepts"). Note: A more precise statement of strong concepts will be provided in the specs paper to come. Also, weak and strong concepts are called out as separate requirements below. A suggestive example of a template with 2 REQUIRES: RESTRICTION :: MAGMA(U, OPERATOR(+)) TYPE :: U END TYPE :: U INTERFACE SUBROUTINE OPERATOR(+)(x,y) RESULT(z) TYPE(T), INTENT(IN) :: x, y TYPE(T) :: Z END SUBROUTINE END INTERFACE END RESTRICTION TEMPLATE :: TMPL2(U, V, OPERATOR(+), OPERATOR(*)) REQUIRES MAGMA(U, OPERATOR(+)) REQUIRES MAGMA(V, OPERATOR(*)) ... CONTAINS SUBROUTINE DO_SOMETHING(x, y) TYPE(U) :: x TYPE(V) :: y x = x + x*x ! legal y = y + y*y ! legal x = x + y ! illegal - no supporting requirement x = 2*x ! illegal integer multiply not in requirements END SUBROUTINE DO_SOMETHING END TEMPLATE TMPL2 L. A TEMPLATE instantiation must satisfy all imposed restrictions. Note: This is the notion of a "weak concept" which requires that invalid template dummy parameters be diagnosed as such rather than being diagnosed as syntax or linkage errors in the instantiated code. M. A TEMPLATE may not reference any procedures or operators involving dummy type parameters except those that have been specified in a REQUIRES statement unless those. Rationale: This is the notion of a "strong concept" which allows template developers to ensure that a template does not have any _implicit_ restrictions that might not be detected for some unanticipated template parameters. Straw votes: =========== Discussion: ----------- An important issue that needs to be resolved is that of what aspects of a type specification can appear when declaring template dummy type parameters. Plenary discussion suggested that all of the following should be permitted in such declarations: - kind/length typeparameters - type-bound procedures - data components Allowing for such is indeed relatively straightforward, but there are some consequences that suggest disallowing these at least in the initial introduction of generic programming features. Rationale for _disallowing_ the above features in templates. ------------------------------------------------------------ Templates that rely on type bound procedures (TBPs) generally have limited applicability. Enabling the implementation of such templates encourages bad practices, and backward compatibility concerns will prevent reversing course at a later stage. Consider the following template that implements a find_root() procedure for a category of types that provide a TBP eval(). TEMPLATE find_root_tmpl(T) TYPE :: T CONTAINS PROCEDURE :: eval END TYPE T INTERFACE REAL FUNCTION eval(this, x) CLASS(T), INTENT(INOUT) :: this REAL, INTENT(IN) :: x END FUNCTION eval END INTERFACE CONTAINS REAL FUNCTION find_root(functor) TYPE(T), INTENT(INOUT) :: functor ... f0 = functor%eval(x) ... END FUNCTION find_root END TEMPLATE find_root_tmpl Initially this example appears to present a strong argument in favor of supporting TBPs in this context. However, a major concern with this approach is that it relies on the hard-coded name, "eval", and will not work directly with types that instead name their type-bound procedure "evaluate", "execute", "run", etc. As written, this template also will not work directly with ordinary (i.e., non TBP) functions that operate on type T. In each such case the user must instead implement a wrapper _TYPE_ (not just a wrapper _procedure_) with the designated name as a type-bound procedure in order to apply this template. Now, consider instead the template below that provides similar capabilities via a second procedure parameter: TEMPLATE find_root_tmpl(T, eval) TYPE :: T END TYPE T INTERFACE REAL FUNCTION eval(this, x) TYPE(T), INTENT(INOUT) :: this REAL, INTENT(IN) :: x END FUNCTION eval END INTERFACE CONTAINS REAL FUNCTION find_root(obj) TYPE(T), INTENT(INOUT) :: func ... f0 = eval(obj, x) ... END FUNCTION find_root END TEMPLATE find_root_tmpl This template can work with _any_ actual procedure name, but admittedly still requires a wrapper function to use with implementations based on TBPs. I.e., this approach addresses a much wider set of use cases. The distinction highlighted above is even worse when one considers templates using multiple TPBs. The applicability of the template reduces _exponentially_ with the number of TPBs as particular naming conventions of the implementor are exposed. Users would generally be required to create custom wrapper classes for _each_ such template that they intend to use. In contrast, a user with a type containing type-bound procedures need only wrap their methods _once_, and can then work with any (relevant) templates based on procedure parameters. That are written using procedure template parameters. Now the disparity between the two approaches can be seen to be so extreme that we should give at least some thought to disallowing what should be considered bad practice in general. One might naively expect that type-bound operators might be a bit safer with regard to the arguments above. However, there are a number of examples that Magne Haveraaen, our outside generics consultant, has provided in which one needs to wrap operators in surprising ways such as: wrapping "min" as "+" and "+" as "*", etc. (GraphBlas with "tropical smearing"). Perhaps a more mundane, but familiar example, is that one might want to sort using ">" when a template relies on a type-bound operator "<". Summary: Generics promotes an alternative, non-inheritance based, structuring of software. Encouraging a mixture will not be productive. A key point is that inheritance binds the names (and inheritance structure) of the methods in the required API, while generics does not. Rationales for _allowing_ the above features to be used: -------------------------------------------------------- 1. As a general rule, we try to allow various Fortran features to work together. The large prohibition suggested here is seemingly severe. 2. Significant use cases involving legacy code will be hard to adapt. 3. Not all templates will be developed with an external community in mind. So even if such templates will be limited in applicability, it might be more than sufficient for the intended purpose. Straw vote 1: Should declarations of TEMPLATE dummy type parameters be permitted to specify type-bound procedures? (YES - NO - UNDECIDED) Straw vote 2: Should declarations of TEMPLATE dummy type parameters be permitted to specify data components? (YES - NO - UNDECIDED) Straw vote 3: Should declarations of TEMPLATE dummy type parameters be permitted to specify kind/length type parameters? (YES - NO - UNDECIDED) If any of straw votes 1-3 are YES: Straw vote 4: Should subgroup pursue mechanisms to allow some form of name mapping between data components, type-bound procedures, and/or kind/length parameters of an actual template parameter and those of the dummy template parameter? (YES - NO -UNDECIDED) If any of the straw votes 1-3 are NO: There will be an amended paper that includes some/all of the following requirement: N. Declarations of dummy template parameters that are types may not specify: - type-bound procedures (pending straw vote 1) - data components (pending straw vote 2) - kind/len type parameters (pending straw vote 3) ===END===