J3/02-314 Subject: Comments on Public Review Draft ISO/IEC CD 1539-1 From: Kurt W. Hirchert (Meeting 163) 08 Nov 2002 1. I do not intend to comment extensively on editorial matters, but I did notice that the procedure descriptions in section 15 look "wrong" when compared to those in sections 13 and 14. I think this is because each procedure description in 13 and 14 is a separate subsection beginning with a level 3 header, while the procedure descriptions in 15 are combined into a single subsection. It is unclear to me whether it would be better to restructure the section 15 descriptions to also be separate subsections or just mimic the typography of a level 3 header for the "headers" within 15.1.2. 2. Annex D would benefit from the addition of a cross reference listings for the BNF terms. (I believe the editor has been given the tools necessary to generate such a cross reference, so I was somewhat surprised one was not included in this draft.) Also, the title of Annex D seems "wrong" for its current contents. 3. I believe that 16.2.3 would be both easier to read and easier to maintain if it first defined what it means for two dummy arguments to be "distinguishable" and then expressed the remaining rules in terms of that concept. Note that while most of the rules can be expressed directly in terms of this concept, the rule labeled (1) also needs the related concepts of a dummy argument that is partially indistinguishable from a second (because some of its valid actual arguments would also be valid actual arguments for the second) and of a dummy argument that is completely indistinguishable from a second (because all of its valid actual arguments would also be valid actual arguments for the second). 4. There appears to be a serious problem with the rules in 16.2.3 in that they are violated by intrinsic assignment as it applies to extensible types. The source of this problem appears to be that 12.4.1.2 allows an actual argument of an extension type (e.g., TYPE(APPLE)) to correspond with a dummy argument of its parent type (e.g., TYPE(FRUIT)). I see no reason for this allowed mismatch; if the programmer truly wanted it, the dummy argument could have been declared CLASS(FRUIT) instead of TYPE(FRUIT). To repair this problem, I suggest the following changes: a. Change the first paragraph of 12.4.1.2 to disallow type mismatches when the dummy argument is declared with TYPE rather than CLASS. b. Change the definition of "distinguishable" in 16.2.3 (see comment #3 above) to reflect this change. c. To facilitate a user explicitly doing what was being done implicitly before, give objects of CLASS(T) (but not those of TYPE(T)) a component named T that represents its TYPE(T) subobject. [If the dynamic type of the object is T, then the "component" is the entire object; otherwise, it is the component T that is already defined for that dynamic type.] 5. Failure to execute FINAL procedures for variables with the SAVE attribute will cause serious problems for abstractions with side-effects outside the program. Such abstractions need _all_ variables to be finalized, even when the program is about to terminate. Ideally, FINAL procedures should be defined to be executed for all variables, and it should be up to the implementation to optimize away those executions that do nothing useful when the program is about to terminate anyway. If the analysis to do this optimization is too difficult for current compiler technology, then there should be a way for the programmer to flag that a particular FINAL procedure need not be executed immediately prior to program termination, but the fundamental semantics of finalizing all variables should be available to those programmers who need it. 6. The semantics of the IMPORT statement are useable, but subtly wrong. An interface body is intended to describe the interface of a procedure using the declarations with which it is defined. Without IMPORT, an interface body has no host association (like an external procedure). With IMPORT, the interface body should be host associated with the same host as the procedure it is describing, i.e., with the module, even when the interface body is contained in a module procedure or internal procedure nested within that module. In most cases, this will make little difference, but it will be much more difficult to make this "right" in a future revision than it would be to correct it now. In the long run, I would like the module procedure to also be allowed to specify IMPORT to get only limited host association, but such a feature seems beyond the scope of what can reasonably be done at this time. 7. In R1224, the comma before is irregular and thus is a syntax error just waiting to happen. I recommend eliminating it. Additionally, the ordering between and RESULT() has no obvious rationale and thus is also likely to be the cause of errors for the unsuspecting. I suggest allowing them in either order. In R1231, I would similarly eliminate the comma. To avoid ambiguity in fixed source form, I would move inside the brackets making the parenthese optional, so a (possibly empty) argument list is required if is present. In R1234, I would eliminate the second alternative and change the first alternative to be analogous to R1224 and R1231 as modified. 8. I agree with those people who dislike the language keyword NONKIND. I suggest SIZE as an alternative (since nonkind type parameters are ultimately used as character lengths and array bounds). 9. I similarly dislike NONOVERRIDABLE, but I have no immediate suggestion as an alternative. 10. The name C_F_POINTER does not appear to follow the naming patterns of other intrinsics, (I also do not understand why it was made a subroutine rather than a pointer-valued function.) 11. I would like to add a constraint on the of R914 specifying that if it is an initialization expression, its value shall contain a valid format. (In the general case, a compiler must defer this checking until execution, but in the commonly-occurring special case of a constant format, it would be nice to be notified of errors in the format during compilation.) 12. 5.3 contains a restriction on the relative order of the IMPLICIT statement and any uses of the implicit typing rule it defines. I would like to see this restriction converted to a constraint (so programmers could receive notification when they get this order wrong). (5.3 may need some clarification on exactly where the implicit typing occurs.) 13. The addition of allocatable components theoretically allows very complex recursive data structures to be built, but in practice this facility can be difficult to at anything more than a trivial level because there is no way to adjust such a data structure without totally rebuilding it. A facility analogous to pointer assignment is needed for allocatable objects. I suggest the addition of the MOVE_ALLOC intrinsic proposed in J3/01-161. 14. The intrinsic random number generator is less portable than had been originally hoped because of inconsistent interpreta- tions of how to implement its interface. I believe the standard needs to provide better guidance on this point. Ideally, I would like such guidance to be normative, but at this point I would happily settle for non-normative guidance. Editorially, I would provide guidance on all forms of reference to RANDOM_NUMBER and RANDOM_SEED, but the most critical point is that CALL RANDOM_SEED was intended to put the random number generator in an unpredictable state, without the programmer having to deal questions about the size of a seed, whether the implementation has a real-time clock, whether repeated values in the seed degrade the quality of the random numbers generated, etc. 15. I am sympathetic to the suggestion for a COMPLEX intrinsic that takes its output kind from the kind of its input argument(s). I note that there is a similar infelicity in taking the real part of a complex value (but not in taking the imaginary part). 16. If have seen suggestions that Fortran shoud have a SIZEOF intrinsic. Since the output of such a function has no meaning in terms of Fortran semantic concepts, I suggest that if such an intrinsic is added, it should be a part of the C interoperability facility. 17. Given the rather major extensions we have made in things like statement length, I do not understand the reluctance to increase the maximum array rank to something larger than 7. 18. I believe the time is long overdue for "procedureness" to be a distinguishing factor in the generic rules (see #3 above). (If we don't do that, we at least need to look at whether the type distinguishing rules are appropriate for procedures. Also, we should do whatever interpretation is necessary to say that subroutines do not have the same type as functions.) 19. In the future, I would like to see a more complete declaration vocabulary for Fortran, so attributes can be declared directly rather than inferred from the lack of a contrary attribute. For the most part, such a project is too large for the current revision (unless there is to be a significant schedule change), but it might be worthwhile to add a DISCARD or NOSAVE attribute to negate the SAVE attribute. I can see several places where this might be useful: a. Most programmers would find it far more convenient if SAVE were the default for variables in modules (and possibly for variables in COMMON), but there are parallel performance opportunities inherent in the current rules. Allowing an explicit DISCARD would allow programmer who need that performance to request it without burdening the average programmer with the need to SAVE shared variables. b. It is too late to change the DATA implies SAVE rule, but an explicit DISCARD could override it, so INTEGER, DISCARD :: NUM = 0 could do what many programmers expect INTEGER :: NUM = 0 to do. 20. Late in the process, the concept of deferred type-bound procedure bindings was removed from the draft. There is some talk of trying to reinstate it. In my opinion, a. the ability to require that a procedure binding be explicitly overridden in extension types was a positive feature, but b. the ability to create a procedure "slot" but leave it empty was definitely a misfeature. It created syntactic inconsistencies and many special cases in the execution rules. Thus, I would support the reinstatement for the former but not the latter. (If someone comes up with a clever way to make "prototype " procedures more clearly something that should not be called, I might be open to that.) 21. If T is a type with a single public component C, the notation Tvar%C can be used to access a variable of type T as a variable of the type of C, and the notation T(Cval) can be used to convert a value of the type of C to type T, but there is no notation that allows one to use a variable of the type of C as a variable of type T. If such a notation (e.g., T(/Cvar/)) were available, it would _significantly_ extend the applicability of the features for polymorphic programming. (The illustrative notation in this comment appears to be adequate for this particular application, but it may need to be altered to support other things people want to do, like viewing a complex array as a real array or viewing a character array as a character string.) 22. The absence of INITIAL procedures analogous to FINAL procedures is a significant hole in the facilities being provided. The fact that in some cases, default initialization or "user-written constructors" can be used as a replacement for INITIAL procedures should not be misconstrued as the latter facilities being an adequate replacement for INITIAL procedures in the general case. (I don't expect INITIAL procedures to be added in the current revision, but don't expect to get away with leaving them out of the next revision.) 23. Allowing generic procedures to use the same name as a type provides a way to do "user written structure constructors" for non-parameterized derived types, but it does not provide the appropriate syntax for parameterized derived types. I believe it would have been better to provide a that causes a procedure to implement the constructor syntax. (This is probably too large a change to make this late in the process, but there will be no way to "correct" this in the next revision.) 24. The C interoperability promises to have great value, but the way it was done is unfortunate. Making entities in Fortran be represented like C entities and have C semantics allows Fortran procedures and interfaces to masquerade as C procedures and interfaces, but it effectively imports large parts of C into Fortran. (The recent controversy over enums is illustrative of the kind of problems this importation creates.) A better was would have been to provide for a bridge that copies Fortran values to C entities and vice versa. In cases, where the representations are the same, this copying could still be optimized away, so performance would not suffer, but this conceptual copying would eliminate the requirement to provide Fortran types corresponding to all C types. (Unless something drastic happens to delay the completion of this revision, I see no way the approach could be changed at this late date, but I still felt I must express my dismay at the implications of the current approach.) 25. Although some improvements in the generic rules can be made by elaborating the definition of "distinguishable" (see #3 above), in the long rule, these rules need extension in other ways. In particular, they need to be extended to handle preferences, so, for example, a pointer dummy argument could be distinguished from a nonpointer dummy argument. (Such an extension is not especially difficult, but because it would require an extension of structure of the generic rules, it is probably beyond the scope of what can be addressed in this revision.) 26. In current implementations, pointers tend to destroy performance. I believe that in many cases this situation could be improved by providing a mechanism to limit the targets to which a pointer can point (and conversely to limit the pointers which can point to a particular target). (Probably too large for this revision) 27. Another common source of complaints about performance relates to mismatches in the expected organization of arrays. Pointers are allowed to point at discontiguous array sections, and thus generate overly general instructions when used to access allocated arrays (that are generally contiguous). Conversely, explicit-shaped dummy arguments expect to be associated with contiguous storage, so if the actual argument is an array section or other discontiguous representations, it must be copied to contiguous storage before executing the procedure and copied back afterwards. I would like to add an attribute to pointers and dummy arguments to declare which array organizations it is to allow or be optimized for. Tentatively, I would suggest the following alternatives: * CONSECUTIVE - contiguous * UNIFORM - possibly discontiguous but uniformly spaced in memory (e.g., the result of extracting a component of a CONSECUTIVE array) * LINEAR - uniformly spaced along each dimension, but not necessarily overall (e.g., a general array section) * GENERAL - arbitrary mapping (e.g., vector-valued subscripts or following pointer components) This could be combined with #25 so, for example, it would be preferred to associated a contiguous actual argument with a CONSECUTIVE dummy argument rather than a LINEAR one and a generic could include both a CONSECUTIVE version of a routine to give highest performance in the common case and a LINEAR (or even GENERAL) version to avoid the performance hit of having to copy a more general argument into contiguous storage. (Again, this is probably too big to do in this revision, but I like to offer a couple of "bigger" suggestions in case there is an unexpected schedule delay that would make them practical to consider.) - end -