J3/02-314

Subject: Comments on Public Review Draft ISO/IEC CD 1539-1
From: Kurt W. Hirchert                  (Meeting 163) 08 Nov 2002

1.  I do not intend to comment extensively on editorial matters,
    but I did notice that the procedure descriptions in section 15
    look "wrong" when compared to those in sections 13 and 14.  I
    think this is because each procedure description in 13 and 14
    is a separate subsection beginning with a level 3 header, while
    the procedure descriptions in 15 are combined into a single
    subsection.  It is unclear to me whether it would be better to
    restructure the section 15 descriptions to also be separate
    subsections or just mimic the typography of a level 3 header
    for the "headers" within 15.1.2.

2.  Annex D would benefit from the addition of a cross reference
    listings for the BNF terms.  (I believe the editor has been
    given the tools necessary to generate such a cross reference,
    so I was somewhat surprised one was not included in this
    draft.)

    Also, the title of Annex D seems "wrong" for its current
    contents.

3.  I believe that 16.2.3 would be both easier to read and easier
    to maintain if it first defined what it means for two dummy
    arguments to be "distinguishable" and then expressed the
    remaining rules in terms of that concept.

    Note that while most of the rules can be expressed directly
    in terms of this concept, the rule labeled (1) also needs
    the related concepts of a dummy argument that is partially
    indistinguishable from a second (because some of its valid
    actual arguments would also be valid actual arguments for the
    second) and of a dummy argument that is completely
    indistinguishable from a second (because all of its valid
    actual arguments would also be valid actual arguments for the
    second).

4.  There appears to be a serious problem with the rules in 16.2.3
    in that they are violated by intrinsic assignment as it applies
    to extensible types.

    The source of this problem appears to be that 12.4.1.2 allows
    an actual argument of an extension type (e.g., TYPE(APPLE)) to
    correspond with a dummy argument of its parent type (e.g.,
    TYPE(FRUIT)).  I see no reason for this allowed mismatch; if
    the programmer truly wanted it, the dummy argument could have
    been declared CLASS(FRUIT) instead of TYPE(FRUIT).

    To repair this problem, I suggest the following changes:

    a.  Change the first paragraph of 12.4.1.2 to disallow type
        mismatches when the dummy argument is declared with TYPE
	rather than CLASS.

    b.  Change the definition of "distinguishable" in 16.2.3 (see
        comment #3 above) to reflect this change.

    c.  To facilitate a user explicitly doing what was being done
        implicitly before, give objects of CLASS(T) (but not those
	of TYPE(T)) a component named T that represents its
	TYPE(T) subobject.  [If the dynamic type of the object is
	T, then the "component" is the entire object; otherwise,
	it is the component T that is already defined for that
	dynamic type.]

5.  Failure to execute FINAL procedures for variables with the
    SAVE attribute will cause serious problems for abstractions
    with side-effects outside the program.  Such abstractions
    need _all_ variables to be finalized, even when the program
    is about to terminate.

    Ideally, FINAL procedures should be defined to be executed for
    all variables, and it should be up to the implementation to
    optimize away those executions that do nothing useful when the
    program is about to terminate anyway.  If the analysis to do
    this optimization is too difficult for current compiler
    technology, then there should be a way for the programmer to
    flag that a particular FINAL procedure need not be executed
    immediately prior to program termination, but the fundamental
    semantics of finalizing all variables should be available to
    those programmers who need it.

6.  The semantics of the IMPORT statement are useable, but subtly
    wrong.  An interface body is intended to describe the
    interface of a procedure using the declarations with which it
    is defined.  Without IMPORT, an interface body has no host
    association (like an external procedure).  With IMPORT, the
    interface body should be host associated with the same host
    as the procedure it is describing, i.e., with the module, even
    when the interface body is contained in a module procedure or
    internal procedure nested within that module.

    In most cases, this will make little difference, but it will
    be much more difficult to make this "right" in a future revision
    than it would be to correct it now.

    In the long run, I would like the module procedure to also be
    allowed to specify IMPORT to get only limited host association,
    but such a feature seems beyond the scope of what can reasonably
    be done at this time.

7.  In R1224, the comma before <proc-language-binding-spec> is
    irregular and thus is a syntax error just waiting to happen.
    I recommend eliminating it.  Additionally, the ordering between
    <proc-language-binding-spec> and RESULT(<result-name>) has no
    obvious rationale and thus is also likely to be the cause of
    errors for the unsuspecting.  I suggest allowing them in
    either order.

    In R1231, I would similarly eliminate the comma.  To avoid
    ambiguity in fixed source form, I would move <proc-language-
    binding-spec> inside the brackets making the parenthese
    optional, so a (possibly empty) argument list is required if
    <proc-language-binding-spec> is present.

    In R1234, I would eliminate the second alternative and change
    the first alternative to be analogous to R1224 and R1231 as
    modified.

8.  I agree with those people who dislike the language keyword
    NONKIND.  I suggest SIZE as an alternative (since nonkind
    type parameters are ultimately used as character lengths and
    array bounds).

9.  I similarly dislike NONOVERRIDABLE, but I have no immediate
    suggestion as an alternative.

10. The name C_F_POINTER does not appear to follow the naming
    patterns of other intrinsics,  (I also do not understand why
    it was made a subroutine rather than a pointer-valued
    function.)

11. I would like to add a constraint on the <default-char-expr>
    of R914 specifying that if it is an initialization expression,
    its value shall contain a valid format.  (In the general case,
    a compiler must defer this checking until execution, but in
    the commonly-occurring special case of a constant format, it
    would be nice to be notified of errors in the format during
    compilation.)

12. 5.3 contains a restriction on the relative order of the
    IMPLICIT statement and any uses of the implicit typing rule
    it defines.  I would like to see this restriction converted
    to a constraint (so programmers could receive notification
    when they get this order wrong).

    (5.3 may need some clarification on exactly where the
    implicit typing occurs.)

13. The addition of allocatable components theoretically allows
    very complex recursive data structures to be built, but in
    practice this facility can be difficult to at anything more
    than a trivial level because there is no way to adjust such
    a data structure without totally rebuilding it.  A facility
    analogous to pointer assignment is needed for allocatable
    objects.  I suggest the addition of the MOVE_ALLOC intrinsic
    proposed in J3/01-161.

14. The intrinsic random number generator is less portable than
    had been originally hoped because of inconsistent interpreta-
    tions of how to implement its interface.  I believe the
    standard needs to provide better guidance on this point.
    Ideally, I would like such guidance to be normative, but at
    this point I would happily settle for non-normative guidance.
    Editorially, I would provide guidance on all forms of
    reference to RANDOM_NUMBER and RANDOM_SEED, but the most
    critical point is that CALL RANDOM_SEED was intended to
    put the random number generator in an unpredictable state,
    without the programmer having to deal questions about the
    size of a seed, whether the implementation has a real-time
    clock, whether repeated values in the seed degrade the
    quality of the random numbers generated, etc.

15. I am sympathetic to the suggestion for a COMPLEX intrinsic
    that takes its output kind from the kind of its input
    argument(s).  I note that there is a similar infelicity
    in taking the real part of a complex value (but not in taking
    the imaginary part).

16. If have seen suggestions that Fortran shoud have a SIZEOF
    intrinsic.  Since the output of such a function has no meaning
    in terms of Fortran semantic concepts, I suggest that if
    such an intrinsic is added, it should be a part of the
    C interoperability facility.

17. Given the rather major extensions we have made in things like
    statement length, I do not understand the reluctance to
    increase the maximum array rank to something larger than 7.

18. I believe the time is long overdue for "procedureness" to be
    a distinguishing factor in the generic rules (see #3 above).
    (If we don't do that, we at least need to look at whether the
    type distinguishing rules are appropriate for procedures.
    Also, we should do whatever interpretation is necessary to
    say that subroutines do not have the same type as functions.)

19. In the future, I would like to see a more complete declaration
    vocabulary for Fortran, so attributes can be declared directly
    rather than inferred from the lack of a contrary attribute.
    For the most part, such a project is too large for the current
    revision (unless there is to be a significant schedule change),
    but it might be worthwhile to add a DISCARD or NOSAVE attribute
    to negate the SAVE attribute.  I can see several places where
    this might be useful:

    a.  Most programmers would find it far more convenient if SAVE
        were the default for variables in modules (and possibly for
	variables in COMMON), but there are parallel performance
	opportunities inherent in the current rules.  Allowing an
	explicit DISCARD would allow programmer who need that
	performance to request it without burdening the average
	programmer with the need to SAVE shared variables.

    b.  It is too late to change the DATA implies SAVE rule, but
        an explicit DISCARD could override it, so
	    INTEGER, DISCARD :: NUM = 0
	could do what many programmers expect
	    INTEGER :: NUM = 0
	to do.

20. Late in the process, the concept of deferred type-bound
    procedure bindings was removed from the draft.  There is some
    talk of trying to reinstate it.  In my opinion,

    a.  the ability to require that a procedure binding be
        explicitly overridden in extension types was a positive
	feature, but

    b.  the ability to create a procedure "slot" but leave it
        empty was definitely a misfeature.  It created syntactic
	inconsistencies and many special cases in the execution
	rules.

    Thus, I would support the reinstatement for the former but not
    the latter.  (If someone comes up with a clever way to make
    "prototype " procedures more clearly something that should not
    be called, I might be open to that.)

21. If T is a type with a single public component C, the notation
    Tvar%C can be used to access a variable of type T as a variable
    of the type of C, and the notation T(Cval) can be used to
    convert a value of the type of C to type T, but there is no
    notation that allows one to use a variable of the type of C as
    a variable of type T.  If such a notation (e.g., T(/Cvar/)) were
    available, it would _significantly_ extend the applicability of
    the features for polymorphic programming.

    (The illustrative notation in this comment appears to be
    adequate for this particular application, but it may need to
    be altered to support other things people want to do, like
    viewing a complex array as a real array or viewing a
    character array as a character string.)

22. The absence of INITIAL procedures analogous to FINAL procedures
    is a significant hole in the facilities being provided.  The
    fact that in some cases, default initialization or "user-written
    constructors" can be used as a replacement for INITIAL procedures
    should not be misconstrued as the latter facilities being an
    adequate replacement for INITIAL procedures in the general case.
    (I don't expect INITIAL procedures to be added in the current
    revision, but don't expect to get away with leaving them out of
    the next revision.)

23. Allowing generic procedures to use the same name as a type
    provides a way to do "user written structure constructors"
    for non-parameterized derived types, but it does not provide
    the appropriate syntax for parameterized derived types.  I
    believe it would have been better to provide a <generic-spec>
    that causes a procedure to implement the constructor syntax.
    (This is probably too large a change to make this late in the
    process, but there will be no way to "correct" this in the
    next revision.)

24. The C interoperability promises to have great value, but the
    way it was done is unfortunate.  Making entities in Fortran
    be represented like C entities and have C semantics allows
    Fortran procedures and interfaces to masquerade as C procedures
    and interfaces, but it effectively imports large parts of C
    into Fortran.  (The recent controversy over enums is
    illustrative of the kind of problems this importation creates.)
    A better was would have been to provide for a bridge that
    copies Fortran values to C entities and vice versa.  In cases,
    where the representations are the same, this copying could
    still be optimized away, so performance would not suffer, but
    this conceptual copying would eliminate the requirement to
    provide Fortran types corresponding to all C types.  (Unless
    something drastic happens to delay the completion of this
    revision, I see no way the approach could be changed at this
    late date, but I still felt I must express my dismay at the
    implications of the current approach.)

25. Although some improvements in the generic rules can be made by
    elaborating the definition of "distinguishable" (see #3 above),
    in the long rule, these rules need extension in other ways.
    In particular, they need to be extended to handle preferences,
    so, for example, a pointer dummy argument could be distinguished
    from a nonpointer dummy argument.  (Such an extension is not
    especially difficult, but because it would require an extension
    of structure of the generic rules, it is probably beyond the
    scope of what can be addressed in this revision.)

26. In current implementations, pointers tend to destroy performance.
    I believe that in many cases this situation could be improved
    by providing a mechanism to limit the targets to which a pointer
    can point (and conversely to limit the pointers which can
    point to a particular target).  (Probably too large for this
    revision)

27. Another common source of complaints about performance relates
    to mismatches in the expected organization of arrays.  Pointers
    are allowed to point at discontiguous array sections, and thus
    generate overly general instructions when used to access
    allocated arrays (that are generally contiguous).  Conversely,
    explicit-shaped dummy arguments expect to be associated with
    contiguous storage, so if the actual argument is an array
    section or other discontiguous representations, it must be
    copied to contiguous storage before executing the procedure
    and copied back afterwards.  I would like to add an attribute
    to pointers and dummy arguments to declare which array
    organizations it is to allow or be optimized for.  Tentatively,
    I would suggest the following alternatives:
    *   CONSECUTIVE - contiguous
    *   UNIFORM - possibly discontiguous but uniformly spaced in
        memory (e.g., the result of extracting a component of a
	CONSECUTIVE array)
    *   LINEAR - uniformly spaced along each dimension, but not
        necessarily overall (e.g., a general array section)
    *   GENERAL - arbitrary mapping (e.g., vector-valued subscripts
        or following pointer components)
    This could be combined with #25 so, for example, it would be
    preferred to associated a contiguous actual argument with
    a CONSECUTIVE dummy argument rather than a LINEAR one and a
    generic could include both a CONSECUTIVE version of a routine
    to give highest performance in the common case and a LINEAR
    (or even GENERAL) version to avoid the performance hit of having
    to copy a more general argument into contiguous storage.
    (Again, this is probably too big to do in this revision, but
    I like to offer a couple of "bigger" suggestions in case there
    is an unexpected schedule delay that would make them practical
    to consider.)

                              - end -