To: J3 J3/18-122 From: Steve Lionel Subject: J3 F202X suggestions Date: 2018-February-13 Reference: 18-007 At meeting 215, J3 attendees were asked to provide their "top 5" requests for new features in Fortran 202X, other than the two top vote-getters in the user survey (N2147) - generic programming/ templates and exceptions. This paper compiles the requests as a basis for further discussion, sorting and winnowing. Top requests from user survey ----------------------------- Generic programming or templates Exceptions Requests from J3 members ------------------------ Numbers in parentheses represent ideas from more than one member - (2)Specifying default values for optional dummy arguments that are omitted - User control of the optional leading zero in F format (18-120) - (3)More support of deferred-length allocatable character variables, to eliminate the requirement that the programmer pre-allocate, including: o Formatted READ o Use as an internal file in WRITE o Use in ERRMSG and IOMSG o Allow default initialization as derived-type component - An INTENT keyword that means the same as not specifying INTENT - Providing methods for easily accepting NUL-terminated strings in calls to Fortran from C. - Asynchronous subroutine execution with event post on completion - Relax type matching rules specifically for real/complex to allow passing real to complex and vice-versa, and also pointer assignment (maybe with new intrinsics that accept real or complex and return pointer of the other type with half/double elements in first rank) - Constructors - procedure called automatically when a derived type comes into existence - Remove requirement that LOGICAL requires one numeric storage unit - Degree trigonometric functions - Significantly increase free-form line length - A SWAP_ALLOCS procedure - (2)Unsigned integers - (2)PROTECTED types, pointers and structure components - Invoke type-bound or object-bound function component using result of another function. Select subobject of function result. - RANGE attribute and statement. 18-119:2.5.2 - Views (taming pointers, alternative/extension to EQUIVALENCE 18-119:2.5.2) - Specification variables 18-119:2.3.4 - Bit strings - POINTERs initially null instead of undefined - Allowing a specification section in (some) constructs without forcing use of BLOCK/END BLOCK seems useful. - A VIRTUOUS (superpure) procedure prefix that declares the procedure does not access data outside of the procedure. - Syntax to specify that the target of a dummy argument pointer is not to be changed (INTENT applies to the pointer, not its target) - CFI function to return text version of a processor's CFI error code - Binary compatibility of C descriptors so that a single C library could work with multiple Fortran implementations - Allow more than one procedure interface to share the same binding name (currently disallowed by 19.2p1) - Reihnold's proposal for requesting different memory types for different data objects. 18-118 - Any information that can help avoid OpenMP false sharing, e.g. intrinsics to inquire about cache sizes, and other hardware features at run time. - processor dependent information on sizes of vector instruction, that helps arrange data in an optimal fashion for SIMD processing. - Strenghening existing intrinsics, i.e. avoiding special processing of corner cases. - A0 format that trims whitespace from left and right of strings - Short-circuit logical operators - More CHARACTER intrinsics (delimiter processing, perhaps?) - Container class libraries for at least 'set', 'map' ('dictionary'), 'list'; using the generic facility I presume we will add. - (2)An ability to declare a reduction variable on DO CONCURRENT. - Standardizing preprocessor behavior - Proper enumerated types - Units Full text of member wishlists ----------------------------- 1. asynchronous subroutine execution uc- more general asynch i/o, asynch message-passing, specialized cores, off-chip cores use asynchronous attribute to mark affected data items ex: type( event_type) :: all_done asynchronous :: data-items call foo (...), ev= all_done 2. real-complex uc- f77 equivalence, common, argument passing, or other storage sequence this can be done via intrinsics or by some decoration of pointer assignment 3. constructors uc- dt value must be correctly set in a larger data structure ex: derived type represents a star (data-item) which must be located in a galaxy (data-structure) 4. logical need not require one numeric storage unit uc- as NS sizes grow, logical is more unnecessary as a full NSU (this likely represents one more new-way/old-way switch for compilers) 5. a way to initialize allocatable strings ex: character( :), allocatable :: foo = 'string' also allow unallocated characters as processor messages: iomsg= foo, errmsg= foo also allow unallocated characters as list items in READ stmts 1. the degree trigonometric functions 2. longer lines, probably only for free-form Not in N2147, but longer lines would be handy if preprocessor macros are allowed. 3. The SWAP_ALLOCS intrinsic subroutine. I want the SWAP_ALLOCS intrinsic subroutine for selfish reasons. I wrote a program where I had to use three calls to MOVE_ALLOC to swap two allocatable variables in several places in the code. While a very clever compiler could optimize the three calls of MOVE_ALLOC to the equivalent of a call of SWAP_ALLOCS, I doubt any compiler will ever do that optimization. 4. a true string type A true string type would eliminate the need for ALLOCATE on READ. 5. Unsigned integers Big projects: 1. Generic programming, as parameterized modules, and not crippled by allowing only integers as instance parameters. To do it right also requires: a. Extension to subscripts 18-119:2.4.1 04-195 for rank genericity b. Coroutines and iterators 18-119:1.6 04-380r1, detailed spec exists c. Uniform syntax 18-119:1.6, 1986 paper correspondence with X3J3, paper by Donald T. Ross, paper by Charles Geschke and James Mitchell, and David Parnas's paper "On the criteria for decomposing programs into modules," wherein his solution for uniform syntax is to hide everything in procedures. d. Accessors or updaters 18-119:1.3, detailed specs exist 2. Exceptions, using enumerators of an enumeration type to identify exceptions. This requires enumeration types. 18-114 18-115 Little projects: 1. Protected types, pointers, and structure components. Types 18-119:2.2.7 04-167 14-165 18-119:2.2.8 13-215 Pointers: 18-119:2.5.3 Components: N2147 It's silly to do only some. It was silly to do LOCK_TYPE and EVENT_TYPE as special cases instead of doing protected types. 2. Invoke type-bound or object-bound function component using result of another function. Select subobject of function result. Invoke: 18-119:2.7.1 Subobject: 18-119:2.2.12 It's silly to do one without the other 3. RANGE attribute and statement. This was briefly in Fortran 8X. 18-119:2.5.2 S8.99 (1986) 4. Views (taming pointers, alternative/extension to EQUIVALENCE) 18-119:2.5.2 5. Specification variables 18-119:2.3.4 1. My first, is BIT strings, though I am not sure if this is considered big ticket or medium ticket. That said, I think it would be a substantial amount of work to implement in a compiler and runtime system, but perhaps not so much work for J3 to define. 2. I am sympathetic to POINTERs being initially unallocated, 3. and a way to have a default value for not present optional arguments. 4. PROTECTED components of derive types seems reasonable (and part of making the language consistent). 5. Allowing a specification section in (some) constructs without forcing use of BLOCK/END BLOCK seems useful. 1. A VIRTUOUS (superpure) procedure prefix that declares the procedure does not access data outside of the procedure. This is a performance request so that such a procedure can be move around with no limitation that other data has to be accessible. 2. Enhancement of the INTENT(IN) attribute or on that front for protecting the target associated with the dummy pointer that is intent(in). Similar to the CONSTPOINTER proposed in the Survey paper. 3. Add const char *CFI_strerror(int ind). which is a function that when given an error code from a CFI function returns a processor-dependent string representation of the error. For example: CFI_strerror(CFI_SUCCESS) would return "CFI_SUCCESS" or "No errors". User value: Right now, different compilers have different error codes. Even for the error codes required by the standard, the standard does not assign values to these codes. So the use case here is a user who wants to abort on error, and want to print what error they got before they abort. Today, they can only print the error number and then someone has to go to ISO_Fortran_binding.h to find what error it is. With this function, they can get the human-readable representation of the error. 4. To provide a way for binary portability between C descriptors. More specifically, a portable way for code compiled by any compiler to access all the required (non-vendor specific) fields of a C descriptor created by code compiled by another compiler. User value: There is currently no direct way for Fortran programs that use C descriptors to interoperate with each other. One example would be a third party library that has a procedure with a C descriptor dummy argument. With the way things are today, the third party has to build a separate version of the library for every compiler on the platform. But if C descriptors were binary interoperable (where any compiler can access the required fields), the library vendor could ship just one version of the library. This would require to bump up the version in the C descriptor. 5. To allow multiple specific interfaces to have the same bind(c) binding name. interface myproc function foo1(i) bind(c, name='__foo') type(c_ptr), value :: i end function function foo2(i) bind(c, name='__foo') type(*), intent(in) :: hostptr(*) end function end interface So the Fortran caller can pass different aregument but still calling the same C function. I'd vote for anything that helps achive high performance on HPC systems. Specifically: 1. 18-118 - Reihnold's proposal for requesting different memory types for different data objects. If, and only if, this can be implemented in a portable and future proof way, and if the implementers believe these can lead to improved performance, then I'd support it. For examlple, these requests can be just hints to the compiler on the intended usage of different data objects, e.g. "this array will be mostly read (or mostly written to)", "this data will be accessed and changed very often, so use the smallest latency", etc. The implementation, of course, can choose to interpret these requests as it sees fit, e.g. by ignoring them completely, but perhaps doing some clever data placement. 2. Any information that can help avoid OpenMP false sharing, e.g. intrinsics to inquire about cache sizes, and other hardware features at run time. 3. Related to (2) - processor dependent information on sizes of vector instruction, that helps arrange data in an optimal fashion for SIMD processing. For example, some constants available via iso_fortran_env, which can be used in allocating multi-dimensional arrays. 4. Strenghening existing intrinsics, i.e. avoiding special processing of corner cases. This is linked to IEEE support. Specifically: allow complex Bessel and error functions, allow LOG(0.0) and LOG(+-0.0, +-0.0). There are probably other intrinsics for which the range of acceptable inputs can be extended with the use of IEEE +-infinities, etc. - intrinsic sort() - DO LOOP syntax which is compliant with DO CONCURRENT - automatically NULLed pointers (rather than undefined) Big thing: The preprocessor (I know, I'm wasting my time). I would like to have a standard set of preprocessor behaviors defined that include macros, string concatenation, and the ability to generate standard conforming source output. It should include the ability to show the original source files and lines in the output. An ability to declare a reduction variable on DO CONCURRENT. This can save on temporaries and can be optimized with atomic instructions. Small things Unsigned integers Shortcut conditionals on LOGICAL expressions A0 format specifier that implicitly trims the object string More CHARACTER intrinsics, though I don't have anything specific yet Container class libraries for at least 'set', 'map' ('dictionary'), 'list'; using the generic facility I presume we will add. 1. asynchronous subroutine execution uc- more general asynch i/o, asynch message-passing, specialized cores, off-chip cores use asynchronous attribute to mark affected data items ex: type( event_type) :: all_done asynchronous :: data-items call foo (...), ev= all_done 2. real-complex uc- f77 equivalence, common, argument passing, or other storage sequence this can be done via intrinsics or by some decoration of pointer assignment 3. constructors uc- dt value must be correctly set in a larger data structure ex: derived type represents a star (data-item) which must be located in a galaxy (data-structure) 4. logical need not require one numeric storage unit uc- as NS sizes grow, logical is more unnecessary as a full NSU (this likely represents one more new-way/old-way switch for compilers) 5. a way to initialize allocatable strings ex: character( :), allocatable :: foo = 'string' also allow unallocated characters as processor messages: iomsg= foo, errmsg= foo also allow unallocated characters as list items in READ stmts 1. Concurrent reductions (use case attached) 2. Assertions (use case attached) 3. Generic programming 4. Enumerated types (use case attached) 5. Units (use case attached)