J3/97-217r1 page 1 of 9 To: J3 From: JOR (Bleikamp) Subject: Specs and Syntax for Derived Type I/O Date: Aug. 25, 1997 Major Changes since 97-184. | - The runtime libraries values for SP, SS, P, BN and BZ | are saved by the library before calling the user defined | derived type I/O routine, RESET to the processors default | values, and restored to the saved | values after the UDDTIO routine returns. | - prohibited UDDTIO when the originating READ/WRITE statement | contained an ASYNCHRONOUS specifier. | - The INTERFACE FORMATTED (READ) statement was renamed | INTERFACE READ (FORMATTED), etc. | Other Issues discussed, but not acted upon. | - We discussed expanding these routines to be usable for | ALL edit descriptors, allowing users to provide formatting | routines for I, F, and other edit descriptors. | No such changes have been made yet, nor how to specify | such functionality.. | - We may wish to allow an array to be passed in for the DTV | dummy argument (at least for unformatted I/O), to facilitate | efficient processing of arrays of values. No changes have | been made to support this yet, but will be considered in | the future. The implementation issues are complicated. | - The "w" field in the DT edit descriptor is still optional. | - Supporting async I/O in UDDTIO is considered desirable. J3/97-217r1 page 2 of 9 Management Synopsis (also see the Rationale and Conceptual Model at the end of this paper): - The provider of a derived type may also provide I/O routines for that type, called "user defined derived type I/O routines" (hereafter refered to as UDDTIO routines), which are called by the Fortran I/O library when certain conditions are met. These UDDTIO routines perform input and output of list items of a particular derived type. In essence, the effect is as if the UDDTIO routines were substituting list items into the original I/O list (where the derived type item was), and adding edit descriptors into the middle of the original format specification, under control of the provided routines. - The F90 way of doing formatted and unformatted I/O on derived types still works the same as before. Only the presence of an interface for the appropriate UDDTIO routine triggers this new functionality. - FORMATs have a new edit descriptor, "DT". When the I/O library encounters this, it must match up with a derived type list item. The I/O library will call the appropriate UDDTIO routine, which will actually do the I/O. Typically, the provider of a derived type (and corresponding module) would provide these UDDTIO routines as part of the module. NOTE: we have chosen not to implicitly overload the existing data transfer edit descriptors (I,D,E,F,G,L,...) when such an interface is visible, and call the UDDTIO routine for those edit descriptors (in addition to DTxxx edit descriptors). This capability is easy to add should we wish too, buts makes it more difficult for the user to get to the F90 functionality. Interval 2 may propose some additional syntax to address this issue. - The UDDTIO routines will be called with a unit number, the derived type variable/value, and other misc. information. The UDDTIO routine will use normal I/O statements (READ/WRITE) on the supplied unit to read/write the derived type item's components. - Full support for complicated data structures is provided. These UDDTIO routines can invoke themselves (to traverse a linked list for example), and can invoke the UDDTIO routines for another derived type to handle nested derived types. Internal I/O may be used to easily construct or decompose character string values. J3/97-217r1 page 3 of 9 - The UDDTIO routines will be able to inquire about, and in the most general (robust) case, might want to worry about - list directed vs. namelist I/O vs. a format spec. - the DELIM= and PAD= values for this file (accessible via INQUIRE) on external (positive) unit numbers. - List directed and NAMELIST I/O will also call these same UDDTIO routines under certain, F90 compatible circumstances (when the appropriate interface is visible). Detailed Specification: UDDTIO routines shall have the following interface (all 4 routines for a particular derived type are not required; any subset can be provided): | INTERFACE READ ( FORMATTED ) SUBROUTINE my_read_routine_formatted & (unit, & dtv, & iotype, w, d, m, & eof, err, eor, errmsg) INTEGER, INTENT(IN) :: unit ! unit number ! the derived type value/variable TYPE (whateveritis), INTENT(OUT) :: dtv ! the edit descriptor string CHARACTER, (LEN=*), INTENT(IN) :: iotype INTEGER, OPTIONAL, INTENT(IN) :: w,d,m LOGICAL, INTENT(OUT) :: eof, err, eor CHARACTER, (LEN=*), INTENT(OUT) :: errmsg END END INTERFACE J3/97-217r1 page 4 of 9 | INTERFACE READ ( UNFORMATTED ) SUBROUTINE my_read_routine_unformatted & (unit, & dtv, & eof, err, eor, errmsg) INTEGER, INTENT(IN) :: unit ! the derived type value/variable TYPE (whateveritis) INTENT(OUT) :: dtv LOGICAL, INTENT(OUT) :: eof, err, eor CHARACTER, (LEN=*), INTENT(OUT) :: errmsg END END INTERFACE | INTERFACE WRITE ( FORMATTED ) SUBROUTINE my_write_routine_formatted & (unit, & dtv, & iotype, w, d, m, & err, errmsg) INTEGER, INTENT(IN) :: unit ! the derived type value/variable TYPE (whateveritis), INTENT(IN) :: dtv ! the edit descriptor string CHARACTER, (LEN=*), INTENT(IN) :: iotype INTEGER, OPTIONAL, INTENT(IN) :: w,d,m LOGICAL, INTENT(OUT) :: err CHARACTER, (LEN=*), INTENT(OUT) :: errmsg END END INTERFACE | INTERFACE WRITE ( UNFORMATTED ) SUBROUTINE my_write_routine_unformatted & (unit, & dtv, & err, errmsg) INTEGER, INTENT(IN) :: unit ! the derived type value/variable TYPE (whateveritis), INTENT(IN) :: dtv LOGICAL, INTENT(OUT) :: err CHARACTER, (LEN=*), INTENT(OUT) :: errmsg END END INTERFACE where the actual specific routine names (my_xxx_routine above) and the dummy argument names may be chosen by the user. These routines shall not be invoked directly by the users program. The "dtv" dummy argument may also be given the TARGET attribute. It may not be given any other attributes. The UDDTIO routines are called when: - for unformatted, list directed, and namelist i/o, an appropriate interface for the derived type of a particular list item is visible - for I/O statements with a , there is an appropriate interface visible AND the list item matches up with a "DTxxx" edit descriptor. J3/97-217r1 page 5 of 9 A new edit descriptor, "DT", with the usual (optional) "[w[.d[.m]]]" widths is provided for use with format specifications. It must match up with a variable/value of a derived type. The DT characters may be followed by up to 253 alphabetic characters (interspersed blanks allowed) (ex. "DTLNKLST"). The entire string of alphabetic characters, including the initial "DT", will be passed into the UDDTIO routine (the "iotype" argument). This argument will be converted to UPPERCASE and have all blanks removed. The user can support different types of formatting for one derived type via this extended edit descriptor. For example, the consecutive characters after the "DT" could be used to request different formatting rules for consecutive components in the derived type, or different formatting rules for nested derived types, etc. When a derived type variable/value matchs up with a "DT" edit descriptor, the user must have also provided the matching read/write procedure for that derived type, with a visible interface that matches the definition in this paper. The "unit" dummy argument will have the same unit value as specified by the user in the originating I/O statement for all external units except "*". When an internal unit or the "*" external unit was specified in the originating I/O statement, the "unit" dummy argument will have a processor dependent negative value. Note that an INQUIRE statement cannot be executed when "unit" is negative. The "iotype" argument (FORMATTED I/O routines only) will have the value: - "LISTDIRECTED" if the originating I/O statement specified list directed I/O, - "NAMELIST" if the original I/O statement contained an NML= specifier, or - "DTxxx" if the originating I/O statement contained a format specification and the list item matched up with a DT edit descriptor, where the "xxx" is the string of alphabetic characters (if any) that actually followed "DT" in the edit descriptor. If the original I/O statement is a READ statement, the "dtv" dummy arg should be assigned a value by the UDDTIO read routine. If the original I/O statement is a WRITE or PRINT, the "dtv" dummy arg contains the value of the list item from the original I/O statement, to be output by the UDDTIO routine. J3/97-217r1 page 6 of 9 The "w", "d", and "m" arguments contain the user specified values from the FORMAT (i.e. FORMAT(DT12.5.2 ) ). If the user did not specify "w", "d", and/or "m", those dummy arguments will not be present. They will not be present if the original I/O statement was a list directed, or namelist I/O statement. The UDDTIO routines for reads shall assign a value of .FALSE. or .TRUE. to the "err", "eof", and "eor" dummy args. The value assigned to these dummy arguments shall determine whether or not the corresponding condition will be triggered in the I/O library when the UDDTIO routine returns. If the value .TRUE. is assigned to the "err" dummy argument, the "errmsg" dummy argument shall be defined also, before the UDDTIO routine returns. When "err" is set to true, and the originating I/O statement did not contain an ERR= nor an IOSTAT= specifier, the processor shall attempt to output the "errmsg" value (to something) and stop execution of the program. If we add an ERRMSG= specifier to all read/write statements, this value would be returned thereto. In the absence of an appropriate visible interface in the scope of the I/O statement, unformatted, list-directed, and namelist I/O will behave as it did in Fortran 90. When an appropriate interface is visible for a particular derived type, and either: 1. The original I/O statement specified unformatted, list directed, or namelist I/O, OR 2. the original I/O statement specified a FORMAT and the list item of derived type matches up with a "DT" edit descriptor, THEN the restrictions on derived type I/O, such as no private components, all components must be defined, no ultimate components with the pointer attribute, etc. do not apply to the list item of derived type, but the normal rules in F95 still apply, about not referencing undefined entities, not referencing/defining POINTERS which are not associated, etc. If NO appropriate interface is visible for a particular derived type, the processor will perform "F90" style I/O, and a "DT" edit descriptor which matches that derived type list item will cause an error (at runtime possibly). When F90 style I/O is selected, all the old F90 restrictions on derived type list items still apply. The users routine may chose to interpret the "w" argument as a field width, but this is not required. If it does, it would be appropriate to fill an output field with "*"s if "w" is too small. When the original I/O statement was a READ, the UDDTIO routine may not READ from any other external unit other than the one passed in via the dummy arg "unit, nor WRITE to any external unit. J3/97-217r1 page 7 of 9 When the original I/O statement was a WRITE, the UDDTIO routine may not WRITE to any other external unit other than the one passed in via the dummy arg "unit, nor READ from any external unit. | Thou shalt not execute an OPEN, CLOSE, BACKSPACE, ENDFILE, or REWIND statement while a UDDTIO routine is active. The UDDTIO routines ARE permitted to use a FORMAT with a DT edit descriptor, for handling components of the derived type which are themselves a derived type. List directed and NAMELIST I/O are also permitted for the "recursive" I/O statement. WRITE statements contained in the UDDTIO routine which specify the same value as passed in via the "unit" dummy arg will insert the characters "written" into the record started by the original WRITE statement, starting at the position in the record where the last edit descriptor left off. Record boundaries may be created by WRITE statements in the UDDTIO routines. Non-advancing I/O may be used to avoid creating record boundaries. READ statements contained in the UDDTIO routine which specify the same value as passed in via the "unit" dummy arg will "pick up" in the current record, where the last edit descriptor from the original I/O statement left off. Multiple records can be read, and the current position can be left within a record by the READ statement in the UDDTIO routine, through the use of non-advancing i/o. Record positioning edit descriptors, such as TL and TR, used on "unit" while a UDDTIO routine is active, shall not cause the record position to be positioned before the record position at the time the UDDTIO routine was invoked. A very robust UDDTIO routine may wish to use INQUIRE to determine what BLANK=, PAD= and DELIM= are for the unit. Edit descriptors which affect subsequent edit descriptors behavior, such as BN, BZ, P, etc., are permitted in FORMATs in UDDTIO routines. The Fortran I/O library will save the state of BN, BZ, S, SP, SS, and P before calling a UDDTIO | routine, reset these states to the processor default (as if we | were starting a new I/O statement), call the user's UDDTIO routine, | and reset the library's state of BN, BZ, S, SP, and SS to the | saved state when the UDDTIO routine returns. The UDDTIO routine is free to use these state changing edit descriptors without having any effect on the formatting of list items in the originating I/O list. If directed rounding mode edit descriptors are added, these will be added to the list of "saved" states. READ and WRITE statements executed in a UDDTIO routine, or executed in a routine called (directly or indirectly) from a | UDDTIO routine shall not have an ASYNCHRONOUS specifier, nor | shall the original READ or WRITE statement which triggered | the UDDTIO routine contain an ASYNCHRONOUS specifier. | A UDDTIO routine shall not define (modify) any storage location | referenced by any I/O list item, the corresponding format, or | any specifer in the original READ/WRITE, except through the DTV | dummy argument. J3/97-217r1 page 8 of 9 -------------------------------------------------------------- Rationale The desire to allow users to implement new data types in a MODULE requires additional language features, including I/O support. The provider of a module which implements a new datatype needs to be able to also provide I/O support. The approach chosen extends existing Fortran features to support derived types, is fairly easy to use, bypasses the restrictions on derived type I/O present in Fortran 90, and allows the I/O support to be bundled with the MODULE which supplies the derived type definition and implements the operations thereon. This also provides the ability to protect these I/O operations from the user. The use of visible interfaces to trigger this functionality helps preserve Fortran 90 compatability, since no Fortran 90 program can specify such an interface. -------------------------------------------------------------- Conceptual Model The key concept is that the UDDTIO routines can, more or less, be viewed as adding individual components into the middle of the original item list, and edit desciptors into the middle of the original format-specification (if any). They also have full control over how input values are processed, and how values are represented on output. They can do so in an intelligent, dynamic, and arbitrarily complex manner. They can also avoid the restrictions on F90 derived type I/O (pointers, etc.), handle nested derived types, and support complex data structures (such as linked lists). The UDDTIO routines provide a familar mechanism, Fortran I/O statements, to insert data into an output record, and to retrieve values from an input record. The user of a derived type uses familiar Fortran syntax to activate this capability. Usually, the user only needs to "USE" the appropriate module, and possibly insert some "DT" edit descriptors into their format-specifications. All of the hard work is done by the provider/writer of the derived type. Once that hard work is done, many users can easily adapt their programs to use it. The interface provides all the information necessary to accomodate all types of Fortran I/O. A robust UDDTIO routine may be quite large, but not necessarily very complicated. A simple UDDTIO routine can be written quickly for one or two forms of I/O, and extended later to handle all the possible forms of Fortran I/O. J3/97-217r1 page 9 of 9 -------------------------------------------------------------- Example for a WRITE(FORMATTED) interface: TYPE linkedList TYPE (linkedList), POINTER :: next INTEGER :: value END TYPE linkedList RECURSIVE SUBROUTINE my_write_routine (unit, dtv, & iotype, w, d, m, & err, errmsg) INTEGER, INTENT(IN) :: unit ! the derived type value TYPE (linkedList), TARGET, INTENT(IN):: dtv CHARACTER (LEN=*), INTENT(IN) :: iotype ! the edit descriptor INTEGER, OPTIONAL, INTENT(IN) :: w,d,m LOGICAL, INTENT(OUT) :: err CHARACTER, (LEN=*), INTENT(OUT) :: errmsg TYPE (linkedList), POINTER :: ptr INTEGER :: ww, dd ! local copies of w,d INTEGER :: en ! iostat= error value CHARACTER, (LEN=20) :: fmt ! format specification err = .FALSE. ! handle the optional "w" and "d" arguments ww = 10 IF ( present ( w ) ) THEN ww = w END IF dd = 1 IF ( present ( d ) ) THEN dd = d END IF ! if we will need a format-spec, build it now IF ( iotype(1:2) == "DT" ) THEN write(fmt, "'(1X,I',I4,'.',I4,')'" ) ww, dd ! (1X,Iw.d) END IF ptr => dtv DO ! main loop through the linked list IF ( iotype == "LISTDIRECTED" ) THEN WRITE (unit, *, ADVANCE="NO", ERR=99, IOSTAT=en) ptr%value ELSE IF ( iotype(1:2) == "DT" ) THEN write(unit, fmt, ADVANCE="NO", ERR=99, IOSTAT=en) ptr%value ELSE ! unrecognized i/o type errmsg="Unsupported I/O request:type(linkedList):"//iotype err = .TRUE. RETURN END IF | ptr => ptr%next | IF ( ASSOCIATED (ptr) ) EXIT END DO RETURN ! normal exit 99 write(errmsg, "('Error writing linkedList%value, IOSTAT=',I9)") en err = .TRUE. RETURN ! error exit END