To: X3J3							X3J3/96-148r1
From: /io (bleikamp)
Subject: Preliminary Functional Specification for Derived Type I/O
Date: Aug. 14, 1996

This document is a PRELIMINAY specification of the high level functionality
to be provided for supporting I/O on derived types.  The goals are to
provide a powerful and portable way to encapsulate I/O support in a MODULE
which defines a derived type.  This I/O support would be accessible via
the traditional Fortran I/O formatted READ/WRITE statements.

Subgroup has discussed some, but not all issues dealt with herein.  Please
send comments to Rich Bleikamp (bleikamp@rsn.hp.com) or to the x3j3 mail
alias.

Management Synopsis:
  - Add a new edit descriptor, "DT".  When the I/O library encounters this,
    it must match up with a derived type.  The I/O library will call a
    user supplied procedure, which will actually do the I/O.  Typically,
    the provider of a derived type would provide these formatting routines.

    List directed and NAMELIST I/O will also call these same user supplied
    routines under certain circumstances.

    The F90 way of doing formatted (and unformatted) I/O on derived types
    still works the same as before.  Only the presence of an interface for
    the appropriate I/O routine triggers this new functionality.

    The user supplied procedures (one for READs, one for WRITEs), will
    be called with a unit number, the derived type variable/value, and other
    misc. information.  The procedure will use normal I/O statements
    (READ/WRITE) on the supplied unit to read/write the data in the
    records of the file.  This use of "recursive" I/O will be restricted
    to this particular feature of the language.

    The user supplied procedure will be able to inquire about, and in the most
    general case, have to worry about:
      - Formatted vs. list directed vs. namelist
      - both sequential and direct access I/O
      - non-advancing and advancing I/O
      - updating the SIZE= variable for non-advancing I/O
      - the DELIM= and PAD= values for this file (accessible via INQUIRE)

Detailed Specification:
  - A new edit descriptor, "DT", with the usual (optional) "[w[.d[.m]]]"
    widths is provided.  It must match up with a variable/value of a
    derived type.

  - The DT characters may be followed by an arbitrary number of alphabetic
    characters (interspersed blanks allowed).  The entire string of
    alphabetic characters, including the initial "DT", will be passed
    into the formatting routine (as the "ed" argument).  This passed in edit
    descriptor will have been converted to UPPERCASE and had all blanks removed.
    The user can develop very sophisticated schemes for requesting different
    types of formatting for one derived type via this extended edit descriptor.
    For example, the consecutive characters after the "DT" could be used to
    request different formatting rules for consecutive components in the
    derived type, or different formatting rules for nested derived types, etc.

  - For list directed I/O, the "ed" argument will have the value
    "LISTDIRECTED".  For NAMELIST I/O, the "ed" argument will have the
    value "NAMELIST".  There is no leading "DT" in these cases.

  - if a derived type is specified in an I/O list and will match up with a
    "DT" edit descriptor, the user must have also provided the matching
    read/write procedure for that derived type , with the specified
    visible interface.  Conversley, if the interface is visible, the derived
    type item MUST match a "DT" edit descriptor.  (this converse limitation
    is not due to implementation difficulties, so we don't have to restrict
    it in this way)

    Said procedures must be defined as follows:

        INTERFACE FORMAT ( DT )
	  SUBROUTINE READ_xxx (dtv, ed, w, d, m)
	    TYPE (whateveritis) dtv	! the derived type value/variable
	    CHARACTER (*) ed		! the edit descriptor string
	    INTEGER, OPTIONAL :: w,d,m
	  END
	  SUBROUTINE WRITE_xxx (dtv, ed, w, d, m)
	    TYPE (whateveritis) dtv	! the derived type value/variable
	    CHARACTER (*) ed		! the edit descriptor string
	    INTEGER, OPTIONAL :: w,d,m
	  END
        END INTERFACE

	where "xxx" is the name of the derived type and
	the DT in parens on the INTERFACE statement must be one of
	the allowed edit desciptors ("DT", "DTA", ...).
	We don't actually need the derived type name as part of the
	routine name.  We could just name the routines READ and WRITE,
	or FORMATTED_READ, ...

	The "dtv" dummy arg should be assigned a value by the READ_xxx
	routine, and contains the value to be output by the WRITE_xxx
	routine.

	The "w", "d", and "m" arguments contain the user specified values
	from the FORMAT (i.e.  FORMAT ( DT12.5.2 ) )
	If the user did not specify "w", "d", and/or "m", they will not
	be present.  They will not be present for list directed and namelist
	i/o either.

        In the absence of an appropriate visible interface in the scope of
	the I/O statement, list-directed I/O will behave as it did for
	Fortran 90.  Same for namelist.

  - When an appropriate interface is visible (really, when the routines will
    be called instead of using F90 semantics), the restrictions on derived
    type I/O, such as no private components, all components must be defined,
    no ultimate components with the pointer attribute, etc. do not apply.
    The normal rules in F95 still apply, about not referencing undefined
    entities, not referencing/defining POINTERS which are not associated,
    etc.

  - If NO appropriate interface is visible for a particular derived type,
    the processor will assume that the "F90" style I/O is happening, and
    a "DT" edit descriptor will cause an error (at runtime possibly).
    When F90 style I/O happens, all the old restrictions still apply.

  - END=, ERR=, EOR=, SIZE= ...
    We have not decided yet how to accomodate this specifiers in the original
    READ/WRITE statement.  Most likely, there will be some extra arguments
    passed in, which need to be defined with .TRUE. if an error occurs, or
    the end of the file is reached, etc.  We may not need to deal with
    SIZE=, since the runtime might be able to do this all by itself.

  - The users routine may chose to interpret the "w" argument as a field width,
    but this is NOT required.  If it does so, it would be appropriate, but
    not required, to fill an output field with "***"s if the value does not
    fit.

  - The formatting routines must use the passed in unit #.
    An implementation is free to substitute a special unit number (such as -999)
    if it choses, to help the runtime library identify this special recursive
    I/O stuff.  This is likely to be needed to support internal files also.

    No other I/O is permitted in the formatting routines.

  - If the original READ/WRITE statement specified sequential I/O,
    only sequential I/O may be performed by the formatting routine.
    Similarily for DIRECT ACCESS I/O.  We'll probably have to pass in
    the RECord number or something.

  - When the original I/O statement was a READ, the formatting routine
    may only do READs.  Similarly for WRITE.

  - The formatting routines ARE permitted to use a FORMAT with
    a DT edit descriptor, for handling components of the derived type
    which are themselves a derived type.  List directed and NAMELIST
    I/O are also permitted for the recursive I/O statement.

  - The WRITE routine will in essence insert the characters written
    by the recursive WRITE into the record started by the original
    WRITE statement.  Record boundaries may be created by the recursive
    WRITE.  See examples below.  Non-advancing I/O may be used to avoid
    creating record boundaries.

  - The READ routine will pick up in the current record, where the
    last edit descriptor left off.  Multiple records can be read,
    and the current position can be left within a record by
    the recursive READ statement, thru the use of non-advancing i/o.

  - A very robust formatting routine may need to use INQUIRE to determine
    whether sequential or direct access I/O is being performed, what
    PAD= and DELIM= are for the specified unit, etc.

Example:  ! derived type i/o example: user program
    USE mytype_module
    TYPE (mytype) :: a
    write(6,*) "hi there, here is my derived type", a
    ...

! derived type i/o example, user written formatting routine for READ
! usually comtained in the MODULE which defines the derived type
MODULE .......
TYPE mytype
    REAL :: rval1, rval2
END TYPE mytype
INTERFACE FORMAT (READ)
  MODULE SUBROUTINE read (unit, dtv, ed, w, d, m, eof, err)
    ...
  END
END INTERFACE

SUBROUTINE read (unit, dtv, ed, w, d, m, eof, err)
  INTEGER  ::  unit 			! unit number
  TYPE (mytype) ::  dtv			! derived type variable to assign to
  CHARACTER, LEN(*) :: ed		! the edit descriptor specified
  INTEGER, OPTIONAL :: w, d, m
  LOGICAL :: eof, err			! set to true if condition occurs

  INTEGER ww, dd, mm
  CHARACTER, LEN(255) :: fmtstr	! we'll build a runtime format here

  ! use user specified field widths (if present), else use default VAlues
  IF ( PRESENT(w) ) THEN
    ww = w
  ELSE
    ww = 12
  END IF
  IF ( PRESENT(d) ) THEN
    dd = d
  ELSE
    dd = 6
  END IF
  IF ( PRESENT(m) ) THEN
    mm = m
  ELSE
    mm = 2
  END IF
  ! make sure field widths are in range
  ww = MAX (MIN(ww, 99), 5)
  dd = MAX (MIN(dd, 97), 1)
  mm = MAX (MIN(mm,  6), 1)

  eof = .false. ;  err = .false.
  IF ( ed == "NAMELIST") THEN
    err = .TRUE. ; RETURN	! haven't implemented NAMELIST support yet
  ELSE IF ( ed == "LISTDIRECTED" ) THEN
    ! fall thru
  ELSE				! FORMAT specification
    IF ( ed <> "DT") THEN
      ! only "DT" supported for now, no extended DTxxx's allowed
      ERR = .TRUE. ; RETURN
    END IF
  END IF

  IF ( ed == "LISTDIRECTED" ) THEN
    ! non-advancing list directed I/O has not been added to F2000 yet
    READ (UNIT, *, ADVANCE="NO", ERR=101, END=102)  dtv%rval1, dtv%rval2
  ELSE
    ! Formatted I/O, build a runtime format based on w,d, and m
    ! fmtstr should look like "( 2(Ew.d.m,1x))"
        WRITE (UNIT=fmtstr, 1, ERR=101) ww, dd, mm
1       FORMAT ("( 2(E",I2,".",I2,".",I1,",1x) )")
    READ (UNIT=unit, FMT=fmtstr, ADVANCE="NO",ERR=101,END=102) dtv%rval1, dtv%rval2
  END IF
  RETURN
101 err = .TRUE. ; RETURN
102 eof = .TRUE. ; RETURN
END