02-318

    Public Review Comment #6 by Jean Vezina

To:         J3
From:       Craig Dedo
Date:       November 11, 2002
Subject:    Public Review Comment #6 by Jean Vezina

    COMMENTS ON THE DRAFT FORTRAN 2000 STANDARD
    (ISO/IEC 1539-1)
    BY
    JEAN VEZINA


Mail Address:
Jean Vezina
6292 Villanelle
St-Leonard, Quebec
Canada
H1S 1W1
e-mail addresses:
jvezina@attglobal.net
Jean.Vezina@tdam.com

   INTRODUCTION

The new proposed Fortran 2000 standard enhances the language considerably
by
introducing a number of modern new features such as object-oriented
programming
facilities and numerical exception handling. However, there are a number of
features,
some of these relatively minor, that we would like to be introduced
immediately in the
language. These features have been frequently requested in the various
discussions
groups about Fortran in the past and we feel that they will be beneficial
to the Fortran
community. Specifically, we ask for a few changes to the draft standard and
three new
features:

The changes to the draft standard we request is to rename the NONKIND
attribute to
EXTENT in parameterized derived type declarations and a few corrections to
the C
interoperability feature.

The new features we propose are:
1. Unsigned integers.
2. A CONVERSION EXPLICIT statement, to disallow mixing types and kinds in
intrinsic numeric expressions, assignments and initializations.
3. The FIRSTLOC and LASTLOC array transformational functions
The following pages describe our requested changes and a proposed
implementation of
the three suggested features.

If, because of time or resource constraints, these features cannot be
included in the next
Fortran standard, then we would like that these be included in the
repository of
requirements for the successor of Fortran 2000.

1- NONKIND should be renamed to EXTENT

In parameterized derived type declarations, there are currently two
categories of
parameter specification "arguments": KIND to specify the kind of the
declared variables
in the derived type definition and NONKIND for array bounds and character
string
lengths. The word NONKIND is somewhat restrictive as it already impairs the
possibility
of adding additional parameter attributes in a future revision of the
standard. This is
because NONKIND means "anything that is not a kind". Adding future type
parameter
attributes (such as TYPE for genericity) will be problematic as they will
be also
"nonkinds". We therefore suggest replacing the NONKIND keyword by its
current
meaning: EXTENT. Then any future improvements to parameterized derived
types will
be not hampered by the choice of an inadequate keyword.

2- Comments on the C interoperability feature

We have noticed one error and one serious omission.

The error is in the C function prototype given at Note 15.22 at page 390.

The prototype in the note is:

short func(int i; double *j; int *k; int l[10]; void *m)

This is not valid C. The semicolons must be replaced by commas such as:

short func(int i, double *j, int *k, int l[10], void *m)

The omission is the lack of a SIZEOF function or its equivalent. A number
of C routines
related to I/O or communication APIs have a size parameter that it is
traditionally
computed by using the C library function sizeof. Many Fortran compiler
vendors that
provide extensions to access C routines also supply a nonstandard sizeof
function to
allow the size of items in a C sense to be computed. Otherwise, workarounds
such as
using the IOLENGTH keyword of INQUIRE or SIZE(TRANSFER … will be required.
These workarounds are not guaranteed to give the correct results.

We therefore suggest the addition of a C_SIZEOF function to the
ISO_C_BINDING
module. The C interoperability feature will then standardize a feature that
is already
implemented in a number of extended Fortran 95 compilers.

Note: Some commenters suggested that the sizeof function (or its
equivalent) should
be added to the Fortran 2000 intrinsic function list rather than to the
ISO_C_BINDING module. This is a good alternative.


3- Unsigned integers

Justification for requirement:

In a number of scientific and engineering applications, the data are
represented using
unsigned integers. For example, in image processing applications, pixels
are represented
by integer quantities varying from 0 to n. Also, in many signal processing
applications,
the signal levels are digitized into discrete values varying again from 0
to n. These data
are therefore stored using an unsigned integer format. Some data encryption
and pseudo-random
number generation algorithms are also more naturally expressed using
unsigned
integers. In addition, the C interoperability facility introduced in F2K
will cause Fortran
programs to interact with C procedures that may expect unsigned integer
arguments.
Finally, many "industry standard" file formats such as TIFF include
unsigned integers as
one of their possible data representations. The unformatted stream I/O
facility introduced
in F2K will finally allow Fortran programs to access portably these kinds
of files,
requiring programs to process data stored in unsigned integer format.

The current methods of dealing with unsigned integers in Fortran are
awkward and not
portable. We can list three workarounds that are currently used along with
their problems:

1- Using the character data type to hold 8-bit unsigned integer values

Considering byte values as characters is not always possible as some
implementations of
Fortran limit the values allowed in the CHAR and ICHAR intrinsic functions
to the 127
ASCII characters, thus not accessing the full byte.

2- Converting the unsigned integer to a signed integer capable of
representing the
maximum value of the unsigned integer, for example a 32-bit signed integer
to hold
a 16-bit unsigned integer.

This method has the inconvenient of wasting space, which may be significant
for some
applications such as image processing where huge amounts of data are
processed.

3- Transforming the unsigned values to signed values by subtracting an
appropriate value.

Since this method requires extra processing, it is not optimal.

Consequently, we feel that an unsigned integer facility should be added in
this standard in
order to satisfy user needs. Our proposal is described beginning at the
following page.

The proposal

Our proposal provides a complete unsigned integer facility with minimal
syntactical
additions to the language.

The model set

The model set for unsigned integers is similar to the one for signed
integers (Section
13.4: Numeric Models of the draft) with the exception that there is no s
(sign) multiplier.
For bit manipulation purposes, the model defined at section 13.3 is useable
directly.

Implementation

Unsigned integers should be provided as additional KINDs of the basic
INTEGER type.
To select an unsigned integer kind, the SELECTED_INT_KIND is extended by
adding
an optional UNSIGNED logical argument. Thus SELECTED_INT_KIND(2,.TRUE.) or
SELECTED_INT_KIND(2,UNSIGNED=.TRUE.) will return a kind suitable for an
unsigned integer where the range 0 to 99 can be represented.

Using kinds for implementing unsigned integers has the advantage of
requiring no
additional syntax to specify constants and to reuse the intrinsics.
Unsigned integers
should be allowed in all places where nondefault kind integer quantities
(variables or
expressions) are permitted, except at places where a negative value is
likely to be
returned (such as IOSTAT).

Conversion rules when unsigned integer operands are mixed with other types
in
expressions

When unsigned integer operands are mixed with REAL and COMPLEX operands in
an
expression, the same rules apply as for ordinary integers: the unsigned
integer is
converted to the type and kind of its corresponding REAL and COMPLEX
operand.

When unsigned integer operands of different kinds are mixed, the kind of
the result is that
of the operand that allows for the maximum representable value. For example
if the kind
for the first operand allows for values ranging from 0 to 255 and the kind
of the second
operand allows a range from 0 to 65536, the kind of the result will be of
the second
operand.

When unsigned integers are mixed with signed integers, the issue is not
clear and we
present four possibilities:

1- Disallow mixing signed and unsigned integers.

This is the "safe" approach if any of the following suggestions seems too
complicated or
error prone.

2- The kind of the result is that of the signed integer operand.

This is consistent with the Fortran behavior (inferred, not explicitly
mentioned in the
standard) of converting from the simpler data type to the more complex data
type.
Unsigned integers have a simpler representation than signed integers and
are thus "lower"
in the hierarchy. There are some risks of overflow or underflow, however,
if the
programmer is not careful.

3- Use a behavior similar to C, that is, use the "size" of the variable to
determine
the kind of the result.

It is agreed that the C rules are very error prone, particularly when both
operands have
the same "size".

4- The kind of the result is that of a signed integer capable of
representing the
maximum positive value of either operand.

The rule is the following: Let two operands A and B. The kind of the result
is that of a
signed integer capable of representing MAX(HUGE(A),HUGE(B)). If such a kind
is not
available, the program is not standard conforming.
Of course, only one possibility should be selected by the committee.


4- CONVERSION EXPLICIT Statement

Justification for requirement:

It has been mentioned several times that allowing mixed types and kinds in
intrinsic
numeric expressions, assignments, and initializations is error prone and
may conduce to
hard to find bugs. For example, when a default real constant is
inadvertently assigned to a
double precision variable, the programmer is often surprised by the
precision loss. Also,
when integer variables are mixed with real variables in an expression, if
the programmer
is not careful, unexpected truncations may occur because of operator
hierarchy. A
negative criticism of the Fortran language has been made on that basis.

What we suggest to solve this problem is the introduction of a CONVERSION
EXPLICIT statement that, when present, forces the programmer to explicitly
specify
type conversion by means of the usual Fortran type conversion intrinsics.

Description of the feature

The CONVERSION EXPLICIT statement has exactly the same scope of an
IMPLICIT statement. Its position in a program unit should be the same as
the
IMPLICIT statement.

Its effect is that automatic conversion of types and KIND promotions in
intrinsic numeric
expressions, assignments, and initializations are disallowed. The
exponentiation operator
remains a special case where an integer exponent should be allowed to be
used with a real
or complex base.

Note: in an early draft of "Fortran 8x" published in the eighties, the
statement
CONVERSION NONE was proposed for the feature. This is an acceptable
alternative spelling.


5- FIRSTLOC and LASTLOC array location functions

Justification for requirement:

Searching for the first or last location of the occurrence of a particular
value in an array is
a frequent programming task. However, this operation is awkward to program
efficiently
using Fortran array notation. As a result, serial DO loops are needed to
accomplish this
simple task.We suggest two intrinsic functions that provide this
functionality. We can
give two examples of the usefulness of such a search capability:
1- Finding the position of the first item that matches a given criterion in
an unordered
list.
2- Finding the position of the first and last nonzero pixels in a digitized
image
represented as a two-dimensional array.

Description of the feature

We have prepared two entries similar to those found in the Fortran 2000
draft to describe
these two functions.


FIRSTLOC (MASK [, DIM, KIND])

Description. Determine the location of the first .TRUE. element of MASK
along
dimension DIM.

Class. Transformational function.

Arguments.

MASK shall be of type logical. It shall not be scalar.
DIM (optional) shall be scalar and of type integer with a value in the
range 1 ≤ DIM ≤ n,
where n is the rank of MASK. The corresponding actual argument shall
not be an optional dummy argument.
KIND (optional) shall be a scalar integer initialization expression.

Result Characteristics. Integer. If KIND is present, the kind type
parameter is that
specified by the value of KIND; otherwise the kind type parameter is that
of default
integer type. If DIM is absent, the result is an array of rank one and of
size equal to the
rank of MASK; otherwise, the result is of n   1 and shape (d1, d2, ...,
dDIM-1, dDIM+1,
..., dn), where (d1, d2, ..., dn) is the shape of MASK.

Case (i): The result of FIRSTLOC (MASK) is a rank-one array whose element
values are the values of the subscripts of an element of MASK which is
the first element having the value .TRUE. occurring in array element
order. If there is at least one element having a .TRUE. value, the ith
subscript returned lies in the range 1 to ei, where ei is the extent of the
ith
dimension of MASK.  If all the elements of MASK have the value
.FALSE., then the ith subscript returned has the value ei + 1 (See note).
If
MASK has size zero, all elements of the result are zero.

Case (ii): If MASK has rank one, FIRSTLOC (MASK, DIM = DIM) is a scalar
whose value is equal to that of the first element of FIRSTLOC (MASK).
Otherwise, the value of element (s1, s2, ..., sDIM-1, sDIM+1, ..., sn) of
the
result is equal to

    FIRSTLOC (MASK(s1, s2, ..., sDIM-1, sDIM+1, ..., sn), DIM=1)

Examples:

Case (i): The value of FIRSTLOC((/.FALSE.,.TRUE.,.TRUE.,.FALSE./)) is [2];
The value of FIRSTLOC((/.FALSE.,.FALSE.,.FALSE./)) is [4].

Case(ii): The value of FIRSTLOC((/.FALSE.,.TRUE.,.TRUE.,.FALSE./), DIM=1)
is 2.

If B has the value ⎡ 1  2 -9 ⎤ , FIRSTLOC(B == 2, DIM=1) is [2, 1, 3]
                   ⎣ 2  2  6 ⎦
and FIRSTLOC(B == 2, DIM=2) is [2,1].


LASTLOC (MASK [, DIM, KIND])

Description. Determine the location of the last .TRUE. element of MASK
along
dimension DIM.

Class. Transformational function.

Arguments.

MASK shall be of type logical. It shall not be scalar.
DIM (optional) shall be scalar and of type integer with a value in the
range 1 ≤ DIM ≤ n,
where n is the rank of MASK. The corresponding actual argument shall
not be an optional dummy argument.
KIND (optional) shall be a scalar integer initialization expression.

Result Characteristics. Integer. If KIND is present, the kind type
parameter is that
specified by the value of KIND; otherwise the kind type parameter is that
of default
integer type. If DIM is absent, the result is an array of rank one and of
size equal to the
rank of MASK; otherwise, the result is of rank n   1 and shape (d1, d2,...,
dDIM-1, dDIM+1,
..., dn), where (d1, d2, ..., dn) is the shape of MASK.

Case (i): The result of LASTLOC (MASK) is a rank-one array whose element
values are the values of the subscripts of an element of MASK which is
the last element having the value .TRUE. occuring in array element order.
If there is at least one element having a .TRUE. value, the ith subscript
returned lies in the range 1 to ei, where ei is the extent of the ith
dimension
of MASK.  If all the elements of MASK have the value .FALSE., then all
elements returned are zero (See note). If MASK has size zero, all
elements of the result are zero.

Case (ii): If MASK has rank one, LASTLOC (MASK, DIM = DIM) is a scalar
whose value is equal to that of the first element of LASTLOC (MASK).
Otherwise, the value of element (s1, s2, ..., sDIM-1, sDIM+1, ..., sn) of
the
result is equal to

    LASTLOC (MASK(s1, s2, ..., sDIM-1, sDIM+1, ..., sn), DIM=1)

Examples:

Case (i): The value of LASTLOC((/.FALSE.,.TRUE.,.TRUE.,.FALSE./)) is [3];
The value of LASTLOC((/.FALSE.,.FALSE.,.FALSE./)) is [0].

Case(ii): The value of LASTLOC((/.FALSE.,.TRUE.,.TRUE.,.FALSE./), DIM=1)
is 3.
If B has the value ⎡ 1 2 -9 ⎤ , LASTLOC(B == 2, DIM=1) is [2, 2, 0]
                           ⎣ 2 2 6 ⎦
and LASTLOC(B == 2, DIM=2) is [2,2].

Note : The choice of returning a position of extenti+1 when there are no
.TRUE.
elements in FIRSTLOC and 0 in the case of LASTLOC for the same case is
consistent with the behavior observed in the case of an identical search
operation
performed with DO loops. However, the INDEX intrinsic function returns 0
when
the substring is not found irrespective of the BACK parameter. If the
committee
chooses to replace the proposed behavior by something else, the sentences
in the
function description describing the behavior and the values in the examples
corresponding to the all .FALSE. condition are shown in red to ease their
replacement.


References
02-007r3, Fortran 2000 Committee Draft

[End of J3 / 02-318]