J3/13-356
To: J3
From: Malcolm Cohen
Subject: Conformance to IEC 60559:2010
Date: 2013 October 16
1. Introduction
This paper is a report into what is required for the next revision of the
Fortran standard to conform to IEC 60559:2010.
This paper does not contain any edits. There are some recommendations,
but really it is for background and for discussion.
2. Changes not required
IEC 60559:2010 uses "subnormal number" for what the original IEEE-754
called "denormalized number". However, it also establishes "denormalized
number" as an alias, so we can continue to use that term if we wish.
Option 1: continue to use denormal everywhere.
Option 2: use subnormal in text, but retain DENORMAL in generic and
constant names.
Option 3: Use subnormal everywhere, but retain DENORMAL as aliases for
generic and constant names.
Recommendation: My preferences would be 3>2>1, but 1 is clearly least
work for all concerned.
There are an infinite number of floating-point formats defined by IEC
60559:2010, these are (I think) a superset of the infinite number defined
by IEC 60559:1989. No action seems to be necessary for us to get all the
new types.
3. Random Discussion
I think we should require that for an IEEE floating-point interchange
format, the representation when written to an external file should be as
required by IEC 60559. For decimal that gives a choice of two formats for
the same set of computational numbers, for binary only one format is
allowed. It is also arguable that the internal representation should be
the same as that required by IEC 60559.
IEC 60559:2010 says that we *shall* provide a means to specify an
attribute, such as rounding mode, with block scope i.e. statically. Also
that we *should* provide a dynamic means of specifying attributes.
Currently we only provide a dynamic means; however one could argue that
because mode setting by called procedures do not affect the mode in the
caller, we effectively have the static means as well, by simply putting the
appropriate CALL IEEE_SET_ROUNDING (or whatever) mode in. Therefore,
although it might be nice to have special syntax for static modes, we need
not do anything here.
The description of IEEE_RINT needs to be replaced or updated so that it
says that it does the IEEE operation roundIntegralToExact (yes, that is a
stupid name, but it is what is in IEC 60559:2010). Various other
operations and functions would also benefit from explicit linking to the
IEC 60559:2010 operation they are supposed to be conforming to.
4. Rounding mode
There is an extra rounding mode, roundTiesToAway; this is not required for
binary formats, but is required for decimal formats. We should add this
extra rounding mode, as IEEE_AWAY or IEEE_ROUND_AWAY. The existing
IEEE_SET_ROUNDING_MODE and IEEE_SUPPORT_ROUNDING procedures need no change.
There is little effect on implementations, which can simply report false
for IEEE_SUPPORT_ROUNDING(IEEE_ROUND_AWAY). This is likely to be about as
useful as IEEE_OTHER, but is not exactly a burden.
5. Required operations
IEC 60559:2010 requires a lot more operations than the 1989 version.
5.1 Rounding functions (without change of format)
For the sake of expository conciseness,
the set {rounding} = { TiesToEven, TiesToAway, TiesTowardZero,
TiesTowardPositive, TiesTowardNegative }
corresponding to what we call IEEE_NEAREST, (the new)IEEE_ROUND_AWAY,
IEEE_TO_ZERO, IEEE_UP and IEEE_DOWN.
roundToIntegral{rounding} : 5 functions.
for roundToIntegerTiesToEven, this can be done with the user function
REAL(ieee_kind) FUNCTION ieee_rint_even(x) RESULT(y)
REAL(ieee_kind),INTENT(IN) :: x
IF (.NOT.IEEE_SUPPORT_ROUNDING(IEEE_NEAREST)) &
ERROR STOP 'SBABAW'
CALL IEEE_SET_ROUNDING_MODE(IEEE_NEAREST)
y = IEEE_RINT(x)
! This is not permitted to raise INEXACT.
CALL IEEE_SET_FLAG(IEEE_INEXACT,.FALSE.)
END FUNCTION
and similarly for the others.
Although it could be unconvincingly argued that this satisfies the "shall
provide" requirement, I recommend adding an optional rounding-mode argument
to IEEE_RINT that would make it do these. This has some implementation
impact but it is not so large.
roundToIntegralExact : this is IEEE_RINT as is, we should change the
description so that it says so.
The differences between this and the preceding 5 functions are
(a) IEEE_INEXACT is raised if X is not already integral, whereas no signal
is raised with the preceding, and
(b) there is only one function, and it uses the dynamic rounding mode.
5.2 nextUp and nextDown vs. NEAREST and IEEE_NEXT_AFTER
nextUp : one might think that we could do
IEEE_NEXT_AFTER(x,IEEE_VALUE(x,IEEE_POSITIVE_INF))
but IEEE_NEXT_AFTER raises signals, whereas nextUp is quiet (except when x
is a signalling NaN). Another plausible effort is
MERGE(x,NEAREST(x,1.0_kind),x>HUGE(x))
but there is no specification of whether NEAREST is quiet.
Therefore our options are:
(0) change IEEE_NEXT_AFTER (incompatibly, but not in an important way?),
(1) change our description of NEAREST so that it is quiet except for
signalling NaN, and NEAREST(+INF,+1.0) is +INF instead of nonsense,
or
(2) add IEEE_NEXT_UP(x).
Recommendation: (2).
nextDown : similarly to nextUp, we should add IEEE_NEXT_DOWN.
5.3 Exact remainder
remainder : provided by IEEE_REM. The description here basically
duplicates the specification in IEC 60559.
Options: (0) do nothing,
(1) append to description saying it does the IEC 60559 op,
(2) replace the description by referring to IEC 60559.
Recommendation: (1) or (2).
5.4 Minimum and maximum
minNum : Ugh.
REAL(kind) ELEMENTAL FUNCTION IEEE_MIN_NUM(x,y) RESULT(r)
REAL(kind),INTENT(IN) :: x,y
IF (xy) THEN
r = y
ELSE IF (IEEE_IS_NAN(x)) THEN
r = y
ELSE
r = x
END IF
END FUNCTION
or the horrible nested merge
MERGE(x,MERGE(y,MERGE(y,x,IEEE_IS_NAN(x)),x>y),xABS(y)) THEN
r = y
ELSE IF (IEEE_IS_NAN(x)) THEN
r = y
ELSE
r = x
END IF
END FUNCTION
or the terrible nested merge
MERGE(x,MERGE(y,MERGE(y,x,IEEE_IS_NAN(x)),ABS(x)>ABS(y)),ABS(x)