J3/03-278
To: J3
From: Bill Long
Date: November 13, 2003
Subject: Post 2003 proposal for TYPELESS data type.
A TYPELESS data type in Fortran
Introduction
------------
Fortran currently lacks support for variables that represent a sequence
of bits independent of the interpretations imposed by the traditional
numeric, logical, and character data types. Various features have been
added to Fortran to facilitate handling sequences of bits by overloading
the INTEGER data type. This has lead to limitations in the usefulness of
these features as well as awkward wording in the Fortran standard text.
Adding a new type, named TYPELESS, would simplify and enhance the use of
Fortran for certain types of non-numeric problems, as well as allowing
for more clarity in the text of the standard. A TYPELESS type also
provides a way to standardize some common Fortran language extensions,
and provide a rational method for dealing with BOZ constants.
A new intrinsic data type, TYPELESS, is added to Fortran. The following
section describes the declaration and kind characteristics, constants,
type conversion, and intrinsic assignment. Subsequent sections describe
intrinsic operations and functions, input and output, and actual and
dummy argument characteristics for typeless objects.
Declarations, constants, assignment
-----------------------------------
TYPELESS is a non-numeric kind type and a numeric sequence type. Support
for two KINDs of TYPELESS data is required, corresponding to bit
sequences with the lengths of one and two numeric storage units
respectively.
BOZ constants are TYPELESS constants. BOZ constants must not specify
more bits that can be represented in the assignment target. If fewer
bits are specified, the constant is padded with zero bits on the left.
The syntax for BOZ constants is extended to allow a trailing _kind
specifier.
Assignments between TYPELESS objects and integer, real, or complex
objects of the same size involve transfers of bits without data
conversions. Assignment of a smaller TYPELESS object to an object of
type typeless, real, integer, or complex is allowed and results in the
result being padded with zero bits on the left. The same rules apply to
data initializations in declaration statements, DATA statements, and
default component initializations.
The type conversion intrinsics, CHAR, REAL, INT, CMPLX, and DBLE, are
extended to allow TYPELESS arguments. The result values are of the
indicated type and kind, but with the same bit pattern as the TYPELESS
argument, padded on the left with zero bits if necessary.
A new intrinsic, TYPELESS(), is added with an argument of type
typeless, char, real, integer, or complex, and an optional KIND
argument. The result is a typeless value with the same bit pattern as
the argument, padded on the left with zero bits if necesary.
A new intrinsic, SELECTED_TYPELESS_KIND() is added. The argument
specifies a number of bits; the result is a default integer equal to the
KIND value of the smallest typeless KIND that can hold that number of
bits. The default typeless KIND is returned by
SELECTED_TYPELESS_KIND(bit_size(0)). If no typeless kind exists with
the specified number of bits, the value -1 is returned.
Intrinsic operations
--------------------
The .and., .or., .xor., and .not. operators are defined for TYPELESS
operands. The computations are bitwise operations. The result of the
expression evaluation is typeless. If the operands are of unequal
size, the smaller is padded on the left with zero bits before the
operation is performed.
The .eq., .ne., .lt., .le., .gt., .ge., ==, /=, <, <=, >, and >=
operators are defined for TYPELESS operands. If the operands are of
unequal KIND, the smaller is padded on the left with zero bits before
the operation is performed. Operands A and B are equal if their
corresponding bits are the smae, and are unequal otherwise. If A and B
are unequal, and the leftmost unequal corresponding bit of A is 1 and
of B is 0, then A > B, otherwise A < B. Note that the behavior of
typeless comparisons is functionally the same as comparisons between
unsigned integers.
Several new intrinsic functions are defined for typeless arguments. The
POPCNT() intrinsic returns a default integer result equal to the number
of 1 bits in the argument. The POPPAR() intrinsic returns a default
integer result equal to 1 if the number of bits set in the argument is
odd, and 0 if the number of bits set in the argument is even. Note that
the result of POPPAR is equal to the low order bit of the result of
POPCNT() with the same argument. The LEADZ() intrinsic returns a
default integer result equal to the number of leading zero bits in the
argument. The TRAILZ() intrinsic returns a default integer result equal
to the number of trailing zero bits in the argument.
In addition, new intrinsics are defined for typeless arguments to
implement optimized shift operations, mask generation, and mask merge
operations. These functions facilitate the use of typeless data to
represent packed bit fields.
Intrinsic arithmetic operations are not defined for TYPELESS objects. To
use a TYPELESS object as an operand of a numeric intrinsic operator, it
must first be cast to an appropriate type with a type conversion
intrinsic function.
Formatted Input and Output
--------------------------
The format specifiers for typeless objects are I, B, O, and Z. For
output, if the .d part of the Bw.d, Ow.d, or Zw.d specifier is
omitted, the value assumed for d is >= 1 and large enough to represent
the bits in the io list item excluding leading 0 bits. For the I
format specifier, the object value is interpreted as an unsigned base
10 integer. For list directed output, the format used is Zw.d where d
is (the number of bits in the io list item + 3)/4 and w = d+1. For
list directed input, base 10 unsigned integers, and BOZ constants are
valid for typeless io list items. The number of bits required to
represent the input value must be less than or equal to the size of
the corresponding io list item. If it is less, the io list item is
padded with zero bits on the left.
Note: If a typeless object is the actual argument corresponding to an
unsigned int dummy argument of a C funtion, or the unsigned int result
of a C function is assigned to a typeless variable, the I format can
be used to output the correct form of the C value.
Procedure actual and dummy arguments and pointers
-------------------------------------------------
Two objects with different types have "compatible kind parameters" if
they are both numeric sequence types and occupy the same number of
numeric storage units. The processor may extend the concept of
compatible kind parameters to objects that are not numeric sequence
types.
TYPELESS objects can be targets of pointers with compatible kind
parameters and TYPELESS pointers can point to targets with compatible
kind parameters.
TYPELESS dummy arguments of non-intrinsic procedures are compatible with
actual arguments having compatible kind parameters.
Note: This feature can significantly simplify the interfaces for
procedures that move data in memory, but do not perform any operations
on the data. Classic examples include dusty-deck codes containing MOVE
or COPY routines, as well as some MPI library routines written in C with
dummy arguments of type void.
Section 13.3 of the standard, describing Bit Models, is modified to apply
to TYPELESS quantities, rather than integers. The bit manipulation
intrinsic procedures are defined in terms of TYPELESS data. The BIT_SIZE
intrinsic is extended to allow TYPELESS arguments. TYPELESS objects
interoperate with same size C objects, including C unsigned integer
objects.
Modification of language intrinsics:
------------------------------------
Inquiry and type conversion functions need to be extended or introduced
to accommodate a new typeless type. In addition, the bit manipulation
intrinsics should be defined in terms of typeless arguments.
Inquiry functions:
BIT_SIZE(G) : Result: integer, same size as G
G : typeless or integer
KIND(A) : Result: integer
A : any intrinsic type. In the case of typeless
constant, the kind for the smallest typeless
type that can contain the constant is returned.
SELECTED_TYPELESS_KIND(SIZE) : Result: default scalar integer
SIZE : scalar integer
SIZE is number of bits. The returned KIND value is for the smallest
typeless kind that can contain SIZE bits. If no kind is provided for the
requested SIZE the result is -1.
Note: The default kind is SELECTED_TYPELESS_KIND(bit_size(0))
Type conversion functions:
--------------------------
CHAR(G,KIND) : Result: character
G : integer or typeless
KIND : integer initialization expression
CMPLX(X, Y, KIND) : Result: complex
X : real, complex, integer, or typeless
Y : real, integer, or typeless
KIND : integer initialization expression
INT(A,KIND) : Result: integer
A : integer, real, complex, or typeless
KIND : integer initialization expression
REAL(A,KIND) : Result: real
A : integer, real, complex, or typeless
KIND : integer initialization expression
TYPELESS(A,KIND) : Result: typeless
A : integer, real, complex, or typeless
KIND : integer initialization expression
Bit manipulation functions:
BTEST(G,POS) : Result: default logical
G : typeless or integer
POS : integer
For the bit operation functions iand, ieor, and ior, the two arguments
must have compatible kind parameters. The type of the result is
determined by the following chart. Results labeled X are disallowed
argument combinations.
Type of actual argument H
T I R
-------------
T | T I R
Type of actual |
I | I I X
argument G |
R | R X X
IAND(G,H) : Result: (see chart)
G : (see chart)
H : (see chart)
IEOR(G,H) : Result: (see chart)
G : (see chart)
H : (see chart)
IOR(G,H) : Result: (see chart)
G : (see chart)
H : (see chart)
IBCLR(G,POS) : Result: same as type of G
G : typeless or integer
POS : integer
IBITS(G,POS,LEN) : Result: same as type of G
G : typeless or integer
POS : integer
LEN : integer
IBSET(G,POS) : Result: same as the type of G
G : typeless or integer
POS : integer
ISHFT(G,SHIFT) : Result: same as the type of G
G : typeless or integer
SHIFT : integer
ISHFTC(G,SHIFT,SIZE) : Result: same as the type of G
G : typeless or integer
SHIFT : integer
SIZE : integer
MVBITS(FROM, FROMPOS, LEN, TO, TOPOS) : FROM : typeless or integer
FROMPOS : integer
LEN : integer
TO : same as type of FROM
TOPOS : integer
NOT(G) : Result: same as the type of G
G : typeless or integer
New Intrinsic Functions
-----------------------
POPCNT(G) : Result: integer, in range [0..bit_size(G)]
G : typeless or integer
POPPAR(G) : Result: integer, 0 or 1
G : typeless or integer
LEADZ(G) : Result: integer, in range [0..bit_size(G)]
G : typeless or integer
TRAILZ(G) : Result: integer, in range [0..bit_size(G)]
G : typeless or integer
SHIFTL(G,SHIFT) : Result: typeless
G : typeless or integer
SHIFT : integer, >= 0
SHIFTR(G,SHIFT) : Result: typeless
G : typeless or integer
SHIFT : integer, >= 0
DSHIFTL(G,H,SHIFT) : Result: typeless
G : typeless or integer
H : same type and kind as G
SHIFT : integer, >= 0
DSHIFTR(G,H,SHIFT) : Result: typeless
G : typeless or integer
H : same type and kind as G
SHIFT : integer, >= 0
MASKL(NBITS) : Result: typeless
NBITS : integer, in range [0..bit_size(NBITS)]
MASKR(NBITS) : Result: typeless
NBITS : integer, in range [0..bit_size(NBITS)]
MERGE_BITS(G,H,MASK) : Result: typeless
G : typeless
H : typeless, same kind as G
MASK : typeless, same kind as G
Result bits are the bits from G if the corresponding bits in MASK are 1
and the bits from H if the corresponding bits in MASK are 0. Equivalent
to (G .and. MASK) .or. (H .and. (.not. MASK))