To: J3                                                     J3/19-221r1
From: Tom Clune
Subject: BFLOAT16 variant of REAL16 representation
Date: 2019-October-14
Reference:   19-139r1

I. Introduction

New generations of processors are now providing half-precision
floating point capabilites that often provide significant performance
advantages over typical precisions.  While such capabilities must be
used carefully, various applications have now demonstrated useful
results.  In particular, machine learning algorithms can often exploit
such hardware capabilities in a robust fashion.

Paper 19-139r1 has already introduced a new REAL16 kind that partially
addresses this application space, but an increasingly populer
representation in the machine learning world is BFLOAT16.  This
representation sacrifices mantissa for an expanded exponent range.

II.  Use case

Enable portable development of applications that can exploit
increasingly commonly available BFLOAT16 hardware capabilities.  This
is particularly relevant to machine learning.

III. Rough requirement:

Expand ISO_FORTRAN_ENV to include BFLOAT16.  If, for any of these
constants, the processor supports more than one kind of that size, it
is processor dependent which kind value is provided.

IV. Pros and Cons

Pros:
  - Supported by multiple hardware vendors, with wider adoption likely.
  - Appears useful (fast) for machine learning use cases


Cons:
  - Not defined in any iso standard.  How would we reference?
  - Machine learning layers are generally implemented in other
    languages, and BFLOAT16 may not actually be exposed in
    relevant interfaces.   Would we specify bits for exponent and mantissa
    explicitly in the ISO_FORTRAN_ENV section?
  - Could wait for definition in C?
  - Pragmatic:   mixed arithmetic between REAL16 and BFLOAT16 likely
    involves promoting both to REAL32.

Straw votes:
  1) Y/N  Should we proceed with with specs and syntax?
  2) If (1) is YES:  how should BFLOAT16 be described?
     (a) unspecified (aside from 16 bits)
     (b) reference non iso (probably not allowed)
     (c) explicit definition in terms of bits?