To: J3                                                     J3/19-184
From: Van Snyder
Subject: From IFIP WG 2.5: Correct dot product
Date: 2019-July-26
Reference: Takeshi Ogita, Siegfried Rump, Shin'Ichi Oishi, "Accurate
           Sum and Dot Product," SIAM J. Scientific Computing (SISC)
           26, 6 (2005) pp 1955-1988.

The request for this work item arose as a result of discussions during
the ICIAM meeting in Valencia 15-19 July 2019, and the meeting of IFIP
Working Group 2.5 (Numerical Software) 20 July 2019.  As a result of
IFIP WG 2.5 requests, it is a recommended part of IEEE 1788.

Number
======

TBD

Title
=====

Correct dot product

Status
======

For consideration.

Basic functionality
===================

Add an argument to DOT_PRODUCT to request to compute a correctly-rounded
result.

Rationale
=========

The dot product is an ubiquitous operation in mathematically-related
problems, not just in library modules.  Floating-point arithmetic is not
associative.  The dot product can be arbitrarily badly conditioned.  The
numerical software community would like to have either an exact dot
product or a correctly-rounded dot product.  Ill-posed problems cannot
be solved if it's not correct.  Defect correction and iterative
refinement of ill-posed linear problems fail if it's not correct.
Bounds in interval arithmetic can unnecessarily expand.  Nearly-
identical eigenvalues and their correspinding eigenvectors go wrong.
Even Monte Carlo methods can get into trouble if it's not correct....
There are certainly many users develop what they believe to be reliable
numerical algorithms, but struggle with incorrect results and don't
realize that an exact or correctly-rounded dot product would solve their
problems.

Estimated impact
================

Minor.  Description in the standard would be very simple.  Algorithms
to compute a correctly-rounded dot product are well known.  Software to
compute a correctly-rounded dot product exists.

Detailed specification
======================

Add specific functions for the DOT_PRODUCT intrinsic function that allow
to request that the result is correctly rounded:

  DOT_PRODUCT ( VECTOR_A, VECTOR_B, CORRECT )

The VECTOR_A and VECTOR_B arguments shall be of real or complex type.
There's no point to provide this for integer or logical type, because
computations for those types are associative, while floating-point
arithmetic is not associative.

The CORRECT argument is of logical type.  If its value is false, the
processor can use whatever algorithm the existing DOT_PRODUCT intrinsic
function uses.  If its value is true, the processor shall compute a
result that is correctly rounded to the precision of the result,
according to the kind of the result.  Whether the processor uses an
algorithm based upon floating-point arithmetic, or exploits equipment
that computes an exact dot product, and then rounds that result to the
kind of the function result, is processor dependent.

This is a separate calling sequence, not an optional argument.
Otherwise, the processor would need to check whether the CORRECT
argument is present, and if so whether to use the obvious fast, but
potentially incorrect, algorithm, in the case that the value of the
actual argument is false.

A correctly-rounded result, computed using floating-point arithmetic,
requires up to six times longer than the obvious method.  Equipment has
been constructed at least three times (IBM ACRITH, Kulish PCI board,
Demmel PCI board) that computes an exact dot product in a 4,288-bit
super accumulator, at approximately six times the speed of the obvious
method.  A processor could detect whether such equipment is available,
and exploit it if so.