To: J3 J3/25-195r1 From: Malcolm Cohen Subject: Generalised reduction prefix (inclusive) semantics Date: 2025-October-22 Reference: WG5/N2249, 25-145r1, 25-186r1 1. Introduction There are several competing models for how the IDENTITY argument should work in a generalised reduction prefix operation. This paper attempts to describe those models succintly, with simple examples, to help us to decide which model we want to use. If the IDENTITY argument is a true identity, that is OPERATION(identity,x) has the value of x, and OPERATION(x,identity) has the value x, all three models act the same. The three models have different effects in different situations, in INCLUSIVE mode: (a) there is a MASK argument with first value FALSE, a later one TRUE; (b) there is a MASK argument with first value TRUE, a later one FALSE; (c) there is no MASK argument (or equivalently, all values TRUE). To keep the paper simple, we will do one example with no MASK, and one with a MASK that has both an initial FALSE and a later FALSE, and the rest TRUE. We will also have an example where the IDENTITY is palpably not any sort of identity, not even a mathematical one, with the 1st MASK element FALSE. Also to keep the paper simple, we will only discuss INCLUSIVE. (The EXCLUSIVE semantics follow straightforwardly from the INCLUSIVE.) Similarly, the results for higher ranks follow from the vector results. 2. The examples Example 1: A = [ 1,2,3,4 ], IDENTITY = 100, no MASK Pure Integer Function OPERATION(a,b) Integer,Intent(In) :: a,b OPERATION = a + b End Function Example 2: Type t Integer res Integer :: missing = 0 End Type A = [ t(1),t(2),t(3),t(4) ], IDENTITY = t(0,1), MASK = [ F,T,F,T ] Pure Type(t) Function OPERATION(a,b) Type(t),Intent(In) :: a,b OPERATION = t(a%res+b%res,a%missing+b%missing) End Function Example 3: A = [ 1,2,3,4 ], IDENTITY = 100, MASK = [ F,T,F,T ] That is, the A and IDENTIFY from Example 1, and MASK from Example 2. 3. The Models Note: In all models, where MASK=TRUE, R(n+1) is OPERATION(R(n),A(n+1)). And where MASK(1) is FALSE, R(1) is IDENTITY. 3.1 Model 1: Use IDENTITY only when necessary, that is only for the first element when masked out; later MASK=FALSE elements just copy the previous. This is the model described in REDUCE. Example 1 Result = [ 1,3,6,10 ] Example 2 Result = [ t(0,1), t(2,1), t(2,1), t(6,1) ] Example 3 Result = [ 100, 102, 102, 106 ] 3.2 Model 2: Prefix the whole sequence with the IDENTITY value, skip the result for it, and for later MASK=FALSE elements just copy the previous. This is the model described in 25-186r1. Example 1 Result = [ 101,103,106,110 ] Example 2 Result = [ t(0,1),t(2,1),t(2,1),t(6,1) ] Example 3 Result = [ 100, 102, 102, 106 ] 3.3 Model 3: Use the IDENTITY argument for all masked-out elements. This is the model described by some committee members during discussion. Example 2 is the motivating ("how many missing elements included in this value") use case. Example 1 Result = [ 1,3,6,10 ] Example 2 Result = [ t(0,1),t(2,1),t(2,2),t(6,2) ] Example 3 Result = [ 100, 102, 202, 206 ] Note that with this model, it would be an error for IDENTITY to be absent if MASK has any FALSE element, whereas models 1 and 2 it is only an error for IDENTITY to be absent if the first element of MASK is FALSE. 4. Properties of the models When MASK is absent or MASK(1) is TRUE, model 1 is the only one that describes the actual semantics for a prefix reduction: that is, one that has the property REDUCE_PREFIX_INCLUSIVE_model_1(A, [ MASK=M ] ...) == (/ (REDUCE(A(:i), [ MASK=MASK(:i) ] ...), i=1,n) /) using here "..." to indicate all other arguments, which will be the same. WLOG, array bounds 1 to n. When IDENTITY is not an identity and MASK(1) is FALSE, that property does not hold for model 1 (or any other model). Model 2 has instead the property (when IDENTITY is present): REDUCE_PREFIX_INCLUSIVE_model_2(A, IDENTITY=IDENTITY, [ MASK=M ] ...) == (/ (REDUCE([IDENTITY,A(:i)], [ MASK=(/.True.,M(:i)/) ] ...), i=1,n) /) We can see here that IDENTITY is not functioning like an identity value, instead changing the function so that it computes the ordered set of reductions of a different array from the one specified, and with the first element of those omitted. Model 3 has the property (again, always): REDUCE_PREFIX_INCLUSIVE_model_3(A, IDENTITY=IDENTITY, [ MASK=M ] ...) == (/ (REDUCE(MERGE(A(:i),IDENTITY,M(:i)), ...), i=1,n) /) === REDUCE_PREFIX_INCLUSIVE_model_1(MERGE(A,IDENTITY,M), ...) We can see here that model 3 can be expressed quite simply via model 1. If the processor optimises prefix reductions of elemental expressions to avoid unnecessary array temps, this formula should perform similarly to a native intrinsic version. 5. Discussion and recommendation Desirable though it is to to have the property REDUCE_PREFIX_INCLUSIVE(A, [ MASK=M ] ...) == [ (REDUCE(A(:i), [ MASK=MASK(:i) ] ...), i=1,n) ] we can see from the above examples that this is not, in fact, possible, without inserting a requirement that IDENTITY is a true identity, or at least a mathematical identity. But that cannot be easily checked. However, if we remove the MASK and IDENTITY arguments from REDUCE_PREFIX_INCLUSIVE, there is no problem. We can still compute masked versions (by using MERGE for the model 3 result or for models 1 and 2 when there is a true identity available, and PACK/UNPACK for the models 1 and 2 result). Therefore the recommendation is to remove the MASK and IDENTITY arguments from REDUCE_PREFIX_INCLUSIVE. For consistency, MASK should also be dropped from REDUCE_PREFIX_EXCLUSIVE. 6. Postscript The EXCLUSIVE version is equivalent to the INCLUSIVE version of a different array, viz with an initial value inserted at the beginning and the last element omitted. This initial value is best characterised by having the dummy argument called INITIAL (not IDENTITY). This is illustrated by the following formulae (for vectors): REDUCE_PREFIX_EXCLUSIVE(A, INITIAL=INITIAL, ...) === REDUCE_PREFIX_INCLUSIVE( (/ INITIAL,A(:n-1) /), ...) === REDUCE_PREFIX_INCLUSIVE( EOSHIFT(A,1,INITIAL), ...) ===END===