J3/98-239 Date: 1998/11/11 To: J3 From: interop Subject: Approved syntax for interoperability References: J3/98-165r1, J3/98-195r1, J3/98-196r2 This paper is a reference paper that collects the previously approved syntax for interoperability features. Describing pre-defined C data types ----------------------------------- An ISO_C_TYPES module shall be provided that makes accessible the following named constants of type default integer: C_INT, C_SHORT, C_LONG, C_LONG_LONG, C_SIGNED_CHAR, C_FLOAT, C_DOUBLE, C_LONG_DOUBLE, C_COMPLEX, C_DOUBLE_COMPLEX, C_LONG_DOUBLE_COMPLEX and C_CHAR. C_INT, C_SHORT, C_LONG, C_LONG_LONG and C_SIGNED_CHAR shall have values that are representation methods for integers that exist on the processor or shall have the value -1. Because the C standard specifies that the representations for positive signed integers are the same as the representations for corresponding values of unsigned integers, and because Fortran will not provide any real support for unsigned kinds of integers, we have decided not to provide C_UNSIGNED_INT, C_UNSIGNED_SHORT, C_UNSIGNED_LONG, C_UNSIGNED_LONG_LONG or C_UNSIGNED_CHAR constants in the ISO_C_TYPES module. Instead a user can use the constants for the signed kinds of integers to access the unsigned kinds as well. Note that this has the potentially surprising side-effect that unsigned char would be declared as INTEGER(C_SIGNED_CHAR) in Fortran. C_FLOAT, C_DOUBLE and C_LONG_DOUBLE shall have values that specify approximation methods for the real type that exist on the processor or shall have the value -1. The values of C_COMPLEX, C_DOUBLE_COMPLEX, and C_LONG_DOUBLE_COMPLEX shall be the same as the values of C_FLOAT, C_DOUBLE, and C_LONG_DOUBLE, respectively. The value of C_CHAR shall specify a representation method for characters that exists on the processor or shall have the value -1. A scalar entity or derived type component in the Fortran type column of the following table, that has a kind type parameter that has the same value as the named constant made accessible from the ISO_C_TYPES modules specified in the Type kind column, are said to interoperate with scalars or structure components of C types that are compatible (ref. C standard) with the C types in the corresponding row of the C type column. +-------------------------------------------------------------+ | Fortran type | Type kind | C type | |--------------+-----------------------+----------------------| | INTEGER | C_INT | int | | | | signed int | | |-----------------------+----------------------| | | C_SHORT | short int | | | | signed short int | | |-----------------------+----------------------| | | C_LONG | long int | | | | signed long int | | |-----------------------+----------------------| | | C_LONG_LONG | long long int | | | | signed long long int | | |-----------------------+----------------------| | | C_SIGNED_CHAR | signed char | | | | unsigned char | |--------------+-----------------------+----------------------| | REAL | C_FLOAT | float | | |-----------------------+----------------------| | | C_DOUBLE | double | | |-----------------------+----------------------| | | C_LONG_DOUBLE | long double | |--------------+-----------------------+----------------------| | COMPLEX | C_COMPLEX | complex | | |-----------------------+----------------------| | | C_DOUBLE_COMPLEX | double complex | | |-----------------------+----------------------| | | C_LONG_DOUBLE_COMPLEX | long double complex | |--------------+-----------------------+----------------------| | CHARACTER | C_CHAR | char | +-------------------------------------------------------------+ So, for example, a scalar object of type integer, with a kind parameter equal to the value of C_SHORT, interoperates with a scalar object of the C type short or of any C type derived (via typedef) from short. No other entities than those mentioned in this document shall be made accessible from the ISO_C_TYPES module. This prevents a program that is conforming with respect to one processor from being made non-conforming with respect to another due to names made accessible from the ISO_C_TYPES module. C pointer types --------------- The ISO_C_TYPES module shall make accessible an entity with the name C_PTR. This entity shall be a derived type or a type alias name. A Fortran scalar entity or derived type component of type C_PTR interoperates with C scalars or structure components that are of any C pointer type. Note that this requires the representation method for all C pointer types to be the same for the C processor, if it is to be the target of interoperability of a Fortran processor. The C standard does not impose this requirement, so this may limit the ability of some processors to conform to Fortran 2000. Whether any C processors of interest actually take advantage of this needs to be determined. Dereferencing of C pointers within Fortran will not be supported. Handling of structures ---------------------- A new attribute (that's not the right term, but. . .) is introduced for Fortran derived type declarations. This is the BIND(C) attribute. For example, TYPE, BIND(C) :: MYFTYPE INTEGER(C_INT) :: I, J REAL(C_FLOAT) :: R END TYPE MYFTYPE Such a derived type definition shall not specify the SEQUENCE statement. A Fortran scalar object or derived type component of derived type is said to interoperate with a C scalar object or derived type component of a struct type, if the derived type definition includes the BIND(C) attribute, the derived type and the struct type have the same number of components, and components of the derived type interoperate with the corresponding components of the struct type, which shall not be bit fields or arrays whose first bound is unspecified (need correct C terminology again). A Fortran derived type that specifies the BIND(C) attribute shall satisfy the following. o It shall not be a parameterized derived type. o It shall not specify either the EXTENDS or the EXTENSIBLE attribute. o Any component that is of derived type shall be of a type that specifies the BIND(C) attribute as well. o Any component shall not specify the POINTER nor the ALLOCATABLE attribute. For example, a C scalar object of type myctype interoperates with a Fortran scalar object of type myftype. typedef struct { int m, n; float r; } myctype; Note that C9x requires the names and component names of two struct types to be the same in order for the types to be considered to be the same. This is similar to Fortran's rule describing when sequence derived types are considered to be the same type. However, because of the problem of mixed-case names in C, we have decided to be more forgiving. Note that unions and bit fields are not supported. In addition, C structs like the following cannot interoperate with any Fortran structure (this is a new feature of C9x): struct { int m[]; /* Last component has an unspecified bound - like assumed-size */ } Straw vote: Should we use BIND(C) or add an optional "(C)" to the SEQUENCE statement? Result of straw vote: BIND(C) - 4 SEQUENCE(C) - 4 Undecided - 3 Handling of arrays ------------------ Because Fortran arrays are stored in column-major order, whereas C arrays are stored in row-major order, the nth dimension of a Fortran array corresponds to the (r-n+1)th dimension of the C array, where both arrays are of rank r. A Fortran explicit-shape or assumed-size array object or structure component interoperates with C objects or structure components of array types if the elements interoperate, the ranks of the arrays are the same, and 1. if the Fortran array is explicit-shape, the extent in each dimension is the same as the extent of the corresponding dimension of the C array (need C terminology here); or 2. if the Fortran array is assumed-size, the extent in each dimension but the last must be the same as the extent of the corresponding dimension of the C array, and the extent of the first dimension of the C array shall be unspecified. Type aliases ------------ In order to facilitate portable use of C functions that use data types defined with C's typedef facility, a type aliasing statement is introduced to Fortran. The syntax is: <> TYPEALIAS :: <> => A type alias name can then appear as the (R502) in a (R501), a (R425) or an (R542). Explicit or implicit declaration of an entity or component using a type alias name is identical to declaration using the for which it is an alias. The keyword TYPE is used in declarations. For example, TYPEALIAS :: DOUBLECOMPLEX => COMPLEX(KIND(1.0D0)), & & NEWTYPE => TYPE(DERIVED) TYPE(DOUBLECOMPLEX) :: X, Y TYPE(NEWTYPE) :: S The type alias name can also be used as a structure constructor name, if it is an alias for a derived type. A type alias name shall not be the same as the name of an intrinsic type. The type alias name declared is a local entity of class (1) (14.1.2), and can be made accessible via use or host association. Note the => in the type alias statement syntax precludes making the :: optional. Otherwise, there is an ambiguity with pointer assignment in fixed source form. Also, the TYPE keyword is required when the type alias name is used in s, as there would be a potential for ambiguity in fixed source form were it omitted: TYPEALIAS :: REWIND => LOGICAL REWIND I Suggestions that allow the "::" to be optional and suggested alternatives to using TYPE in declarations will be gladly entertained. Note that the TYPEALIAS is not as flexible as C's typedef facility because certain type modifiers of C (such as array bounds) are attributes in Fortran, rather than being a part of the type. Attributes of procedures and dummy arguments -------------------------------------------- A new VALUE attribute (along with a VALUE declaration statement) is defined for dummy data objects, and a BIND attribute is introduced for the function and subroutine statements. A Fortran procedure interoperates with a C function if: o the procedure is declared with the BIND(C) attribute; o the results of the procedure and function interoperate, if the Fortran procedure is a function, or the result type of the C function is void, if the Fortran procedure is a subroutine; o the number of dummy arguments of the Fortran procedure is equal to the number of formal parameters of the C function; o dummy arguments with the VALUE attribute interoperate with corresponding formal parameters; and o dummy arguments without the VALUE attribute correspond to formal parameters that are pointers whose reference types (need correct C terminology here!) interoperate with the types of the corresponding dummy argument. The BIND(C) attribute shall not be specified for a subroutine or function if it requires an explicit interface or has asterisk dummy arguments. Note that the requirement that the Fortran procedure not require an explicit interface prohibits dummy arguments from having the POINTER attribute, having the ALLOCATABLE attribute, being assumed-shape arrays, having an array result, etc. Here's an example: BIND(C) INTEGER(C_SHORT) FUNCTION FUNC(I, J, K, L, M) INTEGER(C_INT), VALUE :: I REAL(C_DOUBLE) :: J INTEGER(C_INT) :: K, L(10) TYPE(C_PTR), VALUE :: M END FUNCTION FUNC short func(int i; double *j; int *k; int l[10], void *m); This ties together some of the syntax specified in previous sections. Note that a C pointer may correspond to a Fortran dummy argument of type C_PTR, or to a Fortran scalar that does not have the VALUE attribute. Fortran's rules of type checking will not be hobbled in order to provide access to C pointers to void, and the like; instead, a C_LOC intrinsic will be introduced to get the address of a Fortran object. If an object is not a dummy argument, it shall not have the VALUE attribute. An object shall not have the VALUE attribute if it is an array or has INTENT(OUT) or INTENT(INOUT). A dummy argument of type character with a length parameter whose value is not one, shall not have the VALUE attribute. If a dummy argument in a subprogram has the VALUE attribute, it implicitly has the INTENT(IN) attribute. (That is, VALUE can be used in a Fortran subprogram definition.) Handling of characters ---------------------- All references in the following are to 98-007r2. In C, objects that are of a character type contain a single character value. In Fortran, entities of type character can be composed of more than one character value. In C, multi-character objects are created using arrays, and in the case of strings, a specially distinguished NULL character marks the end of the string. In order to facilitate passing Fortran character entities with character length parameters greater than one to C strings, the Fortran rules on sequence association need to be relaxed. Today, the following is a valid Fortran program: program p character(10) :: a(1) call sub(a) contains subroutine sub(b) character(1) :: b(10) end subroutine sub end program p but the following is not: program p character(10) :: a call sub(a) contains subroutine sub(b) character(1) :: b(10) end subroutine sub end program p That is, sequence association involving characters permits arrays of characters to blur the division between the number of array elements and the length of the element, but this blurring only applies when an actual argument is a character array, an element of a character array or a substring of such an array element, but does not allow a scalar of type character to be argument associated with a dummy argument associated with a character array, unless it is an element of a character array. In order to make this change, the rules in 12.4.1.1 [227:13-15] and 12.4.1.4 need to be modified to allow this. Note: This change could conceivably cause problems for some Fortran 95 processor, though whether there is actually a processor that relies on this is unclear. An example of this approach as it applies to interoperability: program p interface bind(c) subroutine myprintf(dummy) ! The Fortran array corresponds with the C ! array dummy character(1) :: dummy(0:12) end subroutine myprintf end interface character(13) :: actual = "Hello, world!" ! actual is a scalar that is not an array ! element, which is not currently allowed call myprintf(actual) end program p void myprintf(char dummy[13]) { ... } Note: This approach was preferred in a straw vote to the alternative of allowing Fortran scalar characters to interoperate with C character arrays by a vote of 6-4-1. An additional straw vote on whether this extension to sequence association should be permitted only in references to BIND(C) procedures passed by a vote of 5-4-2. A third straw vote on whether to allow this extension to other types failed by a vote of 2-8-2. In order to simplify things, subgroup decided to ignore the result of the second straw vote. Sample edits: In 5.1.2.4.4 [63:38+] Add "If the actual argument is of type default character and is a scalar that is not an array element or array element substring designator, the size of the dummy array is MAX(INT(l/e), 0), where e is the length of an element in the dummy character array, and l is the length of the actual argument." In 12.4.1.1 [226:9-10] Change "or array element" to "array element, or scalar" In 12.4.1.1 [226:11] After "array" add "or scalar" In 12.4.1.1 [227:14-15] Change "or a substring of such an element" to "or a scalar of type default character". In 12.4.1.4 [228:45] Change "an array element substring designator" to "a scalar of type default character" In 12.4.1.4 [229:8+] Add "If the actual argument is of type default character and is a scalar that is not an array element or array element substring designator, the element sequence consists of the character storage units of the actual argument." Null characters --------------- The C character and wchar_t data types have designated null characters, whose value is zero. The ISO_C_TYPES module shall make accessible a named constant, C_NULLCHAR, of type character, whose kind type parameter is equal to the value of C_CHAR. The value of C_NULLCHAR shall be equal to the value of the null character of the C char data type. Note: A straw vote on whether to use CHAR(0) vs. a C_NULLCHAR named constant showed a preference for C_NULLCHAR (2-6-3). Length type parameters ---------------------- The character length parameter of a Fortran entity that interoperates with a C entity shall not be an asterisk. Global Data ----------- In Fortran, there are two facilities for specifying global data: common blocks and modules. The existing practice for sharing global data between Fortran and C is to use common blocks with the same name as the C object with external linkage. The interoperability facility will build off of such existing practice. The BIND(C) attribute can also be specified for a variable declared in the scope of a module, with the restriction that only one variable that is associated with a particular C variable with external linkage is permitted to be declared within a program. For uniformity with other attributes, a BIND(C) attribute statement will also be introduced. For example, integer, bind(c) :: i integer :: j, k common /com/ k bind(c) :: j, /com/ The BIND(C) attribute shall only be specified for a variable if it is declared in the scope of a module. The variable shall interoperate with the C type of the associated C variable that has external linkage. The variable shall not be explicitly initialized, it shall not have the POINTER attribute, the ALLOCATABLE attribute, appear in an EQUIVALENCE statement or be a member of a common block. If a common block is given the BIND(C) attribute, it shall be given the BIND(C) attribute in all scoping units in which it is declared. A C variable with external linkage interoperates with a common block that has the BIND(C) attribute, if the C variable is of a struct type and the variables that are members of the common block interoperate with corresponding components of the struct type, or if the common block contains a single variable, and the variable interoperates with the C variable. A variable in a common block with the BIND(C) attribute shall not be explicitly initialized and it shall not be the parent object of an in an EQUIVALENCE statement. If a variable or common block has the BIND(C) attribute, it has the SAVE attribute as well. A variable with the BIND(C) attribute is a global entity of a program (14.1.1). Such an entity shall not be declared in more than one scoping unit of the program. Note that a straw poll favoured allowing BIND(C) to be specified on both variables and common blocks: 7-4-1. Name Mangling ------------- Paper 98-139 contained an outline for syntax for mapping Fortran variable and procedure names to C variable and function names. That proposal is restated here. The BIND and attribute have an additional, optional specifier, NAME=, that is followed by a scalar initialization expression of type default character. If neither NAME= nor BINDNAME= is specified, NAME= is assumed to have a value that is the same as the name specified for the variable or procedure in lower case. Note: This is an arbitrary choice, but it seems like the reasonable one, since few users of C write their functions with names that are entirely in upper case. An additional BINDNAME= attribute may also appear, followed by a scalar default initialization expression of type default character. At most one NAME= specifier is permitted to appear in a BIND or attribute. More than one BINDNAME= specifier may appear in a BIND for a subprogram, but not an interface body or the BIND attribute for a variable. Any leading and trailing blanks in the value of a NAME= specifier are ignored. The value of the NAME= specifier on an interface body or variable must correspond to some C function or variable, respectively, with the same name. Note: The intent here is that NAME= allows the user to specify C names that are not valid Fortran names, and provides a mechanism through which the processor can distinguish between upper and lower case. Section 14.1.1 states that "A name that identifies a global entity shall not be used to identify any other global entity in the same program." This rule needs to be extended to make it clear that the value of the NAME= specifier might identify a global entity, and it shall not be used to identify any other global entity - the value isn't necessarily a name in the Fortran sense, so some modification of the existing rule is required. This has the effect that two external procedures might have the same name, but still be distinct entities, because the values specified by NAME= specifiers might be different. For example, program p interface bind(c,name='CSub') subroutine c_sub end subroutine c_sub end interface call f_sub end program p subroutine f_sub interface bind(c,name='CSub2') subroutine c_sub end subroutine c_sub end interface end subroutine f_sub The meaning ascribed to the BINDNAME= specifier is processor-dependent. Note: The value of the BINDNAME= specifier is intended to specify one or more alternative names by which a procedure defined by Fortran may be referenced from C, when a user wants to build a library that supports multiple C processors at once. The name is a (potentially) mangled name, rather than the name that is actually specified in the C code.