From: Kurt W. Hirchert J3/97-265 (Page of 2) Subject: Example of Polymorphism Issue Meeting 143 J3/97-265 (Page of 2) This is intended to be a slightly more realistic illustration of the polymorphism issue described in J3/97-261, with some estimation of the costs of the alternatives and tradeoffs. Let us begin with an extensible type representing 2D vectors with two type-bound operations - obtaining the length of a single vector and computing the cosine of the angle between two such vectors: TYPE,EXTENSIBLE::vector_2d REAL::x,y CONTAINS length=>length_2d corr=>corr_2d END TYPE FUNCTION length_2d(v) RESULT(length) TYPE(vector_2d)::v; REAL:: length length=sqrt(v%x**2+v%y**2) END FUNCTION length_2d FUNCTION corr_2d(v1,v2) RESULT(corr) TYPE(vector_2d)::v1,v2; REAL::corr corr=(v1%x*v2%x+v1%y+v2%y)/(v1%length()*v2%length()) END FUNCTION corr_2d Using this type, we can write a subroutine that prints out all the cosines for all combinations of vectors drawn from a list of vectors: SUBROUTINE print_corr_table(vlist) OBJECT(vector_2d)::vlist(:); INTEGER::j,k DO(j=1,size(vlist)) print *,(vlist(j)%corr(vlist(k)),k=1,j) END DO END SUBROUTINE print_corr_table There are two costs to declaring vector_2d with OBJECT rather than type (thus making this procedure polymorphic: 1. The reference to vlist(j)%corr must be made through a run-time dispatch table, at the cost of an additional memory reference per function reference. In this particular example, a smart compiler might keep that the function address will be invariant in the loop on j and keep that function address in a register. 2. For alternative B (the one similar to Ada 95), a naive analysis of the types in the reference to vlist(j)%corr would suggest the possibility of a run-time type mismatch. If you wish to run in a mode where such errors are caught, the necessary code (assuming an appropriate representation of the type information) would involve verifying that one integer in memory is greater than or equal to another integer in memory and then than one address stored in memory is equal to another address stored in memory. In this particular case, the two integers would come from the same place memory, as would the two addresses, so a smart compiler might recognize that the correctness condition is always true and optimize the test away. Under alternative A (the one similar to C++), a naive analysis of the types in that reference will show them to be safe, so no safety test would be necessary. Let us now extend this type (and these operations) to a 3D vector type. Under alternative B (like Ada 95), this is easy: TYPE,EXTENDS(vector_2d)::vector_3d REAL::z REPLACES length=>length_3d corr=>corr_3d END TYPE FUNCTION length_3d(v) RESULT(length) TYPE(vector_3d)::v; REAL:: length length=sqrt(v%x**2+v%y**2+v%z**2) END FUNCTION length_3d FUNCTION corr_3d(v1,v2) RESULT(corr) TYPE(vector_3d)::v1,v2; REAL::corr corr=(v1%x*v2%x+v1%y+v2%y+v1%z*v2%z)/(v1%length()*v2%length()) END FUNCTION corr_3d The polymorphic procedure print_corr_table can operate just as efficiently on an array of type vector_3d as it does on one of type vector_2d. Under alternative A (like C++), things do not go so smoothly. The required type for v2 in our vector_3d version of corr would be vector_2d. We can force a type match with this type, but we would then not have access to the z component of v2, and we need that access to compute the right formula. To get around this limitation, we must go back to corr_2d and change the declaration of v2 from TYPE(vector_2d) to OBJECT(vector_2d), permitting us to associate a vector_3d actual argument without losing access to its additional component. If we made no further changes to corr_2d, we would have changed the reference v2%length() from something that can be resolved at compile-time to something requiring dynamic dispatch, but we can fix that problem by rewriting that reference as v2%vector_2d%length(). The text for corr_3d cannot corrected so easily. It is still not syntactically correct to reference v2%z. We must first coerce v2 from OBJECT(vector_2d) to vector_3d. One way to do that would be as follows: FUNCTION corr_3d(v1,v2) RESULT(corr) TYPE(vector_3d)::v1; OBJECT(vector_2d)::v2; REAL::corr corr=corr_3d_helper(v1,v2) END FUNCTION corr_3d FUNCTION corr_3d_helper(v1,v2) RESULT(corr) TYPE(vector_3d)::v1,v2; REAL::corr corr=(v1%x*v2%x+v1%y+v2%y+v1%z*v2%z)/(v1%length()*v2%length()) END FUNCTION corr_3d_helper The costs here are an extra level of procedure reference and a safety check that the OBJECT(vector_2d) v2 in corr_3d can be correctly associated with the TYPE(vector_3d) v2 in corr_3d_helper. Presumably, a smart compiler could eliminate the cost of the extra level of procedure reference by inlining corr_3d_helper in corr_3d. Let us summarize the differences between the alternative A (like C++) and alternative B (like Ada 95) versions of this example: 1. They involved a similar number of safety checks, albeit at different levels. In this particular example, the alternative B safety checks could be optimized away, but I am certain there would be many variants where this would not be the case. 2. The direct implementation costs for alternative A are slightly higher (because of the need to coerce the OBJECT(vector_2d) to TYPE(vector_3d)), but it should be possible optimize away those extra costs. 3. The alternative A version is a bit more verbose (more work to write). The reason alternative B is less expensive for this example is that the natural extension of our operation between two 2D vectors is an operation between two 3D vectors. If we had chosen an operation whose natural extension was an operation between a 3D vector and 2D vector, alternative A would have been cheaper and less verbose. (I haven't been able to think of such an example, but maybe you can.) Thus, it would appear that the following questions are relevant in deciding among the alternatives: o How important is it that Fortran the example of well-known languages like C++ in this regard? * What fraction of Fortran programmers will know these languages? * Of them, how many will know these particular details? [I suspect that relatively few people that have learned C++ or Java well will choose to switch to Fortran for doing object-oriented programming, so I feel following their lead is not particularly important in avoiding surprise.] o What proportion of real programs are like our example, better served by alternative B? [I have been unable to think of a real example that wasn't better served by alternative B.] o How significant are the differences in implementation speed and program verbosity? [The implementation speed differences are small enough to be ignored in nearly all cases. It is the differences in verbosity that disturb me.]