To: J3 J3/23-166r1 From: Aleksandar Donev, Courant Institute, New York University Subject: Pushing the usability of templates: rank-agnostic loops and TBPs Date: 2023-June-09 Reference: J3/23-165r1, J3/23-164 1. Background =============== I wrote a paper that proposed that generic programming is a needed feature in Fortran titled "Genericity in Extending Fortran" in the Fortran Forum back in 2004. At that time, I was writing a large Fortran code for event-driven molecular dynamics (the executables from that work are still in use today), and was sorely missing some of the features templates provided in C++. One of those things was the ability to write rank-generic code. The practical application is writing code for neighbor search. The idea is very simple. Split a box in which there are particles/atoms/molecules into a grid of small pixels/voxels, sort the particles into voxels, and then for every particle quickly identify its neighbors by searching the nearby voxels. We are interested in molecular dynamics both in 2D and 3D, so we need to be able to work with a grid that is either of rank 2 or 3; for partial differential equations we also often use 1D. In existing codes, the way this is often handled is to always declare arrays to be of rank 3 but with the last one or two dimensions having size one. This is cumbersome and ackward, and does not allow one to go to 4 dimensions (which we eventually needed in my thesis work, so I had to write another code!). The new proposal for generic programming in Fortran greatly helps, but without the ability to write rank-generic or rank-agnostic loops, I cannot accomplish what I have wanted to do for more than 20 years. This is a bit depressing. Tom Clune proposed in J3/23-130 syntax for rank-agnostic loops. I strongly support this proposal. The purpose of this paper is to give a real-world complex example of code that requires that extension. Along the way, I also illustrate the need to be able to package a type along with operations on that type together and pass as one unit to a template. As proposed, the template feature now requires passing the type as a deferred type and all of the procedures as individual procedures. The problems with this are discussed at length in J3/23-164, where I propose SATISFACTIONs as a solution. An alternative solution is to use type-bound procedures instead, which we cannot do in the present proposal since a deferred type cannot have TBPs (or even appear as CLASS(deferred_type)). So in my example I also propose new syntax to allow one to specify that a deferred type has certain TBPs; this is an optional thing orthogonal to the rank-agnostic loops but the example I give illustrates both. 2. Linked-List Cells for Neighbor Search in Molecular Dynamics =============== The code below is an example code for something I and my students / postdocs have implemented many times in different programming languages. What I show here is how I would like to write this using TEMPLATEs in F202Y. I have assumed here also that Tom Clune's proposal 23-130 for "rank-agnostic loops" is accepted. Without that feature some of the code below would be hard to make independent of the space dimension. In this example, I do not use type extension. The good thing about this non-OOP version is that it does not bind the user to a specific class hierarchy. Since we do not have multiple inheritance, requiring that all types be an extension of some base type can be a problem in large codes, especially if they reuse code from other projects. We begin with a module that defines a type capable of representing a box inside which the particles/atoms/molecules live. This is often a periodic box, called a unit cell. It can be a rectangle or rectangular cuboid, but it can also be "slanted" or "sheared." The code can be inlined/compiled to be much more efficient if one assumes an orthogonal unit cell. The templates below illustrate this optimization. module UnitCells ! This is a simple template, without type arguments template UnitCellTemplate(wp,dim) integer, constant :: wp, dim ! This module implements several type-s of periodic unit cells in R^d ! Specifically, we support simple orthogonal unit cells ! (rectangular in 2D, rectangular cuboid in 3D), ! and also more general (sheared) unit cells ! (parallelogram in 2D, parallelepiped in 3D) private public :: OrthogonalUnitCell, GeneralUnitCell ! Both types OrthogonalUnitCell and GeneralUnitCell ! are consistent with a type UnitCell, ! see requirement UnitCellClass below. ! Every type consistent with a UnitCell should have a component ! lengths(dim) that gives the size of the unit cell along ! each dimension in unit cell coordinates. ! One could make this be a type-bound function if it is decided ! that allowing for components is a serious complication ! and only TBPs are allowed for dererred types. ! Every type consistent with a UnitCell should also have two TBPs ! called eucl_distance and grid_size, see below. ! One can alternatively use type extension, as in Ada, that is ! one can require that an instantiation type corresponding to a ! deferred UnitCell type must be an extension of a base type UnitCell; ! I avoid this here since we lack multiple inheritance. ! (One can see what I do here as inspired by Java's interfaces) ! Define a new non-parameterized derived type: type :: OrthogonalUnitCell(wp,dim) ! This is an orthogonal unit unit cell ! (rectangular/rectangular cuboid) integer, kind :: wp, dim real(wp), dimension(dim) :: lengths ! We only need dim lengths along each dimension contains ! New proposed syntax: allow to specify TBPs procedure, pass :: eucl_distance => EuclideanDistance_ortho ! Computes Euclidean distance between points x and y procedure, pass :: grid_size => GridCellSize_ortho ! Computes grid size given an Euclidean cutoff end type type :: GeneralUnitCell(wp,dim) ! This is a general unit unit cell (parallelogram/parallelepiped) integer, kind :: wp, dim real(wp), dimension(dim) :: lengths ! Euclidean lengths of unit cell vectors ! But now we need another component for the lattice vectors of the unit cell real(wp), dimension(dim,dim) :: lattice ! dim vectors in R^dim (columns of lattice) contains ! Can I do this in the current design? procedure, pass :: eucl_distance => EuclideanDistance_nonortho procedure, pass :: grid_size => GridCellSize_nonortho ! Here we add this extra method, which is OK: procedure, pass :: set_lattice => SetLatticeVectors_nonortho end type contains !------------------------------- ! Compute Euclidean distance between points x and y ! This is just an example code so I do not implement here ! periodic wrapping/unwrapping, as it is easy to do. ! That is, this does not implement the "nearest image convention" ! commonly used in MD simulations. ! Benefit of templates: Since dim is a compile-time constant, ! compiler can inline/vectorize all of this! function EuclideanDistance_ortho(unit_cell,x,y) result(d) class(UnitCell(wp,dim)), intent(in) :: unit_cell real(wp), dimension(dim), intent(in) :: x, y real(wp) :: d ! Euclidean ||x-y|| d=sum(((x-y)*unit_cell%lengths)**2) end function function EuclideanDistance_nonortho(unit_cell,x,y) result(d) class(UnitCell(wp,dim)), intent(in) :: unit_cell real(wp), dimension(dim), intent(in) :: x, y real(wp) :: d ! Euclidean ||x-y|| d=sum(matmul(unit_cell%lattice, x-y)**2) ! Can be inlined easily end function !------------------------------- ! Compute what is the minimal length (in *unit unit_cell coordinates*) ! grid cells can have to ensure that points can be within a *Euclidean* ! distance cutoff only if in neighboring grid cells function GridCellSize_ortho(unit_cell,cutoff) result(lengths) class(UnitCell(wp,dim)), intent(in) :: unit_cell real(wp), intent(in) :: cutoff real(wp), dimension(dim) :: lengths lengths=cutoff end function function GridCellSize_nonortho(unit_cell,cutoff) result(lengths) class(UnitCell(wp,dim)), intent(in) :: unit_cell real(wp), intent(in) :: cutoff real(wp), dimension(dim) :: lengths lengths=... ! complicated formula involving the L2 norm of ! matmul(transpose(L^(-1)),L^(-1)) where L=unit_cell%lattice end function subroutine SetLatticeVectors_nonortho(unit_cell, lattice) & result(lengths) class(UnitCell(wp,dim)), intent(in) :: unit_cell real(wp), dimension(dim,dim) :: lattice ! dim vectors in R^dim (columns of lattice) unit_cell%lattice=lattice unit_cell%lengths=sum(lattice**2, dim=2) ! Euclidean lengths of lattice vectors end subroutine end template UnitCellTemplate end module UnitCells The main code to do neighbor searching in a way that is generic w.r.t. the physical dimension, real precision, and the type of unit cell (while allowing opimized inlining!) is in the following module: module NeighborSearch ! Implements linked-list neighbor search, as commonly used in ! Molecular Dynamics (MD) codes. ! We are given a periodic unit unit_cell in R^dim, ! and a bunch of points in that unit cell. ! We want to sort the points into a grid of sub-cells / grid cells ! to facilitate finding neighbors. By neighboring points we mean points ! closer than a *Euclidean* cutoff distance cutoff. ! This is proposed new syntax for allowing type parameters to templates ! where the type is parameterized and/or has type-bound procedures requirement UnitCellClass(wp,dim,UnitCell) integer, constant :: wp, dim ! Working precision and space dimension (1-max_dim) ! This is not allowed by the current design (not even in syntax) ! My proposal: ! The corresponding argument in the instantiate statement can be ! any type that has at least these components and TBPs. ! Notably, any extension of this type would qualify. type, deferred :: UnitCell ! This is periodic unit cell ! Every type consistent with a UnitCell should have this component real(wp), dimension(dim) :: lengths contains ! Every type of UnitCell should have at least these TBPs: procedure(EuclideanDistance), pass :: eucl_distance procedure(GridCellSize), pass :: grid_size end type interface ! These routines could either be template parameters/requirements, ! which requires long lists of arguments to templates, or, ! as I do here, type-bound procedures of type UnitCell. ! Euclidean distance between points x and y function EuclideanDistance(unit_cell,x,y) result(d) type(UnitCell(wp,dim)), intent(in) :: unit_cell real(wp), dimension(dim), intent(in) :: x, y real(wp) :: d ! Euclidean ||x-y|| end function ! Determines cutoffs for neighbor search in grid cells function GridCellSize(unit_cell,cutoff) result(lengths) type(UnitCell(wp,dim)), intent(in) :: unit_cell real(wp), intent(in) :: cutoff real(wp), dimension(dim) :: lengths end function end interface end requirement template NeighborListTemplate(wp,dim,UnitCell) requires UnitCellClass(wp,dim,UnitCell) ! Dim can go up to 15 (max portable rank) ! Question: How do we specify this as a requirement? private public :: NeighborList type :: NeighborList ! A linked-list data structure for neighbor search real(wp) :: cutoff=-1.0_wp ! Euclidean cutoff for search integer :: grid_size(dim)=1 ! Number of grid cells along each dimension real(wp), dimension(dim) :: grid_lengths ! The length (in unit cell coordinates) of a grid cell ! grid_lengths = unit_cell%lengths/list%grid_size real(wp), dimension(:), allocatable :: next ! Next in list "pointers" real(wp), rank(dim), allocatable :: heads ! Heads of linked lists "pointers" contains procedure, pass :: create => CreateNeighborList procedure, pass :: add => AddPoints procedure, pass :: iterate => IterateNeighborPoints procedure, pass :: callback => ProcessNeighborPair end type contains subroutine CreateNeighborList(list, unit_cell, cutoff, max_n_points) class(NeighborList), intent(inout) :: list type(UnitCell(wp,dim)), intent(in) :: unit_cell real(wp), intent(in) :: cutoff integer, intent(in) :: max_n_points (for allocating arrays) real(wp), dimension(dim) :: cell_lengths integer, dimension(dim) :: n_cells cell_lengths=unit_cell%grid_size(unit_cell,cutoff) n_cells=unit_cell%cell_lengths/cell_lengths list%cutoff=cutoff list%grid_size=n_cells list%grid_lengths=unit_cell%lengths/list%grid_size allocate(list%heads(n_cells)) list%heads=0 ! zero means no particles in that grid cell allocate(list%next(max_n_points)) list%next=0 ! zero means end of list end subroutine subroutine AddPoints(list, unit_cell, positions, n_points) class(NeighborList), intent(inout) :: list ! This can be changed to type(UnitClass) since class is not allowed class(UnitCell), intent(in) :: unit_cell ! Not legal right now integer :: n_points real(wp), dimension(dim,n_points) :: positions ! Positions of points to add to the list integer :: point, next_point integer, dimension(dim) :: grid_cell ! One can of course reallocate here to increase the size of ! list%next but for this example just abort: if(n_points>size(list%next)) stop "Increase max_n_points" ! Insert points into linked lists done using integers ! instead of pointers (much more efficient) do point=1, n_points grid_cell=floor(points(:,particle)/list%grid_lengths) next_point=list%heads(@grid_cell) list%heads(@grid_cell)=point list%next(point)=next_point end do end subroutine ! Execute callback for every pairs of neighboring points subroutine IterateNeighborPoints(list, unit_cell, positions, n_points) class(NeighborList), intent(in) :: list class(UnitCell), intent(in) :: unit_cell integer :: n_points ! Note that positions could have changed since the list was filled, ! but only if no particle crossed into another cell ! Type extensions of NeighborList can add methods to do ! elemental updates (particle by particle) ! to ensure no particle leaves the grid cell it is assigned to real(wp), dimension(dim,n_points) :: positions ! Positions of points to add to the list integer :: points(2) integer, dimension(dim) :: grid_cell, neigh_cell real, dimension(dim) :: position1, position2 ! This requires accepting Tom's proposal for "rank-agnostic loops" do @grid_cell=1:list%grid_size ! New syntax ! One needs to worry here about boundary conditions ! (going out of bounds of the unit cell) ! It is easy to do for periodic domains but I omit for simplicity do @neigh_cell = grid_cell-1:grid_cell+1 ! One should do periodic wrapping here but I omit for brevity point(1)=list%heads(@grid_cell) do while point(1)>0 point(2)=list%heads(@neigh_cell) do while point(2)>0 position1=positions(:,point(1)); position2=positions(:,point(2)); if(unit_cell%eucl_distance(position1,position2)<=& list%cutoff) call list%ProcessNeighborPair(points) end do end do end do end do end subroutine ! This method/TBP is meant to be overridden by child classes; ! The default method just prints to stdout. subroutine ProcessNeighborPair(list, points) ! Execute callback for every pairs of neighboring points class(NeighborList), intent(inout) :: list integer, intent(in) :: points(2) write(*,*) points end subroutine end template NeighborListTemplate end module NeighborList Now we actually use these templates to do some neighbor searching: program FancyTemplates implicit none use UnitCells use NeighborList integer, parameter :: wp=kind(0.0d0) ! Double precision integer, parameter :: dim=2 ! Two dimensions instantiate UnitCellTemplate(wp,dim) instantiate NeighborListTemplate(wp,dim,OrthogonalUnitCell), & NeighborListOrtho=>NeighborList instantiate NeighborListTemplate(wp,dim,NonOrthogonalUnitCell), & NeighborListGen=>NeighborList real(wp) :: lattice(dim,dim), lengths(dim) integer, parameter :: n_points = 1000 real(wp) :: points(dim,n_points) type(OrthogonalUnitCell) :: unit_cell_ortho type(GeneralUnitCell) :: unit_cell_nonortho type(NeighborListOrtho) :: neigh_list_ortho type(NeighborListGen) :: neigh_list_nonortho integer :: k, point real(wp) :: cutoff, lengths(dim), lattice(dim,dim) !------------------------------- cutoff = 0.1_wp; lengths=[(k*1.0_wp, k=1,dim)] call random_number(points) do concurrent k=1,d points(k,:)=points(k,:)*lengths(k) end do ! First try an orthogonal cell !------------------------ unit_cell_ortho%lengths=lengths call NeighborListOrtho%create(unit_cell_ortho, cutoff, n_points) call NeighborListOrtho%add(unit_cell_ortho, positions, n_points) call NeighborListGen%iterate(unit_cell_ortho, positions, n_points) ! Now a non-orthogonal cell !------------------------ call random_number(lattice) do concurrent k=1,d ! First normalize then make of the desired Euclidean length lattice(:,k)=lattice(:,k)*lengths(k)/sqrt(sum(lattice(:,k)**2)) end do call NeighborListGen%create(unit_cell_nonortho, cutoff, n_points) call unit_cell_nonortho%set_lattice(lattice) call NeighborListGen%add(unit_cell_nonortho, positions, n_points) call NeighborListGen%iterate(unit_cell_nonortho, positions, n_points) end program