To: J3                                                     J3/23-166r1
From: Aleksandar Donev, Courant Institute, New York University
Subject: Pushing the usability of templates: rank-agnostic loops and TBPs
Date: 2023-June-09

Reference: J3/23-165r1, J3/23-164

1. Background
===============

I wrote a paper that proposed that generic programming is a needed feature
in Fortran titled "Genericity in Extending Fortran" in the Fortran Forum
back in 2004. At that time, I was writing a large Fortran code for
event-driven molecular dynamics (the executables from that work are still
in use today), and was sorely missing some of the features templates
provided in C++.

One of those things was the ability to write rank-generic code. The
practical application is writing code for neighbor search. The idea is
very simple. Split a box in which there are particles/atoms/molecules into
a grid of small pixels/voxels, sort the particles into voxels, and then
for every particle quickly identify its neighbors by searching the nearby
voxels. We are interested in molecular dynamics both in 2D and 3D, so we
need to be able to work with a grid that is either of rank 2 or 3; for
partial differential equations we also often use 1D. In existing codes,
the way this is often handled is to always declare arrays to be of rank 3
but with the last one or two dimensions having size one. This is
cumbersome and ackward, and does not allow one to go to 4 dimensions
(which we eventually needed in my thesis work, so I had to write another
code!).

The new proposal for generic programming in Fortran greatly helps, but
without the ability to write rank-generic or rank-agnostic loops, I cannot
accomplish what I have wanted to do for more than 20 years. This is a bit
depressing. Tom Clune proposed in J3/23-130 syntax for rank-agnostic
loops. I strongly support this proposal. The purpose of this paper is to
give a real-world complex example of code that requires that extension.

Along the way, I also illustrate the need to be able to package a type
along with operations on that type together and pass as one unit to a
template. As proposed, the template feature now requires passing the type
as a deferred type and all of the procedures as individual procedures. The
problems with this are discussed at length in J3/23-164, where I propose
SATISFACTIONs as a solution. An alternative solution is to use type-bound
procedures instead, which we cannot do in the present proposal since a
deferred type cannot have TBPs (or even appear as CLASS(deferred_type)).
So in my example I also propose new syntax to allow one to specify that a
deferred type has certain TBPs; this is an optional thing orthogonal to
the rank-agnostic loops but the example I give illustrates both.

2. Linked-List Cells for Neighbor Search in Molecular Dynamics
===============

The code below is an example code for something I and my students /
postdocs have implemented many times in different programming languages.
What I show here is how I would like to write this using TEMPLATEs in
F202Y. I have assumed here also that Tom Clune's proposal 23-130 for
"rank-agnostic loops" is accepted. Without that feature some of the code
below would be hard to make independent of the space dimension.

In this example, I do not use type extension. The good thing about this
non-OOP version is that it does not bind the user to a specific class
hierarchy. Since we do not have multiple inheritance, requiring that all
types be an extension of some base type can be a problem in large codes,
especially if they reuse code from other projects.

We begin with a module that defines a type capable of representing a box
inside which the particles/atoms/molecules live. This is often a periodic
box, called a unit cell. It can be a rectangle or rectangular cuboid, but
it can also be "slanted" or "sheared." The code can be inlined/compiled to
be much more efficient if one assumes an orthogonal unit cell. The
templates below illustrate this optimization.

module UnitCells

! This is a simple template, without type arguments
template UnitCellTemplate(wp,dim)
   integer, constant :: wp, dim

   ! This module implements several type-s of periodic unit cells in R^d
   ! Specifically, we support simple orthogonal unit cells
   ! (rectangular in 2D, rectangular cuboid in 3D),
   ! and also more general (sheared) unit cells
   ! (parallelogram in 2D, parallelepiped in 3D)

   private
   public :: OrthogonalUnitCell, GeneralUnitCell

   ! Both types OrthogonalUnitCell and GeneralUnitCell
   ! are consistent with a type UnitCell,
   ! see requirement UnitCellClass below.

   ! Every type consistent with a UnitCell should have a component
   ! lengths(dim) that gives the size of the unit cell along
   ! each dimension in unit cell coordinates.
   ! One could make this be a type-bound function if it is decided
   ! that allowing for components is a serious complication
   ! and only TBPs are allowed for dererred types.
   ! Every type consistent with a UnitCell should also have two TBPs
   ! called eucl_distance and grid_size, see below.
   ! One can alternatively use type extension, as in Ada, that is
   ! one can require that an instantiation type corresponding to a
   ! deferred UnitCell type must be an extension of a base type UnitCell;
   ! I avoid this here since we lack multiple inheritance.
   ! (One can see what I do here as inspired by Java's interfaces)

   ! Define a new non-parameterized derived type:
   type :: OrthogonalUnitCell(wp,dim)
      ! This is an orthogonal unit unit cell
      ! (rectangular/rectangular cuboid)
      integer, kind :: wp, dim
      real(wp), dimension(dim) :: lengths
         ! We only need dim lengths along each dimension
   contains ! New proposed syntax: allow to specify TBPs
      procedure, pass :: eucl_distance => EuclideanDistance_ortho
         ! Computes Euclidean distance between points x and y
      procedure, pass :: grid_size => GridCellSize_ortho
         ! Computes grid size given an Euclidean cutoff
   end type

   type :: GeneralUnitCell(wp,dim)
      ! This is a general unit unit cell (parallelogram/parallelepiped)
      integer, kind :: wp, dim
      real(wp), dimension(dim) :: lengths ! Euclidean lengths of unit cell
vectors
      ! But now we need another component for the lattice vectors of the
unit cell
      real(wp), dimension(dim,dim) :: lattice ! dim vectors in R^dim
(columns of lattice)
   contains ! Can I do this in the current design?
      procedure, pass :: eucl_distance => EuclideanDistance_nonortho
      procedure, pass :: grid_size => GridCellSize_nonortho
      ! Here we add this extra method, which is OK:
      procedure, pass :: set_lattice => SetLatticeVectors_nonortho
   end type

contains

   !-------------------------------
   ! Compute Euclidean distance between points x and y
   ! This is just an example code so I do not implement here
   ! periodic wrapping/unwrapping, as it is easy to do.
   ! That is, this does not implement the "nearest image convention"
   ! commonly used in MD simulations.

   ! Benefit of templates: Since dim is a compile-time constant,
   ! compiler can inline/vectorize all of this!

   function EuclideanDistance_ortho(unit_cell,x,y) result(d)
      class(UnitCell(wp,dim)), intent(in) :: unit_cell
      real(wp), dimension(dim), intent(in) :: x, y
      real(wp) :: d ! Euclidean ||x-y||

      d=sum(((x-y)*unit_cell%lengths)**2)
   end function

   function EuclideanDistance_nonortho(unit_cell,x,y) result(d)
      class(UnitCell(wp,dim)), intent(in) :: unit_cell
      real(wp), dimension(dim), intent(in) :: x, y
      real(wp) :: d ! Euclidean ||x-y||

      d=sum(matmul(unit_cell%lattice, x-y)**2) ! Can be inlined easily
   end function

   !-------------------------------
   ! Compute what is the minimal length (in *unit unit_cell coordinates*)
   ! grid cells can have to ensure that points can be within a *Euclidean*
   ! distance cutoff only if in neighboring grid cells

   function GridCellSize_ortho(unit_cell,cutoff) result(lengths)
      class(UnitCell(wp,dim)), intent(in) :: unit_cell
      real(wp), intent(in) :: cutoff
      real(wp), dimension(dim) :: lengths

      lengths=cutoff
   end function

   function GridCellSize_nonortho(unit_cell,cutoff) result(lengths)
      class(UnitCell(wp,dim)), intent(in) :: unit_cell
      real(wp), intent(in) :: cutoff
      real(wp), dimension(dim) :: lengths

      lengths=... ! complicated formula involving the L2 norm of
      ! matmul(transpose(L^(-1)),L^(-1)) where L=unit_cell%lattice
   end function

   subroutine SetLatticeVectors_nonortho(unit_cell, lattice) &
         result(lengths)
      class(UnitCell(wp,dim)), intent(in) :: unit_cell
      real(wp), dimension(dim,dim) :: lattice
         ! dim vectors in R^dim (columns of lattice)

      unit_cell%lattice=lattice
      unit_cell%lengths=sum(lattice**2, dim=2)
         ! Euclidean lengths of lattice vectors

   end subroutine

end template UnitCellTemplate

end module UnitCells

The main code to do neighbor searching in a way that is generic w.r.t.
the physical dimension, real precision, and the type of
unit cell (while allowing opimized inlining!) is in the following module:

module NeighborSearch
! Implements linked-list neighbor search, as commonly used in
! Molecular Dynamics (MD) codes.
! We are given a periodic unit unit_cell in R^dim,
! and a bunch of points in that unit cell.
! We want to sort the points into a grid of sub-cells / grid cells
! to facilitate finding neighbors. By neighboring points we mean points
! closer than a *Euclidean* cutoff distance cutoff.

! This is proposed new syntax for allowing type parameters to templates
! where the type is parameterized and/or has type-bound procedures
requirement UnitCellClass(wp,dim,UnitCell)
   integer, constant :: wp, dim
      ! Working precision and space dimension (1-max_dim)

   ! This is not allowed by the current design (not even in syntax)
   ! My proposal:
   ! The corresponding argument in the instantiate statement can be
   ! any type that has at least these components and TBPs.
   ! Notably, any extension of this type would qualify.
   type, deferred :: UnitCell ! This is periodic unit cell
      ! Every type consistent with a UnitCell should have this component
      real(wp), dimension(dim) :: lengths
   contains
      ! Every type of UnitCell should have at least these TBPs:
      procedure(EuclideanDistance), pass :: eucl_distance
      procedure(GridCellSize), pass :: grid_size
   end type

   interface
      ! These routines could either be template parameters/requirements,
      ! which requires long lists of arguments to templates, or,
      ! as I do here, type-bound procedures of type UnitCell.

      ! Euclidean distance between points x and y
      function EuclideanDistance(unit_cell,x,y) result(d)
         type(UnitCell(wp,dim)), intent(in) :: unit_cell
         real(wp), dimension(dim), intent(in) :: x, y
         real(wp) :: d ! Euclidean ||x-y||
      end function

      ! Determines cutoffs for neighbor search in grid cells
      function GridCellSize(unit_cell,cutoff) result(lengths)
         type(UnitCell(wp,dim)), intent(in) :: unit_cell
         real(wp), intent(in) :: cutoff
         real(wp), dimension(dim) :: lengths
      end function

   end interface

end requirement

template NeighborListTemplate(wp,dim,UnitCell)
   requires UnitCellClass(wp,dim,UnitCell)
   ! Dim can go up to 15 (max portable rank)
   ! Question: How do we specify this as a requirement?

   private
   public :: NeighborList

   type :: NeighborList ! A linked-list data structure for neighbor search
      real(wp) :: cutoff=-1.0_wp ! Euclidean cutoff for search
      integer :: grid_size(dim)=1
         ! Number of grid cells along each dimension
      real(wp), dimension(dim) :: grid_lengths
         ! The length (in unit cell coordinates) of a grid cell
         ! grid_lengths = unit_cell%lengths/list%grid_size
      real(wp), dimension(:), allocatable :: next
         ! Next in list "pointers"
      real(wp), rank(dim), allocatable :: heads
         ! Heads of linked lists "pointers"
   contains
      procedure, pass :: create => CreateNeighborList
      procedure, pass :: add => AddPoints
      procedure, pass :: iterate => IterateNeighborPoints
      procedure, pass :: callback => ProcessNeighborPair
   end type

contains

   subroutine CreateNeighborList(list, unit_cell, cutoff, max_n_points)
      class(NeighborList), intent(inout) :: list
      type(UnitCell(wp,dim)), intent(in) :: unit_cell
      real(wp), intent(in) :: cutoff
      integer, intent(in) :: max_n_points (for allocating arrays)

      real(wp), dimension(dim) :: cell_lengths
      integer, dimension(dim) :: n_cells

      cell_lengths=unit_cell%grid_size(unit_cell,cutoff)
      n_cells=unit_cell%cell_lengths/cell_lengths

      list%cutoff=cutoff
      list%grid_size=n_cells
      list%grid_lengths=unit_cell%lengths/list%grid_size

      allocate(list%heads(n_cells))
      list%heads=0 ! zero means no particles in that grid cell

      allocate(list%next(max_n_points))
      list%next=0 ! zero means end of list

   end subroutine

   subroutine AddPoints(list, unit_cell, positions, n_points)
      class(NeighborList), intent(inout) :: list
      ! This can be changed to type(UnitClass) since class is not allowed
      class(UnitCell), intent(in) :: unit_cell ! Not legal right now
      integer :: n_points
      real(wp), dimension(dim,n_points) :: positions
         ! Positions of points to add to the list

      integer :: point, next_point
      integer, dimension(dim) :: grid_cell

      ! One can of course reallocate here to increase the size of
      ! list%next but for this example just abort:
      if(n_points>size(list%next)) stop "Increase max_n_points"

      ! Insert points into linked lists done using integers
      ! instead of pointers (much more efficient)
      do point=1, n_points
         grid_cell=floor(points(:,particle)/list%grid_lengths)
         next_point=list%heads(@grid_cell)
         list%heads(@grid_cell)=point
         list%next(point)=next_point
      end do

   end subroutine

   ! Execute callback for every pairs of neighboring points
   subroutine IterateNeighborPoints(list, unit_cell, positions, n_points)
      class(NeighborList), intent(in) :: list
      class(UnitCell), intent(in) :: unit_cell
      integer :: n_points
      ! Note that positions could have changed since the list was filled,
      ! but only if no particle crossed into another cell
      ! Type extensions of NeighborList can add methods to do
      ! elemental updates (particle by particle)
      ! to ensure no particle leaves the grid cell it is assigned to
      real(wp), dimension(dim,n_points) :: positions
         ! Positions of points to add to the list

      integer :: points(2)
      integer, dimension(dim) :: grid_cell, neigh_cell
      real, dimension(dim) :: position1, position2

      ! This requires accepting Tom's proposal for "rank-agnostic loops"
      do @grid_cell=1:list%grid_size ! New syntax
         ! One needs to worry here about boundary conditions
         ! (going out of bounds of the unit cell)
         ! It is easy to do for periodic domains but I omit for simplicity
         do @neigh_cell = grid_cell-1:grid_cell+1
            ! One should do periodic wrapping here but I omit for brevity
            point(1)=list%heads(@grid_cell)
            do while point(1)>0
               point(2)=list%heads(@neigh_cell)
               do while point(2)>0
                  position1=positions(:,point(1));
                  position2=positions(:,point(2));
                  if(unit_cell%eucl_distance(position1,position2)<=&
                     list%cutoff) call list%ProcessNeighborPair(points)
               end do
            end do
         end do
      end do

   end subroutine

   ! This method/TBP is meant to be overridden by child classes;
   ! The default method just prints to stdout.
   subroutine ProcessNeighborPair(list, points)
      ! Execute callback for every pairs of neighboring points
      class(NeighborList), intent(inout) :: list
      integer, intent(in)  :: points(2)

      write(*,*) points
   end subroutine

end template NeighborListTemplate

end module NeighborList

Now we actually use these templates to do some neighbor searching:

program FancyTemplates
   implicit none
   use UnitCells
   use NeighborList

   integer, parameter :: wp=kind(0.0d0) ! Double precision
   integer, parameter :: dim=2 ! Two dimensions

   instantiate UnitCellTemplate(wp,dim)
   instantiate NeighborListTemplate(wp,dim,OrthogonalUnitCell), &
      NeighborListOrtho=>NeighborList
   instantiate NeighborListTemplate(wp,dim,NonOrthogonalUnitCell), &
      NeighborListGen=>NeighborList

   real(wp) :: lattice(dim,dim), lengths(dim)

   integer, parameter :: n_points = 1000
   real(wp) :: points(dim,n_points)

   type(OrthogonalUnitCell) :: unit_cell_ortho
   type(GeneralUnitCell) :: unit_cell_nonortho

   type(NeighborListOrtho) :: neigh_list_ortho
   type(NeighborListGen) :: neigh_list_nonortho

   integer :: k, point
   real(wp) :: cutoff, lengths(dim), lattice(dim,dim)

   !-------------------------------

   cutoff = 0.1_wp;
   lengths=[(k*1.0_wp, k=1,dim)]

   call random_number(points)
   do concurrent k=1,d
      points(k,:)=points(k,:)*lengths(k)
   end do

   ! First try an orthogonal cell
   !------------------------
   unit_cell_ortho%lengths=lengths
   call NeighborListOrtho%create(unit_cell_ortho, cutoff, n_points)
   call NeighborListOrtho%add(unit_cell_ortho, positions, n_points)
   call NeighborListGen%iterate(unit_cell_ortho, positions, n_points)

   ! Now a non-orthogonal cell
   !------------------------
   call random_number(lattice)
   do concurrent k=1,d
      ! First normalize then make of the desired Euclidean length
      lattice(:,k)=lattice(:,k)*lengths(k)/sqrt(sum(lattice(:,k)**2))
   end do

   call NeighborListGen%create(unit_cell_nonortho, cutoff, n_points)
   call unit_cell_nonortho%set_lattice(lattice)
   call NeighborListGen%add(unit_cell_nonortho, positions, n_points)
   call NeighborListGen%iterate(unit_cell_nonortho, positions, n_points)

end program