To: J3 J3/22-189r1 From: Mark LeAir Subject: Flang Liaison Report Date: 2022-October-15 Flang Open Source Fortran Report ================================ Flang is an open source compiler for Fortran. It is an official subproject of the LLVM Project (llvm.org). NVIDIA's portion of the work is partially sponsored by the US Department of Energy Exascale Computing Project (particularly, LLNL, Sandia and LANL). The current goals of the project are to - Create an open source Fortran compiler with LLVM licensing, - that can be used for language and parallelization experimentation, - that exists as a peer in the LLVM community of languages, like Clang, - that can rely on LLVM code generation and parallelism support for CPUs and GPUs. There is a Slack workspace for Flang and a number of conference calls related to Flang development. - Flang biweekly sync call ("umbrella call") - Flang biweekly technical call - "Classic Flang" biweekly call - OpenMP in Flang Technical Call Details can be found on the Getting Involved page http://flang.llvm.org/docs/GettingInvolved.html. The New LLVM Flang Compiler --------------------------- The new front-end is available at https://github.com/llvm/llvm-project (the LLVM monorepo) in the flang/ directory. LLVM Flang is being developed under the LLVM license with LLVM exceptions (the same as LLVM, Clang, OpenMP, etc.). Recent efforts have concentrated on * F2003 Polymorphic Entities. An implementation design document has been completed and implementation has begun. * F2003 Parameterized Derived Types (PDTs). An implementation design document has been completed. * Flang Parser/Semantic Analyzer (Front-End) development. * Testing. NVIDIA runs approximately 15,500 tests on Flang nightly. More than 85% of the tests are from the Numerical Algorithms Group (NAG). Flang Front-End Work that has been completed in Q3 (i.e., between July 1, 2022 and September 30, 2022): Development of Fortran semantic analysis continues. This quarter saw fewer patches in semantic analysis compared to Q2 (31 patches compared to 83 in Q2 which was the time period between April 1, 2022 and June 30, 2022). We now have better regression testing that can be used to catch problems before a patch is committed into the Github repository. This led to fewer patches needed for Flang's parser/semantic analyzer development. Flang Lowering work that has been completed in Q3 (i.e., between July 1, 2022 and September 30, 2022): Note: Lowering is the translation of Fortran syntax into the semantic-based Fortran IR (FIR), which is a superset of LLVM's MLIR, and then into LLVM IR. Lowering features completed in Q3 include: New work (42 patches): * Work on design documents and documentation (6 patches): - Polymorphic entities - Adding guideline to write design docs - Design doc for parameterized derived types (PDTs) - Document the Intrinsic Types - Design document for a new high level FIR to improve expression lowering * Upstreaming finalization (pieces that had been lost in the process, and aligning with the newer llvm base ) (11 patches) - Upstreaming the last bits - Adapting lowering to upstream changes regarding opaque pointers and MLIR GEP - Updating build documentation * Work in lowering and MLIR to improve performance (15 patches in flang and 9 in mlir): - Lower TRANSPOSE without using runtime - Using MLIR arith or libm instead of pgmath calls when possible - Use MLIR algebraic simplification - DOT_PRODUCT inlining - Keep original data type for do-variable value to help backend optimizations - Support all real and integer types in SimplifyIntrinsicsPass(SUM and DOT_PRODUCT inlining) - Support multidimensional reductions in SimplifyIntrinsicsPass - Only copyin/out non simply contiguous arguments when they are actually noncontiguous at runtime - [mlir] Fixed ordering of pass statistics - [mlir] Adding arith::IPowI operation - [mlir] Set llvm readnone attribute for libm functions - [mlir] Canonicalization patterns for subi * Work on flang tools (4 patches): - Restore dump-symbols option in bbc - Adding an option to control SUM inlining - Adding an option to control TRANSPOSE inlining * Work on OpenACC (7 patches): - Use TableGen to generate the clause parser - Preserve location information in directives - Support for array sections and derived type components - Adding tests of arrays with allocatable and pointer attributes TODOs on partially implemented features (48 patches): * Replacing compiler/runtime crashes by explicit compilation TODO messages (4 patches): - Add todo for the REDUCE intrinsic - 128 bits MODULO - Make the TODO messages for intrinsics more consistent - Add TODOs for some FORALL assignments with Pointers and Allocatables * Intrinsic procedure lowering fixes/implemented todos (9 patches): - Dynamically optional arguments in RANDOM_SEED - Make SET_EXPONENT(-0.0) return -0.0 - Handle special case for SHIFTA intrinsic - Lower UBOUND of assumed shape with non-default lower bounds correctly - Accept assumed shape arrays as SHAPE in C_F_POINTER - Support optional pointer TARGET argument in ASSOCIATED - Correctly handle aliasing in allocatable assignment - Lower MOD to Fortran runtime call to get correct edge case results - Generate DOT_PRODUCT runtime call based on the result type * Runtime related fixes/implemented todos (6 patches): - Improve runtime error message - Fix runtime character comparison - Enable real/complex kind 10 and 16 variants of dot_product - Fixes for element size calculation - Enabled HAS_FLOAT128 for builds with clang - Fixed identity value for REAL(16) * IO related fixes/implemented todos (4 patches): - Handle non contiguous formats - Internal IO to non contiguous variables - Finalize IO operations after calling EndIO - Support common block POINTERs in namelist * Character related fixes/implemented todos (4 patches): - Round length of function returning characters to zero if negative - Ensure new storage is created when padding a character expression - Lower character result of bind(c) function by value * Call related fixes/implemented todos (3 patches): - Intent(out) deallocation on function entry - Lowers calls to procedure with CONTIGUOUS assumed shape dummies * Derived type related fixes/implemented todos (2 patches): - Handle parent component of extended derived-type - Initialize derived type allocatable components with the unallocated status * Various fixes (16 patches): - Run FIR simplification pass after canonicalization to remove dead blocks - Preserve argument attributes order after target rewrite pass - Handle 128 bits integer constants - Lower empty CASE in SELECT CASE - Fix invalid GOTO rewrite optimization - Fix ASSOCIATE construct with pointer or allocatable selector - Deallocate WHERE masks temporary after all where assignments - Fix inquiries to ASSOCIATE construct entity - Handle NULL(mold) used in initializer region - Triplet computation - Compilation segfault - Fix default initialization of main program arrays in equivalence - Adding generic names to the interfaces to exceptions in IEEE modules - Compute type allocation size based on the actual target representation - FIR clean-up - Build fix Runtime Changes in Q3 (i.e., between July 1, 2022 and September 30, 2022): The Flang runtime support library became more stable this quarter. This meant fewer patches (i.e., 11 runtime related patches in Q3 compared to 68 in Q2). The current state of the LLVM Flang compiler is - written in modern C++ following LLVM conventions - over 200,000 lines of code, tests and scripts - parses all of Fortran 2018 to abstract syntax trees (AST) - parses OpenMP 4.5 and some OpenMP 5.0 - parses OpenACC 3.0 - defines a "Fortran Intermediate Representation" (FIR) based on LLVM's MLIR - can lower most OpenMP and OpenACC constructs - can compile and correctly run Fortran 77 programs (passes FCVS test suite) and nearly all Fortran 95 programs. - supports end-to-end compilation of all OpenMP 1.1 except for reduction, privatization, and one atomic construct. - for later versions of OpenMP, Flang has support for simd, task, taskgroup etc. Upcoming work: * Develop an implementation design document for the F2008 Block construct. * Continue to address runtime errors that are not blocked by not-yet-implemented messages and are not covered by the in-progress design documents. * Work on adding OpenMP target-related and loop transformation constructs is in progress. Current ("Classic") Flang Compiler ---------------------------------- The previous version of Flang, now known as Classic Flang, is derived from the PGI Fortran compiler, with some proprietary features removed (e.g., OpenACC support, inter-procedure analysis). It is the basis for commercial Fortran compilers from Arm, AMD, and Huawei. Classic Flang is available for Linux on x86-64, OpenPOWER, and Arm processors. OpenPower support is not currently being maintained but it is still available. The Classic Flang Community has made progress with bug fixes and new versions of LLVM. Classic Flang works with LLVM 14 and LLVM 15 support is in progress. Support for Mac and Windows architectures, quad-precision, and improved debug support are also in progress. The Classic Flang Community has also switched to a rotating mechanism for organizing its technical call between Arm, AMD and Huawei. Pawel Osmialowski, Shivarama Rao, and Bryan Chan respectively now run the technical calls. Detailed status can be found in the biweekly call Google doc at: docs.google.com/document/d/1-OuiKx4d7O6eLEJDBDKSRnSiUO2rgRR-c2Ga4AkrzOI