## Developments in LLVM-based toolchains and tooling for RISC-V

Alex Bradbury asb@igalia.com

RISC-V Summit Europe, 2023-06-08



```
int add(int a, int b) {
   return a+b;
}
```















add.o: file format ELF32-riscv

Disassembly of section .text: 00000000000000 add:

| 0: | 33       | 05 | b5 | 00 |  |
|----|----------|----|----|----|--|
|    | $\sim$ 7 | 00 | 00 | 00 |  |

- 4: 67 80 00 00 ret
- add a0, a0, a1





- A collection of modular compiler and toolchain technologies
- Modern C++ implementation
- Library-based design
- Permissively licensed
- C/C++ toolchain (Clang) and equivalents to various binutils tools
- Primary backend for e.g. Rust
- Used by many downstream vendor toolchains



#### What is LLVM: Beyond the compiler you know

- MLIR
- Flang
- libopenmp
- libcxx
- IId
- Ildb
- libc
- BOLT
- ...



#### **RISC-V LLVM current status**

- A 'default' (rather than experimental) backend since LLVM/Clang 9.0 (Sep 2019), first patches merged October 2017.
- Current extension support status: <u>https://llvm.org/docs/RISCVUsage.html</u>
- LLD (with linker relaxation) now also stable.



#### **RISC-V LLVM current status: vs RVA22U64 profile**

|                        | Assembler             | Codegen               |
|------------------------|-----------------------|-----------------------|
| rv64imafdc             | <ul> <li>✓</li> </ul> | <ul> <li>✓</li> </ul> |
| zicsr, zicntr, zihpm   |                       | N/A                   |
| zicclsm                | ×                     | ✓ (*)                 |
| zihintpause            | <ul> <li>✓</li> </ul> | N/A                   |
| zba, zbb, zbs          | <ul> <li>✓</li> </ul> | ~                     |
| zicbom, zicbop, zicboz | v                     | N/A                   |
| zfhmin, zfh            | v                     | ~                     |
| zkt                    | <ul> <li>✓</li> </ul> | N/A                   |
| V                      | <ul> <li>✓</li> </ul> | ~                     |
| zkn, zks               | <ul> <li>✓</li> </ul> | <ul> <li>✓</li> </ul> |



#### **RISC-V LLVM current status: vs draft RVA23U64 profile**

|                                            | Assembler | Codegen |
|--------------------------------------------|-----------|---------|
| zicond (experimental)                      | ~         | ✓ (*)   |
| zcb                                        | ~         | ~       |
| zfa (experimental)                         | ~         | ~       |
| zbc                                        | ~         | ~       |
| zvfh (experimental)                        | ~         | ✓ (*)   |
| zfbfmin (experimental)                     | ~         | WIP     |
| zvfbfmin, zvfbfwma<br>(experimental)       | ~         | WIP     |
| zvkng, zvksg, zvbb, zvbc<br>(experimental) | ~         | ×       |



# RISC-V LLVM: A success story for cross-community upstream collaboration



#### (Partial) RISC-V LLVM credits

#### (Contributions in forms of code, reviews, advice, etc)

Sameer Abu Asal, Alexey Bataev, Alexey Baturo, Alex Bradbury, Qihan Cai, Chandler Carruth, Leonard Chan, Ahmed Charles, Chih-Mao Chen, Piyou Chen, Shiva Chen, Kito Cheng, Vitaly Cheptsov, Nelson Chu, David Chisnall, Liao Chunyu, Jessica Clarke, Simon Cook, Fraser Cormack, David Craven, Nick Desaulniers, Conor Dooley, Sam Elliott, Hal Finkel, Eli Friedman, Mikhail Gadelha, Ondrej Glasnak, Eric Gouriou, Mandeep Singh Grang, Jianjian Guan, Jonas Hahnfeld, Ben Horgan, Mitchell Horne, Petr Hosek, ShihPo Hung, Roger Ferrer Ibanez, Ed Jones, Andrew Kelley, David Kipping, Paul Kirth, James Y Knight, Aditya Kumar, Yeting Kuo, Luke Lau, Jim Lin, Michael Maitland, David Majnemer, Luís Margues, Ed Maste, John McCall, Dylan McKay, Azharuddin Mohammed, Job Noorman, Tim Northover, Krzysztof Parzyszek, Ana Pazos, Wang Pengcheng, Jordy Portman, Nitin John Raj, Philip Reames, Lewis Revill, John Russo, Colin Schmidt, Ed Schouten, Andrews Schwab, Jun Sha, Ben Shi, Anton Sidorenko, Pavel Šnobl, Fangrui Song, Shao-Ce Sun, Sami Tolvanen, Philipp Tomsich, Manolis Tsamis, Rui Ueyama,, Hsiangkai Wang, Ulrich Weigand, Mario Werner, Jim Wilson, Brandon Wu, Xinlong Wu, Eugene Zalenko, Florian Zeitz, Leslie Zhai, Zhu Zijia, ... and certainly more I missed (sorry!)

**Lots** of contributors over time, but a small core set of most active contributors - more contributions very welcome!



#### **RISC-V LLVM stats**

- About 4600 commits(\*)
- About 56KLoC in Ilvm/lib/Target/RISCV
  - Many more lines in tests of course!

(\*): git rev-list --count HEAD -- llvm/lib/Target/RISCV llvm/test/CodeGen/RISCV/ llvm/test/MC/RISCV/ lld/ELF/Arch/RISCV.cpp clang/test/CodeGen/RISCV/



#### How we collaborate in (RISC-V) LLVM

- Time-based releases
  - Prioritisation?
- RFCs
- Mailing list / discourse discussion
- 'Code owners' and pre-commit review
- Biweekly sync-up / coordination calls



#### How we collaborate in RISC-V LLVM: Related repos

#### riscv-non-isa/riscv-elf-psabi-doc riscv-non-isa/riscv-asm-manual A RISC-V ELF psABI Document **RISC-V Assembly Programmer's Manual** ★ 1.2k 🔑 216 Makefile ★ 490 ¥ 136 riscv-non-isa/riscv-toolchain-conventions riscv/riscv-isa-manual **RISC-V Instruction Set Manual** Documenting the expected behaviour and supported commandswitches for GNU and LLVM based RISC-V toolchains ● TeX ★ 2.7k ¥ 474 **P** 21 **★** 98 riscv-non-isa/rvv-intrinsic-doc riscv/riscv-c-api-doc Documentation of the RISC-V C API ★ 44 ¥ 18 ●C ★ 190 ₽ 69

### Not-yet-ratified and vendor specific extensions

- Enable upstream collaboration on not-yet-ratified standards
  - Agreed policy on merging support behind 'experimental' flags (e.g. -menable-experimental-extensions) with explicit spec version
  - Usual code review standards apply
  - No backwards compatibility or support expectation for anything other than final ratified spec.
- Allow vendor extensions to be supported upstream, reducing need for fragmentation for vendor-specific toolchains.
  - e.g. XVentanaCondOps, Xsfvcp, XTHeadVDot (and many others)
  - Considerations for inclusion: complexity/ invasiveness, support story, user base, ...



### Compilation for RISC-V: Custom passes

- RISCVSExtWRemoval
- RISCVCodeGenPrepare
- RISCVExpandAtomicPseudoInsts
- RISCVInsertSETVLI
- RISCVRedundantCopyElimination
- RISCVMakeCompressible
- RISCVRVVInitUndef



### What's new (ish): Vector support

- Auto vectorisation with the loop vectorizer
  - Enabled upstream
  - Parallel downstream work with BSC (and others) on tail folding using LLVM vector predication intrinsics (setting VL rather than using masked loads and stores).
  - $\circ$  More tuning to be done
  - Has support for scalable vectors (and vector register grouping)
    - Can generate scalable strided loads and stores
    - Patches for scalable interleaved and deinterleaved ("segmented") loads and stores very close to landing



### What's new (ish): Vector support

- Auto vectorisation with the superword-level parallelism (SLP) vectorizer
  - Not yet enabled upstream by default, but getting very close working to ensure the cost model disables it when it's not beneficial.
  - Recent tuning e.g. using scalar instructions for copies/stores of small fixed size vectors.
- Intrinsics
  - $\circ$  v0.11.1 supported, eagerly awaiting v1.0 finalisation.



### What's new (ish): BOLT

- BOLT is a post-link optimiser designed to speed up large applications.
  - => those suffering from high iTLB / I\$ misses
- Takes information from a sampling profiler, disassembles functions and reconstructs the CFG, performing (primarily) code layout optimisations
- RISC-V port largely finished and almost merged.



#### What's new (ish): BOLT

Work needed to get BOLT upstream:



### What's new(ish): Other

- LLVM libc
  - Same level of completeness as x86-64
- Cl improvements
- Ilvm-mca
- Various newly supported ISA extensions (merged or WIP). e.g. z[f|d]inx, code size reduction extensions, vector crypto, zacas.



#### The future - RISC-V compilers work going forwards

- More ISA extensions, enablement of additional LLVM tooling and features.
- More targeted optimisations due to real hardware, different microarchitectures, investment in specific workloads.
- Very different kind of work to early enablement efforts
- How to take on more of a leadership role within toolchain-related projects?





### The future: features and development directions

**Disclaimer**: Not a declared roadmap, but an interpretation of areas people likely want to invest in. Unordered.

- Ongoing autovectorisation improvements
- LTO fixes
- CI and performance tracking
- Performance modeling
- Ilvm-mca
- MLIR RISC-V vector dialect exploration
- µarch-specific tuning + scheduling models
- Enable SLP autovectorisation
- Improved constant materialisation
- Security hardening features
- Formal verification
- Fuzzing

- Fuzzing
- LLDB
- TLSDESC
- Atomics ABI changes
- RV32E codegen
- GloballSel
- libc
- LLVM-built Linux userspace investments
- Vector crypto
- SIMD
- Easy custom instruction definition flow





#### Get involved and keep track

- https://llvm.org/docs/RISCVUsage.html
- Biweekly contributor sync-ups (see "RISC-V on discourse.llvm.org)
- LLVM Weekly <u>https://llvmweekly.org/</u>
- Announcements posts for 6-monthly LLVM releases (<u>https://muxup.com</u>)

#### Muxup

#### What's new for RISC-V in LLVM 15

LLVM 15.0.0 was <u>released</u> around about two weeks ago now, and I wanted to highlight some of RISC-V specific changes or improvements that were introduced while going into a little more detail than I was able to in the release notes.

In case you're not familiar with LUM's release schedul, it's worth noting that there are two might LUM release yaw (f. Leon couple) very fin month) and these are timed releases as opposed to being cut when a pre-agreed set of fasture targets have been met. Wite very forsthate to benefit from an active and growing set of contributors working on RSC-V apport in LUM projects, who are responsible for the work if describe below - thank you'l coordinate biweekly yne-up calls for RSC-V LUM contributors, or if you're working in this are alkaes consider dorpping in.

#### Linker relaxation

Linker relaxation is a mechanism for allowing the linker to społmiac code sequences at link min. A code sequence to jump to a symboli might comparatively take two instructions, but once the target address is known at link-time it might be small enough to fit in the immediate of a single instruction, meaning the other can be deleted. Because a linker parforming relaxable might be than just patient thind, offset is linking those for jumps within a function may be changed. To allow this to happen whole two to testing those services are enclosed barches than instructions to enclose the samellee must be emitted as a relecation whon linker relaxation is enabled. See the description in the ESC VyABIO Perior Bubble Stob good to linker relaxation for more background.

Although LLVM has supported codegen for linker relaxation for a long time, LLD (the LLVM linker) has until now lacked support for processing these relaxations. Relaxations in sprimarily an optimisation, but processing of R\_RISCV\_ALIGN (the alignment relacation) is necessary for consortence when laters calvastic is capabled. The same list not expectively the list is a list or when the list of the

#### Muxup

#### What's new for RISC-V in LLVM 16

2023Q1. Last update Mar 2023. History 4

LLWM 16:00 was just released today, and a 16 did for LLWM 15.1 wanted be highlight some of the RISC V specific thanges and improvements. This is very much a tour of a dorse subset of additions rather than an attempt to be exhaustive. If you're interested in RISC-V, you may also wart to finde, out my recent attempt to be exhaustive. If you're interested in RISC-V, you may also wart to find, out whort's going on it commercially available. The second of you want to find out what's going on it rusu.

In case work not familiar with LLWA's release schedule, it's worth noting that there are two major LLWA releases a year (it.e. one roughly every 6 months) and these are time releases as opposed to being cut when a pre-agreed set of feature targets have been met. We're very fortunate to benefit from an active and growing set of contributors working on RISC-V support in LLWA projects, who are responsible for the work leaderible below-thankvoll coordinate biveskity sync-up calls for RISC-V LLWA contributors, so if you're working in this area please consider dropping in.

#### Documentation

LLWM 16 is the first release featuring a user guide for the RISC-V target (<u>16.00-version</u>, urgent <u>HEAD</u>. This is a long-standing gain is our documentation, whereby it was difficult to tell at a glance the expected level of support for the various RISC-V instruction set extensions (<u>standad</u>, vendor-specific, and experimental extensions of either type) in a given LLW release. We vie that level er to calce but informative, and ad ab helf not to describe any known limitations that end users should know about. Thanks again to Philip Reams for icking this off, and the reviewers and contributors for ensuring it's kept to date.

Vectorization

### End

- Thanks again to all of the (many) contributors so far.
- Closing summary
  - RISC-V LLVM as a model of successful upstream collaboration.
  - Recent milestones and developments.
  - $\circ$   $\;$  A vision for the future.
- Contact: <u>asb@igalia.com</u>

