### A Methodology for Automating the Integration SIEMENS of User-Defined Instructions into RISC-V Systems **TECHNISCHE** based on the CV-X-IF Interface **UNIVERSITÄT** WIEN

Florian Egert, Sofia Maragkou, Markus Kobelrausch, Bernhard Fischer, Axel Jantsch Contact: florian.egert@siemens.com



Electronics Design & Integrated Circuits, Siemens Technology Institute of Computer Technology, Vienna University of Technology

### **RISC-V, Custom Instructions, Interfaces**

Application-specific **Custom Instructions (CIs)** facilitate efficient RISC-V-based systems while preserving the versatility that standard instructions offer. The integration of CIs in RISC-V processors is a manual, time-consuming process.

The interface implemented in our wrapper is the **Core-V eXtension interface** (CV-X-IF) from the OpenHW Group. The recently ratified specification is still in active development, and we envision upcoming support from multiple processors and coprocessors.

In this work, we present a way to semi-automate the CI development process. The main feature of our approach is a configurable wrapper for Cl modules which is provided by the **generation flow** in RTL representation.

Our proposed flow is based on an interface-based integration approach. External CI modules are integrated with the processor via an CI interface. This approach allows to use and reuse the generated wrappers with multiple different processors, without changing the processor's internals.

One goal of utilizing CV-X-IF is to preserve the ability to immediately react to an offloaded instruction. CV-X-IF and our wrapper allow the processor to collect the **result** of an instruction in Coprocessor the **same cycle** in which it was issued.



TRISTAN

**Cl Integration Flow Inputs** - what the user provides to the flow **Generation Flow** Implementation (RTL Integration) Tool Coprocessor Custom Instructions in form of RTL modules Processor Opcodes in .json format **RISC-V** CV-X-IF CV-X-IF Wrapper CI Wrapper Output Module Generator **Output** - generated CI-specific wrapper Decoder Issue (RTL) Decoder: Handles instructions offloaded by the processor ' \ل\_ CI Module: User-provided CI execution unit IN CV-X-IF OUT X-> Commit Result Generation: Buffers results until the processor Template ()

The current wrapper version supports **R-type single-cycle Cls**. Therefore, CI modules provided by the user must consist of combinational logic. To realize a **multi-cycle CI**, the workaround is to split it into a sequence of multiple combinational CIs.



## **Results - Runtime and Development Time**

**Runtime improvements**:

collects or dismisses them

 $\bullet$ 

- We validated the flow by generating a CRC and an **AES coprocessor**.
- For AES specifically, we compared the instruction and cycle count against two reference algorithms.
- The resulting reduction in runtime  $\bullet$ ranged between around **61% to 88%**.

#### **Development time reduction**:

|   | Metric       |        |
|---|--------------|--------|
|   | Instructions | Cycles |
| 1 | 88.05%       | 87.53% |
| 2 | 68.26%       | 60.52% |

# **Conclusion and Outlook**

- We proposed a semi-automated **CI integration flow** that generates an RTL wrapper. The support of the Core-V eXtension interface enables rapid execution of issued CIs and compatibility with multiple RISC-V processors.
- We found **runtime improvements** for an AES use case of up to **88%**.
- We attain a **reduction in working hours** of the flow's users by around **64%**.

The results demonstrate the potential benefits of a fully automated CI development framework based on interfacing. In a follow-up action as part of TRISTAN we are currently developing a **High-Level Synthesis (HLS)** flow to add automation support for applications given in C/C++. HLS is leveraged to generate RTL code from high-level C/C++ descriptions of the CIs and the wrapper, and to further optimize for performance, power, and area.

We estimated to which degree the presented flow supports the user in reducing their working hours compared to implementing the CIs manually.

gorithm

- The estimation is based on an **expert interview**.  $\bullet$
- We created three scenarios with use cases of varying complexity.
- We also introduced weights based on the system's reusability and the user's experience level and training time.
- The estimation results yield a reduction in development time of about **64%** when using the proposed flow.



1. M. Damian et al. "SCAIE-V: an open-source SCAlable interface for ISA extensions for RISC-V processors". Proc. DAC '22, pp. 169–174. 2. F. Egert. FRANCIS-V: FRAmework for iNtegrating Custom instructionS into RISC-V systems. Master's thesis. 2023. 3. CORE-V eXtension Interface. OpenHW Group, 2021.



#### Acknowledgements

The TRISTAN project, nr. 101095947 is supported by Chips Joint Undertaking (CHIPS-JU) and its members and including top-up funding by the Austrian Research Promotion Agency (FFG) and the program "ICT of the Future" of the Austrian Federal Ministry for Climate Action, Environment, Energy, Mobility, Innovation and Technology (BMK).



Federal Ministry Republic of Austria Social Affairs, Health, Care and Consumer Protection

