Presentations
Notes for speakers
Preparation before the conference:
- At least one author of the presentation must register for the core conference (Tuesday 9 to Thursday 11).
- There are no templates for slides.
- Upload ASAP an update your PDF abstract on the submission web
site:
- To add the authors’ names if the submission was blind.
- To fix typos, if any, if your submission was non-blind.
- Before Friday May 29th, AOE (Anywhere on
Earth):
- Upload your slides as PDF or PPTX on the submission web site.
- Upload your poster as PDF on the submission web site.
At the conference:
- Get in touch with your session chair during the previous half-day, or early on Tuesday for talks of Tuesday 9 morning.
- Your absract, slide and poster (if any) will be published
online as PDFs on the conference web site:
- The abstract will be published as soon as we get it.
- The poster, if any, slightly before, of at the Summit’s opening.
- The slides will be pushed online after the Summit.
Accepted presentations
Check this page regularly for the final schedule!
Accepted
presentations will be associated with keynotes and invited
talks into consistent pleanary sessions, over the three days
of the core conference.
(Tuesday 9 to Thursday 11).
Heuristic-free system call interception on RISC-V
Sub. #3EZXZV.
Iacopo Colonnelli and Ottavio Monticelli.
Abstract: Many applications benefit from the ability to intercept, block, or modify system calls efficiently. Binary rewriting is one of the fastest techniques to achieve this, but it often relies on instruction-dependent heuristics that limit its applicability. To date, exhaustive rewriting techniques (introduced by zpoline) are only available for x86-64 ISA. This work introduces vpoline, the first fully heuristics-free system call interception library for RISC-V. By leveraging the RISC-V linker relaxation mechanism, vpoline achieves the same benefits as zpoline while overcoming the intrinsic limitation of requiring privileged access.
Accelerating Sparse Linear Solvers in OpenFOAM using RISC-V Vector Extensions
Sub. #D3TVFS.
Gabriele Ceccolini.
Abstract: Computational Fluid Dynamics (CFD) relies heavily on the efficiency of linear solvers based on sparse linear algebra kernels. Widely used frameworks like OpenFOAM exploit parallelism primarily at the domain decomposi-tion level via MPI. Support for vector/SIMD architectures is limited to compiler auto-vectorization. Furthermore, support for such architectures is limited by OpenFOAM’s internal matrix data format, which is intrinsically ill-suited for the contiguous memory accesses required for efficient execution on vector processors. In this work, we focused on two very different RISC-V architectures: the prototype long-vector EPAC accelerator and the commercial short-vector CPU Sophon SG2044. On these platforms, we optimized the Sparse Matrix-Vector multiplication (SpMV) using RISC-V vector intrinsics and integrated it into a custom smoother, performing a runtime conversion of internal data into a vector-friendly format. Experimental results on the EPAC test chip show a 6× speedup for the smoother; benchmarks on Monte Cimone (MCv2) cluster with the Sophon SG2044 processor achieve a 1.5× smoother speedup, proving that legacy CFD codes can be effectively accelerated on both research and commercial emerging hardware.
All The Scaling, No New State: One Matrix ISA with Microarchitectural Freedom
Sub. #ETWPMQ.
Dr. Philipp Tomsich.
Abstract: RISC-V’s Zvvm matrix extension stores all tile state in the standard V register file and derives tile geometry algebraically from VLEN, SEW, and a new aspect-ratio field λ. This yields arithmetic intensity that scales with VLEN: a binary compiled at VLEN=256 delivers higher throughput at VLEN=65536 with no recompilation. The same partial-VL mechanism that enables one-column-at-a-time embedded streaming also drives full HPC bulk tiling, while microscaling is integrated via vm-bit opcode aliasing with no new architectural state.
Tile dimensions are not programmer-specified constants — they are consequences of existing parameters. The tile is always square: M = N = VLEN/(SEW×λ), with inner dimension K_eff = λ×W×LMUL. Arithmetic intensity (M/2) grows proportionally with VLEN, and the ratio of intensity to cache-to-VRF bandwidth remains constant — a provable algebraic identity with no equivalent in Arm SME or Intel AMX.
Zvvm’s geometry knobs form an intent vocabulary expressed from both sides: software selects LMUL and VL to control K_eff depth and streaming granularity; hardware determines λ and VLEN to shape the tile for its datapath. Setting VL = K_eff with LMUL = 1 gives portable streaming; increasing LMUL or computing multiple C panels trades register pressure for compute intensity — all via the same opcode.
Microscaling (MX) support is integrated by aliasing the vm bit in FP multiply-accumulate opcodes, introducing no new encoding space, registers, or modes.
Enabling Confidential Computing on RISC-V: An Open-Source MPT Implementation
Sub. #EXN7PD.
Haoyuan Liu.
Abstract: Memory Protection Tables (MPT) is an emerging RISC-V extension under community discussion that enables fine-grained multi-supervisor domain physical memory isolation and access control for multi-tenant computing, addressing the security and isolation limitations of the traditional PMP mechanisms. This work presents the first open-source hardware implementation of the MPT draft specification (v0.4). Our design features a multi-level cache for accelerated permission checking and an L1TLB extension to reduce query frequency, with a decoupled architecture for portability. Evaluation shows only 2.32\% average SPEC06 performance overhead and a 0.244\% core area overhead, providing a hardware reference for SMMPT standardization.
CHAKRA-GP: A Retargetable Compiler Framework for RISC-V GPGPU Architectures
Sub. #FRDLL7.
Prachi Pandey and PRANOSE J EDAVOOR.
Abstract: The emergence of RISC-V as an open and extensible instruction set architecture has enabled the development of domain-specific accelerators and General-Purpose Graphics Processing Units (GPGPUs). While the RISC-V ISA provides support for scalar instructions and the RISC-V Vector Extension (RVV) enables data-parallel vector execution, these models do not directly support the Single-Instruction Multiple-Thread (SIMT) execution paradigm required by modern GPU architectures. Consequently, efficient software enablement for RISC-V–based GPUs requires compiler support capable of generating SIMT-oriented instruction sequences and managing massively parallel execution. This proposal talks about CHAKRA-GP, a hardware-optimized compiler framework for RISC-V–based GPGPU architectures. Built upon LLVM and MLIR infrastructures, CHAKRA-GP provides a scalable compilation pipeline enabling efficient kernel generation, memory optimization, and parallel execution mapping for massively parallel workloads. The compiler targets custom RISC-V GPGPU platforms and enables efficient execution of HPC, scientific computing, and AI workloads. The work demonstrates how an extensible compiler infrastructure can bridge the gap between the RISC-V ISA and SIMT-based GPU execution models, enabling efficient compilation for customizable RISC-V GPGPU architectures.
ARCANE: Enabling High-Performance In-Cache Tensor Extensions in RISC-V
Sub. #G7Y79Q.
Vincenzo Petrolo and Flavia Guella.
Abstract: Modern data-centric workloads increasingly expose the limitations of traditional von Neumann architectures, where excessive data movement limits throughput and energy efficiency. While hardware accelerators improve performance, they often lack flexibility and still require costly memory transfers. Existing compute in- and near-memory solutions reduce the memory bottleneck but introduce usability challenges related to constraints on data placement. ARCANE is a cache architecture that doubles as a tightly-coupled near-memory coprocessor. The embedded RISC-V cache controller executes custom instructions offloaded by the host CPU relying on near-memory vector processing units within the cache memory subsystem. This architecture hides memory synchronization and data mapping from application software, while offering software-based Instruction Set Architecture extensibility. Evaluations demonstrate up to an 84x speedup on 8-bit convolution layers over a traditional system-on-chip, incurring only a 41.3\% area overhead.
Ultra Low Power RISC-V core: Retention with Warm Restart Extension
Sub. #G7YSXG.
Anne Merlande.
Abstract: Energy saving is a top priority for STMicroelectronics products. For the STxP5 embedded CPU based on the RISC-V architecture, there is a particular focus on minimizing static power when the core is inactive. Additionally, it is important to optimize the CPU restart time, silicon area, implementation complexity, and software overhead. The Ultra Low Power Retention with Warm Restart Mode addresses these challenges by maximizing power savings and reducing drawbacks typically associated with resuming operation. This solution leverages the modular, scalable, customizable, and extensible nature of the RISC-V architecture by defining and implementing a custom RISC-V extension and tailored microarchitecture.
RIVIERA: A Programmable RISC V Edge Architecture for NFC Signal Processing
Sub. #HC7JS8.
Luca Lingardo.
Abstract: RIVIERA core, developed within Chips-JU TRISTAN Project, is a valid alternative to State-of-Art DSP architectures used in NFC Readers downlink signal processing. Instead of relying on custom hardware, RIVIERA employs an open source RISC-V core and its ISA extension interface to implement a software defined-radio (SDR) architecture, thus moving processing to the extreme edge of an NFC communication system. The first RIVIERA prototype targets decoding of NFC Type A tags responses and is ready by-design to cover other NFC standards and rates. By replacing hardened logic functions with SW data processing supported by a general-purpose DSP accelerator, RIVIERA reduces pre-silicon engineering effort, enables continuous post silicon improvements, and facilitates portability across SoCs designs and technology nodes. This work demonstrates how application-specific custom RISC V ISA extensions can effectively and efficiently handle RF baseband workloads, paving the way for the adoption of SDR architectures in RF communications for the IoT mass market.
A Proof-of-Concept RISC-V with 128-bit Extension
Sub. #MUFY8Z.
Frédéric Pétrot.
Abstract: Addressing ever-larger amounts of memory is a fact of (computerized) life. The authors of the RISC-V unpriviledge specification did recognize that and coined on less than one and a half page what could be a natural extension to 128-bit of the 32- and 64-bit RISC-V ISA. Given this RV128I draft, we (a) defined an ELF128 extension for binaries, (b) made gnu-based a cross-compilation environment able to use RV128I instructions and generate ELF128 binaries, (c) added support for this extension and ELF128 in QEMU, (d) added the necessary instructions and resources in the CVA6 processor.
SVM: A Synthesizable Approach to Efficient RISC-V CPU Verification
Sub. #QVKCU9.
Yinan Xu.
Abstract: The growing complexity of RISC-V processors, driven by rapidly expanding ISA extensions and sophisticated microarchitectures, has made functional verification a dominant bottleneck. Contemporary CPU verification commonly relies on RTL co-simulation against a software reference model, but on hardware-assisted simulation platforms (e.g., FPGAs) this workflow is fundamentally limited by high-volume communication between the accelerated RTL and the host-executed reference, preventing verification throughput from scaling. This paper addresses this by eliminating the RTL-host interaction bottleneck and proposing a Synthesizable Verification Methodology (SVM). We re-architect a RISC-V reference model as synthesizable hardware and deploy it alongside the design under test on the same acceleration platform, enabling fully hardware-based co-simulation at near-native speeds (60 MHz on FPGAs) while preserving reference-model checking and debug observability.
An Open-Source CVA6S+ based High-Performance, Cache-Coherent Cluster for 64b Automotive MPUs
Sub. #R7MK7D.
Riccardo Tedeschi.
Abstract: Driven by the need for zonal control architectures in software-defined vehicles, open-source RISC-V cores are becoming a compelling solution for automotive microprocessor units (MPUs). We introduce a 64b cache-coherent, tightly coupled cluster built upon the industry-backed OpenHW CVA6S+ core and HPDCache, capable of executing SMP Linux and RTOS kernels. A design space exploration of the core branch predictor identifies an embedded tournament configuration that reduces its area by 11.6% with no loss in accuracy. Evaluated on the Splash-3 benchmark suite, the cluster achieves a geometric mean speedup of 1.75× over a single-core baseline, and a 1.21× speedup over a prior implementation based on the scalar CVA6 and legacy cache subsystem. Synthesized in GlobalFoundries’ 12 nm FinFET, the dual-core cluster incurs less than 1% per-core area overhead, with the coherent unit in the interconnect contributing only 35 kGE (1.5%) to the total cluster footprint.
Practical Implications of SPMP-Based Virtualization in RISC-V
Sub. #XFYMDU.
Manuel Rodríguez.
Abstract: The RISC-V SPMP for Hypervisor specification enable MMU-less virtualization through a multi-layered memory protection architecture. While this model provides strong isolation for mixed-criticality MCUs, concerns have been raised regarding the hardware overhead and timing impact of multiple PMP layers. In this work, we present an empirical evaluation of an SPMP for Hypervisor proof-of-concept implementation. We analyze FPGA resource utilization and timing behavior as a function of entry count and discuss realistic entry requirements for MCU-based virtualization workloads, providing insights for hardware designers adopting SPMP-based architectures.
RISC-V Custom Instructions for Automotive Control and DSP Algorithms Compliant with ISO 26262
Sub. #8LUM7U.
Sai Swaroop Maram and Johannes Sanwald.
Abstract: The stringent safety requirements of the automotive industry necessitate compliance with standards like ISO 26262. Processor cores, often pre-certified to ASIL-B or ASIL-D, face certification risks when modified. This work focuses on the development of custom instructions that are integrated through Codasip’s Bounded Customization (BC) without directly modifying the core’s verified RTL. The paper details a workflow for this process and presents performance results demonstrating the acceleration achieved for key automotive and DSP algorithms, including Field Oriented Control (FOC). All extensions were consolidated into a unified custom processor, termed as the Motor Control with DSP (MCXD) core, featuring a scheduling algorithm that coordinates FOC and filtering routines. Synthesis showed an area increase of ~31%, while runtime and instruction count measurements demonstrated performance improvement of up to ~21%. These results validate that domain-specific acceleration can be achieved within the boundaries of ISO 26262.
RVA23 Profile Support in Linux Kernel: From Extension Definitions to Userspace Export
Sub. #B7EASJ.
Guodong Xu.
Abstract: The RVA23 profile, ratified in October 2024, defines a mandatory baseline of 33 U-mode and 25 S-mode extensions. During the upstream enablement of SpacemiT K3, I identified gaps in the kernel’s RVA23 extension coverage and submitted patches to address them. After several revision cycles, the patches were merged into Linux v7.0, raising coverage from 69% to 100%.
This talk will examine how the Linux kernel community approaches RISC-V extension support - the design principles behind accepting new extensions into the kernel, and how the maintainers manage the growing complexity of the RISC-V extension landscape.
I will then present two patchsets currently under review for Linux v7.1: my series adding cpufeature parsing and hwprobe export for RVA23 extensions, and Andrew Jones’ (Qualcomm) RFC introducing rva23u64 base behavior detection. I will discuss the key architectural decisions in these patches, the review feedback received, and the current status.
Depending on the upstream timeline, these may already be merged by the Summit, or still in progress - either way, the talk will reflect the latest state of the kernel community’s work.
Achieving complete RVA23 support in mainline Linux is a prerequisite for distributions to ship generic RISC-V images that work across compliant hardware, reducing fragmentation. We hope to invite RISC-V kernel community members at the Summit for an open discussion on remaining challenges and future profile evolution.
Building the software ecosystem for a RISC-V datacenter
Sub. #BTUW3M.
Jon Taylor.
Abstract: The RISC-V software ecosystem has grown steadily over the last few years. For embedded software it is reasonably complete, with good compiler, RTOS, and IDE support. The Linux kernel is also well supported, with RISC-V long having upstream support, and RVA23 now supported too. Canonical moved to requiring RVA23 with the release of Ubuntu 25.10, ready for the next generation of RISC-V silicon. But building out a data center takes more than just a good desktop experience. This poster/paper will examine the other elements required including provisioning, hypervisors, containers, orchestration, and also discusses how to manage custom instructions, security, maintenance. Going beyond theory, it discusses the data center Canonical will be building to include RISC-V RVA23 silicon supporting the Launchpad.net community website as well as other uses.
Evaluating Tenstorrent RISC-V Accelerators for High Performance Scientific Computing
Sub. #HNQPKX.
Elisabetta Boella.
Abstract: We implemented an N-body astrophysical simulation code and offloaded its most computationally intensive kernel to Tenstorrent RISC-V–based accelerators using the TT-Metalium programming interface. Performance was assessed on the Wormhole n300 card in terms of execution time and energy consumption, and compared with both an optimized CPU implementation and a CUDA version. The TT-Metalium implementation achieves a speedup of 2× over the CPU baseline, although its performance still slightly lags behind the CUDA implementation. Finally, we investigated strategies for scaling the application across multiple Tenstorrent accelerators, evaluating configurations with up to four devices.
Accelerating RISC-V Innovation with open MPACT Tools from Google
Sub. #JKTENR.
Yenkai and Tor Jeremiassen.
Abstract: The MPACT Tools portfolio provides open-source tools that increase the velocity of HW-SW co-design and development of RISC-V based systems.
MPACT-Sim [1] is an ISS framework in C++ that makes it easier to create ISSs from scratch, and supports rapid changes in response to ISA design changes or user-needed functional enhancements. Using DSLs to describe the instruction set and encoding, it automatically generates instruction decoder source, and provides support for generating assemblers and disassemblers. MPACT-Sim enables rapid HW/SW co-design and early pre-Silicon software development.
MPACT-RiscV [2] (built using MPACT-Sim) is a highly configurable RiscV ISS, with an interactive command interface for assembly level debugging and a customizable assembler which generates both relocatable and executable output files.
To demonstrate the practical impact of the MPACT ecosystem, we present the real-world case study of the CoralNPU machine learning core [3], which is focused on development and execution of ML kernels. The CoralNPU-MPACT ISS [4] development was significantly accelerated by leveraging the fundamental MPACT-Sim and MPACT-RiscV infrastructure, requiring only limited modifications to support the additions to the CoralNPU’s instruction set and memory access rules.
The CoralNPU UVM testbench [5] captures every retired instruction via the standardized RISC-V Verification Interface (RVVI) and steps the MPACT ISS model using a SystemVerilog DPI bridge. The testbench then retrieves golden reference values from the model to verify equivalence against the CoralNPU RTL, detecting functional bugs during development.
The RISE Project: Advancing the RISC-V Software Ecosystem
Sub. #KMNP8Q.
Nathan Egge.
Abstract: The RISC-V Software Ecosystem (RISE) Project is a Linux Foundation Europe initiative where hardware, software and services companies collaborate to bridge the gap between architectural potential and commercial software readiness. While the RISC-V community is highly impactful, industrial-grade software often requires an extra push. RISE provides this through direct engineering, an RFP process that has already deployed over €1M in contracts, and individual support through the RISE Developer Appreciation Program.
This session highlights how RISE is accelerating RISC-V adoption within key upstream open-source projects. We will detail our strategic push to enable AI/ML workloads through targeted investments in PyTorch, Llama.cpp, IREE, oneDNN, and OpenBLAS. We’ll demonstrate how RISE is lowering the barrier to entry by providing free GitHub and GitLab runners for riscv64, alongside self-service remote hardware access via the RISE Board Farm. Finally, we will share our long-term roadmap for ecosystem performance and stability, focusing on LLVM auto-vectorization for RVV at scale and the RISE Build Farm’s role in proactive bug detection across kernel, toolchain and system libraries.
World's first lunar exploration rover using FPGA-based RISC-V processor
Sub. #MCBEUE.
Tetsuo YOSHIMITSU.
Abstract: A small lunar rover named “LEV-1” made surface mobility exploration on the Moon in January 2024. This was the first lunar exploration robot in our country. LEV-1 was installed in the lunar lander “SLIM” and was deployed onto the Moon surface just before landing.
LEV-1 explored over the landing area fully autonomously after the deployment. The obained data inclusing images were directly transmitted to the Ground with no relay by the lander.
The onboard computer of the rover used a RISC-V soft-core CPU implemented within the FPGA. The system is one of the world’s first onboard computers using a RISC-V processor being operated on the Moon.
This paper describes the configuration of the RISC-V controller installed on LEV-1 rover as well as the technical background for using RISC-V in space applications.
Why the industry needs CHERI to be able to meet the EU Cyber Resilience Act
Sub. #PC8KYU.
Tariq Kurd.
Abstract: The Cyber Resilience Act is fully enforced in for all products “with a digital element” sold in the EU from December 2027. It has highly stringent requirements on manufacturers, such as products being “secure by design and by default” and “having no known vulnerabilities” at the point of going on sale. Discovered vulnerabilities in the product must be reported within 24 hours for critical exploits. All vulnerabilities must be patched within a short time frame, and support must be for 5 years or longer depending on the product. As a specific example of the effect of the CRA on consumer products, the Linux kernel had 4336 reported exploits (CVEs) in 2024 (12 per day) and 5779 in 2025 (16 per day). Linux is used in an increasingly large range of consumer devices, not least a large proportion of the world’s smartphones. The able to continue to sell these products in Europe, then the industry really needs to move to a much more securely constructed systems. CHERI systems have memory safety bult-in which resolves 70% of vulnerabilities seen in weaker non-CHERI legacy systems. Resolving such a large proportion of vulnerabilities at source will greatly reduce the support and maintenance costs, if nothing else. As a result of the CRA, there will be a large shift in the industry to make systems much more secure. We expect that much of that shift will be towards CHERI systems as manufacturers wake up to the cost savings.
Enabling High-Performance Storage for RISC-V: Porting the Lustre Parallel File System
Sub. #RRD8XA.
Dave Cremins.
Abstract: Lustre powers approximately 70% of TOP500 supercomputers and is essential infrastructure for high-performance computing (HPC). Our work enables RISC-V systems to access Lustre storage clusters, addressing a critical gap in the RISC-V HPC ecosystem. The port required only 8 minimal patches (19 lines changed across 9 files) to Lustre 2.17.0, demonstrating the maturity of both the RISC-V Linux ecosystem and Lustre’s portable codebase. We validated functionality through QEMU-based testing with multi-client mount operations, FIO, and IOR benchmarks. The patches are being submitted upstream to the Lustre project for inclusion in future releases.
Optimizing Llama.cpp and GGML for RISC-V Vector (RVV)
Sub. #TX9SGW.
Taimur Ahmad and Adeel Ahmad.
Abstract: Llama.cpp is a widely used open-source platform for running Large Language Models (LLMs) on CPUs, but its support for RISC-V remains limited compared to x86 and ARM. Many floating-point and quantized kernels lack RISC-V Vector (RVV) implementations, restricting the performance of existing hardware. This work improves the upstream RISC-V performance by vectorizing core floating-point kernels and extending support across multiple quantization types, enabling first-class support for RVV in Llama.cpp. VLEN-aware data repacking is introduced to accelerate GEMM and GEMV kernels for both floating point and quantization types. The optimized kernels are validated across VLENs up to 1024-bit, with benchmarking on Banana Pi BPI-F3 (256-bit VLEN) demonstrating considerable performance gains over upstream Llama.cpp. This work is supported by the RISC-V Software Ecosystem (RISE), with the vectorized kernels being upstreamed to Llama.cpp along with the test infrastructure.
RVEdge-Vision: A Fully Open, Ultra-Efficient On-Device AI Platform for Smart Eyewear
Sub. #UCYLJC.
Michele Magno.
Abstract: Smart eyewear promises unobtrusive, context-aware human–computer interaction by leveraging strategically placed multimodal sensors and on-device intelligence. However, integrating high-bandwidth sensing and machine learning inference within a compact and lightweight form factor remains challenging due to strict constraints in power consumption, memory footprint, and computational efficiency.This work presents RVEdge-Vision, an open hardware and software platform built on the RISC-V ecosystem that enables rapid prototyping and evaluation of next-generation smart glasses. The platform adopts a modular architecture supporting both frame-based and event-based vision sensors. To the best of our knowledge, this is the first open smart-glasses platform integrating event-based vision sensing in a glasses form factor, enabling ultra-efficient visual perception for wearable edge AI systems. The system incorporates a hardware–software co-designed power management framework optimized for battery-operated edge devices and continuous sensing workloads. As a reference implementation, we present a compact smart-glasses prototype that integrates multimodal sensing and on-device ML acceleration. The device can operate for several hours on a 300 mAh battery while sustaining real-time embedded vision workloads. A YOLOv8-based hand gesture recognition runs on-board with a few ms latency without relying on cloud connectivity. By releasing the platform as open hardware, OpenEdge RV aims to accelerate innovation within the RISC-V and open-edge AI communities, providing a reproducible foundation for research in wearable sensing, neuromorphic vision, and ultra-efficient on-device intelligence.
A User-Friendly and AI-Ready Desktop for RISC-V: Bianbu LXQt
Sub. #WHFNV3.
Xiaogang Fan.
Abstract: We present Bianbu LXQt, a user-oriented desktop environment for RISC-V platforms built on a deeply adapted LXQt software stack, optimized for real hardware such as SpacemiT’s K1 and K3 SoCs. Unlike straightforward ports that assume x86-like hardware standardization, this work addresses common RISC-V Linux challenges, including fragmented peripheral support and the absence of a unified hardware abstraction layer. SpacemiT’s CPUs integrate AI-oriented instruction extensions such as IME, enabling CPU-based inference without discrete GPUs or NPUs, requiring coordinated adaptation across the OS and AI frameworks. Preserving LXQt’s lightweight design, we redesigned the UI and interaction logic to improve responsiveness and visual consistency on resource-constrained RISC-V systems. Development was accelerated using AI-assisted tooling, while continuous feedback from educators and early adopters guided iterative fixes for lag, crashes, and complex configuration—letting users focus on creation, learning, and development rather than system tuning. The full software stack is open source with reproducible builds and modular components. We proved educational AI examples covering image recognition, speech processing, video analysis, and large language model inference, all with intuitive GUIs. Frameworks including ONNX Runtime, llama.cpp, and Ollama run reliably, demonstrating the feasibility of RISC-V systems for AI deployment and local AI development. Through practical system integration, community-driven iteration, and accessible AI tooling, this work shows RISC-V can deliver a polished, daily-driver desktop environment—moving beyond a demo into a trusted open platform for developers, educators, and innovators.
Proposal of State Sensitive Counter (Sssscnt)
Sub. #WWT8EV.
Fengxue Zhang and Bohua Kou.
Abstract: PELT (Per-Entity Load Tracking) is an exponential decay-based per-entity load tracking algorithm in the Linux kernel. It significantly enhances the scheduler’s load awareness accuracy, response latency, and energy efficiency. However, there are still drawbacks in load tracking: The load metrics that are used are not CPU-frequency invariant. The advent of CPU frequency scaling causes task physical runtime to fluctuate with frequency, which, if uncorrected, distorts util_avg and leads to scheduling misjudgments. To address this, the kernel employs hardware counters (e.g., Intel APERF/MPERF, ARMv8.4-AMU) to implement frequency invariance accounting, ensuring util_avg remains anchored to the CPU’s maximum capacity, thereby maintaining load statistics accuracy and scheduling optimality in dynamic frequency environments. Targeting RISC-V architectures, this proposal introduces State sensitive Counters to fill the gap in PELT frequency invariance support. Together, these counters enable the derivation of real-time operating frequency and normalized utilization without costly synchronous queries.