RISC-V Assembler: Compiler Explorer
The Godbolt Compiler Explorer is a fantastic tool for assembler programmers. In this post, I show you how to use Compiler Explorer to generate RISC-V assembly code and offer some ideas to make best use of this tool.
In the last few years, we’ve seen an explosion of RISC-V CPU designs on FPGA and ASIC, including the RP2350 found on the Raspberry Pi Pico 2. Thankfully, RISC-V is ideal for assembly programming with its compact, easy-to-learn instruction set. This series will help you learn and understand 32-bit RISC-V instructions and programming.
RISC-V Assembler: Arithmetic | Logical | Shift | Load and Store | Branch and Set | Jump and Function | Multiply and Divide | Compiler Explorer | Assembler Cheat Sheet
Getting Started with Compiler Explorer
The Godbolt Compiler Explorer lives at godbolt.org.
Compiler Explorer lets you see the results of compiling C, C++, Rust and other high-level languages in your browser. Change your high-level code and see the assembled code update immediately. This is invaluable when experimenting and learning, and Compiler Explorer even shows you which assembly instructions correspond to which parts of your high-level code.
I will focus on C and 32-bit RISC-V (RV32), but much of this advice applies to other languages and architectures. My examples use the Hazard3 RISC-V CPU because it’s both open source and available in a low-cost microcontroller, the Raspberry Pi RP2350.
You write your high-level code on the left and the assembler appears on the right. You can choose your programming language A, compiler B, and compiler options C. Click on output D to see compiler output, including errors and warnings.
Choosing a Compiler
For 32-bit RISC-V, you can choose GCC or Clang in many versions. While I normally use GNU assembler (gas) to assemble my RISC-V designs, Clang often generates more readable assembler. The beauty of Compiler Explorer is you can easily use both and compare them.
Add your chosen compilers to your favourites; otherwise, you’ll be doing a lot of scrolling! You do this by clicking on the compiler drop-down menu and selecting the stars next to your chosen compilers.
Functions and Optimisation
The best way to experiment with simple designs is to write a function. That way, the inputs and outputs are clear, and you can plainly see what’s happening.
By default, the generated code is unoptimised. This is probably not what you want because it adds a stack frame to your functions, making it harder to see what your algorithm is doing.
Consider this trivial C function that squares a number:
int square(int num) {
return num * num;
}
In Clang 18.1, without optimisation, you get 11 instructions!
It makes sense if you know that sp is the stack pointer, ra is the return address, and s0 is the frame pointer. However, unless you’re learning about functions, these instructions are just getting in the way.
If we add -O to the compiler options (top right of window), we get more readable code:
As expected, squaring a number requires a single multiply instruction.
You can optimise further with -O2
or for size with -Os
. See GCC Options That Control Optimization.
Compiler Options
Beyond optimisation, you can pass many more options to the compiler.
For RISC-V, the two key options are:
-mabi=ABI-string
-march=ISA-string
The ABI specifies the integer and floating-point calling convention. There are three valid 32-bit conventions, all of which have 32-bit int, long, and pointer but differ in their floating-point support:
ilp32
- 32 bit without floating point (soft floats)ilp32f
- 32 bit with single-precision floating pointilp32d
- 32 bit with double-precision floating point
The ISA specifies the size (32 or 64-bit) and RISC-V extensions to use. There are many possible combinations; here are a few of the 32-bit possibilities:
rv32i
- base 32-bit integer instructionsrv32im
- 32-bit integer with multiply extensionrv32imac
- 32-bit integer with multiply, atomic, and compressed extensions
For Hazard3, which lacks floating-point hardware but supports bit-manipulation extensions:
-mabi=ilp32
-march=rv32imac_zicsr_zifencei_zba_zbb_zbkb_zbs
For a complete list of RISC-V compiler options, see GCC RISC-V Options (works with GCC & Clang).
Compiler support for RISC-V extensions has evolved rapidly in recent years, so if you have trouble assembling some code, make sure your compiler is new enough.
ProTip: Another handy option is -fno-inline
, which prevents functions from being inlined.
C Types
C types are broadly architecture-dependent, and C sets a low bar for acceptable implementations. For example, int is signed and must be capable of the range −32767 to +32767.
If you’re doing anything vaguely numerical, you want to be precise with your types using stdint.h.
Types you might want to use include:
- signed:
int8_t, int16_t, int32_t, int64_t
- unsigned:
uint8_t, uint16_t, uint32_t, uint64_t
For example, compare 32-bit and 64-bit addition on 32-bit RISC-V:
#include <stdint.h>
int32_t add32(int32_t a, int32_t b) {
return a + b;
}
int64_t add64(int64_t a, int64_t b) {
return a + b;
}
Learn more about RISC-V set instructions and multi-word addition.
C’s rules for type conversion can lead to unexpected results. For example, multiplying 32-bit values produces a 64-bit product, but you only get a 64-bit result if you explicitly cast the input values to 64 bits. Contrast the assembled code for these two functions:
#include <stdint.h>
int64_t mul64_broken(int32_t x, int32_t y) {
return x * y; // returns 32-bit result!
}
int64_t mul64(int32_t x, int32_t y) {
return (int64_t)x * (int64_t)y; // 64-bit cast
}
RISC-V Extensions
You can easily explore the impact of different RISC-V extensions on code generation.
In this example, let’s reverse the order of bytes in a 32-bit word:
int32_t endian_swap(int32_t word) {
return ((word>>24) & 0xFF) |
((word<<8) & 0xFF0000) |
((word>>8) & 0xFF00) |
((word<<24) & 0xFF000000);
}
The result from Clang 18.1 with -O
:
endian_swap:
srli a1, a0, 8
lui a2, 16
addi a2, a2, -256
and a1, a1, a2
srli a3, a0, 24
or a1, a1, a3
and a2, a2, a0
slli a2, a2, 8
slli a0, a0, 24
or a0, a0, a2
or a0, a0, a1
ret
If we use the extensions supported by Hazard3, the generated code is rather shorter.
Clang 18.1 with -O -mabi=ilp32 -march=rv32imac_zicsr_zifencei_zba_zbb_zbkb_zbs
:
endian_swap:
rev8 a0, a0
ret
The Zbb (basic bit-manipulation) extension includes the rev8 instruction, which reverses the bytes in a word, swapping big endian to little endian and vice-versa.
Comparing Architectures
Compiler Explorer is a great way to compare and contrast architectures and instruction sets. Let’s look at a couple of simple functions compiled for different architectures.
#include <stdint.h>
int64_t add64(int64_t a, int64_t b) {
return a + b;
}
int32_t greater(int32_t a, int32_t b) {
if (a > b) return 1;
else return 0;
}
32-bit RISC-V (Clang 18.1 with -O
):
add64:
add a1, a1, a3
add a0, a0, a2
sltu a2, a0, a2
add a1, a1, a2
ret
greater:
slt a0, a1, a0
ret
armv7 (Clang 18.1 with -O
):
add64:
adds r0, r2, r0
adc r1, r3, r1
bx lr
greater:
mov r2, #0
cmp r0, r1
movgt r2, #1
mov r0, r2
bx lr
Or how about the Intel 486 (GCC 14.2 with -O -m32 -march=i486
):
add64:
mov eax, DWORD PTR [esp+12]
mov edx, DWORD PTR [esp+16]
add eax, DWORD PTR [esp+4]
adc edx, DWORD PTR [esp+8]
ret
greater:
mov eax, DWORD PTR [esp+8]
cmp DWORD PTR [esp+4], eax
setg al
and eax, 255
ret
You can pass all the usual compiler options to select specific architectures. Refer to your chosen compiler documentation for details.
Compiler Explorer supports an impressive collections of instruction sets: 6502, aarch64, amd64 (inc. i386), arm32, avr, c6x, ebpf, kvx, loongarch, m68k, mips, mrisc32, msp430, powerpc, riscv32, riscv64, s390x, sh (SuperH), sparc, vax, wasm32, and xtensa! I lost a fair few hours learning about some of the more obscure of them.
What’s Next?
Check out the RISC-V Assembler Cheat Sheet and my FPGA & RISC-V Tutorials.
Share your thoughts with me on Mastodon or X. If you enjoy my work, please sponsor me. Sponsors help me create new projects for everyone, and they get early access to blog posts and source code. 🙏