# RISC-V Assembler: Branch Set

This post looks at RISC-V branch and set instructions, such as **beq**, **bltu**, **bgez**, and **slt**. RISC-V takes a different approach to branching, even compared to other RISC processors. We’ll also cover the zero register, program counter, condition codes, and multi-word addition. Branch and set instructions are included in **RV32I**, the base integer instruction set. New to the series? Check out the first part on arithmetic instructions.

In the last few years, we’ve seen an explosion of RISC-V CPU designs, especially on FPGA. Thankfully, RISC-V is ideal for assembly programming with its compact, easy-to-learn instruction set. This series will help you learn and understand 32-bit RISC-V instructions (RV32) and the RISC-V ABI.

Share your thoughts with @WillFlux on Mastodon or Twitter. If you like what I do, sponsor me. 🙏

## Branch

Conditional branches control the flow of execution in a program. A conditional branch jumps to another program address if a condition is true. In high-level programming languages, this can take the form of a for loop, if-then-else, or switch statement.

RISC-V is unusual because branch instructions include the comparison and branch destination in one instruction. This makes branching simple, but it has trade-offs, which we will consider later.

Let’s start by looking at the six regular branch instructions:

```
beq # equal
bne # not equal
blt # less than
bgt # greater than
ble # less than or equal to
bge # greater than or equal to
```

Branch instructions have a consistent format and always **compare two registers**:

```
branch rs1, rs2, offset
```

Where **rs1** and **rs2** are the registers to compare, and **offset** is the *program counter* offset. The **program counter** (PC) points to the *next* instruction to execute. We’ll discuss the PC in more detail later in this post.

These instructions use **signed comparisons**: a register with the contents `0xFFFFFFFF`

is treated as `-1`

. We’ll cover unsigned comparisons in the next section.

Branch offsets are signed 12-bit immediates but in units of **two bytes**. RISC-V instructions are four bytes long, so why are offsets in units of two bytes? Compressed instructions are only two bytes long, so branch offsets need to be in units of two bytes.

Offsets are sign extended, so you can easily branch backwards in your code. With 12-bit offsets in units of two bytes, branch instructions have a range of ±4 KiB.

In practice, you never write an offset directly; you use a **label** instead.

For example, we can create a wait loop with **bne**:

```
li t0, 1000 # time to wait
.L_timer: # local label
lw t1, TIMER_WAIT(t6) # load hardware timer into t1
bne t0, t1, .L_timer # branch (loop) if t1 isn't equal to t0
```

*ProTip: .L_name is a common naming convention for local labels.*

Some of these branch instructions are pseudoinstructions, but as a programmer, this doesn’t matter: they always assemble to one instruction. For example, **bgt** (greater than) assembles to **blt** (less than) with the operands swapped. Use whichever branch instruction you prefer, and let the assembler worry about the underlying instruction.

## Unsigned Branching

If your numbers are unsigned, you add a “u” to the end of the instruction name:

```
bltu # less than unsigned
bgtu # greater than unsigned
bleu # less than or equal to unsigned
bgeu # greater than or equal to unsigned
```

Equal and not equal aren’t affected by sign, so there aren’t unsigned versions of them.

With unsigned comparison, a register with the contents `0xFFFFFFFF`

is treated as `4294967295`

.

## Branching with Zero

You often want to compare a register to zero, for example, to check for the end of a loop or null-terminated string. RISC-V provides a set of handy pseudoinstructions for this:

```
beqz # equal to zero
bnez # not equal to zero
bltz # less than zero
bgtz # greater than zero
blez # less than or equal to zero
bgez # greater than or equal to zero
```

They’re the same as the regular branch instructions with a “z” at the end of the instruction name.

With these instructions, you only specify a single register because the second register is **x0**:

```
beqz rs1, offset
```

For example, you could implement the absolute function with **bgez**:

```
abs:
bgez a0, .L_abs_end # branch to .L_abs_end if a0 is greater or equal to zero
neg a0, a0 # make negative a0 value positive
.L_abs_end:
ret # return from function (a0 holds the return value)
```

We cover the **neg** instruction under subtraction. In the next post, we’ll examine functions in detail.

### The Power of Zero

RISC-V dedicates the register **x0** to zero. At first glance, this appears wasteful, but zero is used in many places and having it permanently available simplifies the instruction set. Other architectures, such as MIPS and ARM64, have a zero register, and mainframe computers, such as the CDC 6600 and IBM System/360 used a zero register in the 1960s!

As we’ve seen, many branch pseudoinstructions use the zero register, but you’ll find the zero register used across RISC-V.

## Branch Instruction Summary

The following table summarises all 16 RISC-V branch (pseudo)instructions:

Comparison | Registers | Signed | Unsigned | Zero |
---|---|---|---|---|

equal (eq) | rs1 = rs2 | beq | beq | beqz |

not equal (ne) | rs1 ≠ rs2 | bne | bne | bnez |

less (lt) | rs1 < rs2 | blt | bltu | bltz |

greater (gt) | rs1 > rs2 | bgt | bgtu | bgtz |

less or equal (le) | rs1 ≤ rs2 | ble | bleu | blez |

greater or equal (ge) | rs1 ≥ rs2 | bge | bgeu | bgez |

*NB. equal and not equal are the same for signed and unsigned comparisons.*

## Program Counter

Branch offsets are relative to the **program counter** (PC). The program counter points to the *next* instruction the CPU will execute. Usually, the CPU adds 4 to the PC when executing an instruction: addresses are in bytes, and each instruction is 4 bytes long. When you take a branch, the CPU updates the program counter to point to the branch offset instead.

Learn more about memory addresses, alignment, and addressing modes.

*ProTip: In x86 land, the PC is known as the instruction pointer (IP), a much more descriptive name.*

### auipc

RISC-V includes an instruction to help with position independent code: **auipc** (add upper immediate to PC). **auipc** works just like **lui** (load upper immediate) but adds a 20-bit immediate value to the program counter.

```
auipc rd, imm # rd = pc + imm << 12
```

With **auipc**, you can use PC-relative addressing to reach a symbol anywhere in 32-bit memory space. For example, combine **auipc** and load instructions to load data from a distant memory location.

The program counter is not one of the general-purpose registers, so you can’t access it directly. However, you can copy the PC using **auipc** with an immediate of zero:

```
auipc t0, 0 # copy program counter into register t0
```

## Set

Earlier, we noted that RISC-V handles comparison and branching in a single instruction. This worked well for branching, but you don’t always want to compare then branch.

Most CPUs use condition codes or status flags such as zero, carry, and overflow. These condition codes can be used for branching, but also for arithmetic and general comparisons.

RISC-V doesn’t have condition codes, but the **set** instructions can handle many of the same situations, such as checking for zero, carry, or overflow. The set instructions compare two registers or a register to an immediate and write **1** to the destination register if true.

There are only four set instructions, all variants of **set less than**:

```
slt rd, rs1, rs2 # set less than: rd = rs1 < rs2
sltu rd, rs1, rs2 # set less than unsigned: rd = rs1 < rs2 (unsigned)
slti rd, rs, imm # set less than immediate: rd = rs1 < imm
sltiu rd, rs, imm # set less than immediate unsigned: rd = rs1 < imm (unsigned)
```

These immediates are 12-bit sign-extended values that can represent -2048 to 2047 inclusive. See arithmetic sign extension for further details.

There aren’t standard pseudoinstructions for “set greater than”, which seems like an oversight. Recent versions of GCC do allow **sgt**, but this isn’t supported by other assemblers.

Register-register set examples:

```
li t0, 2 # t0 = 2
li t1, -2 # t1 = -2
li t2, 42 # t2 = 42
slt t3, t0, t2 # t3 = 1 because 2 < 42
sltu t4, t0, t2 # t4 = 1 because 2 < 42
slt t5, t1, t2 # t5 = 1 because -2 < 42
sltu t6, t1, t2 # t6 = 0 because 4294967294 > 42
```

Note how treating **t1** as unsigned produces a completely different result! Negative numbers are stored using two’s complement, with -1 being 0xFFFFFFFF. Treating 0xFFFFFFFF as unsigned we get 2^{32}-1 or 4,294,967,295.

Register-immediate set examples:

```
li t0, 2 # t0 = 2
li t1, -2 # t1 = -2
slti t3, t0, -1 # t3 = 0 because 2 > -1
sltiu t4, t0, -1 # t4 = 1 because 2 < 4294967295
slti t5, t1, -1 # t5 = 1 because -2 < -1
sltiu t6, t1, -1 # t6 = 1 because 4294967294 < 4294967295
```

### Set Less Than or Equal To

For “less than or equal to” you need two instructions.

For example, to set if **a0** is less than or equal to **a1**:

```
slt t0, a1, a0
xori t0, t0, 1
```

We check if **a1** is less than **a0**, then invert the set bit with `xori`

because:

`(a0 <= a1) == !(a1 > a0)`

Learn more about xor and xori (exclusive OR) in my post on logical instructions.

### Set Zero

RISC-V provides pseudoinstructions for comparing with zero:

```
seqz rd, rs # set equal zero: rd = rs == 0
snez rd, rs # set not equal zero: rd = rs != 0
sltz rd, rs # set less than zero: rd = rs < 0
sgtz rd, rs # set greater than zero: rd = rs > 0
```

Examples of set zero comparisons:

```
li t0, -2 # t0 = -2
seqz t3, t0 # t3 = 0 because -2 != 0
snez t4, t0 # t4 = 1 because -2 != 0
sltz t5, t0 # t5 = 1 because -2 < 0
sgtz t6, t0 # t6 = 0 because 0 > -2
```

### Multi-Word Addition

You can use a set instruction to carry out multi-word addition in place of a carry flag. For example, to add 64-bit integers on 32-bit RISC-V:

```
# 64-bit integer addition
# arguments:
# a0: x lower 32 bits
# a1: x upper 32 bits
# a2: y lower 32 bits
# a3: y upper 32 bits
# return:
# a0: x+y lower 32 bits
# a1: x+y upper 32 bits
#
add64:
add a0, a0, a2 # add lower 32 bits
add t0, a1, a3 # add upper 32 bits
sltu t1, a0, a2 # if lower 32-bit sum < a2 then set t1=1 (carry bit)
add a1, t0, t1 # upper 32 bits of answer (upper sum + carry bit)
ret
```

If the sum of the lower 32 bits is less than a2, then we need to carry a bit. The **sltu** instruction tests for this and sets `t1=1`

when it occurs. We add `t1`

to the sum of the upper 32 bits to create the correct answer.

32-bit CPUs with a carry flag can add 64 bits in just two instructions. For example:

```
# multi-word addition on arm
add64_arm:
adds r0, r0, r2 # add and set flags (including carry)
adc r1, r1, r3 # add with carry
bx lr
```

At first glance, RISC-V’s approach seems inferior. However, avoiding condition codes simplifies hardware design, especially on modern out-of-order CPUs. RISC-V CPUs can also fuse multiple instructions into one internally, so the set instruction in multi-word add don’t necessarily increase the number of instructions executed.

CPU design is a trade-off. While a load-store architecture and zero register are almost universally admired, it’s safe to say not everyone appreciates RISC-V’s lack of condition codes.

To learn more about RISC-V addition and subtraction, see the post on arithmetic.

## What’s Next?

We’ve almost completed our tour of the RISC-V base instruction set. The next instalment of *RISC-V Assembler* is all about jumping, functions, and the ABI (coming soon). In the meantime, read my posts on RISC-V Arithmetic, Logical, Shift, and Load Store.

Check out my FPGA & RISC-V Tutorials and my series on early Macintosh History.

### References

- RISC-V Technical Specifications (riscv.org)