Welcome back to *Exploring FPGA Graphics*. It’s time to turn our attention to drawing. Most modern computer graphics come down to drawing triangles and colouring them in. So, it seems fitting to begin our tour of drawing with triangles and the straight lines that form them. This post will implement Bresenham’s line algorithm in Verilog, creating lines, triangles, and even a cube (our first sort-of-3D graphic).

In this series, we explore graphics at the hardware level and get a feel for the power of FPGAs. We’ll learn how displays work, race the beam with Pong, animate starfields and sprites, paint Michelangelo’s David, simulate life with bitmaps, draw lines and shapes, and finally render simple 3D models. New to the series? Start with FPGA Graphics.

**You can watch an FPGA Graphics demo reel with designs from across this series.**

*Updated 2021-08-02. Get in touch with @WillFlux or open an issue on GitHub.*

### Series Outline

- FPGA Graphics - learn how displays work and draw your first graphics
- Pong - race the beam to create the arcade classic
- Hardware Sprites - fast, colourful, graphics with minimal resources
- Ad Astra - graphics demo with starfields and hardware sprites
- Framebuffers - driving the display from a bitmap in memory
- Life on Screen - the screen comes alive with Conway’s Game of Life
- Lines and Triangles (this post) - drawing lines and triangles with a framebuffer
- 2D Shapes - filling shapes and drawing pictures
- Animated Shapes - animating shapes and double-buffering (coming soon)

### Requirements

For this series, you need an FPGA board with video output. We’ll be working at 640x480, so pretty much any video output will do. It helps to be comfortable with programming your FPGA board and reasonably familiar with Verilog.

We’ll be demoing with these boards:

**iCEBreaker**(Lattice iCE40) with**12-Bit DVI Pmod****Digilent Arty A7-35T**(Xilinx Artix-7) with**Pmod VGA**

### Source

The SystemVerilog designs featured in this series are available from the projf-explore git repo under the open-source MIT licence: build on them to your heart’s content. The rest of the blog content is subject to standard copyright restrictions: don’t republish it without permission.

## An Address for Every Pixel

Let’s start by reminding ourselves how a framebuffer works. A framebuffer memory location backs every pixel on the screen. To update a pixel, we convert its coordinates into a memory address and write the colour to that address.

For this post, we’ll be using a 16:9 framebuffer with a resolution of **320 x 180**:

- Works well in 16:9 and 4:3
- Scale up 4x for 720p and 6x of 1080p
- Scale up 2x for 640x480 with letterbox
- 320x180 with 16 colours fits in 256 Kb (32 KiB)

Our framebuffer module takes care of turning coordinates into memory addresses for us. We supply the colour and the (x,y) position of the pixel, and the framebuffer module does the rest. Take a look at the post on Framebuffers if you need a reminder on how this works.

Screen Coordinates

Our coordinate system has the origin`(0,0)`

at the top-left of the screen, and the Y-coordinate increasesdownthe screen. Many 3D systems, such as OpenGL, have the origin at the bottom-left, and the Y-coordinate increasesupthe screen.

## Many Colours?

In our previous posts, we loaded an image and picked a palette to match. Now we’re drawing; we want the freedom to choose from a wide range of colours. However, we also want to leave enough memory for double-buffering when we start animating, so we’ll go for 16 colours.

### Framebuffer Memory

A single framebuffer requires: `4 * 320 * 180 = 230,400 bits (225 Kb)`

225 Kb uses 8 of the 50 BRAMs on the Arty, but what about iCEBreaker?

The iCE40UP5K FPGA has 120 Kb of BRAM, but it also includes 1 Mb of single-port memory: **SPRAM**. The SPRAM is organised as four 256 Kb blocks and supports 4-bit writes, so it’s ideal for our purposes. Learn more from SPRAM on iCE40 FPGA.

### 16 Colour Palette

For the 16 colour palette, I’ve chosen the PICO-8 palette (adjusted for 12-bit output):

We load the 16 colours into the colour lookup table (CLUT) ROM using a file: **[16_colr_4bit_palette.mem]**.

```
000 // 0 - black
235 // 1 - dark-blue
825 // 2 - dark-purple
085 // 3 - dark-green
B53 // 4 - brown
655 // 5 - dark-grey
CCC // 6 - light-grey
FFF // 7 - white
F05 // 8 - red
FA0 // 9 - orange
FF2 // A - yellow
0E3 // B - green
3BF // C - blue
87A // D - indigo
F7B // E - pink
FCA // F - peach
```

It’s easy to create your own palette: the three hex digits represent red, green, and blue intensity. I recommend leaving the first entry as black (`000`

).

## From Point to Line

We can draw a point by writing to a single memory address, but we want to draw a line *between* two points. Bresenham’s line algorithm is the definitive way to do this, and The Beauty of Bresenham’s Algorithm has just what we need: a clearly written version of the algorithm using integers.

Here’s the C design:

```
void plotLine(int x0, int y0, int x1, int y1)
{
int dx = abs(x1-x0), sx = x0<x1 ? 1 : -1;
int dy = -abs(y1-y0), sy = y0<y1 ? 1 : -1;
int err = dx+dy, e2; /* error value e_xy */
for(;;){ /* loop */
setPixel(x0,y0);
if (x0==x1 && y0==y1) break;
e2 = 2*err;
if (e2 >= dy) { err += dy; x0 += sx; } /* e_xy+e_x > 0 */
if (e2 <= dx) { err += dx; y0 += sy; } /* e_xy+e_y < 0 */
}
}
```

For the hows and whys, read A Rasterizing Algorithm for Drawing Curves (PDF). Kudos to Alois Zingl.

## From C to Verilog

There are two stages to the algorithm: setting the initial values and running the algorithm in the loop.

As initial values, we need the difference between the start and end coordinates and the sign and absolute value of that difference. Your first thought might be to mess around with two’s complement to determine `abs(x1-x0)`

, but we can use a little combinational logic, remembering to use `logic signed`

as needed:

```
logic signed [CORDW:0] dx, dy; // a bit wider as signed
logic right, down; // drawing direction
always_comb begin
right = (x0 < x1);
down = (y0 < y1);
dx = right ? x1 - x0 : x0 - x1; // dx = abs(x1 - x0)
dy = down ? y0 - y1 : y1 - y0; // dy = -abs(y1 - y0)
end
```

*NB. The sign of dy is different from dx; check the C version of the algorithm to see what I mean.*

### Going Loopy

Next, we could quickly bash out an `always_ff`

block to cover the loop. But this isn’t software; there’s a trap lurking to catch the unwary.

Rewriting the C in Verilog, we could end up with the following (dubious) logic:

```
always_ff @(posedge clk) begin
// ...
if (e2 >= dy) begin
x <= (right) ? x + 1 : x - 1;
err <= err + dy;
end
if (e2 <= dx) begin
y <= (down) ? y + 1 : y - 1;
err <= err + dx;
end
end
```

At first glance, it looks OK, and your tools will almost certainly build it without complaint. Experienced Verilog engineers are probably rolling their eyes, but it’s worth thinking through why this won’t work.

Consider what happens if `(e2 >= dy)`

and `(e2 <= dx)`

are *both* true?

`x`

and `y`

are incremented correctly, but `err <= err + dy;`

is ignored. Huh?!

The `<=`

assignment is **non-blocking**, and non-blocking assignments happen in parallel. The Verilog standard says that if a variable has multiple non-blocking assignments, **the last assignment wins**.

We can’t calculate the error with just a combinatorial block either: the new error value depends on the previous one (we need to maintain state). Instead, we use a combinational block, with **blocking** assignment, to calculate the change in error, then add it to the previous value in a clocked `always_ff`

block:

```
logic signed [CORDW:0] err, derr;
logic movx, movy; // move in x and/or y required
always_comb begin
movx = (2*err >= dy);
movy = (2*err <= dx);
derr = movx ? dy : 0;
if (movy) derr = derr + dx;
end
always_ff @(posedge clk) begin
// ...
if (movx) x <= right ? x + 1 : x - 1;
if (movy) y <= down ? y + 1 : y - 1;
err <= err + derr;
end
```

The two blocking assignments to `derr`

happen one after the other.

Note how we’ve also eliminated the need for `e2`

, replacing it with `2*err`

in our comparisons.

Our first attempt at a line drawing module:

```
module draw_line #(parameter CORDW=10) ( // framebuffer coord width in bits
input wire logic clk, // clock
input wire logic rst, // reset
input wire logic start, // start line drawing
input wire logic signed [CORDW-1:0] x0, // point 0 - horizontal position
input wire logic signed [CORDW-1:0] y0, // point 0 - vertical position
input wire logic signed [CORDW-1:0] x1, // point 1 - horizontal position
input wire logic signed [CORDW-1:0] y1, // point 1 - vertical position
output logic signed [CORDW-1:0] x, // horizontal drawing position
output logic signed [CORDW-1:0] y, // vertical drawing position
output logic drawing, // line is drawing
output logic done // line complete (high for one tick)
);
// line properties
logic signed [CORDW:0] dx, dy; // a bit wider as signed
logic right, down; // drawing direction
always_comb begin
right = (x0 < x1);
down = (y0 < y1);
dx = right ? x1 - x0 : x0 - x1; // dx = abs(x1 - x0)
dy = down ? y0 - y1 : y1 - y0; // dy = -abs(y1 - y0)
end
// error values
logic signed [CORDW:0] err, derr;
logic movx, movy; // move in x and/or y required
always_comb begin
movx = (2*err >= dy);
movy = (2*err <= dx);
derr = movx ? dy : 0;
if (movy) derr = derr + dx;
end
// drawing high when in_progress
logic in_progress; // drawing in progress
always_comb drawing = in_progress;
enum {IDLE, DRAW} state; // we're either idle or drawing
always_ff @(posedge clk) begin
case (state)
DRAW: begin
if (x == x1 && y == y1) begin
in_progress <= 0;
done <= 1;
state <= IDLE;
end else begin
if (movx) x <= right ? x + 1 : x - 1;
if (movy) y <= down ? y + 1 : y - 1;
err <= err + derr;
end
end
default: begin // IDLE
done <= 0;
if (start) begin
err <= dx + dy;
x <= x0;
y <= y0;
in_progress <= 1;
state <= DRAW;
end
end
endcase
if (rst) begin
in_progress <= 0;
done <= 0;
state <= IDLE;
end
end
endmodule
```

We’ve got a good start here, but our module has a couple of significant problems we should tackle.

### Oh dear! I shall be too late!

Line drawing crops up all over the place; if it’s slow, it’ll be a significant performance bottleneck.

Our current line drawing module makes direct use of relatively complex combinational logic. For example, we use `movy`

to control whether to move our drawing position vertically. `movy`

depends on `dx`

, which depends on `right`

. All these signals are purely combinational, with nothing stored in registers (flip-flops). Unsurprisingly, my tests showed this path was the limiting factor for line drawing speed.

Our first improvement is straightforward: we register `dx`

and `dy`

in an `always_ff`

block. Even better, because `dx`

and `dy`

don’t change for a given line, we only have to do this once and don’t suffer a latency penalty:

```
always_comb begin
right = (x0 < x1);
down = (y0 < y1);
end
always_ff @(posedge clk) begin
dx <= right ? x1 - x0 : x0 - x1; // dx = abs(x1 - x0)
dy <= down ? y0 - y1 : y1 - y0; // dy = -abs(y1 - y0)
end
```

We can further improve timing by removing the combinational `derr`

and using `dx`

and `dy`

directly in the main `always_ff`

block:

```
DRAW: begin
if (oe) begin
if (x == xb && y == yb) begin
in_progress <= 0;
done <= 1;
state <= IDLE;
end else begin
if (movx) begin
x <= right ? x + 1 : x - 1;
err <= err + dy;
end
if (movy) begin
y <= y + 1; // always down
err <= err + dx;
end
if (movx && movy) begin
x <= right ? x + 1 : x - 1;
y <= y + 1; // always down
err <= err + dy + dx;
end
end
end
end
```

This Verilog seems overly verbose compared to the combinational `derr`

, but the timing is much better on simpler FPGAs, such as the iCE40. For example, the cube design we will discuss shortly improves from ~22 MHz to ~28 MHz with these changes (we need 25 MHz to meet timing).

With experience, you’ll get a feel for when registering a signal makes sense. For example, back in 2020, I learnt that iCE40 subtraction takes two layers of logic, making registering the initial line values all the more valuable. Both Vivado (Arty) and nextpnr (iCEBreaker) provide timing reports to help you improve the performance of your designs.

### Breaking Symmetry

Bresenham’s line algorithm is not symmetrical: drawing from `(x0,y0)`

to `(x1,y1)`

is not necessarily the same as drawing from `(x1,y1)`

to `(x0,y0)`

.

For example, I drew the triangle (2,2) (6,2) (4,6) clockwise then anticlockwise:

Variations in rendering may not matter if you’re drawing a single shape, but what happens if we draw two shapes next to each other? We don’t want any gaps between the shapes. To ensure one unique rendering of the line `(x0,y0)`

to `(x1,y1)`

, we need a consistent way to order the points. I have chosen to draw *down* the screen; that is, with the y-coordinate increasing. To achieve this, we look at the y-coordinates and swap them if `y0`

is greater than `y1`

.

That leaves horizontal lines: the y-coordinate is the same for both points in this case. However, it does not matter which direction we draw horizontal lines: Bresenham’s line algorithm is the same in both directions.

The swapping logic looks like this:

```
// line properties
logic swap; // swap points to ensure y1 >= y0
logic right; // drawing direction
logic signed [CORDW-1:0] xa, ya; // start point
logic signed [CORDW-1:0] xb, yb; // end point
logic signed [CORDW-1:0] x_end, y_end; // register end point
always_comb begin
swap = (y0 > y1); // swap points if y0 is below y1
xa = swap ? x1 : x0;
xb = swap ? x0 : x1;
ya = swap ? y1 : y0;
yb = swap ? y0 : y1;
end
```

If we use these new combinational signals directly, our timing will suffer. To avoid this, we can register the end coordinate and drawing direction:

```
always_ff @(posedge clk) begin
// ...
x_end <= xb;
y_end <= yb;
// ...
right <= (xa < xb); // draw right to left?
```

### Ready to Draw

We’re now ready to use our improved line drawing module **[draw_line.sv]**:

```
module draw_line #(parameter CORDW=16) ( // signed coordinate width
input wire logic clk, // clock
input wire logic rst, // reset
input wire logic start, // start line drawing
input wire logic oe, // output enable
input wire logic signed [CORDW-1:0] x0, y0, // point 0
input wire logic signed [CORDW-1:0] x1, y1, // point 1
output logic signed [CORDW-1:0] x, y, // drawing position
output logic drawing, // line is drawing
output logic complete, // line complete (remains high)
output logic done // line done (high for one tick)
);
// line properties
logic swap; // swap points to ensure y1 >= y0
logic right; // drawing direction
logic signed [CORDW-1:0] xa, ya; // start point
logic signed [CORDW-1:0] xb, yb; // end point
logic signed [CORDW-1:0] x_end, y_end; // register end point
always_comb begin
swap = (y0 > y1); // swap points if y0 is below y1
xa = swap ? x1 : x0;
xb = swap ? x0 : x1;
ya = swap ? y1 : y0;
yb = swap ? y0 : y1;
end
// error values
logic signed [CORDW:0] err; // a bit wider as signed
logic signed [CORDW:0] dx, dy;
logic movx, movy; // horizontal/vertical move required
always_comb begin
movx = (2*err >= dy);
movy = (2*err <= dx);
end
logic in_progress = 0; // calculation in progress (but only output if oe)
always_comb begin
drawing = 0;
if (in_progress && oe) drawing = 1;
end
enum {IDLE, INIT_0, INIT_1, DRAW} state;
always_ff @(posedge clk) begin
case (state)
DRAW: begin
if (oe) begin
if (x == x_end && y == y_end) begin
state <= IDLE;
in_progress <= 0;
complete <= 1;
done <= 1;
end else begin
if (movx) begin
x <= right ? x + 1 : x - 1;
err <= err + dy;
end
if (movy) begin
y <= y + 1; // always down
err <= err + dx;
end
if (movx && movy) begin
x <= right ? x + 1 : x - 1;
y <= y + 1;
err <= err + dy + dx;
end
end
end
end
INIT_0: begin
state <= INIT_1;
dx <= right ? xb - xa : xa - xb; // dx = abs(xb - xa)
dy <= ya - yb; // dy = -abs(yb - ya)
end
INIT_1: begin
state <= DRAW;
err <= dx + dy;
x <= xa;
y <= ya;
x_end <= xb;
y_end <= yb;
in_progress <= 1;
end
default: begin // IDLE
done <= 0;
if (start) begin
state <= INIT_0;
right <= (xa < xb); // draw right to left?
complete <= 0;
end
end
endcase
if (rst) begin
state <= IDLE;
in_progress <= 0;
complete <= 0;
done <= 0;
end
end
endmodule
```

The pixel to draw is output as `(x,y)`

, and the line coordinates are input as `(x0,y0)`

and `(x1,y1)`

. A high `start`

signal begins drawing, and drawing completion is marked by `complete`

(remains high) and `done`

(high for one tick). An output enable signal, `oe`

, allows you to pause drawing, handy for multiplexing memory access or slowing down the action to make it visible.

There’s a test bench you can use to exercise the module with Vivado: **[xc7/draw_line_tb.sv]**.

We test several sorts of lines: steep and not steep, drawn upwards, downwards, left to right, and right to left, as well as points, and the longest possible horizontal, vertical, and diagonal lines. A steep line is one in which the vertical change is larger than the horizontal.

### Top of the Line

It’s time to get drawing with actual hardware.

Create a new top module and build it for your board:

- Arty (XC7):
**xc7/top_line.sv** - iCEBreaker (iCE40):
**ice40/top_line.sv**

This design is similar to the top modules we used in the framebuffers post.

Building the Designs

In the Lines and Triangles section of the git repo, you’ll find the design files, a makefile for iCEBreaker, a Vivado project for Arty, and instructions for building the designs for both boards.

The iCE40 version of `top_line`

with SPRAM looks like this:

```
module top_line (
input wire logic clk_12m, // 12 MHz clock
input wire logic btn_rst, // reset button (active high)
output logic dvi_clk, // DVI pixel clock
output logic dvi_hsync, // DVI horizontal sync
output logic dvi_vsync, // DVI vertical sync
output logic dvi_de, // DVI data enable
output logic [3:0] dvi_r, // 4-bit DVI red
output logic [3:0] dvi_g, // 4-bit DVI green
output logic [3:0] dvi_b // 4-bit DVI blue
);
// generate pixel clock
logic clk_pix;
logic clk_locked;
clock_gen_480p clock_pix_inst (
.clk(clk_12m),
.rst(btn_rst),
.clk_pix,
.clk_locked
);
// display timings
localparam CORDW = 16;
logic signed [CORDW-1:0] sx, sy;
logic hsync, vsync;
logic de, frame, line;
display_timings_480p #(.CORDW(CORDW)) display_timings_inst (
.clk_pix,
.rst(!clk_locked), // wait for pixel clock lock
.sx,
.sy,
.hsync,
.vsync,
.de,
.frame,
.line
);
// framebuffer (FB)
localparam FB_WIDTH = 320;
localparam FB_HEIGHT = 180;
localparam FB_CIDXW = 4;
localparam FB_CHANW = 4;
localparam FB_SCALE = 2;
localparam FB_IMAGE = "";
localparam FB_PALETTE = "../res/palette/16_colr_4bit_palette.mem";
logic fb_we;
logic signed [CORDW-1:0] fbx, fby; // framebuffer coordinates
logic [FB_CIDXW-1:0] fb_cidx;
logic fb_busy; // when framebuffer is busy it cannot accept writes
logic [FB_CHANW-1:0] fb_red, fb_green, fb_blue; // colours for display
framebuffer_spram #(
.WIDTH(FB_WIDTH),
.HEIGHT(FB_HEIGHT),
.CIDXW(FB_CIDXW),
.CHANW(FB_CHANW),
.SCALE(FB_SCALE),
.F_IMAGE(FB_IMAGE),
.F_PALETTE(FB_PALETTE)
) fb_inst (
.clk_sys(clk_pix),
.clk_pix(clk_pix),
.rst_sys(1'b0),
.rst_pix(1'b0),
.de(sy >= 60 && sy < 420 && sx >= 0), // 16:9 letterbox
.frame,
.line,
.we(fb_we),
.x(fbx),
.y(fby),
.cidx(fb_cidx),
.clip(),
.busy(fb_busy),
.red(fb_red),
.green(fb_green),
.blue(fb_blue)
);
// draw line in framebuffer
logic signed [CORDW-1:0] vx0, vy0, vx1, vy1; // line coords
logic draw_start, drawing, draw_done; // drawing signals
// clear FB before use (contents are not initialized)
logic signed [CORDW-1:0] fbx_clear, fby_clear; // framebuffer clearing coordinates
logic clearing; // high when we're clearing
// draw state machine
enum {IDLE, CLEAR, INIT, DRAW, DONE} state;
always_ff @(posedge clk_pix) begin
case (state)
CLEAR: begin // we need to initialize SPRAM values to zero
fb_cidx <= 4'h0; // black
if (!fb_busy) begin
if (fby_clear == FB_HEIGHT-1 && fbx_clear == FB_WIDTH-1) begin
clearing <= 0;
state <= INIT;
end else begin // iterate over all pixels
if (clearing == 1) begin
if (fbx_clear == FB_WIDTH-1) begin
fbx_clear <= 0;
fby_clear <= fby_clear + 1;
end else begin
fbx_clear <= fbx_clear + 1;
end
end else clearing <= 1;
end
end
end
INIT: begin // register coordinates and colour
vx0 <= 70; vy0 <= 0;
vx1 <= 249; vy1 <= 179;
fb_cidx <= 4'h9; // orange
draw_start <= 1;
state <= DRAW;
end
DRAW: begin
draw_start <= 0;
if (draw_done) state <= DONE;
end
DONE: state <= DONE;
default: if (frame) state <= CLEAR; // IDLE
endcase
if (!clk_locked) state <= IDLE;
end
logic signed [CORDW-1:0] fbx_draw, fby_draw; // framebuffer drawing coordinates
draw_line #(.CORDW(CORDW)) draw_line_inst (
.clk(clk_pix),
.rst(!clk_locked), // must be reset for draw with Yosys
.start(draw_start),
.oe(!fb_busy), // draw when FB is available
.x0(vx0),
.y0(vy0),
.x1(vx1),
.y1(vy1),
.x(fbx_draw),
.y(fby_draw),
.drawing,
.complete(),
.done(draw_done)
);
// write to framebuffer when drawing or clearing
always_ff @(posedge clk_pix) begin
fb_we <= drawing || clearing;
fbx <= clearing ? fbx_clear : fbx_draw;
fby <= clearing ? fby_clear : fby_draw;
end
// reading from FB takes one cycle: delay display signals to match
logic hsync_p1, vsync_p1, de_p1;
always_ff @(posedge clk_pix) begin
hsync_p1 <= hsync;
vsync_p1 <= vsync;
de_p1 <= de;
end
// Output DVI clock: 180° out of phase with other DVI signals
SB_IO #(
.PIN_TYPE(6'b010000) // PIN_OUTPUT_DDR
) dvi_clk_io (
.PACKAGE_PIN(dvi_clk),
.OUTPUT_CLK(clk_pix),
.D_OUT_0(1'b0),
.D_OUT_1(1'b1)
);
// Output DVI signals
SB_IO #(
.PIN_TYPE(6'b010100) // PIN_OUTPUT_REGISTERED
) dvi_signal_io [14:0] (
.PACKAGE_PIN({dvi_hsync, dvi_vsync, dvi_de, dvi_r, dvi_g, dvi_b}),
.OUTPUT_CLK(clk_pix),
.D_OUT_0({hsync_p1, vsync_p1, de_p1, fb_red, fb_green, fb_blue}),
.D_OUT_1()
);
endmodule
```

We use a new version of the framebuffer for SPRAM: **[framebuffer_spram.sv]**.

We also have to clear the SPRAM framebuffer before drawing as SPRAM starts with random values. There’s currently a minor issue with clearing the SPRAM before drawing: one pixel remains uncleared. I’m planning to implement clearing within the SPRAM version of the framebuffer and tackle this issue then.

### That Ain’t No Cube

If we can draw one line, we can draw many! Let’s draw a cube as you’ve probably doodled on paper; this requires nine lines. To see how the drawing works, we’ve linked drawing *output enable* to the `frame_sys`

signal, which occurs once per frame. We draw one pixel each frame, with a delay of 300 frames, to give the monitor time to show the image.

- Arty (XC7):
**xc7/top_cube.sv** - iCEBreaker (iCE40):
**ice40/top_cube.sv**

Arty cube drawing looks like this:

```
// draw cube in framebuffer
localparam LINE_CNT=9; // number of lines to draw
logic [3:0] line_id; // line identifier
logic signed [CORDW-1:0] vx0, vy0, vx1, vy1; // line coords
logic draw_start, drawing, draw_done; // drawing signals
// draw state machine
enum {IDLE, INIT, DRAW, DONE} state;
always_ff @(posedge clk_100m) begin
case (state)
INIT: begin // register coordinates and colour
draw_start <= 1;
state <= DRAW;
fb_cidx <= 4'h8; // red
case (line_id)
4'd0: begin
vx0 <= 130; vy0 <= 60; vx1 <= 230; vy1 <= 60;
end
4'd1: begin
vx0 <= 230; vy0 <= 60; vx1 <= 230; vy1 <= 160;
end
4'd2: begin
vx0 <= 230; vy0 <= 160; vx1 <= 130; vy1 <= 160;
end
4'd3: begin
vx0 <= 130; vy0 <= 160; vx1 <= 130; vy1 <= 60;
end
4'd4: begin
vx0 <= 130; vy0 <= 160; vx1 <= 90; vy1 <= 120;
end
4'd5: begin
vx0 <= 90; vy0 <= 120; vx1 <= 90; vy1 <= 20;
end
4'd6: begin
vx0 <= 90; vy0 <= 20; vx1 <= 130; vy1 <= 60;
end
4'd7: begin
vx0 <= 90; vy0 <= 20; vx1 <= 190; vy1 <= 20;
end
4'd8: begin
vx0 <= 190; vy0 <= 20; vx1 <= 230; vy1 <= 60;
end
default: begin // should never occur
vx0 <= 0; vy0 <= 0; vx1 <= 0; vy1 <= 0;
end
endcase
end
DRAW: begin
draw_start <= 0;
if (draw_done) begin
if (line_id == LINE_CNT-1) begin
state <= DONE;
end else begin
line_id <= line_id + 1;
state <= INIT;
end
end
end
DONE: state <= DONE;
default: if (frame_sys) state <= INIT; // IDLE
endcase
end
// control drawing speed with output enable
localparam FRAME_WAIT = 300; // wait this many frames to start drawing
logic [$clog2(FRAME_WAIT)-1:0] cnt_frame_wait;
logic draw_req; // draw requested
always_ff @(posedge clk_100m) begin
if (!fb_busy) draw_req <= 0; // disable after FB available, so 1 pix per frame
if (frame_sys) begin // once per frame
if (cnt_frame_wait != FRAME_WAIT-1) begin
cnt_frame_wait <= cnt_frame_wait + 1;
end else draw_req <= 1; // request drawing
end
end
draw_line #(.CORDW(CORDW)) draw_line_inst (
.clk(clk_100m),
.rst(1'b0),
.start(draw_start),
.oe(draw_req && !fb_busy), // draw if requested when framebuffer is available
.x0(vx0),
.y0(vy0),
.x1(vx1),
.y1(vy1),
.x(fbx),
.y(fby),
.drawing,
.complete(),
.done(draw_done)
);
```

It looks like a cube, but it’s an ersatz cube. Our cube has no real depth; it cannot move in 3D space, nor can we apply realistic lighting. We’ll cover real 3D models in a later post, but for now, let’s turn our attention to the most critical shape in all of computer graphics: the triangle.

## The Triangle

As you gaze upon the beautiful 4K vista from a AAA game in 2021, know this: it’s all triangles!

A triangle consists of three lines so that we could issue three draw_line commands, but it’s so valuable, it deserves its own module **[draw_triangle.sv]**:

```
module draw_triangle #(parameter CORDW=16) ( // signed coordinate width
input wire logic clk, // clock
input wire logic rst, // reset
input wire logic start, // start triangle drawing
input wire logic oe, // output enable
input wire logic signed [CORDW-1:0] x0, y0, // vertex 0
input wire logic signed [CORDW-1:0] x1, y1, // vertex 1
input wire logic signed [CORDW-1:0] x2, y2, // vertex 2
output logic signed [CORDW-1:0] x, y, // drawing position
output logic drawing, // triangle is drawing
output logic complete, // triangle complete (remains high)
output logic done // triangle complete (high for one tick)
);
logic [1:0] line_id; // current line (0, 1, or 2)
logic line_start; // start drawing line
logic line_done; // finished drawing current line?
// current line coordinates
logic signed [CORDW-1:0] lx0, ly0; // point 0 position
logic signed [CORDW-1:0] lx1, ly1; // point 1 position
enum {IDLE, INIT, DRAW} state;
always_ff @(posedge clk) begin
case (state)
INIT: begin // register coordinates
state <= DRAW;
line_start <= 1;
if (line_id == 2'd0) begin // (x0,y0) (x1,y1)
lx0 <= x0; ly0 <= y0;
lx1 <= x1; ly1 <= y1;
end else if (line_id == 2'd1) begin // (x1,y1) (x2,y2)
lx0 <= x1; ly0 <= y1;
lx1 <= x2; ly1 <= y2;
end else begin // (x2,y2) (x0,y0)
lx0 <= x2; ly0 <= y2;
lx1 <= x0; ly1 <= y0;
end
end
DRAW: begin
line_start <= 0;
if (line_done) begin
if (line_id == 2) begin // final line
state <= IDLE;
complete <= 1;
done <= 1;
end else begin
state <= INIT;
line_id <= line_id + 1;
end
end
end
default: begin // IDLE
done <= 0;
if (start) begin
state <= INIT;
line_id <= 0;
complete <= 0;
end
end
endcase
if (rst) begin
state <= IDLE;
line_id <= 0;
line_start <= 0;
complete <= 0;
done <= 0;
end
end
draw_line #(.CORDW(CORDW)) draw_line_inst (
.clk,
.rst,
.start(line_start),
.oe,
.x0(lx0),
.y0(ly0),
.x1(lx1),
.y1(ly1),
.x,
.y,
.drawing,
.complete(),
.done(line_done)
);
endmodule
```

There’s a test bench you can use to exercise the module with Vivado: **[xc7/draw_triangle_tb.sv]**.

We can tweak our existing top module to draw a few triangles:

- Arty (XC7):
**xc7/top_triangles.sv** - iCEBreaker (iCE40):
**ice40/top_triangles.sv**

Arty triangles look like this:

```
// draw triangles in framebuffer
localparam SHAPE_CNT=3; // number of shapes to draw
logic [1:0] shape_id; // shape identifier
logic signed [CORDW-1:0] vx0, vy0, vx1, vy1, vx2, vy2; // shape coords
logic draw_start, drawing, draw_done; // drawing signals
// draw state machine
enum {IDLE, INIT, DRAW, DONE} state;
always_ff @(posedge clk_100m) begin
case (state)
INIT: begin // register coordinates and colour
draw_start <= 1;
state <= DRAW;
case (shape_id)
2'd0: begin
vx0 <= 60; vy0 <= 20;
vx1 <= 280; vy1 <= 80;
vx2 <= 160; vy2 <= 164;
fb_cidx <= 4'h9; // orange
end
2'd1: begin
vx0 <= 70; vy0 <= 160;
vx1 <= 220; vy1 <= 90;
vx2 <= 170; vy2 <= 10;
fb_cidx <= 4'hC; // blue
end
2'd2: begin
vx0 <= 22; vy0 <= 35;
vx1 <= 62; vy1 <= 150;
vx2 <= 98; vy2 <= 96;
fb_cidx <= 4'h2; // dark purple
end
default: begin // should never occur
vx0 <= 10; vy0 <= 10;
vx1 <= 10; vy1 <= 30;
vx2 <= 20; vy2 <= 20;
fb_cidx <= 4'h7; // white
end
endcase
end
DRAW: begin
draw_start <= 0;
if (draw_done) begin
if (shape_id == SHAPE_CNT-1) begin
state <= DONE;
end else begin
shape_id <= shape_id + 1;
state <= INIT;
end
end
end
DONE: state <= DONE;
default: if (frame_sys) state <= INIT; // IDLE
endcase
end
// control drawing speed with output enable
localparam FRAME_WAIT = 300; // wait this many frames to start drawing
logic [$clog2(FRAME_WAIT)-1:0] cnt_frame_wait;
logic draw_req; // draw requested
always_ff @(posedge clk_100m) begin
if (!fb_busy) draw_req <= 0; // disable after FB available, so 1 pix per frame
if (frame_sys) begin // once per frame
if (cnt_frame_wait != FRAME_WAIT-1) begin
cnt_frame_wait <= cnt_frame_wait + 1;
end else draw_req <= 1; // request drawing
end
end
draw_triangle #(.CORDW(CORDW)) draw_triangle_inst (
.clk(clk_100m),
.rst(1'b0),
.start(draw_start),
.oe(draw_req && !fb_busy), // draw if requested when framebuffer is available
.x0(vx0),
.y0(vy0),
.x1(vx1),
.y1(vy1),
.x2(vx2),
.y2(vy2),
.x(fbx),
.y(fby),
.drawing,
.complete(),
.done(draw_done)
);
```

We can draw millions of pixels per second, but drawing 60 per second (one per frame) is fun to watch:

## Explore

I hope you enjoyed this instalment of *Exploring FPGA Graphics*, but nothing beats creating your own designs. Here are a few suggestions to get you started:

- Experiment with different lines, triangles, and colours
- What’s the most impressive thing you can draw with a handful of straight lines?
- We drew a cube, but how about the other Platonic solids?
- Draw a landscape with one-point perspective (YouTube example)

## Next Time

Next time, we’ll be covering filled shapes in 2D Shapes.

*Constructive feedback is always welcome. Get in touch with @WillFlux or open an issue on GitHub.*