Welcome back to Exploring FPGA Graphics. It’s time to turn our attention to drawing. Most modern computer graphics come down to drawing triangles and colouring them in. So, it seems fitting to begin our tour of drawing with triangles and the straight lines that form them. This post will implement Bresenham’s line algorithm in Verilog, creating lines, triangles, and even a cube (our first sort-of 3D graphics).
In this series, we explore graphics at the hardware level and get a feel for the power of FPGAs. We start by learning how displays work, before racing the beam with Pong, starfields and sprites, simulating life with bitmaps, drawing lines and triangles, and finally creating simple 3D models. I’ll be writing and revising this series throughout 2020 and 2021. New to the series? Start with Exploring FPGA Graphics.
Updated 2021-02-26. Get in touch with @WillFlux or open an issue on GitHub.
Series Outline
- Exploring FPGA Graphics - learn how displays work and animate simple shapes
- FPGA Pong - race the beam to create the arcade classic
- Hardware Sprites - fast, colourful, graphics with minimal resources
- FPGA Ad Astra - demo with hardware sprites and animated starfields
- Framebuffers - driving the display from a bitmap in memory
- Life on Screen - the screen comes alive with Conway’s Game of Life
- Lines and Triangles (this post) - drawing lines and triangles with a framebuffer
More parts to follow.
Requirements
For this series, you need an FPGA board with video output. We’ll be working at 640x480, so pretty much any video output will do. It helps to be comfortable with programming your FPGA board and reasonably familiar with Verilog.
We’ll be demoing with these boards:
- iCEBreaker (Lattice iCE40) with 12-Bit DVI Pmod
- Digilent Arty A7-35T (Xilinx Artix-7) with Pmod VGA
Source
The SystemVerilog designs featured in this series are available from the projf-explore repo on GitHub. The designs are open source hardware under the permissive MIT licence, but this blog is subject to normal copyright restrictions.
An Address for Every Pixel
Let’s start by reminding ourselves how a framebuffer works. A framebuffer memory location backs every pixel on screen. To update a pixel, we convert its coordinates into a memory address using the framebuffer dimensions.
For this post we’ll be using:
- Arty:
320 x 240
pixels - iCEBreaker:
160 x 120
pixels
For example, to change the colour of pixel (42,21)
, we write to the following address:
- Arty:
(320 * 21) + 42 = 6,762
- iCEBreaker:
(160 * 21) + 42 = 3,402
If our pixel coordinates are (px, py)
and the framebuffer width is hres
, our Verilog is:
always_ff @(posedge clk) begin
pix_addr <= (hres * py) + px;
end
We could have written this with combinatorial logic, but timing is better if you store the result in a register and accept one cycle of latency.
Many Colours?
In our previous framebuffer design, we loaded an image and picked a palette to match. Now we’re drawing; we want the freedom to choose from a good range of colours. However, we also want to leave enough ram for double-buffering when we start animating, so we’ll settle for four colours (2 bit) on iCEBreaker, and 16 colours (4 bit) on Arty.
A single framebuffer requires:
4 * 320 * 240 = 307,200 bits
(12 of 50 BRAMs on Arty)2 * 160 * 120 = 38,400 bits
(10 of 30 BRAMs on iCEBreaker)
I have selected two palettes to get you started, but use anything you like.
16 Colour Palette
For the 16 colour palette I’ve chosen the PICO-8 palette (adjusted for 12-bit output):
We load the 16 colours into the colour lookup table (CLUT) ROM using a file: [16_colr_4bit_palette.mem].
000 // 0 - black
235 // 1 - dark-blue
825 // 2 - dark-purple
085 // 3 - dark-green
B53 // 4 - brown
655 // 5 - dark-grey
CCC // 6 - light-grey
FFF // 7 - white
F05 // 8 - red
FA0 // 9 - orange
FF2 // A - yellow
0E3 // B - green
3BF // C - blue
87A // D - indigo
F7B // E - pink
FCA // F - peach
4 Colour Palette
For the iCEBreaker palette I’ve chosen four of these colours:
000 // 0 - black
FA0 // 1 - orange
0E3 // 2 - green
3BF // 3 - blue
We load the 4 colours into the CLUT ROM using a file: [4_colr_4bit_palette.mem].
From Point to Line
We can draw a point, but we want to draw a line between two points. Bresenham’s line algorithm is the definitive way to do this, and The Beauty of Bresenham’s Algorithm has just what we need: a clearly written version of the algorithm using integers.
Here’s the C design:
void plotLine(int x0, int y0, int x1, int y1)
{
int dx = abs(x1-x0), sx = x0<x1 ? 1 : -1;
int dy = -abs(y1-y0), sy = y0<y1 ? 1 : -1;
int err = dx+dy, e2; /* error value e_xy */
for(;;){ /* loop */
setPixel(x0,y0);
if (x0==x1 && y0==y1) break;
e2 = 2*err;
if (e2 >= dy) { err += dy; x0 += sx; } /* e_xy+e_x > 0 */
if (e2 <= dx) { err += dx; y0 += sy; } /* e_xy+e_y < 0 */
}
}
For the hows and whys read A Rasterizing Algorithm for Drawing Curves (PDF). Kudos to Alois Zingl.
From C to Verilog
There are two stages to the algorithm: setting the initial values and running the algorithm in the loop.
As initial values, we need the difference between the start and end coordinates, and the sign and absolute value of that difference. Your first thought might be to mess around with two’s complement to determine abs(x1-x0)
, but we can use a little combinatorial logic, remembering to use logic signed
as needed:
logic signed [CORDW:0] dx, dy; // a bit wider as signed
logic right, down; // drawing direction
always_comb begin
right = (x0 < x1);
down = (y0 < y1);
dx = right ? x1 - x0 : x0 - x1; // dx = abs(x1 - x0)
dy = down ? y0 - y1 : y1 - y0; // dy = -abs(y1 - y0)
end
NB. The sign of dy
is different from dx
; check the C version of the algorithm to see what I mean.
Going Loopy
Next, we could quickly bash out an always_ff
block to cover the loop. But this isn’t software; there’s a trap lurking to catch the unwary.
Rewriting the C in Verilog, we could end up with the following (dubious) logic:
always_ff @(posedge clk) begin
// ...
if (e2 >= dy) begin
x <= (right) ? x + 1 : x - 1;
err <= err + dy;
end
if (e2 <= dx) begin
y <= (down) ? y + 1 : y - 1;
err <= err + dx;
end
end
At first glance, it looks OK, and your tools will almost certainly build it without complaint. Experienced Verilog engineers are probably rolling their eyes, but it’s worth thinking through why this won’t work.
Consider what happens if (e2 >= dy)
and (e2 <= dx)
are both true?
x
and y
are incremented correctly, but err <= err + dy;
is ignored. Huh?!
The <=
assignment is non-blocking and non-blocking assignments happen in parallel. The Verilog standard says that if a variable has multiple non-blocking assignments, the last assignment wins.
We can’t calculate the error with just a combinatorial block either: the new error value depends on the previous one, in other words we need to maintain state. Instead, we use a combinatorial block, with blocking assignment, to calculate the change in error, then add it to the previous value in a clocked always_ff
block:
logic signed [CORDW:0] err, derr;
logic movx, movy; // move in x and/or y required
always_comb begin
movx = (2*err >= dy);
movy = (2*err <= dx);
derr = movx ? dy : 0;
if (movy) derr = derr + dx;
end
always_ff @(posedge clk) begin
// ...
if (movx) x <= right ? x + 1 : x - 1;
if (movy) y <= down ? y + 1 : y - 1;
err <= err + derr;
end
The two blocking assignments to derr
happen one after the other. Alternatively, we could have calculated next_err
in our always_comb
block and assigned it to err
in our always_ff
; I tried this approach, but it used more LUTs.
Note how we’ve also eliminated the need for e2
; replacing it with 2*err
in our comparisons.
Ready to Draw
With the error logic fixed we’re ready to create our line drawing module [draw_line.sv]:
module draw_line #(parameter CORDW=10) ( // FB coord width in bits
input wire logic clk, // clock
input wire logic rst, // reset
input wire logic start, // start line drawing
input wire logic oe, // output enable
input wire logic [CORDW-1:0] x0, // horizontal start position
input wire logic [CORDW-1:0] y0, // vertical start position
input wire logic [CORDW-1:0] x1, // horizontal end position
input wire logic [CORDW-1:0] y1, // vertical end position
output logic [CORDW-1:0] x, // horizontal drawing position
output logic [CORDW-1:0] y, // vertical drawing position
output logic drawing, // line is drawing
output logic done // line complete (high for one tick)
);
// "constant" signals
logic signed [CORDW:0] dx, dx_c, dy, dy_c; // a bit wider as signed
logic right, right_c, down, down_c; // drawing direction
always_comb begin
right_c = (x0 < x1);
down_c = (y0 < y1);
dx_c = right_c ? x1 - x0 : x0 - x1; // dx_c = abs(x1 - x0)
dy_c = down_c ? y0 - y1 : y1 - y0; // dy_y = -abs(y1 - y0)
end
// error values
logic signed [CORDW:0] err, derr;
logic movx, movy; // move in x and/or y required
always_comb begin
movx = (2*err >= dy);
movy = (2*err <= dx);
derr = movx ? dy : 0;
if (movy) derr = derr + dx;
end
logic in_progress = 0; // drawing calculation in progress
always_comb begin
drawing = 0;
if (in_progress && oe) drawing = 1;
end
enum {IDLE, INIT, DRAW} state;
always @(posedge clk) begin
case (state)
DRAW: begin
if (x == x1 && y == y1) begin
in_progress <= 0;
done <= 1;
state <= IDLE;
end else if (oe) begin
if (movx) x <= right ? x + 1 : x - 1;
if (movy) y <= down ? y + 1 : y - 1;
err <= err + derr;
end
end
INIT: begin
err <= dx + dy;
x <= x0;
y <= y0;
in_progress <= 1;
state <= DRAW;
end
default: begin // IDLE
done <= 0;
if (start) begin // register "constant" signals
right <= right_c;
down <= down_c;
dx <= dx_c;
dy <= dy_c;
state <= INIT;
end
end
endcase
if (rst) begin
in_progress <= 0;
done <= 0;
state <= IDLE;
end
end
endmodule
The pixel to draw is output as (x,y)
and the start and end coordinates are input as (x0,y0)
and (x1,y1)
respectively. A high start
signal begins drawing and drawing completion is marked by done
. An output enable signal, oe
, allows you to pause drawing, handy for multiplexing memory access, or for slowing down the action to make it visible.
We register the “constant” signals to improve timing; this is particularly important on the iCE40 where subtraction takes two layers of logic.
There’s a test bench you can use to exercise the module for Xilinx: [xc7/draw_line_tb.sv].
We test different sorts of lines: steep and not steep, drawn upwards, downwards, left to right, and right to left, as well as points, and the longest possible horizontal, vertical, and diagonal lines. A steep line is one in which the vertical change is larger than the horizontal.
Top of the Line
It’s time to get drawing with actual hardware.
Create a new top module and build it for your board:
- Xilinx XC7: xc7/top_line.sv
- Lattice iCE40: ice40/top_line.sv
This design is similar to the top modules we used in the framebuffers post.
Building the Designs
In the Lines and Triangles section of the git repo, you’ll find the design files, a makefile for iCEBreaker, a Vivado project for Arty, and instructions for building the designs for both boards.
The XC7 version of top_line
looks like this:
module top_line (
input wire logic clk_100m, // 100 MHz clock
input wire logic btn_rst, // reset button (active low)
output logic vga_hsync, // horizontal sync
output logic vga_vsync, // vertical sync
output logic [3:0] vga_r, // 4-bit VGA red
output logic [3:0] vga_g, // 4-bit VGA green
output logic [3:0] vga_b // 4-bit VGA blue
);
// generate pixel clock
logic clk_pix;
logic clk_locked;
clock_gen clock_640x480 (
.clk(clk_100m),
.rst(!btn_rst), // reset button is active low
.clk_pix,
.clk_locked
);
// display timings
localparam CORDW = 10; // screen coordinate width in bits
logic [CORDW-1:0] sx, sy;
logic hsync, vsync, de;
display_timings_480p timings_640x480 (
.clk_pix,
.rst(!clk_locked), // wait for clock lock
.sx,
.sy,
.hsync,
.vsync,
.de
);
// size of screen with and without blanking
localparam H_RES_FULL = 800;
localparam V_RES_FULL = 525;
localparam H_RES = 640;
localparam V_RES = 480;
// vertical blanking interval (will move to display_timings soon)
logic vbi;
always_comb vbi = (sy == V_RES && sx == 0);
// framebuffer (FB)
localparam FB_WIDTH = 320;
localparam FB_HEIGHT = 240;
localparam FB_CORDW = $clog2(FB_WIDTH); // assumes WIDTH>=HEIGHT
localparam FB_PIXELS = FB_WIDTH * FB_HEIGHT;
localparam FB_ADDRW = $clog2(FB_PIXELS);
localparam FB_DATAW = 4; // colour bits per pixel
localparam FB_IMAGE = "";
localparam FB_PALETTE = "16_colr_4bit_palette.mem";
logic fb_we;
logic [FB_ADDRW-1:0] fb_addr_write, fb_addr_read;
logic [FB_DATAW-1:0] fb_cidx_write;
logic [FB_DATAW-1:0] fb_cidx_read, fb_cidx_read_1;
bram_sdp #(
.WIDTH(FB_DATAW),
.DEPTH(FB_PIXELS),
.INIT_F(FB_IMAGE)
) fb_inst (
.clk_write(clk_pix),
.clk_read(clk_pix),
.we(fb_we),
.addr_write(fb_addr_write),
.addr_read(fb_addr_read),
.data_in(fb_cidx_write),
.data_out(fb_cidx_read_1)
);
// draw line in framebuffer
logic [FB_CORDW-1:0] lx0, ly0, lx1, ly1; // line start and end coords
logic [FB_CORDW-1:0] px, py; // line pixel drawing coordinates
logic draw_start, drawing, draw_done; // draw_line signals
// draw state machine
enum {IDLE, INIT, DRAW, DONE} state;
initial state = IDLE; // needed for Yosys
always @(posedge clk_pix) begin
draw_start <= 0;
case (state)
INIT: begin // register coordinates and colour
lx0 <= 40; ly0 <= 0;
lx1 <= 279; ly1 <= 239;
fb_cidx_write <= 4'h9; // orange
draw_start <= 1;
state <= DRAW;
end
DRAW: if (draw_done) state <= DONE;
DONE: state <= DONE;
default: if (vbi) state <= INIT; // IDLE
endcase
end
draw_line #(.CORDW(FB_CORDW)) draw_line_inst (
.clk(clk_pix),
.rst(!clk_locked),
.start(draw_start),
.oe(1'b1),
.x0(lx0),
.y0(ly0),
.x1(lx1),
.y1(ly1),
.x(px),
.y(py),
.drawing,
.done(draw_done)
);
// pixel coordinate to memory address calculation takes one cycle
always_ff @(posedge clk_pix) fb_we <= drawing;
pix_addr #(
.CORDW(FB_CORDW),
.ADDRW(FB_ADDRW)
) pix_addr_inst (
.clk(clk_pix),
.hres(FB_WIDTH),
.px,
.py,
.pix_addr(fb_addr_write)
);
// linebuffer (LB)
localparam LB_SCALE = 2; // scale (horizontal and vertical)
localparam LB_LEN = FB_WIDTH; // line length matches framebuffer
localparam LB_BPC = 4; // bits per colour channel
// LB output to display
logic lb_en_out;
always_comb lb_en_out = de; // Use 'de' for entire frame
// Load data from FB into LB
logic lb_data_req; // LB requesting data
logic [$clog2(LB_LEN+1)-1:0] cnt_h; // count pixels in line to read
always_ff @(posedge clk_pix) begin
if (vbi) fb_addr_read <= 0; // new frame
if (lb_data_req && sy != V_RES-1) begin // load next line of data...
cnt_h <= 0; // ...if not on last line
end else if (cnt_h < LB_LEN) begin // advance to start of next line
cnt_h <= cnt_h + 1;
fb_addr_read <= fb_addr_read == FB_PIXELS-1 ? 0 : fb_addr_read + 1;
end
end
// FB BRAM and CLUT pipeline adds three cycles of latency
logic lb_en_in_2, lb_en_in_1, lb_en_in;
always_ff @(posedge clk_pix) begin
lb_en_in_2 <= (cnt_h < LB_LEN);
lb_en_in_1 <= lb_en_in_2;
lb_en_in <= lb_en_in_1;
end
// LB colour channels
logic [LB_BPC-1:0] lb_in_0, lb_in_1, lb_in_2;
logic [LB_BPC-1:0] lb_out_0, lb_out_1, lb_out_2;
linebuffer #(
.WIDTH(LB_BPC), // data width of each channel
.LEN(LB_LEN), // length of line
.SCALE(LB_SCALE) // scaling factor (>=1)
) lb_inst (
.clk_in(clk_pix), // input clock
.clk_out(clk_pix), // output clock
.data_req(lb_data_req), // request input data (clk_in)
.en_in(lb_en_in), // enable input (clk_in)
.en_out(lb_en_out), // enable output (clk_out)
.vbi, // start of vertical blanking interval (clk_out)
.din_0(lb_in_0), // data in (clk_in)
.din_1(lb_in_1),
.din_2(lb_in_2),
.dout_0(lb_out_0), // data out (clk_out)
.dout_1(lb_out_1),
.dout_2(lb_out_2)
);
// improve timing with register between BRAM and async ROM
always @(posedge clk_pix) begin
fb_cidx_read <= fb_cidx_read_1;
end
// colour lookup table (ROM) 16x12-bit entries
logic [11:0] clut_colr;
rom_async #(
.WIDTH(12),
.DEPTH(16),
.INIT_F(FB_PALETTE)
) clut (
.addr(fb_cidx_read),
.data(clut_colr)
);
// map colour index to palette using CLUT and read into LB
always_ff @(posedge clk_pix) begin
{lb_in_2, lb_in_1, lb_in_0} <= clut_colr;
end
// LB output adds one cycle of latency - need to correct display signals
logic hsync_1, vsync_1, lb_en_out_1;
always_ff @(posedge clk_pix) begin
hsync_1 <= hsync;
vsync_1 <= vsync;
lb_en_out_1 <= lb_en_out;
end
// VGA output
always_ff @(posedge clk_pix) begin
vga_hsync <= hsync_1;
vga_vsync <= vsync_1;
vga_r <= lb_en_out_1 ? lb_out_2 : 4'h0;
vga_g <= lb_en_out_1 ? lb_out_1 : 4'h0;
vga_b <= lb_en_out_1 ? lb_out_0 : 4'h0;
end
endmodule
Note how we set the line colour using the hex index value from the palette: 0x9
is orange.
The module for calculating the pixel address looks like this [pix_addr.sv]:
module pix_addr #(
parameter CORDW=10, // framebuffer coordinate width in bits
parameter ADDRW=13 // width of memory address bus
) (
input wire logic clk, // clock
input wire logic [CORDW-1:0] hres, // horizontal framebuffer resolution
input wire logic [CORDW-1:0] px, // horizontal pixel position
input wire logic [CORDW-1:0] py, // vertical pixel position
output logic [ADDRW-1:0] pix_addr // pixel address
);
always_ff @(posedge clk) begin
pix_addr <= (hres * py) + px;
end
endmodule
pix_addr
has only one line of logic, and a simple one at that, so a module seems like overkill. However, when you use a more complex memory set up, it helps to have this logic defined in one place.
Calculating the pixel address takes one cycle, so we need to delay write enable for the framebuffer by one cycle in the top modules: always_ff @(posedge clk_pix) fb_we <= drawing;
.
That Ain’t No Cube
If we can draw one line, we can draw many! Let’s draw a cube as you’ve probably doodled on paper; this requires nine lines. To see how the drawing works, we’ve wired the drawing output enable to the animate
signal. Each frame, one new pixel is drawn, with a delay of 300 frames to allow the monitor time to start showing the image.
- Xilinx XC7: xc7/top_cube.sv
- Lattice iCE40: ice40/top_cube.sv
The cube drawing part looks like this for iCE40:
// draw cube in framebuffer
localparam LINE_CNT=9;
logic [3:0] line_id; // line identifier
logic [FB_CORDW-1:0] lx0, ly0, lx1, ly1; // line start and end coords
logic [FB_CORDW-1:0] px, py; // line pixel drawing coordinates
logic draw_start, drawing, draw_done; // draw_line signals
// draw state machine
enum {IDLE, INIT, DRAW, DONE} state;
initial state = IDLE; // needed for Yosys
always @(posedge clk_pix) begin
draw_start <= 0;
case (state)
INIT: begin // register coordinates and colour
draw_start <= 1;
state <= DRAW;
fb_cidx_write <= 2'h2; // green
case (line_id)
4'd0: begin
lx0 <= 65; ly0 <= 45; lx1 <= 115; ly1 <= 45;
end
4'd1: begin
lx0 <= 115; ly0 <= 45; lx1 <= 115; ly1 <= 95;
end
4'd2: begin
lx0 <= 115; ly0 <= 95; lx1 <= 65; ly1 <= 95;
end
4'd3: begin
lx0 <= 65; ly0 <= 95; lx1 <= 65; ly1 <= 45;
end
4'd4: begin
lx0 <= 65; ly0 <= 95; lx1 <= 45; ly1 <= 75;
end
4'd5: begin
lx0 <= 45; ly0 <= 75; lx1 <= 45; ly1 <= 25;
end
4'd6: begin
lx0 <= 45; ly0 <= 25; lx1 <= 65; ly1 <= 45;
end
4'd7: begin
lx0 <= 45; ly0 <= 25; lx1 <= 95; ly1 <= 25;
end
4'd8: begin
lx0 <= 95; ly0 <= 25; lx1 <= 115; ly1 <= 45;
end
default: begin
lx0 <= 0; ly0 <= 0; lx1 <= 0; ly1 <= 0;
end
endcase
end
DRAW: if (draw_done) begin
if (line_id == LINE_CNT-1) begin
state <= DONE;
end else begin
line_id <= line_id + 1;
state <= INIT;
end
end
DONE: state <= DONE;
default: if (vbi) state <= INIT; // IDLE
endcase
end
// control drawing output enable - wait 300 frames, then 1 pixel/frame
localparam DRAW_WAIT = 300;
logic [$clog2(DRAW_WAIT)-1:0] cnt_draw_wait;
logic draw_oe;
always_ff @(posedge clk_pix) begin
draw_oe <= 0;
if (vbi) begin
if (cnt_draw_wait != DRAW_WAIT-1) begin
cnt_draw_wait <= cnt_draw_wait + 1;
end else draw_oe <= 1;
end
end
draw_line #(.CORDW(FB_CORDW)) draw_line_inst (
.clk(clk_pix),
.rst(!clk_locked),
.start(draw_start),
.oe(draw_oe),
.x0(lx0),
.y0(ly0),
.x1(lx1),
.y1(ly1),
.x(px),
.y(py),
.drawing,
.done(draw_done)
);
It looks like a cube, but it’s an ersatz cube. Our cube has no real depth; it cannot move in 3D space, nor can we apply realistic lighting. We’ll cover real 3D models in a later post, but for now let’s turn our attention to the most critical shape in computer graphics: the triangle.
The Triangle
As you gaze upon the beautiful 4K vista from a AAA game in 2021, know this: it’s all triangles!
A triangle consists of three lines so that we could issue three draw_line commands, but it’s so useful, it deserves its own module [draw_triangle.sv]:
module draw_triangle #(parameter CORDW=10) ( // FB coord width in bits
input wire logic clk, // clock
input wire logic rst, // reset
input wire logic start, // start triangle drawing
input wire logic oe, // output enable
input wire logic [CORDW-1:0] x0, // vertex 0 - horizontal position
input wire logic [CORDW-1:0] y0, // vertex 0 - vertical position
input wire logic [CORDW-1:0] x1, // vertex 1 - horizontal position
input wire logic [CORDW-1:0] y1, // vertex 1 - vertical position
input wire logic [CORDW-1:0] x2, // vertex 2 - horizontal position
input wire logic [CORDW-1:0] y2, // vertex 2 - vertical position
output logic [CORDW-1:0] x, // horizontal drawing position
output logic [CORDW-1:0] y, // vertical drawing position
output logic drawing, // triangle is drawing
output logic done // triangle complete (high for one tick)
);
enum {IDLE, INIT, DRAW} state;
localparam CNT_LINE = 3; // triangle has three lines
logic [$clog2(CNT_LINE)-1:0] line_id; // current line
logic line_start; // start drawing line
logic line_done; // finished drawing current line?
// line coordinates
logic [CORDW-1:0] lx0, ly0; // current line start position
logic [CORDW-1:0] lx1, ly1; // current line end position
always @(posedge clk) begin
line_start <= 0;
case (state)
INIT: begin // register coordinates
if (line_id == 2'd0) begin // (x0,y0) -> (x1,y1)
lx0 <= x0; ly0 <= y0;
lx1 <= x1; ly1 <= y1;
end else if (line_id == 2'd1) begin // (x1,y1) -> (x2,y2)
lx0 <= x1; ly0 <= y1;
lx1 <= x2; ly1 <= y2;
end else begin // (x2,y2) -> (x0,y0)
lx0 <= x2; ly0 <= y2;
lx1 <= x0; ly1 <= y0;
end
state <= DRAW;
line_start <= 1;
end
DRAW: begin
if (line_done) begin
if (line_id == CNT_LINE-1) begin
done <= 1;
state <= IDLE;
end else begin
line_id <= line_id + 1;
state <= INIT;
end
end
end
default: begin // IDLE
done <= 0;
if (start) begin
line_id <= 0;
state <= INIT;
end
end
endcase
if (rst) begin
line_id <= 0;
line_start <= 0;
done <= 0;
state <= IDLE;
end
end
draw_line #(.CORDW(CORDW)) draw_line_inst (
.clk,
.rst,
.start(line_start),
.oe,
.x0(lx0),
.y0(ly0),
.x1(lx1),
.y1(ly1),
.x,
.y,
.drawing,
.done(line_done)
);
endmodule
There’s a test bench you can use to exercise the module for Xilinx: [xc7/draw_triangle_tb.sv].
And a similar top module can be used to draw a few triangles:
- Xilinx XC7: xc7/top_triangles.sv
- Lattice iCE40: ice40/top_triangles.sv
The triangle drawing part looks like this for XC7:
// draw shapes in framebuffer
localparam SHAPE_CNT=3;
logic [1:0] shape_id; // shape identifier
logic [FB_CORDW-1:0] tx0, ty0, tx1, ty1, tx2, ty2; // triangle coords
logic [FB_CORDW-1:0] px, py; // triangle pixel drawing coordinates
logic draw_start, drawing, draw_done; // draw_line signals
// draw state machine
enum {IDLE, INIT, DRAW, DONE} state;
initial state = IDLE; // needed for Yosys
always @(posedge clk_pix) begin
draw_start <= 0;
case (state)
INIT: begin // register coordinates and colour
draw_start <= 1;
state <= DRAW;
case (shape_id)
2'd0: begin
tx0 <= 20; ty0 <= 60;
tx1 <= 60; ty1 <= 180;
tx2 <= 110; ty2 <= 90;
fb_cidx_write <= 4'h2; // dark purple
end
2'd1: begin
tx0 <= 70; ty0 <= 200;
tx1 <= 240; ty1 <= 100;
tx2 <= 170; ty2 <= 10;
fb_cidx_write <= 4'hC; // blue
end
2'd2: begin
tx0 <= 60; ty0 <= 30;
tx1 <= 300; ty1 <= 80;
tx2 <= 160; ty2 <= 220;
fb_cidx_write <= 4'h9; // orange
end
default: begin // should never occur
tx0 <= 10; ty0 <= 10;
tx1 <= 10; ty1 <= 30;
tx2 <= 20; ty2 <= 20;
fb_cidx_write <= 4'h7; // white
end
endcase
end
DRAW: if (draw_done) begin
if (shape_id == SHAPE_CNT-1) begin
state <= DONE;
end else begin
shape_id <= shape_id + 1;
state <= INIT;
end
end
DONE: state <= DONE;
default: if (vbi) state <= INIT; // IDLE
endcase
end
// control drawing output enable - wait 300 frames, then 1 pixel/frame
localparam DRAW_WAIT = 300;
logic [$clog2(DRAW_WAIT)-1:0] cnt_draw_wait;
logic draw_oe;
always_ff @(posedge clk_pix) begin
draw_oe <= 0;
if (vbi) begin
if (cnt_draw_wait != DRAW_WAIT-1) begin
cnt_draw_wait <= cnt_draw_wait + 1;
end else draw_oe <= 1;
end
end
draw_triangle #(.CORDW(FB_CORDW)) draw_triangle_inst (
.clk(clk_pix),
.rst(!clk_locked),
.start(draw_start),
.oe(draw_oe),
.x0(tx0),
.y0(ty0),
.x1(tx1),
.y1(ty1),
.x2(tx2),
.y2(ty2),
.x(px),
.y(py),
.drawing,
.done(draw_done)
);
We can draw millions of pixels per second, but drawing 60 per second (one per frame) is fun to watch:
Explore
I hope you enjoyed this instalment of Exploring FPGA Graphics, but nothing beats creating your own designs. Here are a few suggestions to get you started:
- Experiment with different lines, triangles, and colours
- What’s the most impressive thing you can draw with a handful of straight lines?
- We drew a cube, but how about the other Platonic solids?
- Draw nested squares of different colours
- Cycle the colours every frame to create the feeling of flying down a tunnel
- Draw a landscape with one point perspective (YouTube example)
Next Time
Next time we’ll be expanding our repetoir of shapes and animating them using double buffering. Stay tuned.
Constructive feedback is always welcome. Get in touch with @WillFlux or open an issue on GitHub.