10 June 2020

FPGA Ad Astra

Welcome back to Exploring FPGA Graphics. In the previous part we learnt how to create hardware sprites. In this fourth part, we create a demo by combining our knowledge of sprites with animated starfields.

In this series, we explore graphics at the hardware level and get a feel for the power of FPGAs. We start by learning how displays work, before racing the beam with Pong, starfields and sprites, simulating life with bitmaps, drawing lines and triangles, and finally creating simple 3D models. I’ll be writing and revising this series throughout 2020. New to the series? Start with Exploring FPGA Graphics.

Updated 2020-11-25. Get in touch with @WillFlux or open an issue on GitHub.

Series Outline

  • Exploring FPGA Graphics - learn how displays work and animate simple shapes
  • FPGA Pong - race the beam to create the arcade classic
  • Hardware Sprites - fast, colourful, graphics with minimal resources
  • FPGA Ad Astra (this post) - demo with hardware sprites and animated starfields
  • Framebuffers - driving the display from a bitmap in memory
  • Life on Screen - the screen comes alive with Conway’s Game of Life

More parts to follow.

Requirements

For this series, you need an FPGA board with video output. We’ll be working at 640x480, so pretty much any video output will do. You should be comfortable with programming your FPGA board and reasonably familiar with Verilog.

We’ll be demoing with these boards:

Source

The SystemVerilog designs featured in this series are available from the projf-explore repo on GitHub. The designs are open source hardware under the permissive MIT licence, but this blog is subject to normal copyright restrictions.

Computer Space

Computer Space was the first video arcade game: it features a simple backdrop of stars over which the player’s rocket battles two flying saucers. The backdrop may have been simple, but this game was released in 1971, when a “cheap” computer, such as the Data General Nova, cost around $8,000 (c. $40,000 in 2020). Unable to find anything both fast and cheap, developers Nolan Bushnell and Ted Dabney created their own custom hardware instead. It’s strangely hard to find details of the Computer Space logic design on the Internet, but TTL 7400s were used.

Quick Aside: Computer Space Cabinet
Computer Space had an amazing fibreglass cabinet. You can see more photos at Marvin’s Marvelous Mechanical Museum (courtesy of the Wayback Machine).

Linear Feedback Shift Register

I don’t know how Computer Space created its starfield, but we’re going to use a linear feedback shift register (henceforth LFSR). Rather like field programmable gate arrays themselves, their name makes LFSRs sound arcane and inscrutable; happily for us, this is not the case.

An LFSR can create a pseudorandom number sequence in which every number appears just once. For example, an 8-bit LSFR can generate all the numbers from 1-255 in a repeatable sequence that seems random. The logic for an LFSR can be written in a single line of Verilog:

sreg <= {1'b0, sreg[7:1]} ^ (sreg[0] ? 8'b10111000 : 8'b0);

The first part of the statement right-shifts the shift-register one bit. In the second part, we check the value of the bit we shifted: this is the feedback. If the feedback is true, we XOR the whole shift register with a magic pattern of bits known as taps. Values at the bit positions of the taps are flipped by the XOR. The initial value of an LFSR is known as the seed.

For example, an 8-bit LFSR with a seed of 169:

  • 10101001 - seed value (169)
  • 01010100 - after right shift (bit shifted out was 1)
  • 01010100 XOR 10111000 - XOR with taps
  • 11101100 - 2nd value (236)
  • 01110110 - after right shift (bit shifted out was 0)
  • 01110110 XOR 00000000 - XOR with 0
  • 01110110 - 3rd value (118)

For an 8-bit LFSR, we set tap bits 8, 6, 5, and 4. To find taps for other lengths, you can refer to the feedback polynomials on Wikipedia.

To create a starfield, we need a sequence that’s the same length as the number of pixels we’re drawing. If we iterate through our shift register every time we get to a new pixel, then each pixel will always be associated with the same number in the shift register. We could take a single bit and draw a star if it were true, but that would cover half the pixels on the screen with stars; instead, we only draw a star if a group of bits are all 1.

We’ll start with a 17-bit LFSR, as its longest sequence, 217-1, fits within 640x480. If we draw this within an area of 512x256, which has 217 pixels, then the starfield will move left by one pixel every frame.

Create an LFSR module [lfsr.sv]:

module lfsr #(
    parameter LEN=8,                  // shift register length
    parameter TAPS=8'b10111000,       // XOR taps
    parameter SEED={LEN{1'b1}}        // initial seed value
    ) (
    input  wire logic clk,            // clock
    input  wire logic rst,            // reset
    input  wire logic en,             // enable
    output      logic [LEN-1:0] sreg  // lfsr output
    );

    always_ff @(posedge clk) begin
        if (en) sreg <= {1'b0, sreg[LEN-1:1]} ^ (sreg[0] ? TAPS : {LEN{1'b0}});
        if (rst) sreg <= SEED;
    end
endmodule

We use a simple top module to drive it, based on the design we used in Exploring FPGA Graphics.

Building the Designs
In the FPGA Ad Astra section of the git repo, you’ll find the design files, a makefile for iCEBreaker, a Vivado project for Arty, and instructions for building the designs for both boards.

Shown below is the LSFR top module for iCE40:

module top_lfsr (
    input  wire logic clk_12m,      // 12 MHz clock
    input  wire logic btn_rst,      // reset button (active high)
    output      logic dvi_clk,      // DVI pixel clock
    output      logic dvi_hsync,    // DVI horizontal sync
    output      logic dvi_vsync,    // DVI vertical sync
    output      logic dvi_de,       // DVI data enable
    output      logic [3:0] dvi_r,  // 4-bit DVI red
    output      logic [3:0] dvi_g,  // 4-bit DVI green
    output      logic [3:0] dvi_b   // 4-bit DVI blue
    );

    // generate pixel clock
    logic clk_pix;
    logic clk_locked;
    clock_gen clock_640x480 (
       .clk(clk_12m),
       .rst(btn_rst),
       .clk_pix,
       .clk_locked
    );

    // display timings
    localparam CORDW = 10;  // screen coordinate width in bits
    logic [CORDW-1:0] sx, sy;
    logic de;
    display_timings timings_640x480 (
        .clk_pix,
        .rst(!clk_locked),  // wait for clock lock
        .sx,
        .sy,
        .hsync(dvi_hsync),
        .vsync(dvi_vsync),
        .de
    );

    logic sf_area;
    always_comb sf_area = (sx < 512 && sy < 256);

    // 17-bit LFSR
    logic [16:0] sf_reg;
    lfsr #(
        .LEN(17),
        .TAPS(17'b10010000000000000)
    ) lsfr_sf (
        .clk(clk_pix),
        .rst(!clk_locked),
        .en(sf_area && de),
        .sreg(sf_reg)
    );

    // adjust star density (~512 stars for AND 8 bits with 512x256)
    logic star;
    always_comb star = &{sf_reg[16:9]};

    // DVI clock output
    SB_IO #(
        .PIN_TYPE(6'b010000)
    ) dvi_clk_buf (
        .PACKAGE_PIN(dvi_clk),
        .CLOCK_ENABLE(1'b1),
        .OUTPUT_CLK(clk_pix),
        .D_OUT_0(1'b0),
        .D_OUT_1(1'b1)
    );

    // DVI output
    always_ff @(posedge clk_pix) begin
        dvi_de <= de;
        dvi_r <= (de && sf_area && star) ? sf_reg[3:0] : 4'h0;
        dvi_g <= (de && sf_area && star) ? sf_reg[3:0] : 4'h0;
        dvi_b <= (de && sf_area && star) ? sf_reg[3:0] : 4'h0;
    end
endmodule

A Screen Full of Sky

If we use the maximum sequence of an LFSR, then our starfield is limited to a few sizes. To fill the screen, we choose an LFSR that produces a sequence longer than our number of pixels, then restart it when we reach the end of the screen. There are two ways to do this:

  1. Find the LFSR value at the point we want to restart
  2. Use a separate counter

The first option is extremely efficient when it comes to logic. As each value appears only once, it uniquely describes a position in the sequence. Historically, LFSR were used as counters, including on FPGAs. Xilinx has a nice application note describing this: XAPP210.

The second option requires separate counter logic, but on a contemporary FPGA, the cost is minimal. The advantage of this approach is we can easily adjust the direction and speed of the starfield by counting a little more or a little less. This is the approach we’ll use in this project.

We’re going to want a few starfields, so let’s create a dedicated module [starfield.sv]:

module starfield #(
    parameter H=800,
    parameter V=525,
    parameter INC=-1,
    parameter SEED=21'h1FFFFF,
    parameter MASK=21'hFFF
    ) (
    input  wire logic clk,           // clock
    input  wire logic en,            // enable
    input  wire logic rst,           // reset
    output      logic sf_on,         // star on
    output      logic [7:0] sf_star  // star brightness
    );

    localparam RST_CNT = H * V + INC - 1;  // counter starts at zero, so subtract 1
    logic [20:0] sf_reg, sf_cnt;

    always_ff @(posedge clk) begin
        if (en) begin
            sf_cnt <= sf_cnt + 1;
            if (sf_cnt == RST_CNT) sf_cnt <= 0;
        end
        if (rst) sf_cnt <= 0;
    end

    // select some bits to form stars
    always_comb begin
        sf_on = &{sf_reg | MASK};
        sf_star = sf_reg[7:0];
    end

    lfsr #(
        .LEN(21),
        .TAPS(21'b101000000000000000000),
        .SEED(SEED)
        ) lsfr_sf (
        .clk,
        .rst(sf_cnt == 21'b0),
        .en,
        .sreg(sf_reg)
    );
endmodule

The starfield module defaults to a 21-bit LFSR, which has a maximum sequence of just over two million. The module takes the screen dimensions as parameters H and W: we’re using the full screen, including the blanking interval, so the starfield doesn’t immediately repeat. The MASK allows us to control the stellar density, the more 1s in the mask, the more stars there will be. Finally, the module outputs an 8-bit value for star brightness, which can be used to create a more varied starfield.

Into Space

Using our module, we can create multiple starfields at different speeds and densities to give that real in-space feeling. Our example top module has three starfields:

Rebuild your project with the new starfield and top modules. Try experimenting with the INC and MASK parameters to create different speeds and densities.

Greetings, World!

Our starfields make an ideal backdrop for a greetings demo. Sprites are not ideal for large quantities of text, but do make for a flexible way to animate short messages.

Bitmap Font

If we’re going to display text messages, we need a font. I’ve experimented with a few simple bitmap fonts in the past, but Unscii is one of the best. It’s available in 8x8 and 8x16/16x16 pixels with thousands of glyphs for many languages and those for ASCII art. Plus, the GNU Unifont hexdump version is trivial to convert to readmemh format for use with Verilog.

The full list of glyphs is too large to be held in internal FPGA memory. For the purposes of this demo, I’ve created two subsets of the Unscii font: one for upper-case basic Latin (including punctuation and numbers), and one for Hiragana (without marks):

Hex Glyphs

If you look at the entry for ‘F’ in the Latin memory file you won’t find an 8x8 array of 1s and 0s:
7E 60 60 7C 60 60 60 00 // U+0046 (F)

Each eight-pixel line of the glyph is represented by two hex digits, so 0x7E is the first line of pixels, 0x60 the second etc. There are two ways these lines could be drawn: most significant bit (MSB) first, or least significant bit (LSB) first. For our ‘F’ glyph the results of MSB and LSB first are shown below.

 ######         ######
 ##                 ##
 ##                 ##
 #####           #####
 ##                 ##
 ##                 ##
 ##                 ##

For this font we want to draw the MSB first, but other font data might require the LSB first; we provide support for both options in our module (discussed below).

Choosing Your Own Characters
You can easily create your own font version with different characters: check out the hex source on the Unscii site. Use VS Code Column Selection Mode to quickly turn Unifont hex into readmemh format. Just be aware you can’t mix different glyph sizes with our sprite design, and watch out for memory usage: the Hiragana file uses 6/30 iCEBreaker BRAMs.

New Sprite Features

We already have a solid sprite design from the previous part, but there are three improvements we’ll make for working with fonts.

Shared Memory Bus

In Hardware Sprites, each sprite had exclusive access to a memory instance. Exclusivity works well for simple sprites with fixed graphics, but doesn’t make it easy to change or share sprite graphics. Instead, we’ll load a set of font glyphs into memory, then share them amongst all the sprite instances.

Sharing requires arbitration between the different sprite instances: only one sprite can read from the memory at a time. We control access by adding a dma_avail signal to the sprite module: the sprite waits for this signal before reading from memory. To avoid potential clashes, we read the required data for each sprite in the blanking interval, then it can freely draw whenever it likes on the following line.

An Entire Line

Our fonts are monochrome: each pixel is either 0 or 1, so we could read one bit at a time over a one-bit bus. However, as we’re reading the sprite during the blanking interval there’s no need to read a pixel at a time; it’s more efficient to read a whole line of 8 or 16 pixels (for the Hiragana glyphs) at a time. By reading a whole sprite line in a single clock, we make very light use of the memory bus. We will also change the memory layout in the top module (covered later), so each glyph consists of 8 or 16 entries of 8 or 16 bits.

Most or Least Significant?

As we discussed in Hex Glyphs, above, a font may be drawn most or least significant bit first. Because we load an entire glyph line into our sprite at a time, it’s easy to reverse the bits with a for loop if required:

    if (state == READ_MEM) begin
        if (LSB) begin
            spr_line <= data_in;  // NB. Assumes read takes one clock cycle
        end else begin  // reverse if MSB is left-most pixel
            for (i=0; i<WIDTH; i=i+1) spr_line[i] <= data_in[(WIDTH-1)-i];
        end
    end

On the surface, for loops in Verilog seem the same as those in software; this is misleading.

A for loop in Verilog duplicates logic, so the above bit reversal is actually equivilent to (if WIDTH=8):

    spr_line[0] <= data_in[7];
    spr_line[1] <= data_in[6];
    spr_line[2] <= data_in[5];
    spr_line[3] <= data_in[4];
    spr_line[4] <= data_in[3];
    spr_line[5] <= data_in[2];
    spr_line[6] <= data_in[1];
    spr_line[7] <= data_in[0];

All eight bits are read in one cycle. Using the for loop doesn’t change the design, but makes writing it much more compact and less error prone. We’ll make further use of for loops shortly, to handle multiple sprites.

New Sprite Module

Our new sprite module incorporating these changes [src]:

module sprite #(
    parameter WIDTH=8,         // graphic width in pixels
    parameter HEIGHT=8,        // graphic height in pixels
    parameter SCALE_X=1,       // sprite width scale-factor
    parameter SCALE_Y=1,       // sprite height scale-factor
    parameter LSB=1,           // first pixel in LSB
    parameter CORDW=10,        // width of screen coordinates
    parameter H_RES_FULL=800,  // horizontal screen resolution inc. blanking
    parameter ADDRW=9          // width of graphic memory address bus
    ) (
    input  wire logic clk,                  // clock
    input  wire logic rst,                  // reset
    input  wire logic start,                // start control
    input  wire logic dma_avail,            // memory access control
    input  wire logic [CORDW-1:0] sx,       // horizontal screen position
    input  wire logic [CORDW-1:0] sprx,     // horizontal sprite position
    input  wire logic [WIDTH-1:0] data_in,  // data from external memory
    output      logic [ADDRW-1:0] pos,      // sprite line position
    output      logic pix,                  // pixel colour to draw (0 or 1)
    output      logic draw,                 // signal sprite is drawing
    output      logic done                  // signal sprite drawing is complete
    );

    logic [WIDTH-1:0] spr_line;  // local copy of sprite line

    // position within sprite
    logic [$clog2(WIDTH)-1:0]  ox;
    logic [$clog2(HEIGHT)-1:0] oy;

    // scale counters
    logic [$clog2(SCALE_X)-1:0] cnt_x;
    logic [$clog2(SCALE_Y)-1:0] cnt_y;

    enum {
        IDLE,       // awaiting start signal
        START,      // prepare for new sprite drawing
        AWAIT_DMA,  // await access to memory
        READ_MEM,   // read line of sprite from memory
        AWAIT_POS,  // await horizontal position
        DRAW,       // draw pixel
        NEXT_LINE,  // prepare for next sprite line
        DONE        // set done signal
    } state, state_next;

    integer i;  // for bit reversal in READ_MEM

    always_ff @(posedge clk) begin
        state <= state_next;  // advance to next state

        if (state == START) begin
            done <= 0;
            oy <= 0;
            cnt_y <= 0;
            pos <= 0;
        end

        if (state == READ_MEM) begin
            if (LSB) begin
                spr_line <= data_in;  // NB. Assumes read takes one clock cycle
            end else begin  // reverse if MSB is left-most pixel
                for (i=0; i<WIDTH; i=i+1) spr_line[i] <= data_in[(WIDTH-1)-i];
            end
         end

        if (state == AWAIT_POS) begin
            ox <= 0;
            cnt_x <= 0;
        end

        if (state == DRAW) begin
            if (SCALE_X <= 1 || cnt_x == SCALE_X-1) begin
                ox <= ox + 1;
                cnt_x <= 0;
            end else begin
                cnt_x <= cnt_x + 1;
            end
        end

        if (state == NEXT_LINE) begin
            if (SCALE_Y <= 1 || cnt_y == SCALE_Y-1) begin
                oy <= oy + 1;
                cnt_y <= 0;
                pos <= pos + 1;
            end else begin
                cnt_y <= cnt_y + 1;
            end
        end

        if (state == DONE) begin
            done <= 1;
        end

        if (rst) begin
            state <= IDLE;
            ox <= 0;
            oy <= 0;
            cnt_x <= 0;
            cnt_y <= 0;
            spr_line <= 0;
            pos <= 0;
            done <= 0;
        end
    end

    // output current pixel colour when drawing
    always_comb begin
        pix = (state == DRAW) ? spr_line[ox] : 0;
    end

    // create status signals and correct horizontal position
    logic last_pixel, load_line, last_line;
    logic [CORDW-1:0] sprx_cor;
    always_comb begin
        last_pixel = (ox == WIDTH-1 && cnt_x == SCALE_X-1);
        load_line  = (cnt_y == SCALE_Y-1);
        last_line  = (oy == HEIGHT-1 && cnt_y == SCALE_Y-1);
        draw = (state == DRAW);

        // BRAM adds an extra cycle of latency
        case (sprx)
            0: sprx_cor = H_RES_FULL - 2;
            1: sprx_cor = H_RES_FULL - 1;
            default: sprx_cor = sprx - 2;
        endcase
    end

    // determine next state
    always_comb begin
        case(state)
            IDLE:       state_next = start ? START : IDLE;
            START:      state_next = AWAIT_DMA;
            AWAIT_DMA:  state_next = dma_avail ? READ_MEM : AWAIT_DMA;
            READ_MEM:   state_next = AWAIT_POS;
            AWAIT_POS:  state_next = (sx == sprx_cor) ? DRAW : AWAIT_POS;
            DRAW:       state_next = !last_pixel ? DRAW : (!last_line ? NEXT_LINE : DONE);
            NEXT_LINE:  state_next = load_line ? AWAIT_DMA : AWAIT_POS;
            DONE:       state_next = IDLE;
            default:    state_next = IDLE;
        endcase
    end
endmodule

F in Space!

We have an animated starfield, we have a font-friendly sprite module, let’s draw an ‘F’ in space.

To avoid confusion, we’re going to use code point to refer to the numerical representation of a character and glyph to refer to the graphicical representation.

Capital F has the code point U+0046. However, our Latin font file has 64 glyphs covering code points U+0020 - U+005F, so we need to subtract 0x20 (32 decimal) from the code point to load the correct glyph.

The new top module combines the starfield design from earlier in this post, with a single sprite:

Xilinx version shown below:

module top_space_f (
    input  wire logic clk_100m,     // 100 MHz clock
    input  wire logic btn_rst,      // reset button (active low)
    output      logic vga_hsync,    // horizontal sync
    output      logic vga_vsync,    // vertical sync
    output      logic [3:0] vga_r,  // 4-bit VGA red
    output      logic [3:0] vga_g,  // 4-bit VGA green
    output      logic [3:0] vga_b   // 4-bit VGA blue
    );

    // generate pixel clock
    logic clk_pix;
    logic clk_locked;
    clock_gen clock_640x480 (
       .clk(clk_100m),
       .rst(!btn_rst),  // reset button is active low
       .clk_pix,
       .clk_locked
    );

    // display timings
    localparam CORDW = 10;  // screen coordinate width in bits
    logic [CORDW-1:0] sx, sy;
    logic de;
    display_timings timings_640x480 (
        .clk_pix,
        .rst(!clk_locked),  // wait for clock lock
        .sx,
        .sy,
        .hsync(vga_hsync),
        .vsync(vga_vsync),
        .de
    );

    // size of screen with and without blanking
    localparam H_RES_FULL = 800;
    localparam V_RES_FULL = 525;
    localparam H_RES = 640;
    localparam V_RES = 480;

    // font glyph ROM
    localparam FONT_WIDTH  = 8;   // width in pixels (also ROM width)
    localparam FONT_HEIGHT = 8;   // height in pixels
    localparam FONT_GLYPHS = 64;  // number of glyphs
    localparam F_ROM_DEPTH = FONT_GLYPHS * FONT_HEIGHT;
    localparam FONT_FILE   = "font_unscii_8x8_latin_uc.mem";

    logic [$clog2(F_ROM_DEPTH)-1:0] font_rom_addr;
    logic [FONT_WIDTH-1:0] font_rom_data;  // line of glyph pixels

    rom_sync #(
        .WIDTH(FONT_WIDTH),
        .DEPTH(F_ROM_DEPTH),
        .INIT_F(FONT_FILE)
    ) font_rom (
        .clk(clk_pix),
        .addr(font_rom_addr),
        .data(font_rom_data)
    );

    // sprite
    localparam SPR_SCALE_X = 32;  // enlarge sprite width by this factor
    localparam SPR_SCALE_Y = 32;  // enlarge sprite height by this factor

    // horizontal and vertical screen position of letter
    localparam SPR_X = 192;
    localparam SPR_Y = 112;

    // start sprite in blanking of line before first line drawn
    logic [CORDW-1:0] spr_y_cor;  // corrected for wrapping
    logic spr_start;
    always_comb begin
        spr_y_cor = (SPR_Y == 0) ? V_RES_FULL - 1 : SPR_Y - 1;
        spr_start = (sy == spr_y_cor && sx == 0);
    end

    // subtract 0x20 from code points as font starts at U+0020
    localparam SPR_GLYPH_ADDR = FONT_HEIGHT * 'h26;  // F U+0046

    // font ROM address
    logic [$clog2(FONT_HEIGHT)-1:0] spr_glyph_line;
    logic spr_fdma;  // font ROM DMA slot
    always_comb begin
        font_rom_addr = 0;
        spr_fdma = (sx == H_RES);  // load glyph line at start of blanking
        font_rom_addr = (spr_fdma) ? SPR_GLYPH_ADDR + spr_glyph_line : 0;
    end

    logic spr_pix;  // sprite pixel
    sprite #(
        .WIDTH(FONT_WIDTH),
        .HEIGHT(FONT_HEIGHT),
        .SCALE_X(SPR_SCALE_X),
        .SCALE_Y(SPR_SCALE_Y),
        .LSB(0),
        .CORDW(CORDW),
        .H_RES_FULL(H_RES_FULL),
        .ADDRW($clog2(FONT_HEIGHT))
        ) spr (
        .clk(clk_pix),
        .rst(!clk_locked),
        .start(spr_start),
        .dma_avail(spr_fdma),
        .sx,
        .sprx(SPR_X),
        .data_in(font_rom_data),
        .pos(spr_glyph_line),
        .pix(spr_pix),
        .draw(),
        .done()
    );

    // starfields
    logic sf1_on, sf2_on, sf3_on;
    logic [7:0] sf1_star, sf2_star, sf3_star;

    starfield #(.INC(-1), .SEED(21'h9A9A9)) sf1 (
        .clk(clk_pix),
        .en(1'b1),
        .rst(!clk_locked),
        .sf_on(sf1_on),
        .sf_star(sf1_star)
    );

    starfield #(.INC(-2), .SEED(21'hA9A9A)) sf2 (
        .clk(clk_pix),
        .en(1'b1),
        .rst(!clk_locked),
        .sf_on(sf2_on),
        .sf_star(sf2_star)
    );

    starfield #(.INC(-4), .MASK(21'h7FF)) sf3 (
        .clk(clk_pix),
        .en(1'b1),
        .rst(!clk_locked),
        .sf_on(sf3_on),
        .sf_star(sf3_star)
    );

    // sprite colour & star brightness
    logic [3:0] red_spr, green_spr, blue_spr, starlight;
    always_comb begin
        {red_spr, green_spr, blue_spr} = (spr_pix) ? 12'hFC0 : 12'h000;
        starlight = (sf1_on) ? sf1_star[7:4] :
                    (sf2_on) ? sf2_star[7:4] :
                    (sf3_on) ? sf3_star[7:4] : 4'h0;
    end

    // VGA output
    always_ff @(posedge clk_pix) begin
        vga_r <= de ? spr_pix ? red_spr   : starlight : 4'h0;
        vga_g <= de ? spr_pix ? green_spr : starlight : 4'h0;
        vga_b <= de ? spr_pix ? blue_spr  : starlight : 4'h0;
    end
endmodule

Hello - こんにちは

Our ‘F’ in space doesn’t take advantage of the new functionality of the sprite module. Let’s update our top module to say “Hello” using five sprites; I’ve done this for English and Japanese:

Try creating your own five-character message. Check the fonts for which characters are available, or create your own font-variant to write in another language.

These modules make extensive use of for loops to avoid duplication. To create multiple instances of a module we need to use generate rather than a normal for loop.

NB. The iCE40 designs don’t currently use generate as I’ve had an issue with the spr_glyph_line signal that prevents it from working.

Controlling the Message

To display a custom message we can store a message as code points: greet.mem.

A single two-line message looks like this.

//   FPGA
// AD ASTRA
20 20 46 50 47 41 20 20
41 44 20 41 53 54 52 41

The greeting ROM loads the messages and looks similar to the font ROM:

    // greeting message ROM
    localparam GREET_MSGS   = 32;    // 32 messages
    localparam GREET_LENGTH = 16;    // each containing 16 code points
    localparam G_ROM_WIDTH  = $clog2('h5F);  // highest code point is U+005F
    localparam G_ROM_DEPTH  = GREET_MSGS * GREET_LENGTH;
    localparam GREET_FILE   = "greet.mem";

    logic [$clog2(G_ROM_DEPTH)-1:0] greet_rom_addr;
    logic [G_ROM_WIDTH-1:0] greet_rom_data;  // code point

    rom_sync #(
        .WIDTH(G_ROM_WIDTH),
        .DEPTH(G_ROM_DEPTH),
        .INIT_F(GREET_FILE)
    ) greet_rom (
        .clk(clk_pix),
        .addr(greet_rom_addr),
        .data(greet_rom_data)
    );

To express ourselves better, I’ve expanded the number of sprites to eight, then reused them to form a second line of text. This gives us 16 characters to work with per message.

Now we have our message in memory, the process for calculating the font ROM address is more complex:

  1. Set the greeting ROM address to the chosen message (at start of blanking)
  2. Save the code points of the characters from the ROM (one cycle later)
  3. Use the code points to calculate the glyph address (two cycles later)

We use for loops and offset the times relative to the start of the blanking interval:

    // greeting ROM address
    logic [$clog2(G_ROM_DEPTH)-1:0] msg_start;
    always_comb begin
        greet_rom_addr = 0;
        msg_start = greeting * GREET_LENGTH;  // calculate start of message
        for (i = 0; i < SPR_CNT; i = i + 1) begin
            if (sx == H_RES+i) greet_rom_addr = (sy < LINE2) ? (msg_start+i) : (msg_start+i+8);
        end
    end

    // load code point from greeting ROM
    logic [G_ROM_WIDTH-1:0] spr_cp [SPR_CNT];
    always_ff @(posedge clk_pix) begin
        for (i = 0; i < SPR_CNT; i = i + 1) begin
            if (sx == H_RES+i+1) spr_cp[i] <= greet_rom_data;  // wait one cycle
        end
    end

    // font ROM address
    logic [$clog2(F_ROM_DEPTH)-1:0] spr_glyph_addr [SPR_CNT];
    logic [$clog2(FONT_HEIGHT)-1:0] spr_glyph_line [SPR_CNT];
    logic spr_fdma [SPR_CNT];  // font ROM DMA slots
    always_comb begin
        font_rom_addr = 0;
        for (i = 0; i < SPR_CNT; i = i + 1) begin
            spr_fdma[i] = (sx == H_RES+i+2);  // wait two cycles
            spr_glyph_addr[i] = (spr_cp[i] - CP_START) * FONT_HEIGHT;
            if (spr_fdma[i]) font_rom_addr = spr_glyph_addr[i] + spr_glyph_line[i];
        end
    end

Greetings Demo v1

Using the greetings logic, I’ve created a demo to greet a few of the open-source FPGA projects we love; apologies to everyone we missed:

We cycle through the greetings using a frame counter; I’ve chosen 80 frames (1.25 seconds). You can adjust this with the MSG_CHG parameter.

Copperbars

The text feels a bit flat in plain gold: what we need are copper bars! While we don’t have a co-processor (yet), we can create the effect using a simple counter. I’ve gone for a sky and earth colour scheme, but you can easily change the colours to your own taste:

    // font colours
    localparam COLR_A   = 'h125;  // initial colour A
    localparam COLR_B   = 'h421;  // initial colour B
    localparam SLIN_1A  = 'd150;  // 1st line of colour A
    localparam SLIN_1B  = 'd178;  // 1st line of colour B
    localparam SLIN_2A  = 'd250;  // 2nd line of colour A
    localparam SLIN_2B  = 'd278;  // 2nd line of colour B
    localparam LINE_INC = 3;      // lines of each colour

    logic [11:0] font_colr;  // 12 bit colour (4-bit per channel)
    logic [$clog2(LINE_INC)-1:0] cnt_line;
    always_ff @(posedge clk_pix) begin
        if ((sy == SLIN_1A || sy == SLIN_2A) && sx == 0) begin
            cnt_line <= 0;
            font_colr <= COLR_A;
        end else if ((sy == SLIN_1B || sy == SLIN_2B) && sx == 0) begin
            cnt_line <= 0;
            font_colr <= COLR_B;
        end else if (sx == 0) begin
            cnt_line <= cnt_line + 1;
            if (cnt_line == LINE_INC-1) begin
                cnt_line <= 0;
                font_colr <= font_colr + 'h111;
            end
        end
    end

Our final greeting design:

FPGA Ad Astra

Explore

I hope you enjoyed this instalment of Exploring FPGA Graphics, but nothing beats creating your own designs. Here are a few suggestions to get you started:

  • Change starfield speed and direction with buttons on your FPGA board (see FPGA Pong)
  • Write your own greeting messages
  • Cycle the greetings colours over time
  • Animate the text from different directions, so it slides in from all sides of the screen

If you create a cool demo, drop me a message @WillFlux, and I’ll add it to the blog.

Next Time

In the next part, we’ll introduce bitmaps and learn about framebuffers.

©2020 Will Green, Project F