10 June 2020

FPGA Ad Astra

Welcome back to Exploring FPGA Graphics. In the previous part we created a version of the classic game, Pong. In this third part, we take inspiration from an even earlier game: Computer Space and work with sprites, bitmap fonts, and starfields.

In this series, we explore graphics at the hardware level and get a feel for the power of FPGAs. We start by learning how displays work, before racing the beam with Pong, drawing starfields and sprites, simulating life with bitmaps, drawing lines and triangles, and finally creating simple 3D models. I’ll be writing and revising this series throughout 2020. New to the series? Start with Exploring FPGA Graphics.

Updated 2020-08-17. Get in touch with @WillFlux or open an issue on GitHub.

Series Outline

  • Exploring FPGA Graphics - how displays work and simple animated colour graphics
  • FPGA Pong - race the beam to create the arcade classic
  • FPGA Ad Astra (this post) - animated starfields, hardware sprites, and bitmap fonts
  • Life on Screen - bitmaps and Conway’s Game of Life
  • Hard Lines - 2D drawing (planned)
  • More to follow

Requirements

For this series, you need an FPGA board with video output. We’ll be working at 640x480, so pretty much any video output will do. You should be comfortable with programming your FPGA board and reasonably familiar with Verilog.

We’ll be demoing with these boards:

Follow the source README to quickly build a project for either of these boards.

Source

All the Verilog designs featured in this series are available in the Exploring FPGAs repo and source links are included throughout the blog. The designs are open source under the permissive MIT licence, but this blog is subject to normal copyright restrictions.

Computer Space

Computer Space was the first video arcade game: it features a simple backdrop of stars over which the player’s rocket battles two flying saucers. The backdrop may have been simple, but this game was released in 1971, when a “cheap” computer, such as the Data General Nova, cost around $8,000 (c. $40,000 in 2020). Unable to find anything both fast and cheap, developers Nolan Bushnell and Ted Dabney created their own custom hardware instead. It’s strangely hard to find details of the Computer Space logic design on the Internet, but TTL 7400s were used.

Quick Aside: Computer Space Cabinet
Computer Space had an amazing fibreglass cabinet. You can see more photos at Marvin’s Marvelous Mechanical Museum (courtesy of the Wayback Machine).

Linear Feedback Shift Register

I don’t know how Computer Space created its starfield, but we’re going to use a linear feedback shift register (henceforth LFSR). Rather like field programmable gate arrays themselves, their name makes LFSRs sound arcane and inscrutable; happily for us, this is not the case.

An LFSR can create a pseudorandom number sequence in which every number appears just once. For example, an 8-bit LSFR can generate all the numbers from 1-255 in a repeatable sequence that seems random. The logic for an LFSR can be written in a single line of Verilog:

sreg <= {1'b0, sreg[7:1]} ^ (sreg[0] ? 8'b10111000 : 8'b0);

The first part of the statement right-shifts the shift-register one bit. In the second part, we check the value of the bit we shifted: this is the feedback. If the feedback is true, we XOR the whole shift register with a magic pattern of bits known as taps. Values at the bit positions of the taps are flipped by the XOR. The initial value of an LFSR is known as the seed.

For example, an 8-bit LFSR with a seed of 169:

  • 10101001 - seed value (169)
  • 01010100 - after right shift (bit shifted out was 1)
  • 01010100 XOR 10111000 - XOR with taps
  • 11101100 - 2nd value (236)
  • 01110110 - after right shift (bit shifted out was 0)
  • 01110110 XOR 00000000 - XOR with 0
  • 01110110 - 3rd value (118)

For an 8-bit LFSR, we set tap bits 8, 6, 5, and 4. To find taps for other lengths, you can refer to the feedback polynomials on Wikipedia.

To create a starfield, we need a sequence that’s the same length as the number of pixels we’re drawing. If we iterate through our shift register every time we get to a new pixel, then each pixel will always be associated with the same number in the shift register. We could take a single bit and draw a star if it were true, but that would cover half the pixels on the screen with stars; instead, we only draw a star if a group of bits are all 1.

We’ll start with a 17-bit LFSR, as its longest sequence, 217-1, fits within 640x480. If we draw this within an area of 512x256, which has 217 pixels, then the starfield will move left by one pixel every frame.

Create an LFSR module lfsr.sv [src]:

module lfsr #(
    parameter LEN=8,                    // shift register length
    parameter TAPS=8'b10111000,         // XOR taps
    parameter SEED={LEN{1'b1}}          // initial seed value
    ) (
    input  wire logic clk,              // clock
    input  wire logic rst,              // reset
    input  wire logic en,               // enable
    output      logic [LEN-1:0] sreg    // lfsr output
    );

    always_ff @(posedge clk) begin
        if (en) sreg <= {1'b0, sreg[LEN-1:1]} ^ (sreg[0] ? TAPS : {LEN{1'b0}});
        if (rst) sreg <= SEED;
    end
endmodule

We use a simple top module to drive it, based on the design we used in Exploring FPGA Graphics:

Shown below is the version for XC7:

module top_lfsr (
    input  wire logic clk_100m,         // 100 MHz clock
    input  wire logic btn_rst,          // reset button (active low)
    output      logic vga_hsync,        // horizontal sync
    output      logic vga_vsync,        // vertical sync
    output      logic [3:0] vga_r,      // 4-bit VGA red
    output      logic [3:0] vga_g,      // 4-bit VGA green
    output      logic [3:0] vga_b       // 4-bit VGA blue
    );

    // generate pixel clock
    logic clk_pix;
    logic clk_locked;
    clock_gen clock_640x480 (
       .clk(clk_100m),
       .rst(!btn_rst),  // reset button is active low
       .clk_pix,
       .clk_locked
    );

    // display timings
    localparam CORDW = 10;  // screen coordinate width in bits
    logic [CORDW-1:0] sx, sy;
    logic de;
    display_timings timings_640x480 (
        .clk_pix,
        .rst(!clk_locked),  // wait for clock lock
        .sx,
        .sy,
        .hsync(vga_hsync),
        .vsync(vga_vsync),
        .de
    );

    logic sf_area;
    always_comb sf_area = (sx < 512 && sy < 256);

    // 17-bit LFSR
    logic [16:0] sf_reg;
    lfsr #(
        .LEN(17),
        .TAPS(17'b10010000000000000)
    ) lsfr_sf (
        .clk(clk_pix),
        .rst(!btn_rst),
        .en(sf_area && de),
        .sreg(sf_reg)
    );

    // VGA output
    always_comb begin
        logic star;
        star = &{sf_reg[16:9]};  // (~512 stars for 8 bits with 512x256)
        vga_r = (de && sf_area && star) ? sf_reg[3:0] : 4'h0;
        vga_g = (de && sf_area && star) ? sf_reg[3:0] : 4'h0;
        vga_b = (de && sf_area && star) ? sf_reg[3:0] : 4'h0;
    end
endmodule

To build this example, you’ll also need the display timings, clock generation, and constraints, that we created in the first part of the series. All these files are available in the FPGA Ad Astra repo together with a makefile for iCEBreaker, and Vivado project for Arty.

A Screen Full of Sky

If we use the maximum sequence of an LFSR, then our starfield is extremely limited in size. To fill the screen, we choose an LFSR that produces a sequence longer than our number of pixels, then restart it when we reach the end of the screen. There are two ways to do this:

  1. Find the LFSR value at the point we want to restart
  2. Use a separate counter

The first option is extremely efficient when it comes to logic. As each value appears only once, it uniquely describes a position in the sequence. Historically, LFSR were used as counters, including on FPGAs. Xilinx has a nice application note describing this: XAPP210.

The second option requires separate counter logic, but on a contemporary FPGA, the cost is minimal. The advantage of this approach is we can easily adjust the direction and speed of the starfield by counting a little more or a little less. This is the approach we’ll use in this project.

We’re going to want a few starfields, so let’s create a dedicated module [src]:

module starfield #(
    parameter H=800,
    parameter V=525,
    parameter INC=-1,
    parameter SEED=21'h1FFFFF,
    parameter MASK=21'hFFF
    ) (
    input  wire logic clk,              // clock
    input  wire logic en,               // enable
    input  wire logic rst,              // reset
    output      logic sf_on,            // star on
    output      logic [7:0] sf_star     // star brightness
    );

    localparam RST_CNT = H * V + INC - 1;  // counter starts at zero, so subtract 1
    logic [20:0] sf_reg, sf_cnt;

    always_ff @(posedge clk) begin
        if (en) begin
            sf_cnt <= sf_cnt + 1;
            if (sf_cnt == RST_CNT) sf_cnt <= 0;
        end
        if (rst) sf_cnt <= 0;
    end

    // select some bits to form stars
    always_comb begin
        sf_on = &{sf_reg | MASK};
        sf_star = sf_reg[7:0];
    end

    lfsr #(
        .LEN(21),
        .TAPS(21'b101000000000000000000),
        .SEED(SEED)
        ) lsfr_sf (
        .clk,
        .rst(sf_cnt == 21'b0),
        .en,
        .sreg(sf_reg)
    );
endmodule

The starfield module defaults to a 21-bit LFSR, which has a maximum sequence of just over two million. The module takes the screen dimensions as parameters H and W: we’re using the full screen, including the blanking interval, so the starfield doesn’t immediately repeat. The MASK allows us to control the stellar density, the more 1s in the mask, the more stars there will be. Finally, the module outputs an 8-bit value for star brightness, which can be used to create a more varied starfield.

Into Space

Using our module, we can create multiple starfields at different speeds and densities to give that real in-space feeling. Our example top module has three starfields:

Rebuild your project with the new starfield and top modules. Try experimenting with the INC and MASK parameters to create different speeds and densities.

Greetings, World!

I had initially planned to do some more stuff with LFSRs, but on seeing the starfields, I thought they’d make an ideal backdrop for a message. We don’t have a bitmap on which to draw on, so I decided to quickly design a hardware sprite. Sprites are not ideal for large quantities of text but do make for a fun way to display short messages on screen.

Hardware Sprite

Our simple hardware sprite design reads a line of pixels in the blanking interval before drawing them on the following screen line. We’ve departed a little from the most straightforward design by adding a couple of features:

  • Scaling - so a small font can be used to create screen-sized messages
  • Bit Swapping - so sprite data can be MSB or LSB first (no more back-to-front text!)

A hardware sprite has several different modes, such as loading data, waiting for screen position, and drawing; this makes it an ideal candidate for a finite state machine (FSM).

To create the sprite, I worked out the states I wanted to include on pen and paper, then created combinatorial logic to describe the state transitions. Finally, I added the sequential logic to perform the actions for each state, such as loading data from memory. I put the logic for choosing the pixel to draw within the combinatorial process, which is a little ugly, but it has worked well in practice.

Rather than describe each state, I’ve commented the sprite module in some detail [src]:

module sprite #(
        parameter LSB=1,      // first pixel in LSB
        parameter WIDTH=8,    // graphic width in pixels
        parameter HEIGHT=8,   // graphic height in pixels
        parameter SCALE_X=1,  // sprite width scale-factor
        parameter SCALE_Y=1,  // sprite height scale-factor
        parameter ADDRW=9,    // width of graphic address bus
        parameter CORDW=10    // width of screen coordinates
        ) (
        input  wire logic clk,                        // clock
        input  wire logic rst,                        // reset
        input  wire logic start,                      // start control
        input  wire logic dma_avail,                  // memory access control
        input  wire logic [CORDW-1:0] sx,             // horizontal screen position
        input  wire logic [CORDW-1:0] sprx,           // horizontal sprite position
        input  wire logic [WIDTH-1:0] gfx_data,       // sprite graphic data
        input  wire logic [ADDRW-1:0] gfx_addr_base,  // graphic base address
        output      logic [ADDRW-1:0] gfx_addr,       // graphic address (sprite line)
        output      logic pix,                        // pixel to draw
        output      logic done                        // sprite drawing is complete
    );

    // position within sprite
    /* verilator lint_off LITENDIAN */
    logic [$clog2(WIDTH)-1:0]  ox;
    logic [$clog2(HEIGHT)-1:0] oy;

    // scale counters
    logic [$clog2(SCALE_X)-1:0] cnt_x;
    logic [$clog2(SCALE_Y)-1:0] cnt_y;
    /* verilator lint_on LITENDIAN */

    logic [WIDTH-1:0] spr_line; // local copy of sprite line

    enum {
        IDLE,       // awaiting start signal
        START,      // prepare for new sprite drawing
        AWAIT_DMA,  // await DMA access to memory
        READ_MEM,   // read line of sprite from memory
        AWAIT_POS,  // await horizontal position
        DRAW,       // draw pixel
        NEXT_LINE,  // prepare for next line
        DONE        // set done signal, then go idle
    } state, state_next;

    integer i;  // for bit reversal in READ_MEM

    always_ff @(posedge clk) begin
        // advance to next state
        state <= state_next;

        // START
        // clear done signal
        // set vertical position to start
        // set graphic address to base
        if (state == START) begin
            done <= 0;
            oy <= 0;
            cnt_y <= 0;
            gfx_addr <= gfx_addr_base;
        end

        // READ_MEM
        // read sprite line, reversing if MSB is left-most pixel
        // NB. Assumes read takes one clock cycle
        if (state == READ_MEM) begin
            if (LSB) begin
                spr_line <= gfx_data;
            end else begin
                for (i=0; i<WIDTH; i=i+1) spr_line[i] <= gfx_data[(WIDTH-1)-i];
            end
         end

        // AWAIT_POS
        // set horizontal drawing position to start of sprite
        if (state == AWAIT_DMA) begin
            ox <= 0;
            cnt_x <= 0;
        end

        // DRAW
        // count horizontal position, including scaling factor
        if (state == DRAW) begin
            if (SCALE_X <= 1 || cnt_x == SCALE_X-1) begin
                ox <= ox + 1;
                cnt_x <= 0;
            end else begin
                cnt_x <= cnt_x + 1;
            end
        end

        // NEXT_LINE
        // count vertical position, including scaling factor
        // increment memory address for new graphic line
        if (state == NEXT_LINE) begin
            if (SCALE_Y <= 1 || cnt_y == SCALE_Y-1) begin
                oy <= oy + 1;
                cnt_y <= 0;
                gfx_addr <= gfx_addr + 1;
            end else begin
                cnt_y <= cnt_y + 1;
            end
        end

        // DONE
        // set done signal
        if (state == DONE) begin
            done <= 1;
        end

        if (rst) begin
            state <= IDLE;
            ox <= 0;
            oy <= 0;
            cnt_x <= 0;
            cnt_y <= 0;
            spr_line <= 0;
            gfx_addr <= gfx_addr_base;
            done <= 0;
        end
    end

    logic line_complete, line_gfx_complete, draw_complete;
    always_comb begin
        /* verilator lint_off WIDTH */
        line_complete      = (ox == WIDTH-1 && cnt_x == SCALE_X-1);
        line_gfx_complete  = (cnt_y == SCALE_Y-1);
        draw_complete      = (oy == HEIGHT-1 && line_complete && line_gfx_complete);
        /* verilator lint_on WIDTH */

        pix = 0;
        state_next = IDLE;
        case(state)
            IDLE: state_next = (start) ? START : IDLE;
            START: state_next = AWAIT_DMA;
            AWAIT_DMA: state_next = (dma_avail) ? READ_MEM : AWAIT_DMA;
            READ_MEM: state_next = AWAIT_POS;
            AWAIT_POS: state_next = (sx == sprx) ? DRAW : AWAIT_POS;
            DRAW: begin
                pix = spr_line[ox];
                state_next = (draw_complete) ? DONE :
                             (line_complete) ? NEXT_LINE : DRAW;
            end
            NEXT_LINE: state_next = (line_gfx_complete) ? AWAIT_DMA : AWAIT_POS;
            DONE: state_next = IDLE;
        endcase
    end
endmodule

ProTip: Using SystemVerilog enums makes finite state machines easier to understand and debug.

Bitmap Font

If we’re going to display text messages, we need a font. I’ve experimented with a few simple bitmap fonts in the past, but Unscii is one of the best. It’s available in 8x8 and 8x16/16x16 pixels with thousands of glyphs for many languages and those for ASCII art. Plus, the GNU Unifont hexdump version is trivial to convert to readmemh for use with Verilog.

The full list of glyphs is too large to be held in internal FPGA memory. For the purposes of this demo, I’ve created two subsets of the Unscii font: one for upper-case basic Latin (including punctuation and numbers), and one for Hiragana (without marks):

Create Your Own Font
You can easily create your own font version with different characters: check out the hex source on the Unscii site. Use VS Code Column Selection Mode to quickly turn Unifont hex into readmemh format. Just be aware you can’t mix different glyph sizes with this sprite design, and watch out for memory usage: the Hiragana file uses 6/30 iCEBreaker BRAMs.

Hello - こんにちは

Using the fonts we can create simple greetings. I’ve done this for English and Japanese:

Try creating your own five-character message. Check the fonts for which characters are available, or create your own font-variant to write in another language.

Greetings from Project F

At this point, I should have called it a day, but starfields and demos are a classic combination, so I couldn’t resist sending a few greetings. To be able to express ourselves better, I expanded the number of sprites to eight, then reused them to form a second line of text. This gives us 16 characters to work with.

We’ve included greetings for a few of the open-source FPGA projects we love; apologies to everyone we missed. I used readmemh hex format for the greetings like we used for the fonts: greets.mem.

For this design, the top module loads the character code point from the greetings ROM, then calculates the address in the font to pass to the sprite.

A counter is used to switch between messages: I’ve chosen 80 frames (1.25 seconds). You can adjust this with the MSG_CHG parameter. You can adjust the size of the text with SPR_SCALE_X and SPR_SCALE_Y.

Rather than single colour for the text, we’ve got gradients in the classic Amiga demo style (copperbars FTW!). I’ve gone for a sky and earth colour scheme, but you can easily change the colours to your own taste using COLR_A and COLR_B.

FPGA Ad Astra

Explore

I hope you enjoyed this instalment of Exploring FPGA Graphics, but nothing beats creating your own designs. Here are a few suggestions to get you started:

  • Change starfield speed and direction with buttons on your FPGA board
  • Cycle the greetings colours over time
  • Create space ship and asteroid sprites
  • Animate the text, so it slides in from the bottom of the screen

If you create a cool demo, drop me a message @WillFlux, and I’ll add it to the blog.

Next Time

In the next part, we’ll learn about bitmaps, simulating life, and displaying Earth from space, in Life on Screen.

©2020 Will Green, Project F