20 May 2020

Exploring FPGA Graphics

In all beginnings dwells a magic force
Herman Hesse, Stages from The Glass Bead Game

Welcome to Exploring FPGA Graphics. In this series we explore graphics at the hardware level and get a feel for the power of FPGAs. We start by learning how displays work, before racing the beam with Pong, drawing starfields and sprites, simulating life with bitmaps, drawing lines and triangles, and finally creating simple 3D models. I’ll be writing and revising this series throughout 2020.

In this first post we learn how computer displays work and animate simple colour graphics.

Updated 2020-09-08. Get in touch with @WillFlux or open an issue on GitHub.

Series Outline

  • Exploring FPGA Graphics (this post) - how displays work and simple animated colour graphics
  • FPGA Pong - race the beam to create the arcade classic
  • FPGA Ad Astra - animated starfields, hardware sprites, and bitmap fonts
  • Life on Screen - bitmaps and Conway’s Game of Life (being written)
  • Hard Lines - 2D drawing (planned)
  • More to follow

Requirements

For this series, you need an FPGA board with video output. We’ll be working at 640x480, so pretty much any video output will work. You should be comfortable with programming your FPGA board and reasonably familiar with Verilog. If you’re new to FPGAs and have a Xilinx board, try Hello Arty.

We’ll be demoing with these boards (FPGA type):

Follow the source README to build a design for either of these boards.

Source

All the Verilog designs featured in this series are available in the Exploring FPGAs repo and source links are included throughout the blog. The designs are open source under the permissive MIT licence, but this blog is subject to normal copyright restrictions.

Quick Aside: SystemVerilog?!
We’re using a few choice features from SystemVerilog to make Verilog a little more pleasant (no laughing at the back). If you’re familiar with Verilog, you’ll have no trouble.

Space and Time

The screen you’re looking at is a little universe with its own rules of space and time.

When you look at a screen from afar, you see a smooth two-dimensional image. Looking closely, you see many individual blocks: these are pixels. A typical high-definition image is 1920 pixels across and 1080 lines down: over 2 million pixels in total. Even a 640x480 image has over 300,000 pixels. The need to handle so much information so quickly is a big part of the challenge of working with graphics.

A VGA cable has five main signals: red, green, blue, horizontal sync, and vertical sync. There are no addressing signals to tell the display where to draw pixels; the secret is time, demarcated by the sync signals. The red, green, and blue wires carry the colour of each pixel in turn. Each pixel lasts a fixed length of time; when the display receives a horizontal sync, it starts a new line; when it receives a vertical sync, it starts a new frame.

The sync signals are part of blanking intervals. Originally designed to allow an electron gun to move to the next line or top of the screen, blanking intervals have been retained and repurposed in contemporary displays: HDMI uses them to transmit audio. The blanking interval has three parts: front porch, sync, and back porch.

Display Timings

Display Timings

In this series, we’re going to use 640x480 as our display resolution. Almost all displays support 640x480, and its low resource requirements make it possible to work with even small FPGAs. All the same principles apply at higher resolutions, such as 1280x720 or 4K.

We’ll use traditional horizontal and vertical timings, based on the original VGA display:

    640x480 Timings      HOR    VER
    -------------------------------
    Active Pixels        640    480
    Front Porch           16     10
    Sync Width            96      2
    Back Porch            48     33
    Blanking Total       160     45
    Total Pixels         800    525
    Sync Polarity        neg    neg

Learn more from Video Timings: VGA, SVGA, 720p, 1080p.

Taking blanking into account, we have a total of 800x525 pixels. A typical LCD refreshes 60 times a second, so the number of pixels per second is: 800 x 525 x 60 = 25,200,000, which equates to a pixel clock of 25.2 MHz.

CAUTION: CRT Monitors
Any modern display, including multisync CRTs, should be fine with a 25.2 or 25 MHz pixel clock. Fixed-frequency CRTs, such as the original IBM 85xx series, could be damaged by an out-of-spec signal. Use these designs at your own risk.

Running to Time

We’ve decided we need a pixel clock of 25.2 MHz pixel clock, but neither of our demo boards have such a clock. To reach the required frequency, we’re going to use a phase-locked loop (PLL). Almost all FPGAs include one or more PLLs, but there isn’t a standard way to configure them in Verilog, so we have to use vendor-specific designs.

We have provided implementations for Xilinx 7 Series (XC7) and Lattice iCE40; for other FPGAs, you’ll need to consult your vendor documentation. If you can’t reach 25.2 MHz exactly, then 25 MHz or thereabouts should be fine (but see note about CRTs, above). The iCE40 can’t generate 25.2 MHz using the oscillators on iCEBreaker but works fine at 25.125 MHz.

Clock Generator Modules

Display Timings Module

Using our circa 25 MHz pixel clock, we can generate timings for our 640x480 display. Creating display timings is straightforward: there’s one counter for horizontal position and one for vertical position. We use these counters to decide on the correct time for sync signals.

640x480 display timing generator [src]:

module display_timings (
    input  wire logic clk_pix,          // pixel clock
    input  wire logic rst,              // reset
    output      logic [9:0] sx,         // horizontal screen position
    output      logic [9:0] sy,         // vertical screen position
    output      logic hsync,            // horizontal sync
    output      logic vsync,            // vertical sync
    output      logic de                // data enable (low in blanking interval)
    );

    // horizontal timings
    parameter HA_END = 639;             // end of active pixels
    parameter HS_STA = HA_END + 16;     // sync starts after front porch
    parameter HS_END = HS_STA + 96;     // sync ends
    parameter LINE   = 799;             // last pixel on line (after back porch)

    // vertical timings
    parameter VA_END = 479;             // end of active pixels
    parameter VS_STA = VA_END + 10;     // sync starts after front porch
    parameter VS_END = VS_STA + 2;      // sync ends
    parameter SCREEN = 524;             // last line on screen (after back porch)

    always_comb begin
        hsync = ~(sx >= HS_STA && sx < HS_END);  // invert: hsync polarity is negative
        vsync = ~(sy >= VS_STA && sy < VS_END);  // invert: vsync polarity is negative
        de = (sx <= HA_END && sy <= VA_END);
    end

    // calculate horizontal and vertical screen position
    always_ff @ (posedge clk_pix) begin
        if (sx == LINE) begin  // last pixel on line?
            sx <= 0;
            sy <= (sy == SCREEN) ? 0 : sy + 1;  // last line on screen?
        end else begin
            sx <= sx + 1;
        end
        if (rst) begin
            sx <= 0;
            sy <= 0;
        end
    end
endmodule

ProTip: The last assignment wins in Verilog, so the reset overrides the existing sx and sy.

sx and sy store the horizontal and vertical position; their maximum values are 800 and 525 respectively, so we need 10 bits to hold them (210 = 1024). de is data enable, which is low during the blanking interval: we use it to decide when to draw pixels.

Display modes vary in the polarity of their sync signals; for traditional 640x480 the polarity is negative for both hsync and vsync. Negative polarity means the voltage is normally high; low indicates a sync signal.

The following simulation shows the vertical sync starting at the 490th line (counting starts at zero):

Sync Signal Simulation

Test Benches

You can exercise the designs with the included test benches (currently Xilinx only):

Some things to check:

  • What is the pixel clock period?
  • How long does the pixel clock take to lock?
  • Does a frame last exactly 1/60th of a second?
  • How much time does a single line last?
  • What is the maximum values of sx and sy when de is low?

Bringing it Together

Now we have our display signals we’re ready to start drawing. To begin we’re going to keep it simple and draw a coloured square. When the screen x and y coordinates are both less than 32 we draw in orange, otherwise we use blue. Because our colour output has 4 bits per channel, we can use a single hex digit from 0-F to represent the intensity of red, green, and blue.

There are three versions of this top module:

Shown below is the version for XC7:

module top_square (
    input  wire logic clk_100m,     // 100 MHz clock
    input  wire logic btn_rst,      // reset button (active low)
    output      logic vga_hsync,    // horizontal sync
    output      logic vga_vsync,    // vertical sync
    output      logic [3:0] vga_r,  // 4-bit VGA red
    output      logic [3:0] vga_g,  // 4-bit VGA green
    output      logic [3:0] vga_b   // 4-bit VGA blue
    );

    // generate pixel clock
    logic clk_pix;
    logic clk_locked;
    clock_gen clock_640x480 (
       .clk(clk_100m),
       .rst(!btn_rst),  // reset button is active low
       .clk_pix,
       .clk_locked
    );

    // display timings
    localparam CORDW = 10;  // screen coordinate width in bits
    logic [CORDW-1:0] sx, sy;
    logic de;
    display_timings timings_640x480 (
        .clk_pix,
        .rst(!clk_locked),  // wait for clock lock
        .sx,
        .sy,
        .hsync(vga_hsync),
        .vsync(vga_vsync),
        .de
    );

    // 32 x 32 pixel square
    logic q_draw;
    always_comb q_draw = (sx < 32 && sy < 32) ? 1 : 0;

    // VGA output
    always_comb begin
        vga_r = !de ? 4'h0 : (q_draw ? 4'hF : 4'h0);
        vga_g = !de ? 4'h0 : (q_draw ? 4'h8 : 4'h8);
        vga_b = !de ? 4'h0 : (q_draw ? 4'h0 : 4'hF);
    end
endmodule

Take a look at the VGA or DVI output section of the top module. For each colour, we check whether we’re in the blanking interval (when de is 0). If we are in the blanking interval, we set the colour intensity to zero. Otherwise, we look at the value of q_draw: if true, we set the pixel to orange, if false we set it to blue. The colour intensity must be zero in the blanking interval; otherwise, your display may be garbled or misaligned.

We could have written the output logic using nested if blocks instead; for example:

    // VGA output
    always_comb begin
        // default values
        vga_r = 4'h0;
        vga_g = 4'h0;
        vga_b = 4'h0;

        if (de) begin
            if (q_draw) begin
                vga_r = 4'hF;
                vga_g = 4'h8;
                // vga_b is zero, so we don't override
            end else begin
                // vga_r is zero, so we don't override
                vga_g = 4'h8;
                vga_b = 4'hF;
            end
        end
    end

Let there be Pixels

Combine the top_square, display_timings, and clock_gen modules with suitable constraints and you’re ready to drive a display. The constraints map the pins on the FPGA to signals in our design:

You can find build instructions in the source README.

Once you’ve programmed your board, you should see something like this (colours #0088FF and #FF8800):

A Square

Try experimenting with your own square size, position, and colours.

Animation

To create a simple animation, we can update the position of the square every frame. If we updated the position of the square during active drawing, we risk screen tearing, so we create an animate signal that happens at the start of the blanking period.

We’re going to replicate the behaviour of the video display itself, scanning across then down the screen. The square “beam” disappears off the edge of the screen, like the signal in the blanking interval. Try rebuilding the design with top_beam:

Shown below is the version for iCE40 with DVI Pmod:

module top_beam (
    input  wire logic clk_12m,      // 12 MHz clock
    input  wire logic btn_rst,      // reset button (active high)
    output      logic dvi_clk,      // DVI pixel clock
    output      logic dvi_hsync,    // DVI horizontal sync
    output      logic dvi_vsync,    // DVI vertical sync
    output      logic dvi_de,       // DVI data enable
    output      logic [3:0] dvi_r,  // 4-bit DVI red
    output      logic [3:0] dvi_g,  // 4-bit DVI green
    output      logic [3:0] dvi_b   // 4-bit DVI blue
    );

    // generate pixel clock
    logic clk_pix;
    logic clk_locked;
    clock_gen clock_640x480 (
       .clk(clk_12m),
       .rst(btn_rst),
       .clk_pix,
       .clk_locked
    );

    // display timings
    localparam CORDW = 10;  // screen coordinate width in bits
    logic [CORDW-1:0] sx, sy;
    logic de;
    display_timings timings_640x480 (
        .clk_pix,
        .rst(!clk_locked),  // wait for clock lock
        .sx,
        .sy,
        .hsync(dvi_hsync),
        .vsync(dvi_vsync),
        .de
    );

    // size of screen (including blanking)
    localparam H_RES = 800;
    localparam V_RES = 525;

    // square 'Q' - origin at top-left
    localparam Q_SIZE = 32; // square size in pixels
    localparam Q_SPEED = 4; // pixels moved per frame
    logic [CORDW-1:0] qx, qy;     // square position

    logic animate;  // high for one clock tick at start of blanking
    always_comb animate = (sy == 480 && sx == 0);

    // update square position once per frame
    always_ff @(posedge clk_pix) begin
        if (animate) begin
            if (qx >= H_RES - Q_SIZE) begin
                qx <= 0;
                qy <= (qy >= V_RES - Q_SIZE) ? 0 : qy + Q_SIZE;
            end else begin
                qx <= qx + Q_SPEED;
            end
        end
    end

    // is square at current screen position?
    logic q_draw;
    always_comb begin
        q_draw = (sx >= qx) && (sx < qx + Q_SIZE)
              && (sy >= qy) && (sy < qy + Q_SIZE);
    end

    // DVI output
    always_comb begin
        dvi_clk = clk_pix;
        dvi_de  = de;
        dvi_r = !de ? 4'h0 : (q_draw ? 4'hF : 4'h0);
        dvi_g = !de ? 4'h0 : (q_draw ? 4'h8 : 4'h8);
        dvi_b = !de ? 4'h0 : (q_draw ? 4'h0 : 4'hF);
    end
endmodule

Bounce!

Now we can animate, we can start to create some interesting effects. By adding collision detection, we can bounce squares around the screen. If we create three squares: red, green, and blue we have a simple demo. While simple, it’s satisfying to watch the squares combine colours as they move around the screen.

Try rebuilding the design with top_bounce:

Bouncing Squares

Collision Detection

Collision detection is one of those things that seems trivial, but has a number of subtleties. In our bounce module, each square checks for collisions in both horizontal and vertical directions. We’ll make use of this in the next part of this series on Pong, so it’s worth understanding.

Horizontal collision detection example from top_bounce:

    if (q1x >= H_RES - (Q1_SIZE + q1s)) begin  // right edge
        q1dx <= 1;
        q1x <= q1x - q1s;
    end else if (q1x < q1s) begin  // left edge
        q1dx <= 0;
        q1x <= q1x + q1s;
    end else q1x <= (q1dx) ? q1x - q1s : q1x + q1s;
  • H_RES - horizontal screen resolution
  • Q1_SIZE - size of square 1
  • q1x - horizontal position of square 1
  • q1dx - horizontal direction of square 1
  • q1s - horizontal speed of square 1

A couple of things to consider:

  1. What needs to change to make the left and right edge collision tests symmetrical?
  2. Why do we need to account for the speed of the square?

At first blush it seems we can simplify this to the following, with a single position update for all situations:

    if (q1x >= H_RES - (Q1_SIZE + q1s)) q1dx <= 1;
    if (q1x < q1s) q1dx <= 0;
    q1x <= (q1dx) ? q1x - q1s : q1x + q1s;

What’s the problem with this approach? Hint: logic in an always_ff block operates in parallel.

Can you suggest a change to the comparisons to make this simpler approach work?

Explore

I hope you enjoyed the first instalment of Exploring FPGA Graphics. Nothing beats creating your own designs; here are a few suggestions to get you started:

  • Try drawing some country flags, many of which are composed of rectangular shapes
  • Animate the size of the squares so they grow and shrink
  • Add collision detection between animated squares so they bounce off each other
  • Create a separate square module to avoid code duplication

Next Time

That’s all for this quick introduction to FPGA graphics.
In the next part, we’ll build the classic game: Pong.

©2020 Will Green, Project F