The first post in this series talked about the flow of information between the user and implementer of an interface through a “machine”. In the case of software user interfaces, that machine is the hardware and software on a computer—a desktop computer, laptop, or game console.
I’ve written, thus far, about building that machine’s software to support this flow of information between user and programmer. I’ve built up some ideas about easily expressing certain interface designs, and how the code can be structured to support those expressions.
But I have not yet written, in-depth, about the two obvious parts of this “information cycle”—the inputs of the system, and the outputs of the system. I’ve been only focusing on the “machinery”.
I’ve skipped details on the inputs and outputs, so far, because if you’ve used a computer at all, you’re well-aware of them. The user specifies their input by moving their mouse, pressing its buttons, or typing on a keyboard, and the programmer specifies output by setting the colors of pixels on a screen, or playing a waveform through the speakers.
In this post, I’d like to get specific about the “setting the colors of pixels” part.
Identifying Rendering Effects We Want
You can, of course, get arbitrarily fancy with this part of the problem. User interfaces in games often require much more sophisticated renderers (and asset pipelines) than those of “boring” applications (even “boring” applications that look nice). If we just want to focus on building fairly nice looking “boring” applications (or a game that isn’t too ambitious about its interface rendering), though, the set of on-screen rendering effects you need is not particularly large.
Let’s take a look at a few user interfaces and jot down some crucial rendering features.
The first two obviously-important features we can spot in the above screenshots are solid-color rectangles and text. In any application you build, you’ll pretty much always need to have a way of rendering both.
An interface that only uses solid-color rectangles and text does not have to be ugly, nor does it have to be unreadable. That being said, these two features are not sufficient to express a number of effects that often help readability and visual appeal.
Looking at those examples, you’ll notice a few other visual features—buttons look embossed or debossed, rectangular borders (“hollow” rectangles) are common, corners of both hollow and filled rectangles can be rounded, and windows or dropdown menus have shadows behind them. Many interfaces also prominently feature icons accompanying text.
As I’ve already said, you can introduce arbitrarily-many features that you might want to support. In this post, I’ll be covering one way of supporting just the ones I’ve mentioned.
The Naïve Top-Down Approach
A very common strategy I’ve seen in the programming world is thinking of certain desired high-level features, then directly codifying each one as a separate codepath. I call this strategy “top-down” because it begins enforcing constraints on code by starting with high-level requirements. Such an approach would mean that, in the case of user interface rendering, each of the features I’ve mentioned—text, rectangles, rounded rectangles, rectangle borders, rounded rectangular borders, drop shadows, and icons—would all be implemented as distinct codepaths. Each feature is seen as a separate “case” to handle, and the programmer in charge naïvely translates that into distinct cases in the code.
This has a number of possible drawbacks. The first is simply that you may write (and thus maintain) more code to implement each feature, when compared to an alternative world where you got all of the features you wanted out of a smaller number of codepaths. That might not sound too bad for a small number of features, but it is worse than you might first assume. Each “case” is heterogeneous, and so there is a degree of variability that propagates elsewhere, and forces itself not only into the implementation of each case, but also other code that must interact with each case. Any codepath that wants to programmatically parameterize this rendering codepath now must scale itself with the number of cases. For these reasons, I consider each addition of a “case” as a multiplication of code, rather than an addition of code.
The top-down approach also, in the case of rendering, makes for worse batches of GPU work. GPUs are particularly good at doing very large batches of parallelized homogeneous workloads. So, in my estimation, if there is a way to use a single, simple, homogeneous codepath to express all needed features, that will often be preferable to the alternatives.
So this “single, simple, homogeneous codepath” thing sounds pretty great. How do we build it? Unsurprisingly, we’ll approach this with “bottom-up” thinking.
Bottom-Up: Layering On Constraints
We’ve all been in this situation. You know the one—you’re putting together a piece of furniture, and you get a little too ambitious with tightening screws, before everything is in place. Your plan is to screw everything in as tightly as possible in a single pass, so that when you screw the last piece in place, you’re left with a sturdy and entirely complete piece of furniture.
Unfortunately, while screwing together your furniture in this way, you also simultaneously screw yourself. This is because you’ve overcommitted to certain constraints. The wood or metal you’re using is forced to slightly contort to meet your demands. The problem is that it begins contorting to an incomplete set of demands—you haven’t yet “told the material” about all of the screws it ought to contort with respect to.
Code is, interestingly, similar. If you overfit a codepath to a subset of your constraints, then it may become contorted to fit those constraints before you’ve even introduced some features. Fitting in the features you left out originally, then, will be more painful. You may need to loosen other constraints, or undergo some friction in introducing the new constraints.
“Bottom-up” is perhaps best conceptualized as gradually introducing constraints to a single codepath overtime. In the furniture analogy, it’s about putting in all screws up-front, leaving them only partially screwed in, then tightening them all together. Each “screw” is a lower level (at the level of analysis you want to focus on) requirement about what effects you expect the codepath to be capable of producing.
This approach requires that you are more careful about each change you make. Before making a change, you first must check that it does not violate any of your constraints. If it does, you may need to slightly reform the change, and leave some “wiggle-room” for the constraints you’ve yet to visit.
Building A UI Shader (Bottom-Up Style)
So, with that picture in mind, and with the goal of using a GPU renderer to put our user interfaces on screen, let’s build vertex and pixel shaders (with the rest of the pipelining implied) that satisfy all of our feature constraints. I’ll be writing these shaders in a C-like pseudocode, which you should be easily able to translate to GLSL or HLSL or whatever else fits your requirements.
Pass I → Solid Color Rectangles
Firstly, we know that we need to be able to render a batch of solid-colored rectangles. We can easily send down per-instance data for each rectangle that encodes its color and positioning on screen. We can use that per-instance data, then, in a large draw call for instanced triangle strips, with each instance letting us draw a rectangle.
struct Globals
{
float2 res; // resolution
};
struct VS_Input
{
float2 p0; // top left corner of rectangle
float2 p1; // bottom right corner of rectangle
float4 color;
uint vertex_id; // synthetic
};
struct PS_Input
{
float4 vertex;
float4 color;
};
Globals globals;
// vertex shader
PS_Input VS_Main(VS_Input input)
{
// static vertex array that we can index into
// with our vertex ID
static float2 vertices[] =
{
{-1, -1},
{-1, +1},
{+1, -1},
{+1, +1},
};
// "dst" => "destination" (on screen)
float2 dst_half_size = (input.p1 - input.p0) / 2;
float2 dst_center = (input.p1 + input.p0) / 2;
float2 dst_pos =
(vertices[input.vertex_id] * dst_half_size + dst_center);
// package output
PS_Input output = {0};
output.vertex = float4(2 * dst_pos.x / globals.res.x - 1,
2 * dst_pos.y / globals.res.y - 1,
0,
1);
output.color = input.color;
return output;
}
// pixel shader
float4 PS_Main(PS_Input input)
{
return input.color;
}
Note: The vertex_id
member of VS_Input
is “synthetic”, meaning it is not packaged directly with data you prepare on the CPU—it is an implicit detail about a shader codepath. In OpenGL, its equivalent is gl_VertexID
. For a given invocation of a vertex shader, it will correspond with the index of the current vertex on the current primitive.
The above shaders (and implied pipeline) allow us to render a batch of solid-colored rectangles, so we’ve set ourselves up for our first constraint.
Pass II → Text, Icons, Bitmaps
Now that we can draw solid-color rectangles, we’re only a few steps away from also rendering textured rectangles. Textured rectangles can be used for rendering text (in this case, the texture is a glyph atlas) and icons (this would also be a “glyph” atlas—more on this later). All of the code we have for using per-instance data to position solid-colored rectangles also applies to textured rectangles, we just need to equip the shaders with the ability to also sample from textures.
Instead of introducing any branching, or modes of the shaders, we can collapse both solid-colored rectangles and textured rectangles into the same codepaths by plugging in a white texture for solid-colored rectangles, and always blending the rectangle’s color with the texture sample. This also allows for drawing tinted textured rectangles, which is what we would need for colored text.
To make this change, first let’s add a Texture2D
to our Globals
:
struct Globals
{
float2 res; // resolution
Texture2D texture;
};
We also need to—on a per-instance-basis—specify texture source coordinates, so that we can use atlas textures, with each rectangle only using a specific portion of an entire texture. This is very similar to specifying destination coordinates, but it’s just mapping between different spaces.
For this, we’ll first need to expand our per-instance data:
struct VS_Input
{
float2 dst_p0; // top left corner on screen
float2 dst_p1; // bottom right corner on screen
float2 src_p0; // top left corner on texture
float2 src_p1; // bottom right corner on texture
float4 color;
uint vertex_id; // synthetic
};
And unsurprisingly, the coordinate mapping portion of the vertex shader will look similar:
// "dst" => "destination" (on screen)
float2 dst_half_size = (input.dst_p1 - input.dst_p0) / 2;
float2 dst_center = (input.dst_p1 + input.dst_p0) / 2;
float2 dst_pos =
(vertices[input.vertex_id] * dst_half_size + dst_center);
// "src" => "source" (on texture)
float2 src_half_size = (input.src_p1 - input.src_p0) / 2;
float2 src_center = (input.src_p1 + input.src_p0) / 2;
float2 src_pos =
(vertices[input.vertex_id] * src_half_size + src_center);
We’ll then need to send this information to the pixel shader, so that it can sample from the texture at the appropriate location for each pixel. This will require us first to expand our PS_Input
type:
struct PS_Input
{
float4 vertex;
float2 uv;
float4 color;
};
And then, we can prepare this data in the vertex shader. For destination coordinates, we were mapping to clip space, which is in the [-1, +1] range—in this case, we’ll map to UV space, which is in the [0, 1] range:
// package output
PS_Input output = {0};
output.vertex = float4(2 * dst_pos.x / globals.res.x - 1,
2 * dst_pos.y / globals.res.y - 1,
0,
1);
output.uv = float2(src_pos.x / globals.texture.size.x,
src_pos.y / globals.texture.size.y);
output.color = input.color;
return output;
And, finally, we can sample from our texture in the pixel shader:
// pixel shader
float4 PS_Main(PS_Input input)
{
float4 sample = SampleTexture2D(globals.texture, input.uv);
float4 color = input.color * sample;
return color;
}
I won’t cover how you might go about producing a glyph atlas texture, since that is out-of-scope for this series, and there are many ways to approach that problem, with the best option being highly dependent on problem-space. Here is a non-comprehensive list of options that you might want to look into if you’re trying to approach this problem:
Draw the atlas yourself. If you’re building a game—especially one with simpler graphics, like pixel art—you might just want a simple glyph atlas that you draw yourself, that supports the glyphs you specifically need. People forget about this option!
Generate the atlas with a third-party tool. I’ll leave it at “tool”, because being more specific would be a huge rabbit hole. For example, though, I’ve done this in the past when I was using signed-distance-field glyph atlases (these are useful when you need dynamically scaling text, like in a 3D game), in which case I used a tool called Hiero (I’ll avoid linking to it, because I always download it from sketchy software hosting websites—Google it and download random executables at your own risk), which exported both a signed-distance-field glyph atlas, and metadata for mapping codepoints to glyphs. The rendering pipeline I’m covering in this post wouldn’t support signed-distance-field texture rendering, but it wouldn’t take much to extend it for this case.
stb_truetype. This is a single-file C library that allows you to decode TrueType Font (.ttf) files and produce glyph atlases from them. It’s a very popular option. It likely will not be your best option for producing the highest-quality glyphs, but it does a decent job, and is simple to use. I also don’t recommend using it when you may have to load and use untrusted files. Even in those cases, though, it’s a very useful bootstrapping option.
Freetype. I have no experience using Freetype, but I know that it’s a popular library for rasterizing glyphs and building font atlases. I’m just linking to it here so that you’re aware of it—I don’t really have any other information about it.
Native operating system libraries. These will vary depending on your target platforms, obviously, but they are sometimes the best way to produce very crisp glyphs, and deal with untrusted input files. On Windows, you’d want to look at DirectWrite. For doing this, I recommend Allen Webster’s DirectWrite example code. I can’t speak to other platforms.
Note: Icons are a useful way to improve the readability and polish of your user interfaces. Unfortunately, they’ve always seemed like a hassle to me. I’ve recently found a fairly low-friction way to use them, if you’re already using TrueType font files for text rendering. There is a free online service I found called Fontello (this service is also open source, so there might be a way to better integrate it into offline workflows) that allows easily generating TrueType font files with a custom codepoint → glyph mapping. This allows you to associate a Unicode codepoint with any glyph. These glyphs can come from Fontello’s fairly large icon library, but they can also come from custom SVGs.
What this means is that you can use the features of your renderer built for text rendering and also use them for icon rendering, just with a different source for your glyphs. Your glyph atlas rasterization should already allow for rasterizing glyphs at various sizes—this means that icons can also be scaled appropriately.
Pass III → Gradients
Earlier I mentioned embossed or debossed effects on certain controls in user interfaces. These effects—as well as a number of others (e.g. rendering color pickers)—can be achieved by upgrading our shaders to support gradients.
For our purposes, this can be simply achieved by specifying per-vertex colors, then using built-in interpolation to produce a blended color in the pixel shader. This will not produce very high-quality gradients that avoid artifacts from simple per-vertex-color interpolation, nor will it support more complicated gradient features (adjustment of the transition curve between colors, allowing more than two colors, and so on). But those features and improvements, I’ve found, are not necessary to support a number of the most important effects. That being said, you can certainly fit those more complex features into this same style of renderer.
To make this change, we need to fatten up our per-instance data type a bit more:
struct VS_Input
{
float2 dst_p0;
float2 dst_p1;
float2 src_p0;
float2 src_p1;
float4 colors[4]; // (change is here)
uint vertex_id;
};
Then, we can sample from colors
using vertex_id
:
output.color = input.colors[input.vertex_id];
Using a combination of several gradients, you can render fairly high-quality rectangular embossing and debossing effects.
For example, if I want to make a button appear embossed when it is hovered by the user, I can render a gradient that fades from semi-transparent white on the top vertices to full-transparency on the bottom vertices. To make the same button appear debossed, I can render gradients that fade from a semi-transparent black to full-transparency to emulate shadows on the edge of the button, to make it appear “pressed”:
The strength of these gradients can be a function of animation state that is cached across frames, which is how you’d tie everything together to obtain fully animated interactions.
Pass IV → Rounded Corners, Soft Edges
We can extend our rectangle renderer with support for rounded corners and soft edges (useful for glows or drop shadows) by using a signed-distance function that we apply at each pixel of each rectangle, and use its result to change the color for each pixel.
First, we obviously don’t want rounded corners and soft shadows for every rectangle we draw—so, we need a way to parameterize it per-instance:
struct VS_Input
{
float2 dst_p0;
float2 dst_p1;
float2 src_p0;
float2 src_p1;
float4 colors[4];
float corner_radius; // new
float edge_softness; // new
uint vertex_id;
};
Because both of these new per-instance values will need to be used per-pixel, we’ll also need to add them to the pixel shader input. We also need to send information about the pixel position and destination coordinates to the pixel shader.
struct PS_Input
{
float4 vertex;
float2 uv;
float2 dst_pos; // new
float2 dst_center; // new
float2 dst_half_size; // new
float corner_radius; // new
float edge_softness; // new
float4 color;
};
And, obviously, we need to fill this information out when returning from the vertex shader:
// calculated earlier
output.dst_pos = dst_pos;
output.dst_center = dst_center;
output.dst_half_size = dst_half_size;
// pass-through, no vertex shader work to do
output.corner_radius = input.corner_radius;
output.edge_softness = input.edge_softness;
Below is a signed-distance function that calculates distance between a point and a rounded rectangle. I won’t get into the weeds of building these functions (it’s mostly off-topic, and I suspect I’m not the best person to do so), but it isn’t too difficult to work out the math for something this simple.
float RoundedRectSDF(vec2 sample_pos,
vec2 rect_center,
vec2 rect_half_size,
float r)
{
vec2 d2 = (abs(rect_center - sample_pos) -
rect_half_size +
vec2(r, r));
return min(max(d2.x, d2.y), 0.0) + length(max(d2, 0.0)) - r;
}
We can then use this function in our pixel shader to blend the output appropriately, to account for both corner radius and edge softness.
// we need to shrink the rectangle's half-size
// that is used for distance calculations with
// the edge softness - otherwise the underlying
// primitive will cut off the falloff too early.
float2 softness = input.edge_softness;
float2 softness_padding = float2(max(0, softness*2-1),
max(0, softness*2-1));
// sample distance
float dist = RoundedRectSDF(input.dst_pos,
input.dst_center,
input.half_size-softness_padding,
input.corner_radius);
// map distance => a blend factor
float sdf_factor = 1.f - smoothstep(0, 2*softness, dist);
// use sdf_factor in final color calculation
float4 color = input.color * sample * sdf_factor;
Pass V → Hollow Rectangles
We can express “hollowed-out” rectangles (including those that must be rounded or soft) with a single additional piece of per-instance information: border thickness. We can use this per-instance parameter with another signed-distance sample for the rectangle’s “interior”. We can skip this with a branch in the case of filled rectangles, just by checking if the border thickness is specified as 0
.
I’m sure you get the picture at this point, but for the sake of completeness, let’s pipe this data into the vertex shader and then into the pixel shader:
struct VS_Input
{
float2 dst_p0;
float2 dst_p1;
float2 src_p0;
float2 src_p1;
float4 colors[4];
float corner_radius;
float edge_softness;
float border_thickness; // new
uint vertex_id;
};
struct PS_Input
{
float4 vertex;
float2 uv;
float2 dst_pos;
float2 dst_center;
float2 dst_half_size;
float corner_radius;
float edge_softness;
float border_thickness; // new
float4 color;
};
Once you have this data available in the pixel shader, using it to produce a new blend factor might look something like this:
float border_factor = 1.f;
if(input.border_thickness != 0)
{
float2 interior_half_size =
input.dst_half_size - float2(input.border_thickness);
// reduction factor for the internal corner
// radius. not 100% sure the best way to go
// about this, but this is the best thing I've
// found so far!
//
// this is necessary because otherwise it looks
// weird
float interior_radius_reduce_f =
min(interior_half_size.x/input.dst_half_size.x,
interior_half_size.y/input.dst_half_size.y);
float interior_corner_radius =
(input.corner_radius *
interior_radius_reduce_f *
interior_radius_reduce_f);
// calculate sample distance from "interior"
float inside_d = RoundedRectSDF(input.dst_pos,
input.dst_center,
interior_half_size-
softness_padding,
interior_corner_radius);
// map distance => factor
float inside_f = smoothstep(0, 2*softness, inside_d);
border_factor = inside_f;
}
The final color calculation then becomes:
float4 color = input.color * sample * sdf_factor * border_factor;
With that, we’ve made a single rendering pipeline that supports all of the features I mentioned that we needed earlier: rectangles, rounded rectangles, hollow rectangles, hollow rounded rectangles, gradients, bitmaps, text, icons, and drop shadows. These features become, ultimately, different “expressions” that all use the same pipeline.
The Rest Of The Picture
I’m hoping that the rest of the renderer code is more-or-less clear at this point, as it mostly follows from the vertex and pixel shaders. I’ve skipped over a lot—concrete GLSL or HLSL code, wrangling GPU APIs to set up and use the rendering pipeline, specifying clipping rectangles, abstracting over multiple GPU APIs, and techniques for building a high level rendering API that cleanly batches up data for this pipeline. I’ve left these topics out because they’re really more appropriate for a general guide on renderer programming, and there are probably already a number of resources that cover them. For the purposes of rendering user interfaces, I’m hoping I’ve covered everything I need to in sufficient detail.
Once the renderer is in place, the final piece of the puzzle is the codepath that iterates a UI_Box
hierarchy and queues up instances for this renderer. This codepath will be intertwined with which feature flags are applied to each UI_Box
node, and whatever custom rendering parameterization that builder code has provided for each node.
Before wrapping up, I’d like to mention that this codepath is ripe for batching, and this is because in a user interface, most widgets do not overlap. In cases when widgets do not overlap, you’re free to assume that the order in which you render those widgets does not matter—this means that you’re free to, for example, batch all non-overlapping text, and non-overlapping rectangles together. What this means, at the end of the day, is that you can very often render very complex user interfaces with only a very small number of draw calls.
Conclusion
That’s everything I wanted to cover for user interface rendering. I hope this was an insightful look into how you can get a lot out of a little in user interface rendering—even though, like I said, you can get arbitrarily ambitious with many pieces of this problem, I’ve found that what I’ve covered is sufficient for very sophisticated user interfaces.
Thanks for reading!