UI, Part 4: The Widget Is A Lie (Node Composition)
Ditching the idea that a "widget" must be explicitly represented in our user interface hierarchy data structure.
So far, we’ve covered a lot of ground. I started this series with a first-principles definition of a user interface, and the idea that a user interface facilitates communication between a user and a programmer. I’ve introduced the concept of an immediate-mode API as a suitable design strategy for a user interface system’s builder code. And lastly, I’ve provided a basic overview of what the core code might also look like—specifically, how it might avoid a combinatoric explosion of features.
In this part, I’d like to extend the plan for more complex widgets. The details of something like UI_Button
and how that breaks down are fairly simple. But how does a more complex widget look? What about combo boxes, dropdown menus, and tables? Are those also just another UI_Widget
in the tree? Is it enough to just continue adding features until a single UI_Widget
is sophisticated and expressive enough to handle all of our problems?
Ditching Top-Down Thinking
First, I want to break an illusion I’ve been maintaining thus far throughout this series. I’ve been calling nodes in the “user interface hierarchy” as UI_Widget
s. I did this mostly because I expected it to be more presentable and understandable. But this name comes from a top-down, high-level conceptualization of what we’re doing. A “widget” can include a number of possible formulations in a user interface. Simple concepts—buttons, checkboxes, sliders—are generally called “widgets”, it’s true. But other, more sophisticated, concepts—combo boxes, dropdown menus, single-line text editors, multi-line text editors, table editors—are also called “widgets”.
But what actually are UI_Widget
s, and the code we’ve set up to work with them? A UI_Widget
is certainly not sufficient—as I’ve already expressed it—to entirely provide functionality for any of those more complicated widgets.
A UI_Widget
is, instead, one node in a hierarchical (and persistent caching) data structure we’ve set up. Each node can have any number of children nodes. Each node offers one possible granularity to apply the features that I’ve established are important for a user interface: a rectangular sub-allocation of screen space, clickability, rendering and animation characteristics, and so on.
For a moment, let’s ditch the UI_Widget
name. It doesn’t matter what we call it—for now, let’s call it X
. Don’t think about these nodes as “widgets” anymore, and instead just consider them as a useful building block we can use.
Now, let’s take a more complex widget example: a “list box”.

How might you go about composing multiple X
s into a hierarchy to produce the effect of the “list box” above, including the scroll bar on the side?
Here’s the formulation I came up with:
* List Box Region (child layout on x)
| * Scrollable Region (fill space, clip rectangle, overflow y)
| | * List Item 1 (clickable text)
| | * List Item 2 (clickable text)
| | * List Item 3 (clickable text)
| | * List Item 4 (clickable text)
| | * (etc.)
|
| * Scroll Bar Region (fixed size, child layout on y)
| | * Scroll-Up Button (clickable button with up-arrow)
| | * Space Before Scroller
| | * Scroller (fixed size, proportional to view - draggable)
| | * Space After Scroller
| | * Scroll-Down Button (clickable button with down-arrow)
In the above, each *
corresponds to one X
node. Hopefully you can see that one “widget”—the list box—can be decomposed into several building of these X
building blocks.
This decomposition may have already been clear to some readers, and unclear to others. My point is that it wasn’t clear to me when I first started doing user interface programming—you can hopefully see that it wasn’t obvious up-front that we’d decompose it that way, instead of adding more features into UI_Widget
to make it powerful enough to express all of that.
This decomposition strategy allows builder code, with this system, to express very complicated widgets with a fairly small number of features supported by the core code. It also keeps our hierarchy/cache node type—while still large—from growing much larger past a certain stage of feature support.
So, after clearing that up, the only thing left to do is to decide on a name for X
. It’s hopefully clear that an X
is not actually a “widget”, but a building block that lets builder code construct widgets.
In all of the recent codebases I’ve worked on, I’ve settled on the name UI_Box
. It’s not great, but it’s just a name I had to attach to the building block, and this one is small and easy to type. That name change applies to all of the associated functions too. Instead of UI_WidgetMake
, it’ll be UI_BoxMake
. Instead of UI_WidgetFlags
, it’ll be UI_BoxFlags
.
Naming seems fairly superficial and uninteresting, but I think it makes an impact on how we think about problems as programmers. Attaching a too high-level or top-down name to nothing more than a useful data-organization mechanism—like “widget”—can often lead us astray. And, indeed, using the “widget” label did lead me astray for years.
Note: This is why I am vehemently against practices often taught in, for example, a computer science university program, that suggest a good strategy for building software architecture is to bake a high-level mental model of what the problem is into which types are used (and, in the worst case, the “relationships” between these types being modeled after the relationships between concepts in the high-level mental model).
Complex Widget Decomposition Examples
To wrap up, let’s explore a few complex widgets, and how they may be decomposed.
Single-Line Text Field
A single-line text field may not initially seem like it requires any decomposition. But, to successfully account for all expected behavior, it is useful to decompose.
In the above GIF, you’ll see that a single-line text field has a number of requirements in order to match expected behavior. There is an expectation of keyboard input (obviously), and of mouse input. It also requires a special rendering codepath to render a cursor and selection. Those features alone do not suggest that we need any decomposition—we can fit that into the features we have for a single node in the box hierarchy.
This stops being true when you consider that single-line text fields must have a strategy for text that goes “out-of-bounds”. We can deal with this easily by using the same codepaths we use for scrolling elsewhere, and just ensuring that the scroll position of the single-line text field is positioned to keep the cursor in the visible range.
The rest of the code we’d already have around for scrolling continues to apply—notably, that of animation, or of using the mouse wheel (if we want).
Regardless of the scroll position, the single-line text field “widget” must remain in the same place (of course). We can use a single UI_Box
for the “container” part of the widget. For the text itself, we can use a child UI_Box
that we allow to grow and overflow the size of the “container”, and scroll around in the virtual “viewing space” of the “container”.
So, the decomposition is this:
* Container (allow overflow in x, clip rectangle, scrollable, clickable)
| * Text Content (text + extra rendering for cursor/selection)
Menu Bar
A menu bar is a very common widget you’ll find in native applications, particularly on Windows. The standard behavior of these is quite specific, and is often implemented incorrectly.
When the user presses their mouse on a menu bar button, the corresponding menu is opened (unlike other buttons, which require a press followed by a release while still hovering). At this point, the menu bar is considered “activated”, and hovering over another menu bar button (without pressing) will close the currently-open menu, and open the one corresponding to the hovered button.
I’ll leave the dropdown itself aside for now. The actual menu bar seen above can be decomposed as follows:
* Menu Bar Container (layout children in x)
| * File Menu Bar Button (clickable. if the menu bar is not
| activated, then activate the menu bar
| and load this button's menu on *press*. | if it is activated, then activate on
| *hover*.)
| * Window Menu Bar Button
| * Panel Menu Bar Button
| * View Menu Bar Button
| * Control Menu Bar Button
In-Place Keybinding Modifiers
At first glance, the “Open” button above looks like just a single button, with keybinding text displayed on the right-hand-side. But by composing multiple boxes together, the keybinding text can be an entire widget that acts separately as a child of the overall button. This is possible with the following simple decomposition:
* Button "Container" (clickable, but click happens after children)
| * Icon Label
| * Text Label
| * Space, to align to right-hand-side
| * Binding Button (clickable, with other special behavior)
Brief Note on Dropdown Menus & Tooltips
In the menu bar GIFs above, you can see both dropdown menus and tooltips. The actual structure of these is very easy to decompose—it is identical to how you’d decompose any other column of several widgets elsewhere in the user interface. The difference with these concepts is that they are often built in-line with a widget tree, but must be the first subtrees to consume events (and thus the last ones rendered). When building tooltips or dropdown menus, you’d ideally be able to write code as such:
UI_Comm comm = UI_ButtonF("Hover Me!");
if(comm.hovering)
{
UI_BeginTooltip();
UI_LabelF("I am a label!");
UI_EndTooltip();
}
This may seem like a head-scratcher at first, because the tooltip should not appear at this location in the box hierarchy. But, an easy solution to this is to pre-build the tooltip and dropdown box hierarchy roots at the beginning of the frame, and push new widgets into them at any time. Then, in the case of dropdown menus, widgets inside of them should capture inputs, even if they are built after widgets elsewhere in the frame (that are covered by the dropdown). To solve this, you simply need to test the mouse position against that special subtree before consuming any events (because the dropdown layout will be preserved stably across frames).
Conclusion
That’s all for now. I hope this article helped to clarify your understanding, clear up any unanswered questions I’ve left behind thus far, and further demonstrate the capabilities you’ll easily achieve by carefully making decisions about composability.
Thanks for reading, and hope you enjoyed. More will come soon.
It is interesting to note that Donald Knuth's TeX had basically the same primitives you are laying out here, as he was solving essentially the same problem in a typesetting language. An idea to consider adopting for UI code may be similar to Knuth's hbox and vbox. Where above you specify "layout children in x" to denote behavior, if you have two box primitives for each way that the children flow (vbox for vertical and hbox for horizontal) you may find some code becomes conceptually simpler. The best part is, you can easily add this conceptual model to almost any imgui library with little complexity. Thanks for writing these posts, you are doing excellent work.