R. Fleury Computing

Share this post

UI, Part 8: State Mutation, Jank, and Hotkeys

www.rfleury.com

UI, Part 8: State Mutation, Jank, and Hotkeys

Making the case for limiting the timeframe during which important state mutations can occur, making these mutations homogeneous, and how this idea can be used together with builder code.

Ryan Fleury
Aug 23, 2022
11
2
Share this post

UI, Part 8: State Mutation, Jank, and Hotkeys

www.rfleury.com

In this series, I’ve written a number of times that a software user interface is, fundamentally, the barrier between a user and some computational system. It is what allows information flow between a human using a computer and the human who programmed that computer. One direction of the information flow is what is often regarded as “inputs”—the user is expressing some intent to the system by using their peripheral devices. The other direction—normally called the “outputs”—is whatever is displayed on screen, or played through the speakers.

In this post, though, I’d like to look beyond the user interface barrier.

In one direction past this barrier, there is a system forming input. As it turns out, that’d be a human brain and body. While that is a complex, mysterious, and fascinating machine in itself, this is not a psychology, neurology, nor anatomy series! So, while I know you’d love to read my thoughts about those difficult and complicated subjects while being seriously underqualified to do so, I’ll be focusing on the system in the other direction.

That would be, of course, whatever computational system the human is interacting with using the user interface. That computational system is just, you know, a running program—it has some state lying around, and it has some code lying around that can perform transformations to that state. The user interface’s job is, in some way, to visually represent important aspects of that state on the screen, and provide interactive controls to cause transformations to that state accordingly.

This concept is related to the “model-view-controller pattern”, but I like to avoid using that language—like most things you’ll find in an object-oriented textbook, it has become overloaded and conflated with a number of nonsensical ideas. There is a kernel of truth in it, though, which corresponds to the fundamental reality of the information flow in any human-computer system.

The user interface can sometimes be a confusing place to write code, because it is responsible for transforming state that was also used to produce it. It’s inherently cyclical—it visually represents state that it’s also in charge of changing.

This intrinsic quality of the problem can be confusing or frustrating, and it often leads to a number of software bugs, particularly when state transformation occurs too early.

To explain what I mean, I’ll break down the problem from a functional perspective. A user interface can be expressed as a function UI that takes some State to a new State—UI: State → State’. When naively writing procedural code that mutates State in-place, it’s fairly easy to see that if you mutate whatever State you’re working with in the middle of your particular implementation of UI, then the rest of UI will not be using the original State. Thus, there arises the opportunity for disagreement within one single user interface frame. Some of the user interface will have one opinion about the “source State”, because it will have been derived from the State before the mutation, and other parts of the user interface will have another opinion, because they will have been derived after the mutation.

This possible discrepancy in “opinions” about State may arise in a number of places, and it manifests in a number of applications you’re probably familiar with. It will generally appear as ugly and perhaps confusing artifacts, if not severe bugs. These might look like text rendering outside of a box meant to contain it, multiple boxes in the user interface hierarchy moving together but one being a frame behind, graphical artifacts flickering for only a single frame, and so on.

A simple demonstrative example of this is a button in a user interface for deleting an object. If the actual deletion of that object occurs immediately in the builder codepath that is building the button for deletion, then any codepath that runs following the deletion that somehow still has a handle or pointer to that object will either break, or possibly crash the program.

If not addressed, all of this leads to the interface feeling (or literally being) unreliable and brittle—you may also prefer the more colloquial term “janky”. This not only influences the feeling that a user will have while using the interface, but it may also influence the psychology of the user such that their thought processes change in order to avoid what they see as more brittle situations in interacting with the user interface. This, of course, has more serious implications than just the user “feeling bad”—it may, in fact, degrade the user’s ability to effectively use your software. This, again, is not a psychology post, and I am not a psychologist, nor have I done any actual research to verify this conjecture, but I have noticed this effect in myself.

The psychological implications of “jank” on the user—which, again, I can’t verify for all people, but I find easy to identify personally—serve as a useful example of what I’d argue is a higher-order truth: it is not possible to cleanly separate user experience design from making software good and useful. Unfortunately, a nihilistic anti-human sentiment pervades programming culture and thus programming materials, which convinces humans to act like computers in all respects (and not merely as a thinking tool, to learn about how a computer works) and reject the subjective human experience. This is obviously nonsensical—computers are for people, and we write software for people. Deforming one’s experience to approximate only “objectivity” is overly-reductive, and indeed eliminates the entire reason for computation altogether. So, how people interact with our software is of deep importance. Making software usable for people is not a “hack”—it is an intrinsic part of the problem. If you prefer a “simpler” solution that disregards human behavior, then your solution is not actually simpler at all—it is merely ignoring certain crucial constraints.

The Information Flow’s Ground Truth

So, how might we avoid jank—or, at least this particular source of it?

The most straightforward way, conceptually, is to just build the “ideal” pure-functional transformation directly into the builder codepath. Given some type laying out the data format for your program’s state State, the state of the program can simply be “double-buffered”. A builder codepath can, then, take in a State state, and produce a State next_state. Once you’ve produced next_state, then the original state can be thrown away and replaced with next_state.

This is conceptually simple and matches the high-level description of the information flow, but it likely becomes prohibitively wasteful computationally, and also in the maintenance requirement in maintaining a correct “deep-copy”, especially for complex data structures that are necessarily involved in any reasonably-complicated program. It will also likely result in a number of other complexities—for instance, any pointer to a structure in program state will become invalidated on every frame, and it becomes much more difficult to have shared, stable locations where—for example—communication between threads can occur. So, this is not the way I choose to write most state transformation into my user interface code, and I recommend against a pure-functional data transform for user-interface-driven state mutation at the top level. That being said, this strategy does become useful on a smaller scale. In a builder codepath, if you’d like to hold certain state constant—for instance, a boolean that encodes whether or not an expander is open or not—it is sometimes useful to explicitly reserve a “next state”, and treat the initial state for the frame as read-only. This ensures your builder codepath will only ever use the initial state for building rules, and it can cleanly produce the next frame’s state without any discrepancies.

While directly writing a pure-functional transform is only occasionally a suitable strategy in localized scenarios, it’s useful to keep the pure-functional “mathematical ground truth” in mind, because it can inform alternative strategies.

Delta-Based State Mutation

The alternative strategy I use—which I’ll call “delta-based state mutation”—can come from noticing that—in the direct, pure-functional version—new_state and state are often nearly identical. State changes caused by a user interface are generally incremental, relatively infrequent (humans run much slower than computers), and small. So, instead of encoding a full deep-copy of the initial state and mutating it, the builder codepath can produce an encoding of a state delta, to later be applied to the program’s state. This delta can be applied on the following frame before any user interface building codepaths. This setup ensures that for the period of time during which building occurs, the state largely remains identical, with all of the information required to mutate the state later is gathered.

With this strategy, the problem is split in two—first, there is the sub-problem of producing the encoding for a state delta, and second, there is the sub-problem of “applying” the state delta, and directly mutating the state. There are a number of other subtle—but important—benefits hidden in this factorization of the problem.

First, each sub-problem’s corresponding codepaths take place at both a different place in the frame (producing the state delta takes place during the builder codepath, and applying the state delta takes place before any builder codepath runs). This solves the problem I’ve already introduced: different sections of a builder codepath disagreeing on whatever the frame’s original state is.

Secondly, and perhaps more importantly, each sub-problem’s corresponding codepaths take place at different depths of your program’s callstack. That may sound weird, but allow me to explain.

Imagine you are running your program, which implements a user interface, in a debugger, and you set a breakpoint in a builder codepath function that is called at a relatively deep level in the callstack on every frame. The callstack you’ll see in your debugger might look something like this (starting from the lowest point in the callstack):

  • EntryPoint

  • UpdateAndRenderApplication

  • UpdateAndRenderWindow

  • BuildSettingsViewUI

  • UI_CheckBox (Imagine this is a checkbox controlling whether or not the user is using a dark theme or light theme.)

  • UI_Label

  • UI_BoxMake

  • UI_HashFromString

Note: In many cases, not all of these would literally be functions in a program—I am using the callstack as a model for nested conceptual layers.

At lower levels of your user interface thread’s callstack, you will find be high-level application logic—your entry point, your top-level main loop, all of the work to organize work for the frame, and other code making top-level decisions.

Now, I’ll pose the question—where, in this callstack, does it make most sense to inform the system that the user has pressed the checkbox to toggle between a dark theme and light theme? I’d say that’d unambiguously happen in the BuildSettingsViewUI builder codepath. That is where the code would have all of the information necessary about what the user is doing, and thus what clicking that particular checkbox implies about the user’s intent. So, this is a natural place for some state delta to be encoded about what the user has done.

But now, consider that the user also has a hotkey bound which toggles between the dark theme and light theme. When the user hits that hotkey, they intend for the same state mutation to occur as that which would happen if they clicked the checkbox. Whenever that input event is seen, a state mutation must occur that matches that caused by the checkbox, or a state delta identical to that produced by the corresponding user interface codepath must also be produced. Where would that occur in the callstack?

It certainly couldn’t happen in BuildSettingsViewUI—what if the user wants to hit that hotkey even with the settings view closed? It should, of course, still work. So, it must be placed in either UpdateAndRenderWindow or UpdateAndRenderApplication. Importantly, it cannot be placed in the same place as the checkbox.

So, now that we know we’ll have two distinct codepaths for the hotkey consumption and for the user interface builder codepath, we have two choices—we must either mutate the state in an identical way in two places (one in the state delta application, one in the hotkey consumption), or we also must also produce state deltas from hotkey consumption.

The correct choice, to me, is obvious, and it’s the latter—producing state delta data from hotkey consumption in the same way it is produced by the user interface builder codepath allows for a singular codepath that does all of the important state mutation, irrespective of who decided to cause that mutation. It keeps all state mutations in a single place in the frame, so in all locations other than that which applies state deltas, disagreements about state do not occur.

Finally, it naturally follows that the application of state delta must occur at a lower level in the callstack than all productions of state delta. The application of state delta would occur in places like UpdateAndRenderApplication, UpdateAndRenderWindow, or perhaps both. The fact that state delta production and state delta application are decoupled with delta-based state mutation fits this aspect of the problem perfectly—it allows state delta application to occur at a very low level in the callstack, and it allows state delta production to occur anywhere higher in the callstack.

This also opens the door for codepaths other than hotkey consumption and builder codepaths to produce state deltas. For instance, many games, engines, or tools offer “developer consoles”—it may be quite useful to be able to encode these state deltas textually so that you may manually type them into such a console. That also allows the ability to textually log these state deltas, copy and paste them, and perhaps “replay” a recorded log of state deltas to recreate a program’s state easily. Furthermore, requiring that state deltas can be expressed textually forces the ability to also refer to some concepts in your program textually. This may seem like extra work at first, but in my experience, the requirement to textually identify or refer to entities in your program inevitably arises, be it for debugging, logging, serialization, and so on.

Note that this does not necessarily need to be the mechanism for literally all state mutations. You may use your best judgment to simply mutate certain state in-place during a builder codepath, for example—it is not necessarily a problem if that state is truly only controlled-by and related-to that builder codepath. In such a case, the approach I mentioned earlier may come in handy—explicitly reserving a “next state”, and treating the initial state as read-only. Delta-based state mutation is simply a tool to use for a bulk of “important state mutations”, where “important” can be up to your design intuition and judgment.

An Example Frame

With delta-based state mutation, the top-level of a program’s frame loop may, then, look like the following:

  • Get events from the operating system: * → List(Event)

  • Take hotkey events, and add new state deltas to those produced last frame: (List(Event), StateDelta) → (List(Event)', StateDelta')

  • Apply state deltas: (State, StateDelta') → (State')

  • Build user interface: (State', List(Event)') → (UITree, StateDelta'', List(Event)'')

  • Layout user interface: UITree → UITree'

  • Render user interface: UITree' → Render

A Concrete Example

Now that I’ve explained the reasoning and ideas behind “delta-based state mutation”, I’ll walk through some more concrete details of the style of implementation I normally go for.

What I’ve been calling “state delta” can be called by another name: “command buffer”, and that hopefully sheds some light on how you’d encode state deltas—it can merely be an ordered list of “commands”. I’ve had success setting up the encoding for state deltas this way.

I prefer to encode all of these commands textually. I’ve mentioned some of the reasons why this is advantageous already—I can easily log, visualize, type, record, or replay state deltas. Even in the presence of a complex user interface, I can always fall back to using a developer terminal. Encoding these commands textually means that it’s very easy to throw in a simple user interface right away that is capable of queueing up these commands. So, even if you lack the more sophisticated user interfaces that allow graceful and well-designed mechanisms to queue up commands behind the scenes, you always have a slow path available from day one.

Encoding these textual commands is trivial. We can start with a “value type” for the commands:

struct Cmd
{
  // any non-textual "parameters" to the command can go here...
  String8 string;
};

We can then wrap that in a linked-list node type to chain multiple Cmds together:

struct CmdNode
{
  CmdNode *next;
  Cmd cmd;
};

Finally, in our program state, we can store a linked list of these commands:

struct State
{
  // ...
  Arena *cmd_arena;
  CmdNode *first_cmd_node;
  CmdNode *last_cmd_node;
  U64 cmd_count;
  // ...
};

Note: An Arena is just a growing linear allocator. Arenas deserve their own post that’s not coupled to discussion about user interface programming, so I’ll leave it at that for now, but if you’d like to learn more, I’d recommend searching around for “bump”, “stack”, or “linear” allocators.

We can then push a command to a State like so:

void PushCmd(State *state, String8 string)
{
  CmdNode *n = PushArrayZero(state->cmd_arena, CmdNode, 1);
  n->cmd.string = PushStr8Copy(state->cmd_arena, string);
  
  if(state->last_cmd_node == 0)
  {
    state->first_cmd_node = state->last_cmd_node = n;
  }
  else
  {
    state->last_cmd_node->next = n;
    state->last_cmd_node = n;
  }
  state->cmd_count += 1;
}

And, finally, we can process each command—before the user interface builder codepaths—like so:

// given State *state

// perform all commands
for(CmdNode *cmd_node = state->first_cmd_node;
    cmd_node != 0;
    cmd_node = cmd_node->next)
{
  Cmd *cmd = &cmd_node->cmd;
  
  // parse string. I parse the first part to map to a
  // certain "command kind", and then use the rest of
  // the string as parameters.
  String8 cmd_name = ...;
  String8 cmd_args = ...;
  CmdKind cmd_kind = ...;

  switch(cmd_kind)
  {
    default:
    {
      // unrecognized command!
    }break;

    case CmdKind_A:{ ... }break;
    case CmdKind_B:{ ... }break;
    // etc.
  }
}

// clear command state
state->first_cmd_node = state->last_cmd_node = 0;
state->cmd_count = 0;
ArenaClear(state->cmd_arena);

That is the basic idea. I’ve left a few parts out that naturally follow—it is not very complicated.

Intricacies

While the basic idea is simple, there are a few moments I’ve had while using this approach where some questions and concerns did not have trivial answers. So, I’d like to cover some of those here, so that if you have any similar concerns, I can hopefully address them.

Parsing Work

Encoding commands textually may seem computationally wasteful, especially when encoded parameters require nontrivial decoding—for example, baking floats into the textual contents of the string, only to parse them back out later and use them.

It certainly seems silly at first, but recall the way I introduced “delta-based state mutations”—they are useful because user-interface-driven changes to state are normally small and incremental. So, you’ll likely only be running a handful of commands per-frame, if any at all. While it is, in a sense, computationally wasteful, the overall effect will be negligible, and the wins—in my experience—are quite worth it.

It is nevertheless possible to imagine piping a nontrivial number of commands through this system on any given frame, in which case I’d recommend reconsidering the granularity of your command definition. Instead of sending a thousand commands that each do a single thing, reformulate the command to do a thousand things once.

Non-Textual Parameters

Sometimes, it is not worth it to develop a textual encoding scheme for some concepts in your program. Additionally, it can sometimes be prohibitively expensive to encode all of those concepts textually. For cases like this, it’s useful to have a mechanism by which you can bundle non-textual parameters with a command.

The simple way to solve this—which is the one I’ve stuck with—is throwing in an extra “registers” struct in with every command. The value of that struct for any given command is just whatever value was written into the global registers struct last—it is very much like CPU registers. That struct’s type information just needs to be expressive enough to pack the data you need to send along with the command in builder code. Conveniently, this command processing layer only happens in the layer that is also responsible for builder code—so if you’re writing builder code and need a new way to pack extra non-textual metadata with a command, you can simply fatten the struct in whatever way seems fit. Then, the codepath that applies those commands can simply read out of the registers that were pushed along with the command.

Incomplete Information

The most subtle problem I encountered with this approach is when writing implementations for commands that required some information from later in the frame.

For example, consider a text editor—it may have a command like “Go To Line”.

What “Go To Line” actually does depends on a number of factors. It needs to place the cursor on the specified line number, of course, but it also must position the scroll position of the editor, so that the user can actually see whatever line the cursor moved to. If the cursor is to be vertically-centered on the screen after it has been moved, that depends on what font is being used, and how big the panel is in which the code editing user interface is built.

Because command application runs before any builder codepaths, this information is not well-known at the time of the command’s execution. So, “Go To Line” cannot be easily implemented in-place.

While this may initially seem like a show-stopper, it is actually not much of a problem. The main loop can, instead, have the effect of always going to a line on every iteration (for every code editor on the screen). In most cases, this line number will just be 0 (an invalid value, so nothing will happen). The “Go To Line” command can, then—instead of performing 100% of the work related to jumping to a certain line number—simply “program the main loop iteration” by setting a per-frame state which controls the line-to-be-jumped-to.

The full generalization of this solution is just another command buffer, which is responsible for state mutations that occur—for example—after a user interface build. But this full generalized solution adds a fair amount of complexity, and offers little benefit in my experience—but, I figured I’d mention it in case it helps with your mental model, or a problem I’ve not experienced.

Closing Thoughts

When I first wrote user interfaces, the problems I’ve described in this post seemed weird, frustrating, subtle, and daunting. Dealing with hotkeys, state disagreements, developer terminals, all while being concerned about code deduplication felt like a pretty nasty problem. But, I’m happy to report that what I’ve described in this post has more-or-less entirely resolved the problem for me.

I cannot overstate how useful it was to put all of these seemingly-at-odds constraints on the table, break the problem down into its pure-functional mathematical description, and then projecting the “information flow” down onto data transforms. It is a useful strategy, and I hope this post is demonstrative of why it’s useful and how you might go about using it.

That’ll wrap this post up. I hope this was helpful—good luck programming!


If you enjoyed this post, please consider subscribing. Thanks for reading.

-Ryan

2
Share this post

UI, Part 8: State Mutation, Jank, and Hotkeys

www.rfleury.com
2 Comments
Ryan
Nov 20, 2022·edited Nov 20, 2022

Thanks for this series. Do you plan to add any of the UI related source to your Code Depot in a similar vein to the app_template repository? I would find this very helpful to solidify my knowledge.

Expand full comment
Reply
1 reply by Ryan Fleury
1 more comment…
TopNewCommunity

No posts

Ready for more?

© 2023 Ryan Fleury
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing