Juan Patten

Let’s Build a Reactive UI Framework – Part 1: Data Flow

Reactive dataflow libraries like Solid and MobX are a joy1 to use. They can feel like magic in the same way a spreadsheet feels like magic. Change one value and watch the effects ripple through the UI.

The cool thing is that the concepts behind them are actually pretty simple! In this series of articles, I want to see if we can distill the reactive dataflow paradigm down to its essence, and implement it in a straightforward, readable way while still being correct (“glitch-free”2 as they say) and performant enough to build real apps.

We’ll start by building a simple and correct library, and then we’ll tackle some low-hanging optimization opportunities.

A Preview

Here’s a demo of what we’ll be able to make at the end of the article. Everything is vanilla javascript, and the only library used is the one we’re about to build. You can view it full-screen if you’d rather.

Usage Notes

  • Cell formulas are just Javascript expressions.
  • Inside a cell's formula, you can refer to other cells by name, ie a4.
  • To finish editing a cell, press Enter or Tab or click out of it.
  • Math functions are exposed directly. Do abs(a4), not Math.abs(a4).
  • If a cell's formula results in an error, the cell will show a ‼️ which you can hover over to see the error.
  • The ☰ icon opens a sidebar where you can resize the grid.

The Core Idea

There are basically two types of data3 in any system:

The distinction between the two can be fuzzy4 in theory, but in practice it’s a useful distinction to guide how we design our data and shared state.

If that makes intuitive sense, feel free to skip the following examples.

Example 1: The Canonical Rectangle Example

Pretend we are making an app which displays some facts about a rectangle. Namely, the width, height, area, and diagonal of the rectangle.

The rectangle's width and height are input data, because the only way we can know these facts is to be told them. On the other hand, the area and diagonal are derived data, because they can be calculated from the width and height.

Example 2: Filling Out a Form

Think of filling out a typical web form. There are some form fields, which may be valid or invalid. If they are valid, Submit button is enabled. If they are invalid, the Submit button is disabled, and an error message is displayed.

The form fields are the input data. Whether they are valid or invalid is derived data. Additionally, the state of the Submit button and whether the error message is displayed are also derived data.

Example 3: A Spreadsheet

The most direct & visceral example is a spreadsheet. Some cells are just numbers (input data), and some are equations which reference other cells (derived data). When one of the input cells is updated, all the cells that are derived from that cell update automatically.

Reactive framework people want all UIs to work like spreadsheets.

Derived Data, Flowing

The goal of a reactive dataflow framework is that the only mutable state in our app is input data. Everything else is derived from that, and updates automatically when the inputs change.

The idea behind libraries like React is that the entire UI is derived data.

Glossary

I’ll use the following terms throughout:


Library API

Let’s sketch out how we might turn these ideas into a library. First we’ll think about the API we want, and work backwards from there to the implementation.

Atom

An Atom starts with an initial value:

let x = Atom(5)  // x's value is 5

Setting an Atom’s value will cause all its dependents to recalculate:

x.set(10)  // x's value is 10

To get the current value of an Atom, call it like a function:

x()  // => 20

If a Calc uses the value of an Atom, it observes that Atom. This is usually what we want, but if we want to look without observing, we can peek:

x.peek()  // => 20

Calc

A Calc is just a function. Ideally a pure function:

let xSquared = Calc(() => x() ** 2)

Notice that this Calc observes the value of the Atom we created earlier.

To get the current value of a Calc, call it like a function:

xSquared()  // => 400

Calcs can also observe other Calcs:

let xSquaredPlus2 = (() => xSquared() + 2)

We can peek at a Calc without observing it:

xSquaredPlus2.peek()  // => 402

Effect

An Effect is like a Calc, but instead of calculating a new value when a fact changes, an effect does something when a fact changes: logs to the console, makes a network request, updates the UI, etc…

An Effect is just a function that will be re-run whenever any of its observed facts change:

let logger = Effect(() => console.log(`x == ${x()}`))

Is a Calc just a special case of an Effect? Kind of.5


Usage Example

To get a sense for how the library feels “on the page”:

Example Code
let fullName = Atom('James Bond')
let intro = Atom("The name's")
let punct = Atom('.')

let first = Calc(() => fullName().split(' ')[0])
let last = Calc(() => fullName().split(' ')[1])

let sentence = Calc(() =>
  `${intro()} ${last()}${punct()} ${first()} ${last()}${punct()}`)


let logger = Effect(() => {
  console.log(sentence())
})


fullName.set('Mary Oliver')

intro.set(intro.peek() + ' still')

setTimeout(() => {
  punct.set('?')
  intro.set('Wait… is my name')
}, 1000)
Example Console Output
The name's Bond. James Bond.
The name's Oliver. Mary Oliver.
The name's still Oliver. Mary Oliver.
The name's still Oliver? Mary Oliver?
Wait… is my name Oliver? Mary Oliver?

Implementation Concepts

Now that we’ve got an API in mind, let’s start working on an implementation. There are a few things to think about.

Automatic Dependencies

Part of the “magic” feeling of a reactive framework is that it we never have to explicitly specify dependencies. The system figures them out for us. This is not only convenient, but important.6

And it’s surprisingly simple:

That’s all there is to it.7

Reacting to Changes

Every time an Atom is updated, we need to figure out two things:

  1. Every Calc8 that could possibly change based on the Atom’s new value. We’ll call these stale.
  2. The order in which to recalculate affected Calcs. A Calc can recalculate if and only if all of its dependencies are fresh (ie. not stale).

The Algorithm9

Imagine a dependency graph that looks like this:

A1 A1 A2 A2 A3 A3 C3 C3 C4 C4 C5 C5 C6 C6 C7 C7 C8 C8 C1 C1 C2 C2

Squares are Atoms. Circles are Calcs. Arrows mean “depends-on”.

  1. When an Atom is told to change its value, it will first notify all its dependents that it is stale. They will in turn notify all their dependents, and so forth. When this is done, every fact that could possibly be affected by the change has been marked stale.
A1 A1 A2 A2 A3 A3 C3 C3 C4 C4 C5 C5 C6 C6 C7 C7 C8 C8 C1 C1 C2 C2
  1. The Atom stores its new value.

  2. The Atom notifies all its dependents that it is fresh. When a Calc is notified that one of its dependencies has become fresh, it recalculates if and only if all of its dependencies are now fresh.

A1 A1 A2 A2 A3 A3 C3 C3 C4 C4 C5 C5 C6 C6 C7 C7 C8 C8 C1 C1 C2 C2

At this point, C1 can recalculate, because all of its dependencies (just A1) are fresh. However, C3 cannot recalculate yet because one of its dependencies (C1) is still stale.

  1. Once a Calc recalculates, it is now fresh, and it notifies its dependents.
A1 A1 A2 A2 A3 A3 C3 C3 C4 C4 C5 C5 C6 C6 C7 C7 C8 C8 C1 C1 C2 C2

C3 may now recalculate – as can C4 and C6 – because all of their dependencies are fresh.

  1. This process continues until all facts are fresh.
A1 A1 A2 A2 A3 A3 C3 C3 C4 C4 C5 C5 C6 C6 C7 C7 C8 C8 C1 C1 C2 C2

And that’s it. Now all the Calcs that were affected by the change have recalculated, and we’re awaiting another change.

A1 A1 A2 A2 A3 A3 C3 C3 C4 C4 C5 C5 C6 C6 C7 C7 C8 C8 C1 C1 C2 C2

Dynamic Dependencies

It’s worth noting that when a Calc recalculates, it might end up with a new set of dependencies. Perhaps when C7 recalculated, it no longer read the value of C4, but instead read the value of C5. We would end up with a graph like this:

A1 A1 A2 A2 A3 A3 C3 C3 C4 C4 C5 C5 C6 C6 C7 C7 C8 C8 C1 C1 C2 C2

If so, the next time A1 changes, C7 will not be recalculated because it no longer transitively depends on the value of A1.

Errors

In the case where a Calc’s function throws an error, we want to treat that Error as if it were the result of the function. That is, after badCalc throws an error, badCalc.peek() should return that error.


And Now, Some Code

Now we have a sense of how things should work, and we can start making the ideas concrete in code.

Graph Management

Notice that Atom and Calc share a lot of functionality around maintaining the dependency graph and propagating information through it. Let’s factor that functionality out into a base. We could call it “GraphNode”, but for the sake of style let’s call it Reactor.

In addition to its graph management duties, a Reactor has a latest value, and – optionally – an effect which will run when its dependencies become fresh.

A Note on Nomenclature

I use the terms “input” and “output” here instead of “dependency” and “dependent”, respectively. I find the latter tend to blur together on the page, while the former make things a bit more obvious and intuitive.

class Reactor {
  static active = null

  latest = undefined
  effect = undefined

  _staleInputs = 0
  _inputs = new Set()
  _outputs = new Set()

  // Informs the currently-running reactor that this
  // is one of its inputs. Returns the latest value.
  observe() {
    let running = Reactor.running
    if (running) {
      this._outputs.add(running)
      running._inputs.add(this)
    }
    return this.latest
  }

  // Records a stale input. If not already stale,
  // becomes stale and notifies its outputs.
  stale() {
    if (++this._staleInputs == 1) {
      for (const o of this._outputs) { o.stale() }
    }
  }

  // Records a fresh input. If all are fresh,
  // runs effect (if any) and notifies outputs.
  fresh() {
    if (--this._staleInputs == 0) {
      this.effect?.()
      for (const o of this._outputs) { o.fresh() }
    }
  }

  // Executes `fn()` and records its inputs.
  // Returns the result, or any thrown error.
  track(fn) {
    const oldInputs = this._inputs
    this._inputs = new Set()
    const oldActive = Reactor.active
    Reactor.active = this
    try {
      return fn()
    } catch(err) {
      return err
    } finally {
      Reactor.active = oldActive
      for (const i of oldInputs.difference(this._inputs)) {
        i._outputs.delete(this)
      }
    }
  }
}

NOTE

As of 2024-03-20, we’ll need to include a shim10 for Set.prototype.difference for Firefox.

The API

Implementing our API on top of Reactor is straightforward.

Atom

An Atom is a Reactor that doesn’t have an effect. Notice that its set method is – precisely – steps 1-3 from the algorithm.

export function Atom(value) {
  const r = new Reactor()
  r.latest = value
  const atom = () => r.observe()
  atom.set = (newValue) => {
    r.stale()
    r.latest = newValue
    r.fresh()
  }
  atom.peek = () => r.latest
  return atom
}

Calc

A Calc is a Reactor whose effect is to set its own latest value to the result of its calculation.

export function Calc(fn) {
  const r = new Reactor()
  r.effect = () => r.latest = r.run(fn)
  r.effect() // calculate the initial value
  const calc = () => r.observe()
  calc.peek = () => r.latest
  return calc
}

Effect

An Effect is a Reactor that exists only for its effect. Instead of repeating code from Calc, we’ll return a Calc whose value is always undefined.

export function Effect(action) {
  return Calc(() => { action() })
}

One Two More Thing s

The library is fully functional, but there are a couple of gaps which I’ve omitted up til now for simplicity’s sake.

Cancellation / Disposal

How would we stop an Effect from running? Right now there’s no good way. Even if we try to have it garbage-collected11, it will keep effect-ing. Since dependencies keep a reference to their dependents, an Effect will stay it alive even if we try to destroy it. The same is true for Atom and Calc as well.

We need a way to sever the edge between nodes in the graph. Let’s add the following method to the Reactor class definition:

class Reactor {
  // ...

  dispose() {
    for (const i of this.#inputs) { i.#outputs.delete(this) }
    for (const o of this.#outputs) { o.#inputs.delete(this) }
    this.#inputs.clear()
    this.#outputs.clear()
  }
}

and expose it on each of our API primitives:

export function Atom(value) {
  // ...
  atom.dispose = r.dispose.bind(r)
  return atom
}

export function Calc(fn) {
  // ...
  calc.dispose = r.dispose.bind(r)
  return calc
}

// Effect returns a Calc, so we
// don't need to change anything.

Now users of the library can – for example – cancel a logger, etc…

let logger = Effect(() => console.log(myAtom()))
setTimeout(() => { logger.dispose() }, 10_000)

Cycles

A Calc could end up depdending on itself, forming a cycle in the dependency graph. For example:

let cycle = [Atom('')]

// On first execution, cycle[1] will be undefined.
cycle[1] = Calc(() => cycle[0]() + cycle[1]?.())

// But this will cause cycle[1] to run again
// and this time it will refer to itself.
cycle[0].set('!')

Detecting Cycles

In the above example, the Calc directly depends on itself. This case is easy to detect: the Calc will be active when it tries to observe itself.

However a Calc directly depending on itself is a bit contrived, and unlikely in real-world cases. It’s much more likely that a cycle would arise when a Calc indirectly depends on itself, eg:

A1A1C1C1C2C2C3C3

In this case, at the point when the cycle is established, one of the Calcs will try to observe a Calc which is stale.

Handling Cycles

How exactly we should handle a cycle is an open question, but there are a few things to consider:

We know we can detect a cycle at the time a Calc is observed, so the approach we’ll take is to throw an error in the observe() method, after the dependencies are set up:

observe() {
  let active = Reactor.active
  if (active) {
    this._outputs.add(active)
    active._inputs.add(this)
    // ↓↓ NEW CODE ↓↓
    if ((Reactor.active == this) || (this._staleInputs > 0)) {
      throw Error('Cycle detected')
    }
    // ↑↑ NEW CODE ↑↑
  }
  return this.latest
}

There are two other things to note:

  1. C1’s stale notification will end up back at itself: C1 → C3 → C2 → C1
  2. C1’s fresh notification will end up back at itself: C1 → C3 → C2 → C1

Both of these cases will cause bugs if not handled, so let’s handle them:

class Reactor {
  // ...
  _becomingStale = false  // ← so we can detect a stale() cycle
  // ...
  stale() {
    if (this._becomingStale) { return } // ← avoid modifying state
    if (++this._staleInputs == 1) {
      this._becomingStale = true  // ← set detector
      for (const o of this._outputs) { o.stale() }
      this._becomingStale = false // ← unset detector
    }
  }
  // ...
  fresh(changed = true) {
    if (this._staleInputs == 0) { return } // ← avoid modifying state
    // ...
  }
  // ...
}

Now when a cycle happens, the latest value of all Reactors in the cycle will be Error('Cycle detected'), and if we later break the cycle (by ie. changing a spreadsheet cell’s formula), all the Reactors which previously made up the cycle will correctly recalculate.12


Optimizations

There are many optimizations we could make to improve speed and memory usage, but one in particular calls out to be addressed.

Minimizing Recalculations

If a Calc recalculates but produces the same value as before, all of its dependents will still recalculate, even though they don’t need to. It would be good to avoid them if possible, especially because we anticipate that the effects of some dependents (like UI rendering) will be expensive.

To do this, we’ll need to pass some additional information along with the fresh notification, and add a bit of additional logic:

class Reactor {
  // ...

  // Add an instance variable
  _inputChanged = false

  // ...

  // Add a parameter to fresh(), and a bit of additional logic
  // to avoid running the effect if no inputs have changed.
  fresh(changed = true) {
    if (changed) { this._inputChanged = true }
    if (--this._staleInputs == 0) {
      if (this._inputChanged && (this.effect != null)) {
        this._inputChanged = false // reset for next time
        let oldValue = this.latest
        this.effect()
        changed = (this.latest !== oldValue)
      }
      for (const o of this._outputs) { o.fresh(changed) }
    }
  }
}

And let’s add a check at the beginning of Atom’s set method:

export function Atom(value) {
  // ...
  atom.set = (newValue) => {
    if (newValue === r.latest) { return }
    // ...
  }
  // ...
}

Now a Reactor will only run its effect if at least one of its inputs changed value13.


Dev Tooling

When debugging the dataflow graph, we’ll frequently find ourself wondering “which Reactor am I looking at?”, so let’s add a way to label them so we can tell them apart in the debugger.

In lieu of a more comprehensive custom formatter, it will suffice to just be able to see the label from the default devtools object summary without having to click the object to expand it.

Chrome, Safari, and Firefox summarize an instance of a class by listing the instance variables in order of definition, so let’s add an instance variable at the top of the Reactor definition:

class Reactor {
  static active = null  // ← not an instance variable

  _label = undefined  // ← add this before other variables

  // ...
}

Now the label will show in the object summary, like this:

And sometimes when doing more complex debugging, we’ll need access to the Reactor that powers a given Atom or Calc.

Let’s modify our API primitives like so:

export func Atom(value) {
  // ...
  atom.label = (lbl) => ((r._label = lbl), atom)
  atom.$r = r
  return atom
}

export func Calc(fn) {
  // ...
  calc.label = (lbl) => ((r._label = lbl), calc)
  calc.$r = r
  return calc
}

Now we can label a primitive at the time of creation:

const foo = Atom(123).label('foo')

and access its Reactor by doing foo.$r


That’s It

That’s it for the first article in the seri— wait! —we forgot the most crucial thing! If we’re going to make a library, it has to have a clever name 😉

Let’s see…

urx.js — not bad.

And it’s … type type type … available on GitHub, so – ✔ that’ll do.

https://github.com/jrpat/urx.js

That’s Really It

Phew, that was a lot to read, but the full implementation is actually pretty simple. In about 100 lines of easy-to-read code, we have an ergonomic, glitch-free, performant reactive dataflow library that robustly handles cycles, and has affordances for debugging! And you know how it all works, from soup to nuts.

We can already use it to build real apps like the spreadsheet demo from the beginning of the article.

But as you build with it, you’ll quickly realize that it still leaves a lot to be desired: it’s not easy to combine Atoms into nested structures, for example. It’s a bit clunky to work with Atoms whose values are lists, for another.

And we have all this derived state but no way to declaratively code our UI, so we end up doing a lot of manual DOM manipulation (check out the code for the spreadsheet demo for example).

We’ll address all those and more in due time.


Next Up

In the next few posts in this series, we’ll use urx.js as a foundation on which to build:


Recommendations

If you liked this post, you might enjoy reading:


Thanks for reading! See you next time.

Juan



  1. Making UI development fun is important. If we want to build enjoyable things, we should enjoy working on them.

    That’s been a driving force behind many of the frameworks I’ve developed over the years. We even made it a central, explicit goal in the one I developed for my old company (closed-source, but I dug up the original README recently)  

  2. A “glitch” is reactive jargon for an inconsistency in derived state. More Info 

  3. The terminology comes from the classic Out of the Tar Pit (Section 7.1.1), which further granularizes these types. 

  4. Is the mouse position input data? Or is it derived from a starting position and a list of every movement it has made since the program started? In theory, the latter. In practice, the former. Caching is another place where the line blurs. Out of the Tar Pit has a pretty good treatment of this question in section 7.1.1 

  5. We could say that a Calc is just an Effect whose action is to recompute & store its value. Or is it the other way around? Maybe an Effect is just a Calc whose value is always undefined.

    Both are valid ways of looking at it, and in our implementation, in fact both are true. An Effect is just a Calc. And a Calc is just a “reactor” whose effect is to recompute and store its value. 

  6. If we have to manually specify dependencies, it’s easy to make mistakes. For any given Calc, we might “oversubscribe” (continue observing a fact that is no longer needed) or “undersubscribe” (forget to observe a fact, and thus don’t react when it changes). 

  7. At least, that’s all there is to it in single-threaded code. This becomes quite a bit more tricky if we’re in a multithreaded environment. 

  8. Since Atoms don’t depend on anything, no other Atom could possibly change. 

  9. The original idea for this algorithm comes from this post about MobX. I’m sorry in advance for the color scheme 🤦 

  10. It’s trivial

  11. By – for example – creating it as a local variable in a function and then letting it go out of scope. 

  12. I know that’s a bit of a hand-wavy explanation, but this post is long enough already. Cycle detection in a reactive dataflow graph is a complex topic, which perhaps I’ll revisit with a more thorough treatment at some point in the future. 

  13. We use === equality, but it might make sense to use same-value-zero equality like many Javascript builtins do. Or perhaps we should make it configurable per Reactor