In short: As a case study, I reduce the code complexity of the one-line text editor with categorical combinators.
Suppose that we begin to use a new, user-friendly application, and we get familiar with it under a few days. Suppose moreover, that someone explains us the application logic under a few weeks. More or less this means that the ‘what’ and ‘how’ of the application has a few days and a few weeks of perceived complexity, respectively. We still don’t know the perceived complexity of ‘how to do efficiently’ because we have no clue how the application works on the actual hardware:
perceived complexity of the application
┌─┬─────┬──────────────────────── ─── ─── ───
│d│weeks│ ?
│a│ │
│y│ │
│s│ │
│ │ │
│W│ HOW │ how to do efficiently
│H│ │
│A│ │
│T│ │
└─┴─────┴──────────────────────── ─── ─── ───
In my experience more or less the code complexity of the same application looks like the following:
code complexity of the application
┌─── ─── ─── ─── ─── ─── ─── ─── ─── ─── ─── ─── ───┬─── ─── ─── ─── ───┐
│months or years │ │
│ │ │
│ │ efficient │
│ │ │
│ user interface program logic │ data structures │
│ │ & │
│ │ algorithms │
│ │ │
│ │ │
└─── ─── ─── ─── ─── ─── ─── ─── ─── ─── ─── ──────┴─── ─── ─── ─── ───┘
There is a huge gap between learning the ‘what’ and ‘how’ in weeks vs. understanding the application’s user interface and program logic code in months or years. I see the following possible causes:
As a part of my bigger project, I try to narrow the gap between the perceived complexity and code complexity of applications. In this blog post I present a case study about cause 2. I plan to discuss cause 3. at another time.
The text editor is probably the most iconic interactive application. The one-line editor can edit a single line of text:
As a case study, I reimplement the one-line text editor. My goal is to minimise its code complexity such that I don’t count the complexity of independent, reusable parts.
The user input is one of the following events:
event | function | details |
---|---|---|
Del | Backspace | Left | Right | ShiftLeft | ShiftRight |
control key | |
Char p |
printable character | p : 'a' | 'b' | 'c' | … |
Resize w |
window resize | w : width (in characters) |
I have selected the following editor features for the case study:
Char 'y'
= Heyllo world!Backspace
= Hello world!Del
= Hello world!Right
= Hello world!Left
= Hello world!ShiftRight
= Hello world!ShiftLeft
= Hello world!Resize 5
= Hello world!The set of corner cases is more vague than the set of features. A corner case for someone may be a feature or a common case for someone else. For me the editor has the following corner cases:
Backspace
Left
ShiftLeft
Del
Right
ShiftRight
Resize 10
Left
= Hello world! Backspace
= Hello orld! Right
= Hello world!Char 'o'
= Hello worold!Resize 2
= Hello world!ShiftRight
= Hello world! ShiftLeft
= Hello world!Users predict the behaviour of the application in the corner cases and they note the surprising cases only. I conclude that the perceived complexity of corner cases correlates with the number of surprising corner cases – which should be minimal – plus the complexity of the inference process by which users predict the behaviour of the application.
There is an asymptotic (huge) difference between the number of corner cases and the perceived complexity of corner cases. The number of corner cases grows probably exponentially with the number of features because corner cases may arise as interactions between different features. On the other hand, in my experience the perceived complexity of corner cases grows not much faster than the number of features.
I want the code complexity to correlate with the pereceived complexity of the corner cases. For this I want to reflect and distill the inference process by which users predict the behaviour in corner cases. As a side-effect the number of surprising corner cases will be minimized too.
For further analysis, let’s introduce the following definitions about the state of the editor:
t | the editable text |
c | cursor position inside t (beginning with 0) |
b | start position of the visible area (in characters, counted from the beginning of t) |
e | end position of the visible area (in characters, counted from the beginning of t) |
w | width of the visible area (in characters) |
The extended screenshot is a compact visualisation of these values. For example,
Hello world!
is a visualisation of
{t = "Hello world!"
, c = 7
, b = 5
, e = 15
, w = 10
}
User input can be mapped to editor state perturbations. Some examples:
user input | perturbation | meaning |
---|---|---|
Left |
∆c = (-1) |
decrement c |
Right |
∆c = (+1) |
increment c |
ShiftLeft |
∆b = (-1) |
decrement b |
ShiftRight |
∆b = (+1) |
increment b |
The editor state should always fulfil these so-called invariant properties:
invariant | explanation/feature |
---|---|
b ≤ c < e | the cursor is inside the visible area |
0 ≤ b | the visible area cannot be shifted before the editable text |
c ≤ length t | the cursor is inside the editable text |
e = w + b | the meaning of ‘width’ |
These constraints are inherent parts of the editor state.
Hello world!
can be also seen as a visualisation of
{t = "Hello world!"
, c = 7
, b = 5
, e = 15
, w = 10
| 0 ≤ b ≤ c < e, c ≤ length t, e = w + b}
Side-note about the notation
The part after the vertical bar is static information which can be expressed by refinement types or dependent types. Expressing it by refinement types would look like this:
{t = "Hello world!"
, c = 7
, b = 5
, e = 15
, w = 10
}
: {t ∊ String; c, b, e, w ∊ ℤ | 0 ≤ b ≤ c < e, c ≤ length t, e = w + b}
I use a bit more compact notation in the examples.
The editor’s corner cases are closely connected to the constraints. Consider the corner case when the cursor is at the right end of the visible area and the user hits Right
:
Hello world! & Right
This stands for
{c = 9
, e = 10
, … | c < e, …} & (∆c = (+1)
)
Here the application of the perturbation would invalidate c < e which is a signal for a corner case.
The common approach to handle the editor corner cases is to perturb the state and adjust it afterwards by a dedicated program logic which is called adjustVisibleArea
in the following example:
Hello world! & Right
= {c = 9
, b = 0
, e = 10
, … | c < e, e = w + b, …} & (∆c = (+1)
)
= adjustVisibleArea
{c = 10
, b = 0
, e = 10
, … | c < e, e = w + b, …}
= {c = 10
, b = 1
, e = 11
, … | c < e, e = w + b, …}
= Hello world!
This approach has the following shortcomings:
Perturbation propagation is also known as constraint propagation. Maybe it is easier to understand perturbation propagation if we split it into two distinct phases first.
In the first phase the perturbations are propagated in all possible ways:
Hello world! + Right
= {c = 9
, b = 0
, e = 10
, w = 10
, … | c < e, e = w + b, …} & (∆c = (+1)
)
= {c = 9
, b = 0
, e = 10
, w = 10
, … | c < e, e = w + b, …}
= Hello world!
(the user input is rejected)
= {c = 10
, b = 0
, e = 10
, w = 10
, … | c < e, e = w + b, …} & (∆e = (+1)
)
= {c = 10
, b = 0
, e = 11
, w = 10
, … | c < e, e = w + b, …} & (∆b = (+1)
)
= {c = 10
, b = 1
, e = 11
, w = 10
, … | c < e, e = w + b, …}
= Hello world!
(the cursor moved and visible are is shifted)
= {c = 10
, b = 0
, e = 11
, w = 10
, … | c < e, e = w + b, …} & (∆w = (+1)
)
= {c = 10
, b = 0
, e = 11
, w = 11
, … | c < e, e = w + b, …}
= Hello world!
(the cursor moved and the width of the visible area is increased)
After the first phase we get several possible results. In the second phase the results are filtered and the first good result is chosen.
This two-phase approach is tightly connected with the constraints and avoids the ad-hoc adjusting logic but it has at least two drawbacks:
In the editor application we know in advance (= statically) the constraints and the valid propagation of different perturbations.
Examples of such static knowledge:
With the static knowledge perturbation propagation can done in one phase. The running example looks like now this:
Hello world! & Right
= {c = 9
, b = 0
, e = 10
, w = 10
, … | c < e, e = w +’ b, …} & (∆c = (+1)
)
= {c = 10
, b = 0
, e = 10
, w = 10
, … | c < e, e = w +’ b, …} & (∆e = (+1)
)
= {c = 10
, b = 0
, e = 11
, w = 10
, … | c < e, e = w +’ b, …} & (∆b = (+1)
)
= {c = 10
, b = 1
, e = 11
, w = 10
, … | c < e, e = w +’ b, …}
= Hello world!
Either compilers may generate efficient code from the static knowledge, or better, we can structure the editor’s code such that it directly reflects the static knowledge. Let’s see how.
First I show how to avoid temporary invalidation of constraints.
The main idea is to compose complex constraints from basic ones. For this, constraints are represented as bidirectional stateful algorithms (see later). Each constraint has its own local state on which the constraint always holds. During the propagation of perturbations, the perturbations change the local state of the constraints as they are propagated by them.
Here are three steps of an example propagation:
│c ∆c = (+1)
╔══════╧══════╗
║ c < e ║ c = 9, e = 10
╚══════╤══════╝
│e
╔══════╧══════╗
║ e = w +' b ║ e = 10, w = 10, b = 0
╚═══╤═════╤═══╝
│w │b
The initial perturbation is ∆c = (+1)
. There are two basic constraints composed together. The local states on the right of the boxes are in sync and the constraints hold on them.
│c
╔══════╧══════╗
║ c < e ║ c = 10, e = 11
╚══════╤══════╝
│e ∆e = (+1)
╔══════╧══════╗
║ e = w +' b ║ e = 10, w = 10, b = 0
╚═══╤═════╤═══╝
│w │b
∆c = (+1)
was propagated by the c < e constraint whose local state was changed. The new perturbation to be propagated is ∆e = (+1)
. At this point the values of e are different in the two local states, but the constraints still hold on the local states.
│c
╔══════╧══════╗
║ c < e ║ c = 10, e = 11
╚══════╤══════╝
│e
╔══════╧══════╗
║ e = w +' b ║ e = 11, w = 10, b = 1
╚═══╤═════╤═══╝
│w │b ∆b = (+1)
At the end of the propagation the local states are in sync again.
At this point I am ready to show you the final editor structure which looks like this:
┌─user input─┐
╭──────────────┬───◂ const w ↤ Resize w ╎
│ ╰──▸ screen ╎ ╎
│ ╭───▸ screen ╎ ╎
w│ ╭──────┬──┴──┬────◂ (-1) ↤ ShiftLeft ╎
╔═╧═══╧═╗ │b ╰────◂ (+1) ↤ ShiftRight ╎
║ +' ║ │ ╎ ╎
╚═══╤═══╝ ╔══╧══╗ ╎ ╎
│e ║ ≤ ║ ╎ ╎
╔══╧══╗ ╚══╤══╝ ╎ ╎
║ > ║ │ ╎ ╎
╚══╤══╝ │c ╭────◂ (-1) ↤ Left ╎
├────────┴─────┴────◂ (+1) ↤ Right ╎
c│ ╭──────◂▸ co-editing ╎ ╎
╔═╧═══╧═╗ t ╎ ╎
║splitAt║ ╎ ╎
╚═╤═══╤═╝ t₂ ╎ ╎
t₁│ ╰──────┬──◂ deleteAt 0 ↤ Del ╎
├───────╮ ╰────▸ screen ╎ ╎
╔═══╧═══╗ ╰───────▸ screen ╎ ╎
║reverse║ ╎ ╎
╚═══╤═══╝ t₁'╭─◂ insertAt 0 c ↤ Char c ╎
╰─────────┴───◂ deleteAt 0 ↤ Backspace ╎
└────────────┘
Explanation:
splitAt
(c, t) is a constraint which holds if t is the concatenation of t₁ and t₂, and length t₁ = c. From this follows that the property c ≤ length t always holds.insertAt
i c – insert the character c at position ideleteAt
i – delete the character at position ireverse
t₁ is a constraint which holds if t₁′ is the reverse of t₁.Backspace
is mapped to deleteAt 0
which deletes the character before the cursor. Char c
is mapped to insertAt 0 c
which inserts c
right before the cursor.Wires are translated to constraint combinators. The translation process is automatic and it can be done by the compiler. The translation process is stable if certain conditions hold. You can read more about this in the Appendix.
The circle in the wiring could have many interpretations. The interpretation of circles is connected to the scheduling of propagation. I allow only scheduling-independent networks, for which any scheduling gives the same result. Scheduling-dependent networks can be ruled out either by the programmer or by strong static typing which I don’t discuss here. The actual scheduling is determined by the translation process of networks.
This architecture handles all corner cases correctly. Let’s see the running example first:
Hello world! & Right
The relevant part of the network looks like this after perturbation propagation is finished:
w = 10
╭──────────────▸ -
│ ╭──────┬───▸ (+1)
╔═╧═══╧═╗ │b = 0 & (+1)
║2. +' ║ │
╚═══╤═══╝ ╔══╧══╗
e=10│&(+1)║3. ≤ ║
╔══╧══╗ ╚══╤══╝
║1. > ║ │
╚══╤══╝ │c = 9 & (+1)
├────────┴──────────────────◂ (+1) ↤ Right
│ ╭────────▸ -
╔══╧═══╧══╗ t = "Hello world!"
║4.splitAt║
╚══╤═══╤══╝ t₂ = "ld!" & deleteAt 0
│ ╰────────▸ deleteAt 0
╰────────────▸ insertAt 9 'l'
t₁ = "Hello wor" & insertAt 9 'l'
Remarks:
(+1)
) as expected.(+1)
) but this information is not directly used in the screen rendering. The screen will be updated because t₁ and t₂ (the texts before and after the cursor) are perturbed.In the second example the user tries to delete the character after the end of the text:
Hello world! & Del
The relevant part in this case is just the splitAt
constraint:
c=│12 ╭──────▸ -
╔═╧═══╧═╗ t = "Hello world!"
║splitAt║
╚═╤═══╤═╝ t₂ = ""
t₁│ ╰──────◂ deleteAt 0 ↤ Del
Here the splitAt
halts the propagation process because it is not possible to propagate the deletion of the first character of the empty text. Because the propagation is halted, nothing will happen, which is the right behaviour. Halting the propagation is an optional feature of constraint combinators. You can read more about halting in the Appendix.
The editor can be rendered on the screen from w (the editor width), b (the begin position of the visible area), t₁ (the text before the cursor) and t₂ (the text after the cursor).
Just for illustration, here is the Haskell code for rendering the output:
render w b t₁ t₂ =
take w ( -- drop invisible glyps at the end
drop b ( -- drop invisible glyps at the beginning
map normalColor t₁ -- colour the characters before the cursor
++ mapHead invertColors ( -- invert the colours of the character right after the cursor
map normalColor t₂ -- colour the characters after the cursor
++ repeat (reddishColor ' ') -- unlimited reddish space after the text
)
)
)
The functions normalColor
and reddishColor
map characters to glyphs (colored characters). invertColors
maps a glyph to a glyph by exchanging it’s foreground and background colors. mapHead
applies a function to the first element of a list.
We don’t want waste computation power by re-rendering the whole screen for each user input. This is even more important for GUI programs. Incremental rendering is important for co-editing too: if another user edits part of the text which is not in the visible area of the user, then the user’s screen should not be touched at all.
Fortunately, the render
function can be converted to a ∆render
function which maps the perturbations from the editor core into perturbation of the screen, which is what incremental rendering means. ∆render
can also be constructed from combinators. For this we need simpler combinators than in perturbation propagation because ∆render
contains one-way wires only.
Let’s see the running example:
Hello world! & Right
The relevant output of the editor core is:
∆b = (+1)
, ∆t₂ = deleteAt 0
, ∆t₁ = insertAt 9 'l'
Without going into further details, ∆render
will produce the following output:
insertAt 8 (normalColor 'l')
. insertAt 8 (invertColors (normalColor 'd'))
. deleteAt 8
. deleteAt 0
Here the dot is the (reversed) composition of perturbations. If we apply the individual perturbations one by one we get to the final rendering from the previous rendering:
Hello worl
ello worl – deleteAt 0
ello wor – deleteAt 8
ello word – insertAt 8 (invertColors (normalColor 'd'))
ello world – insertAt 8 (normalColor 'l')
You have to go through the Appendix if you want to understand the actual implementation. I encourage you to read it also if you have not heard about categorical combinators yet.
You can find the source code of the implementation here. Note that the implementation is not in sync with this blog post and incremental rendering is not implemented which should be fixed soon.
As a part of my bigger project, I am going to implement a multi-line text editor and a basic TUI (textual user interface) email client and browser with as small code complexity as I can.