Making Sense of Lambda Calculus 5: Bring Computation to (Aggregate) Data

Oh, I see you disabled JavaScript. Keep up the good work, my fellow cleanweb person!

Notice that there might be trace amounts of JS, used for e.g. runnable JS code blocks. JS is not required for use of the website though, it’s only enhancing the existing functionality.

A bright thumbnail with black text. The text is arranged in a schematic tree. The tree first branches to LAM and BDA, then to CAL, and finally to CULUS. The branches are soft academic purple. Attributions to AARTAKA.me and Artyom Bologov are in the bottom corners.

Timeline of Making Sense of Lambda Calculus

So we’ve covered numbers as compositions and booleans as branches in this series of posts. But both of these are “primitive” data types. We need aggregate types to build the whole world out of! This post goes over techniques to build data types in Lambda Calculus. And some conventional types too.

Note on syntax

Starting with this post, I’ll use syntax of Lamber, my Lambda Calculus-based programming language. I find it more readable than conventional single-letter-single-argument lambdas.

Conses/Tuples/Pairs: All You Need #

An idea that generalizes to all of this post (and beyond:) Bring computation to data, not the other way around. It’s a bit of a brain teaser, but let’s see for ourselves. Conses (or pairs, or tuples) are two-element collections. With “car” (head) and “cdr” (tail):

def cons fn (car cdr)
  (fn (f) f car cdr) .

Cons type definitions

So cons takes two things: car and cdr. Then it takes a third argument, a function. It’s curried, so we don’t have to provide this function upfront. And applies the function to the first two arguments. That’s what I was talking about. Make cons user provide their action on data. Don’t just provide the data to them.

Now the simplest action might be fetching values. Here are two functions for fetching cons parts.

def car fn (pair)
  pair (fn (a b) a) .

def cdr fn (pair)
  pair (fn (a b) b) .

Cons accessors

See that? We apply the data structure to the accessor function. Not the other way around. It’s counter-intuitive, but useful and generalizes well to other aggregate types.

On NIL

An important thing to all this cons business is NIL. The cons terminator, the final list/tree element. NIL is useful, because it’s also a representation of an empty list. Data structures are useless if you don’t have a way to represent empty states, right?

NIL has two canonical representations with separate handling routines:

def nil false .
def alt-nil (fn x true) .

Two different representations of NIL

The former seems to be a consensus and it has a nice property: it’s 0, False, and NIL, all at the same time. Consistent and beautiful. With it, functions for NIL-aware cons iteration are elegant enough: just provide two values

A function to apply to the head and tail if this is a cons;
And a default value if it’s a NIL.

Null check, for example, is booleans (plus some formalities) applied to the pair:

def null fn (pair)
  pair (fn (h t x) false) true .

Definition of null function with NIL = False

This idiom notably includes a second argument to the pair. And (respectively) a third argument to the cons-processing function. This argument (True) is what NIL branch returns. But if it’s not NIL, this argument has to go somewhere, and that’s the purpose of x parameter. An elegant application of this pattern is foldr function:

def foldr fn (f init list)
  local %foldr = fn (h t z)
    f h (t %foldr z)
  end
  list %foldr init
end

Right fold implementation with False-encoded NIL

We feed %foldr, a recursive function, to a cons list, alongside the initial value (a.k.a. init a.k.a. z). If the list is NIL, it returns the init. If it’s not, it’s applying a recursive function to its head and application of it to the tail and init. Init is passed deep into the list recursion until there’s NIL. NIL returns this init and then the whole recursion chain collapses into a final value. Applying the folded function to init and last element. Then to the result of that and penultimate element. Then to the result and the element before that. Et cetera.

To illustrate, here’s a control flow for foldr + 0 [1 2 3]:

Apply [1 2 3] to fold and 0
It’s not NIL, so apply recursive function to head and tail
Apply + to 1 and...
- Apply [2 3] to fold and 0
- It’s not NIL, so apply recursive function to head and tail
- Apply + to 2 and...
  - Apply [3] to fold and 0
  - It’s not NIL, so apply recursive function to head and tail
  - Apply + to 3 and...
    - Apply []/NIL to fold and 0
    - It’s NIL, so just return 0
  - Get 3 + 0 = 3
- Get 2 + 3 = 5
Get 1 + 5 = 6
Done, foldr + 0 [1 2 3] is 6, all thanks to the beauty of NIL!

Wait, but what about alt-nil?! It’s nice in that it allows short and neat null check:

def alt-null fn (pair)
  pair (fn (h t) false) .

Alternative NIL check

If the cons is NIL, it just returns True. If it’s non-NIL, it applies the function to head and tail and returns False. Which is quite elegant... But that’s about it for its benefits—and e.g. foldr looks too ugly and explicit with it. So I prefer the 0/False/Nothing/NIL equality. All of Lamber list-processing is based on this assumption.

Now, can we generalise all this apply-data-to-function business to something else?

Triples #

Triples are like tuples/conses, but with three elements.

def triple fn (first second third)
  (fn (f) f first second third) .

def first fn (triple)
  triple (fn (a b c) a) .

;; Et cetera for second and third

Triple definition

If you want to represent an N-element collection, pass these elements to the constructor. And wait for the last argument—the function to apply to these.

Trees #

With these two or three element collections, we can build arbitrarily complex structures. Cons trees (binary, ternary, or whatever else you come up with.) That’s what Lisp code is all about—trees of symbols.

cons (cons 1 2)
     (cons (cons 3 (cons 4 nil)) (cons 5 (cons 6 7)))

Cons tree

Represents a tree with

1 and 2 (as one pair)
3 and 4 (as a so-called proper NIL-terminated list)
and 5, 6, and 7 (as a chain of conses)

Or, visualizing it with a little piece of ASCII art (generated by draw-cons-tree)

[o|o]---[o|o]-------[o|o]-[o|o]-7
 |       |           |     |
[o|o]-2 [o|o]-[o|/]  5     6
 |       |     |
 1       3     4

Cons tree, visualized

See the tree there? Not the most readable thing, but the benefit is apparent: It allows creating arbitrarily complex structures. So we only need conses, right? In a sense yes, but there are more tricks and types we can use:

Union Types #

We can increase the number of values we store in the structure, like with triples. But we can also increase the number of actions! That’s how union types actually work in LC. Get multiple functions, one per possible type, and only run the one matching the type.

The most representative union type is Either. Basically, it’s either Left with one value (usually an error,) or Right with the other value (usually a result.)

def left fn (leftval)
  fn (leftfn rightfn)
    leftfn leftval .

def right fn (rightval)
  fn (leftfn rightfn)
    rightfn rightval .

Either constructors

There’s only one value now, but two action functions! If the thing is right, call rightfn function, otherwise call leftfn. Now to actual functions processing these:

def isleft fn (either)
  either (fn _ true) (fn _ false) .

def leftval fn (either)
  either identity (fn _ nil) .

Left-related functions

The pattern repeats: we pass actions to data, not the other way around. But this time, there are multiple actions we pass. In this case, checking that the data is Left means passing a True-returning function as leftfn. And getting the left value is basically an identity leftfn to return whatever is stored there. Corresponding rightfn-s are obvious: return False/NIL when Right.

Again, this example can be generalized to an arbitrary number of action functions. That’s what Mogensen–Scott encoding is about. Store the data, get action functions, run the matching one, get the result.

Maybe and Nothing (spoiler: there’s no Nothing) #

July 2025: Originally, the post was coming to an abrupt end right after the previous section. But I decided to also list my implementation of Maybe monad. In part because it diverges from Marvin Borner’s one in a significant way. The way Marvin defines Maybe is:

Nothing   = nothing => just => nothing
Just = v => nothing => just => just(v)

isNothing = maybe => maybe(true)(_ => false)
isJust    = maybe => maybe(false)(_ => true)

getValue = just => just()(v => v)

Maybe, Just, and Nothing definitions

Notice the order of arguments to Nothing/Just: “nothing” clause goes first, “just” second. Which makes as much sense as any other placement, actually:

Personally I often choose the argument/tag order such that the de Bruijn indices make some intuitive sense to me

Marvin Borner on Nothing/Maybe parameter order

But, for my implementation, I picked an opposite order, and here’s how it ended:

alias nothing nil .

def maybe fn (x)
  fn (just none)
    just x .

def isnothing fn (mb)
  mb (fn _ false) true .

def isjust = complement isnothing .

def just fn (default mb)
  mb identity default .

Lamber implementation of Maybe

So Nothing is no longer a random-looking function, it’s merely an alias for NIL. Which is extending a 0=False=NIL=Nothing property I strive for so much. And then, the order of arguments to isnothing and just is reversed too. Which makes a lot of sense for just, kind of saying “if maybe, then take identity, otherwise return default value.” NIL/Nothing returns the second argument automatically, so no need for any special logic. I like it when the tech solves its own problems. As my implementation of Maybe did!

I guess that’s a lot of screen time taken for a small hack. But at least you’ve gotten a taste of what writing Lambda Calculus monads sometimes involves: parameter juggling! Now to actually good material:

Directions to More Complex Structures #

I’m too stupid for more complex data types. Luckily, Marvin Borner isn’t. So go read his “Tiny, Untyped Monads” instead of a conclusion to this post. It’s worth the read! I’ll get to that too, and maybe I can steal some more of it into Lamber, my Lambda Calculus-compiling language.

As an additional reference: the “bring computation to data” technique is used by Christian Queinnec in his Lisp in Small Pieces book, in the part about emulating object-orientation when all you have are lambdas. The solution he suggests is making a constructor lambda closing over initial arguments. And then passing in a symbol matching respective “field” to fetch/modify it. It can be generalized by passing in a function instead of symbol, like most structures in this post do!