Making Sense of Lambda Calculus 2: Numerous Quirks of Numbers

With basic terms and order of evaluation out of the way, we can finally get to practical things. Like numbers.

Numbers in Lambda Calculus (LC) are usually represented by Church Numerals. These are usually two-argument functions taking a function and a value. Numbers apply the function to the value N times and return the result. Here are some numbers:

0 = λfx.x
1 = λfx.f x
2 = λfx.f (f x)
5 = λfx.f (f (f (f (f x))))
...

Examples of Church Numerals

There’s also an encoding of numbers with pairs. But that’s too much overhead for questionable benefit. So let’s stick with regular Church numerals and see that...

Numbers Are Compositions #

This wasn’t immediately clear: numbers are compositions. A Church numeral is applying a function it’s given a certain number of times. So you can compose a function with itself by passing it to a number. Elegant, yet non-obvious.

Imagine you want a function taking the fifth (zero-indexed) element of the linked list. So you strip off list beginnings five times and then get the fifth element directly. Implying that tail returns the tail of the list and head returns the head:

head ((5 tail) list)
head (((λfx.(f (f (f (f (f x)))))) tail) list)
head ((λx.(tail (tail (tail (tail (tail x)))))) list)
head (tail (tail (tail (tail (tail list)))))

Example of numbers as function compositions

Successor #

So numbers are clearly compositions. If we want to increase some number, we can simply take this number and compose another function call on top of it. That’s the logic of Successor function:

succ = λnfx.f (n f x)
succ 5
succ (λdy.(d (d (d (d (d y)))))) // α-substitution fx -> dy
λfx.f ((λdy.(d (d (d (d (d y)))))) f x)
λfx.f (f (f (f (f (f x))))) // η-reduction

Successor function

Form-closing Bracket #

These closing parentheses are getting unruly. I have to introduce another notational element to handle them. I’ll use ] (as the Interlisp homage) at the end of deeply nested form. That’ll mean “arbitrary number of closing parentheses”. The final function in the example above can be rewritten as:

λfx.f (f (f (f (f (f x]

6 in the bracket-terminated form

Hopefully that’s not too off-putting.

Addition and Peano Arithmetics #

Peano arithmetics—as far as I was able to understand it— is all about building a complete set of numeric operations out of constants and successor operator. It’s also possible to implement these arithmetics in the toolkit Lambda Calculus provides. Building on the foundations:

Constants are Church numerals like λfx.f (f (f (f (f (f x].
Successor operation is succ.
m+n Addition is applying succ to m an n number of times.
Multiplication is... let’s leave it for later.

So here’s how addition formula based on Peano axioms looks in Lambda Calculus:

// Peano
a + 0 = a
a + S(b) = S(a + b)
// Lambda Calculus:
add = λmn.n succ m,

Translating Peano addition to Lambda Calculus

While these look quite different, the idea is the same: apply the successor n times to get n+m. Wikipedia page has a good example that I won’t be able to reproduce here. But, believe me, these are the same.

Deadly Predecessor & Subtraction #

Now what Peano arithmetics don’t cover is subtraction. There are no negative numbers, so there’s no “subtraction is addition of negative number”. So we need to somehow strip off some function applications from the number. The smart piece of code that Wikipedia lists is:

pred = λnfx.n (λgh.h (g f)) (λu.x) (λu.u)

Predecessor function

Here’s some reduction to show how it works. Beware: it’s quite overwhelming!

pred 3
(λnfx.n (λgh.h(g f)) (λu.x) (λu.u)) 3
(λnfx.n (λgh.h(g f)) (λu.x) (λu.u)) (λfx.f(f(fx]
λfx.(λgh.h(g f)) ((λgh.h(g f)) ((λgh.h(g f)) (λu.x))) (λu.u)
λfx.(λgh.h(g f)) ((λgh.h(g f)) (λh.h x)) (λu.u)
λfx.(λgh.h(g f)) ((λgh.h(g f)) (λh.h x)) (λu.u)
λfx.(λgh.h(g f)) ((λgh.h(g f)) (λd.d x)) (λu.u)
λfx.(λgh.h(g f)) (λh.h((λd.d x) f)) (λu.u)
λfx.(λgh.h(g f)) (λh.h(f x)) (λu.u)
λfx.(λgh.h(g f)) (λv.v(f x)) (λu.u)
λfx.(λh.h((λv.v(f x)) f)) (λu.u)
λfx.(λh.hf(f x)) (λu.u)
λfx.((λu.u) f(f x))
λfx.f(f x)

Reduction of “pred 3” expression

Essentially (you probably got it just from looking at it, right?) what it does is introduce a proxy function that gets passed to the number. The first application of this function is returning x. And the rest of the applications just apply f to it. So it’s stripping off one innermost call to f from the number. Scary, but useful.

Now that the predecessor is covered, subtraction is a piece of cake. The principle is the same as addition: just apply predecessor N times:

sub = λmn.n pred m,

Subtraction function

Quite elegant... if we don’t consider the underlying horrors of pred.

Multiplication and Exponentiation #

Multiplication and exponentiation are beautiful in LC. But they are not necessarily apparent in how they work. So I’ll try to make some reductions and explain how these make their magic. Here’s multiplication:

mult = λmnf.m (n f)

Multiplication of m×n in Lambda Calculus

The idea is clear: take n, apply it to f, and repeat that m times. But once you squint at it, it unfolds into a series of questions:

Where is x in it? The zero/init/base value, I mean.
How does currying work in there?
Is applying one number to another going to be useful?

To deepen the feeling of displacement, here’s the exponentiation function:

pow = λbe.e b

Raising b to the power of e in Lambda Calculus

This is basically this application of one number to another one. Quite intuitive on the surface. But it gives me a stack overflow when I try to mentally reduce it. So here are written reduction to understand mult and pow:

mult 2 3
(λmnf.m (n f)) 2 3
(λmnf.m (n f)) (λfx.f (f x)) 3
(λmnf.m (n f)) (λfx.f (f x)) (λfx.f (f (f x]
(λnf.(λfx.f (f x)) (n f)) (λfx.f (f (f x]
λf.(λfx.f (f x)) ((λfx.f (f (f x))) f)
λf.(λgy.g (g y)) ((λhz.h (h (h z))) f) // α-substitution
λf.(λgy.g (g y)) (λz.f (f (f z)))
λfy.(λz.f (f (f z))) ((λz.f (f (f z))) y) // NB
λfy.(λz.f (f (f z))) (f (f (f y]
λfy.f (f (f (f (f (f y]
λfx.f (f (f (f (f (f x] // SIX!!!!!

Reduction of multiplication over 2 and 3

So what happens here is indeed this curried number thing. n is applied to the base value (implied here) m times. The biggest plot twist there is the line marked with NB: the number with f being curried inside it is getting applied exactly m times. Which is the same as I said earlier, but now proven by this small reduction (at least compared to the predecessor one.) Now the tasty stuff with even more levels of implied currying:

pow 3 2
(λbe.e b) 3 2
(λbe.e b) (λfx.f (f (f x))) 2
(λbe.e b) (λfx.f (f (f x))) (λfx.f (f x))
(λe.e (λfx.f (f (f x))))  (λfx.f (f x))
(λf.λx.f (f x)) (λfx.f (f (f x)))
λx.(λfx.f (f (f x))) ((λfx.f (f (f x))) x)
λx.(λfx.f (f (f x))) ((λfy.f (f (f y))) x) // α-substitution
λx.(λfx.f (f (f x))) (λy.x (x (x y))) // NB
λx.(λfz.f (f (f z))) (λy.x (x (x y))) // α-substitution
λxz.(λy.x (x (x y))) ((λy.x (x (x y))) ((λy.x (x (x y))) z))
λxz.(λy.x (x (x y))) ((λy.x (x (x y))) (x (x (x z)))
λxz.(λy.x (x (x y))) (x (x (x (x (x (x z]
λxz.x (x (x (x (x (x (x (x (x z]
λfz.f (f (f (f (f (f (f (f (f z]  // α-substitution
λfx.f (f (f (f (f (f (f (f (f x]  // α-substitution
λfx.f (f (f (f (f (f (f (f (f x] // NINE!!!

Reduction of exponentiation of 3 by 2

Now what happens here, especially on the line marked with NB, is a crime. Crime against static type systems and fundamentalist minds. The value that’s supposed to be a base/zero/init value... Is used as a function. This makes Lambda Calculus both more cryptic and beautiful:

Functions are all there is to LC.
Even the ones supposed to be scalar values.
Even the ones supposed to be “functions”.
And even the ones that oscillate between the two!

This is a takeaway that the exponentiation reduction refreshed in my mind: (After all, it’s been almost 4 months since the last Making Sense of Lambda Calculus post.) Don’t be afraid to break the supposed contract your lambdas are establishing. They are just that: lambdas. Nothing else.

But there are cases when even the smart currying, re-interpretation of data as functions and vice versa, and other arcane stuff doesn’t help. Like division:

Is There Division? #

Division is always a problem. It’s not following many rules that addition and multiplication establish. And there’s a whole page on division algorithms on Wikipedia.

The simplest algo is the subtraction-based one. Basically subtracting the denominator from the numerator until it’s below the denominator. Here’s how stdlambda implementation of division works:

div = λnd.Y (λr.λq.(lt q d) 0 (succ (r (- q d)))) n

Division implementation based on recursive subtration

A lot of things there:

Y: Y combinator for recursion on r
lt: Less than function working as a termination condition.
succ: Successor function from above.
sub: Subtraction.

Putting it all together:

Apply the function to numerator (n) and denominator (d)
If the quotient (q, numerator initially) is less than the denominator, then return 0.
Otherwise, subtract the denominator from the quotient and recurse.
Add one to the recursive call result every time we subtract the denominator.
Once the recursion terminates, 0 turns into the number of denominator subtractions.
Which is basically what a division is.

And yes, that’s the simplest implementation I can think of. No elegant stuff like applying numbers to each other. Justine Tunney lists a division function in her “Lambda Calculus in 383 Bytes”, but I was not able to decipher how it works. I’m stuck with advanced things like Y combinator for the seemingly simple operation: division.

Up Next: Truth or Dare With Booleans #

Summarizing all of the above:

Numbers are compositions.
Increasing them is a matter of composing yet another function on top.
Decreasing/subtracting them is... complicated.
Multiplication and exponentiation are beautiful.
While division is... complicated.

One type of numeric operations that I haven’t mentioned is comparisons. Because comparisons get us into the uncharted lands of booleans. Reviewing booleans is a big topic beyond this (already huge) post. So booleans come next, in Making Sense of Lambda Calculus 3!