The Unlambda Programming Language

Unlambda: Your Functional Programming Language Nightmares Come True

Table of contents

What's New in Unlambda World?

(If you don't know what Unlambda is, skip this section and move directly to the introduction below.)

[1999/11/04] Unlambda II is coming out soon! Distribution 2.0.0 will have the following changes over version 1.0.0:

Distribution 2.0.0 is not available yet. See below for details.

Introduction

Unlambda is a programming language that is specially designed to allow for obfuscation. While other attempts towards the same goal (such as Intercal or Brainf***) are imperative, in contrast Unlambda is a purely functional language.

Unlambda is very minimalistic. However, contrary to most such languages, it does not attempt to mimic the Turing Machine paradigm: Unlambda does not use a tape, array or stack. Nor is it binary-oriented; as a matter of fact, it does not manipulate integers in any way. Other remarkable (un)features of Unlambda are the fact that it does not have any variables, data structures or code constructs (such as loops, conditionals and such like).

Rather, Unlambda uses a functional approach to programming: the only form of objects it manipulates are functions. Each function takes a function as argument and returns a function. Apart from a binary ``apply'' operation, Unlambda provides several builtin functions (the most important ones being the K and S combinators). User-defined functions can be created, but not saved or named, because Unlambda does not have any variables.

Despite all these apparently unsurmountable limitations, Unlambda is fully Turing-equivalent.

Mathematically, the core of the language can be described as an implementation of the lambda-calculus without the lambda operation, relying entirely on the K and S combinators. Hence the name ``Unlambda''. It uses head (eager, by value) evaluation.

To give an example of Unlambda's unique elegant style, here is the first program I wrote in it:

# This unlambda program prints the integers consecutively.  Each
# integer n is printed as a line of n asterisks.
````s``s`ks``s`k`si``s`kk``s`k
                             `s``s`ksk # Increment
                            ``s`k
                                ``s``s``si`k.*`kri # Print n (and return n)
                               i`ki
  `ki # The number zero (replace by i to start from one)
 ``s``s`ks``s`k`si``s`kk``s`k # Ditto (other half-loop)
                            `s``s`ksk
                           ``s`k
                               ``s``s``si`k.*`kri
                              i`ki

Admittedly, this program is not the shortest that will perform this goal. In fact, I have since discovered that the same effect can be obtained with the following much shorter program:

``r`ci`.*`ci

However, the latter program is probably far more difficult to understand than the former. (The former took me about two hours to write. The latter I stumbled upon by pure luck and it took me two hours to read.)

Tutorial

Although the very idea of a tutorial for such an obfuscated language as Unlambda is patently absurd, I shall try to give a brief introduction to the concepts before dwelling in the details of the reference section (which is also very short considering how small Unlambda is as a whole).

As has been mentioned in the introduction, the only objects that the Unlambda programming language manipulates are functions. Every function takes exactly one argument (that is also a function) and returns one value (that is also a function).

The basic building blocks for Ulambda programs are the primitive functions and the application operation. There are seven primitive functions: k, s, i, v, d, c and .x (where x is an arbitrary characters — so actually that makes 6+256 primitive functions, but we shall consider .x as a single function; the r function is but a commodity synonym for .x where x is the newline character).

Function application is designated with the backquote (ASCII number 96=0x60) character. The notation is prefix, in other words, `FG means F applied to G. (Note that this is not the same thing as the composite of F and G (namely the function taking X to `F`GX, that is, F applied to `GX); the latter can be, as we will later see, written as ``s`kFG.)

The fact that every Unlambda function is unary (takes exactly one argument) means that the notation is unambiguous, and we do not need parentheses (or, if you prefer, the backquote plays the role of the open parenthesis of Lisp, but the closed parenthesis is unnecessary). For example, ``FGH means (F applied to G) applied to H whereas `F`GH means F applied to (G applied to H). In fact, to check wether an expression is a valid Unlambda expression, there is a simple criterion: start at the left with a counter equal to the number 1, and move from left to right: for every backquote encountered, increment the counter, and for every primitive function encountered, decrement it; the counter must always remain positive except at the very end when it must reach zero.

Since all Unlambda functions take exactly one argument, when we wish to handle a function of several arguments, it is necessary to ``currify'' that function. That is, read the arguments one after another. For example, if F is a function that should take three variables, it will be applied thus: ```FG1G2G3. The idea being that F will do nothing but read the first argument and return (without side effects) a function that reads the second argument and returns a function that reads the third argument and finally do whatever calculation it is F was supposed to perform. Thus, both ``FG1G2 and `FG1 are legal, but they don't do much except wait for more arguments to come.

The previous discussion is not so theoretical. Of course, when the user is defining his own functions, he may use whatever mechanism he seems fit for reading the functions' arguments (but such a currification is certainly the best because pairs and lists are so horribly difficult to define in Unlambda). But the builtin k and s functions take respectively 2 and 3 arguments, and the several arguments are passed in the manner which we have just described. (As a side note, I remark that it is, if not impossible, at least inconvenient, to construct functions that take zero arguments because preventing evaluation until all arguments have been read is good but when there are no arguments to be read, the situation is not pleasant; in the pure lambda calculus there is no problem because evaluation order is unspecified and irrelevant, but in Ulambda we have a bigger problem. Here the d function might help.)

A note about evaluation order: when Unlambda is evaluating an expression `FG, it evaluates F first, and then G (the exception being when F evaluates to d), and then applies F to G. Evaluation is idempotent: that is, evaluating an already evaluated expression in Ulambda does not have any effect (there is no level-of-quotation concept as in m4 or SIMPLE).

We now turn to the description of the Ulambda builtins.

The k and s builtins are the core of the language. Just these two suffice to make Unlambda Turing complete (although .x is also necessary if you want to print anything). The k builtin is easy enough to describe: it takes two arguments (in currified fashion, as explained above) and returns the first. Thus, ``kXY evaluates to X (evaluated). Note that Y is still evaluated in the process. The s builtin is slightly more delicate. It takes three arguments, X, Y and Z, and evaluates as ``XZ`YZ.

We also mention immediately the i function: it is the identity function, in other words, it takes an argument and returns it intact. The i function is not strictly necessary but it is practical. It could be replaced by ``skk. Indeed, ```skkX evaluates as ``kX`kX, which in turn evaluates as X.

The k builtin is a ``constant function constructor''. That is, for all X, `kX is the constant function with value X. The s builtin corresponds to ``substituted application'': that is, ``sXY is a function that, instead of applying X to Y directly, will apply each of them to Z (the argument) first, and then one to the other. Finally, i is the identity function.

From the remarks of the previous paragraph follows a method for ``lambda removal'' (or ``elimination of abstraction''). Indeed, suppose given an expression F that apart from applications and the primitive functions, also (possibly) contains occurrences of one variable, call it $x. We want to construct the function which, when applied to a given expression X, will return the value of F with X substituted for $x. Let us write ^xF for this expression (where ^ is supposed to be a lambda character). Since lambda (abstraction, that is) is not part of the Unlambda language (hence its name), we need a way to remove it (to perform elimination of abstraction). The algorithmic method is as follows: we can be reduced to three cases: either F is some builtin (or, more generally, anything primitive other than $x) or it is $x, or it is `GH, where G and H are simpler expressions. In the first case, ^xF is merely `kF (since F does not depend on $x). In the second case, ^xF is ^x$x, that is, it is i. In the third case, we have precisely a substituted application, and ^x`GH is ``s^xG^xH. These rules allow progressive reduction of a lambda expression, and, starting from the innermost lambda, of any expression containing lambdas (a term of the untyped lambda calculus). It is sometimes possible to speed things up a little. Notably, ^xF can be replaced by `kF as soon as F does not contain $x. However, such departures from the Canon are not without their dangers: in this particular case, F will be evaluated, which amounts to doing a partial evaluation of subexpressions not containing the variable within the body of the lambda expression; and this eagerness of evaluation can introduce early loops or such nasty surprises (as a rule, replacing with `d`kF will work, but it may be considered cheating).

The v function is not very important and rarely useful. It takes an argument, ignores it and returns v. It can be used to swallow any number of arguments. The v function can be implemented using s, k and i (and hence, using s and k only). Indeed, it can be written using the lambda expression `^h^x`$h$h^h^x`$h$h (which evaluates to ^x of the same thing), and abstraction elimination shows that this is ` ``s``s`ks``s`kki``s`kki ``s``s`ks``s`kki``s`kki (here is an example when early evaluation in an abstraction elimination can be disastrous, for example if ` ``s`kk``sii ``s`kk``sii were used instead).

The .x function is the only way to perform output in Unlambda (note that there is for the moment no way to perform input). This function takes an argument and, like the identity function, returns it unchanged. Only contrary to the identity function it has a side effect, namely to print the character x on the standard output (this writing takes place when .x is applied). Note that while this function is written with two characters, it is still one function; on no account should .x be thought of as something applied to x (and, just to insist, there is no such function as . (dot), only .x (dot x)). The r function is just one instance of the .x function, namely when x is the newline character. Thus, the `ri program has the effect of printing a newline (so would `rv or `rr or `r(anything), but r alone doesn't do it, because here the r function isn't applied: here my note about the impossibility of currifying functions of zero arguments should be made clearer).

The d function is an exception to the normal rules of evaluation (hence it should be called a special form rather than a function). When Unlambda is evaluating `FG and F evaluates to d (for example when F is d) then G is not evaluated. The result `dG is a promise to evaluate G: G is kept unevaluated until the promise is itself applied to an expression H. When that happens, G is finally evaluated (after H is), and it is applied to H. This is called forcing the promise.

For example, `d`ri does nothing (and remains unevaluated), and ``d`rii prints a blank line. Another point to note is that ``dd`ri prints a blank line: indeed, `dd is first evaluated, and since it is not the d function, it does not prevent the `ri expression from being evaluated (to i, with the side effect of printing a newline), so that when finally d is applied, it is already too late to prevent the newline from being printed. To summarize, the d function can delay the d function itself. On the other hand, ``id`ri does not print a blank line (because `id does evaluate to d). Similarly, ```s`kdri is first transformed to ```kdi`ri, in which ``kdi is evaluated to d, which then prevents `ri from being evaluated so no newline gets printed.

Writing `d`kF is another form of promise (perhaps more customary but at the same time less transparent): when it is applied to an arbitrary argument Y, then Y is ignored and F is evaluated and returned.

The c (``call with current continuation'') function is probably the most difficult to explain (if you are familiar with the corresponding fucntion in Scheme, it will help a lot). c called with an argument F will apply F to the current continuation. The current continuation is a special function which, when it is applied to X, has the effect of making c return immediately the value X. In other words, c can return in two ways: if F applied to the continuation evaluates normally, then its return value is that of c; but if F calls the continuation at some point, c will immediately return the value passed to the continuation. Note that the continuation can even escape from the c call, in which case calling it will have the effect of going ``back in time'' to that c call and making it return whatever value was passed to the continuation. For a more detailed discussion, see any book on Scheme.

Examples of c include ``cir: here, `ci evaluates to the continuation of the c which we shall write <cont>, and we have `<cont>r: here, the continuation is applied, so it makes the c call return r, and we are left with `rr which prints a newline. Another interesting example is `c``s`kr``si`ki: in this expression, the argument ``s`kr``si`ki (which does not evaluate any further) is applied to the continuation of the c, giving ```s`kr``si`ki<cont> (where we have written <cont> for the continuation in question); this gives ` ``kr<cont> ```si`ki<cont> which evaluates to `r``i<cont>``ki<cont>, hence to `r`<cont>i (this was where we wanted to get), and in this expression, the continuation is applied, so that the c in the initial expression immediately returns i, and the remaining calculations are lost (in particular, the r is lost and no newline gets printed).

Expressions including c function calls tend to be hopelessly difficult to track down. This was, of course, the reason for including it in the language in the first place.

A note about the Unlambda Quine Contest

Recall that a quine is a program that prints its own listing. By the fixed point theorems in logic, such a program exists in any Turing-complete language in which printing an arbitrary string is possible (by a computable program of the string — a technical criterion which is satisfied in all programming languages in existence). Although the fixed point theorem is constructive (and thus actually algorithmically produces a quine), actually writing down the program can be difficult. See my personal collection of quines for examples of quines in (ordinary, non obfuscated) programming languages.

From 1999/10/27 to 1999/11/03, I opened the Unlambda Quine Contest: I had written a quine in Unlambda myself, and I invited anyone else to do so. During that week, the quines were kept secret (only their md5 fingerprint was revealed so that it could be later checked), in order that their independence be guaranteed. I offered a copy of the Wizard Book to the first person to produce a quine (retrospecively I find that I should have offered it to the best quine, or to the shortest one, or some such thing, but no matter).

The contest is now over. Olivier Wittenberg (olivier.wittenberg@ens.fr) won the prize with his one megabyte quine that he sent me within a few hours of the contest's opening. Subsequent quines were written by Panu Kalliokoski (Panu.Kalliokoski@nokia.com), Jean Marot (jean.marot@ens.fr), Denis Auroux (denis.auroux@ens.fr) and Jacob Mandelson (jlm@ghs.com).

All these quines are truly gems (and, once again, I congratulate all the authors). The shortest one is only 491 bytes long, and was written by Jean Marot. The most efficient one (by a definition of efficiency which I will explain later on if I find the time) was written by Denis Auroux. Jacob Mandelson's quine is also very remarkable in that it minimizes the number of dots (dots are printing functions in Unlambda) to only 60.

The full list of quines can be found in the quine/ directory on the FTP repository of the Comprehensive Unlambda Archive Network.

Unlambda reference

First we must specify that whitespace is ignored in an Unlambda program (wherever it may be, except, naturally, between the period and the character in the .x function name). Comments are also ignored, a comment being anything starting from the # character to the end of the line.

If F and G are two Unlambda expressions, then the expression `FG is also an expression (called the application of F to G). It is evaluated as follows: first, F is evaluated (and its value is a function, since there is no other kind of values in Unlambda); if the value of F is not d, then, G is evaluated, and finally the value of F is applied to the value of G.

To complete the description of Unlambda, we need therefore only specify what happens when F is applied to G, and to do that we consider each possible value of F.

k (``constant generator'')
The k function takes an argument X and returns the function `kX (see below).
`kX (``constant function'')
The `kX function (which is not primitive but obtained by applying the primitive function k to some function X) takes an argument, ignores it and returns X.
s (``substitution'')
The s function takes an argument X and returns the function `sX (see below).
`sX (``substitution first partial'')
The `sX function (which is not primitive but obtained by applying the primitive function s to some function X) takes an argument Y and returns the function ``sXY (see below).
``sXY (``substituted application'')
The ``sXY function (which is not primitive but obtained by applying the primitive function s to two functions X and Y successively) takes an argument Z and returns the evaluation of ``XZ`YZ.
i (``identity'')
The i function takes an argument and returns that argument.
v (``void'')
The v function takes an argument X and returns v itself.
c (``call with current continuation'')
The c function takes an argument X and returns either the evaluation of `X<cont> where <cont> is c's current continuation (see below), or else the value passed to <cont> if the latter was applied (with the effect of making c return immediately).
<cont> (a continuation)
Continuations take an argument and non-locally jump to the point in history when the evaluator was waiting for the corresponding c to return, making that c return that argument.
d (``delay'')
The d function is never truly applied (it is a special form). It only occurs in the form `dF where F is an Unlambda expression (see below).
`dF (``promise'')
The `dF function takes an argument Y and evaluates F, giving a function X, and returns the evaluation of `XY.
.x (``print'') and r (``carriage return'')
The .x function is written using two characters. The first character is a period and the second is any character. Nevertheless, .x is a single function in Unlambda, and x in this expression is merely a character (read during parsing), not a parameter to the function. The r function is exactly equivalent to .(newline). The .x function behaves like the i (identity) function, with the side effect that it prints the character x (to the standard output) when it is applied. The r function also behaves like the identity and prints a newline character.
e (``exit'') only in Unlambda version 2 and greater
The e function takes an argument X. It exits immediately, pretending (if the interpreter cares) that the result of the evaluation of the program is X.
@ (``read'') only in Unlambda version 2 and greater
The @ function takes an argument X. It reads one character from the standard input, making it the ``current character'' and returns the evaluation of `Xi or of `Xv according as one character has been read successfully or not (for example on EOF).
?x (``compare character read'') only in Unlambda version 2 and greater
The ?x function (where x is a character, as in the .x function) takes an argument X. It returns the evaluation of `Xi or of `Xv according as the current character (the one read by the last application of @) is x or not (if @ has not been applied or if it has encountered an EOF, there is no current character, and x is deemed not to be equal to the current character).
| (``reprint character read'') only in Unlambda version 2 and greater
The | function takes an argument X. It returns the evaluation of `X.x, where x is the current character (the one read by the last application of @) or of `Xv if there is no current character (i.e. if @ has not yet been applied or if it has encountered an EOF).

Unlambda distribution

[1999/11/04] The Unlambda 2.0.0 distribution is not available yet (expect it around mid-november). If you're really impatient you can try the 1.92.1 version of the distribution, but it is provided with no explanations or comments of any kind. The following concerns the 1.0.0 distribution:

You can download the latest Unlambda distribution tarball either via FTP from ftp://quatramaran.ens.fr/pub/madore/unlambda/unlambda.tar.gz or via HTTP from http://www.eleves.ens.fr:8080/home/madore/programs/unlambda.tar.gz. You can download older versions as well (look into the FTP directory or append the version number, with a hyphen, before the .tar.gz suffix).

The Unlambda interpreter, as well as all the accompanying files, is distributed under the terms of the GNU General Public License, either version 2 of this license, or, at your option, any later version. Since Unlambda is Free Software, it comes with absolutely no warranty: see the GNU General Public License for more details.

(Note that this concerns the interpreter. There is no copyright on the language itself: you do not need to ask for my permission to write an Unlambda interpreter, and you are permitted (but not encouraged of course) to write a closed-source one.)

This document is included in the Unlambda distribution. You can also find it on the World Wide Web at http://www.eleves.ens.fr:8080/home/madore/programs/unlambda/unlambda.html.

The Unlambda interpreter is written in Scheme. It is included in the Unlambda distribution tarball, and it can also be downloaded directly from the World Wide Web at http://www.eleves.ens.fr:8080/home/madore/programs/unlambda/unlambda.scm. There is no makefile because there are so many different implementation of Scheme and no command line standard. In fact, it may even be necessary (or advisable) to slightly modify the interpreter before running it or compiling it; however, this should be completely straightforward (as the entire interpreter is, in fact).

With guile, the interpreter can be run with

guile -l unlambda.scm -c '(ev (parse))'

The Unlambda distribution also includes a compiled version of the interpreter for an Intel Pentium Pro platform running GNU/Linux with version 2.1 of the GNU libc. It was compiled using Bigloo, and is statically linked with the Bigloo and GC libraries, so that having Bigloo (or indeed any kind of Scheme) is not necessary to run the Unlambda interpreter.

The Unlambda distribution also includes a tool program, also written in Scheme, that can be used to perform abstraction elimination as described above, and with the syntax used above. This program is also available on the World Wide Web, at http://www.eleves.ens.fr:8080/home/madore/programs/unlambda/unlambdaify.scm. To run that program using guile, try

guile -l unlambdaify.scm -c '(begin (unparse (remove-lambdas (parse))) (newline))'

Finally, the Unlambda distribution includes a few example Unlambda programs. We encourage users of Unlambda to donate their programs to the distribution.

Please send comments and suggestions about Unlambda and its interpreter to david.madore@ens.fr.

Happy hacking!

Comprehensive Unlambda Archive Network

I have opened the Comprehensive Unlambda Archive Network: its goal is to gather all the Unlambda programs that are written (provided their authors agree, of course). Since there are very few programs in Unlambda altogether, it is convenient to centralize everything in one place, it will not take too much disk space, and a copy of the archive will probably be included in the next Unlambda distribution.

You can find the archive in the directory /pub/madore/unlambda/CUAN/ on the ``Quatramaran'' FTP site. See the MANIFEST file for a list of the programs in the CUAN. Please drop me a note if you have a program you want to add to the archive.


David Madore