Unlambda: Your Functional Programming Language Nightmares Come True
(If you don't know what Unlambda is, skip this section and move directly to the introduction below.)
[1999/11/04] Unlambda II is coming out soon! Distribution 2.0.0 will have the following changes over version 1.0.0:
e
, @
,
?x
and |
. The e
function is ``exit'' and terminates the execution immediately. The
other three functions are input functions (@
reads a
character, ?x
compares the current character
with x, and |
returns the printing function
for the current cahracter). The exact behavior is (or will soon be)
described in the reference section on this page.
call/cc
function
in various programming languages (including C which is imperative and
has neither continuations, nor first-class citizenship of functions;
the C version uses the Hans Boehm conservative
C/C++ garbage collector).
Distribution 2.0.0 is not available yet. See below for details.
Unlambda is a programming language that is specially designed to allow for obfuscation. While other attempts towards the same goal (such as Intercal or Brainf***) are imperative, in contrast Unlambda is a purely functional language.
Unlambda is very minimalistic. However, contrary to most such languages, it does not attempt to mimic the Turing Machine paradigm: Unlambda does not use a tape, array or stack. Nor is it binary-oriented; as a matter of fact, it does not manipulate integers in any way. Other remarkable (un)features of Unlambda are the fact that it does not have any variables, data structures or code constructs (such as loops, conditionals and such like).
Rather, Unlambda uses a functional approach to programming: the only form of objects it manipulates are functions. Each function takes a function as argument and returns a function. Apart from a binary ``apply'' operation, Unlambda provides several builtin functions (the most important ones being the K and S combinators). User-defined functions can be created, but not saved or named, because Unlambda does not have any variables.
Despite all these apparently unsurmountable limitations, Unlambda is fully Turing-equivalent.
Mathematically, the core of the language can be described as an implementation of the lambda-calculus without the lambda operation, relying entirely on the K and S combinators. Hence the name ``Unlambda''. It uses head (eager, by value) evaluation.
To give an example of Unlambda's unique elegant style, here is the first program I wrote in it:
# This unlambda program prints the integers consecutively. Each # integer n is printed as a line of n asterisks. ````s``s`ks``s`k`si``s`kk``s`k `s``s`ksk # Increment ``s`k ``s``s``si`k.*`kri # Print n (and return n) i`ki `ki # The number zero (replace by i to start from one) ``s``s`ks``s`k`si``s`kk``s`k # Ditto (other half-loop) `s``s`ksk ``s`k ``s``s``si`k.*`kri i`ki
Admittedly, this program is not the shortest that will perform this goal. In fact, I have since discovered that the same effect can be obtained with the following much shorter program:
``r`ci`.*`ci
However, the latter program is probably far more difficult to understand than the former. (The former took me about two hours to write. The latter I stumbled upon by pure luck and it took me two hours to read.)
Although the very idea of a tutorial for such an obfuscated language as Unlambda is patently absurd, I shall try to give a brief introduction to the concepts before dwelling in the details of the reference section (which is also very short considering how small Unlambda is as a whole).
As has been mentioned in the introduction, the only objects that the Unlambda programming language manipulates are functions. Every function takes exactly one argument (that is also a function) and returns one value (that is also a function).
The basic building blocks for Ulambda programs are the primitive
functions and the application operation. There are
seven primitive functions: k
, s
,
i
, v
, d
, c
and
.x
(where x is an arbitrary
characters — so actually that makes 6+256 primitive functions,
but we shall consider .x
as a single function;
the r
function is but a commodity synonym for
.x
where x is the newline
character).
Function application is designated with the backquote (ASCII number
96=0x60) character. The notation is prefix, in other words,
`FG
means F applied to
G. (Note that this is not the same thing as the
composite of F and G (namely the function taking
X
to
`F`GX
, that is,
F applied to `GX
); the
latter can be, as we will later see, written as
``s`kFG
.)
The fact that every Unlambda function is unary (takes exactly one
argument) means that the notation is unambiguous, and we do not need
parentheses (or, if you prefer, the backquote plays the role of the
open parenthesis of Lisp, but the closed parenthesis is unnecessary).
For example, ``FGH
means
(F applied to G) applied to H whereas
`F`GH
means F
applied to (G applied to H). In fact, to check
wether an expression is a valid Unlambda expression, there is a simple
criterion: start at the left with a counter equal to the number 1, and
move from left to right: for every backquote encountered, increment
the counter, and for every primitive function encountered, decrement
it; the counter must always remain positive except at the very end
when it must reach zero.
Since all Unlambda functions take exactly one
argument, when we wish to handle a function of several arguments, it
is necessary to ``currify'' that function. That is, read the
arguments one after another. For example, if F is a
function that should take three variables, it will be applied thus:
```FG1G2G3
.
The idea being that F will do nothing but read the first
argument and return (without side effects) a function that reads the
second argument and returns a function that reads the third argument
and finally do whatever calculation it is F was supposed to
perform. Thus, both
``FG1G2
and `FG1
are legal, but
they don't do much except wait for more arguments to come.
The previous discussion is not so theoretical. Of course, when the
user is defining his own functions, he may use whatever mechanism he
seems fit for reading the functions' arguments (but such a
currification is certainly the best because pairs and lists are so
horribly difficult to define in Unlambda). But the builtin
k
and s
functions take respectively 2 and 3
arguments, and the several arguments are passed in the manner which we
have just described. (As a side note, I
remark that it is, if not impossible, at least inconvenient, to
construct functions that take zero arguments because preventing
evaluation until all arguments have been read is good but when there
are no arguments to be read, the situation is not pleasant; in the
pure lambda calculus there is no problem because evaluation order is
unspecified and irrelevant, but in Ulambda we have a bigger problem.
Here the d
function might help.)
A note about evaluation order: when Unlambda is evaluating an
expression `FG
, it evaluates
F first, and then G (the exception being when
F evaluates to d
), and then applies
F to G. Evaluation is idempotent: that is,
evaluating an already evaluated expression in Ulambda does not have
any effect (there is no level-of-quotation concept as in m4 or SIMPLE).
We now turn to the description of the Ulambda builtins.
The k
and s
builtins are the core of the
language. Just these two suffice to make Unlambda Turing complete
(although .x
is also necessary if you want to
print anything). The k
builtin is easy enough to
describe: it takes two arguments (in currified fashion, as explained
above) and returns the first. Thus,
``kXY
evaluates to
X
(evaluated). Note that Y is
still evaluated in the process. The s
builtin is
slightly more delicate. It takes three arguments, X,
Y and Z, and evaluates as
``XZ`YZ
.
We also mention immediately the i
function: it is the
identity function, in other words, it takes an argument and returns it
intact. The i
function is not strictly necessary but it
is practical. It could be replaced by ``skk
. Indeed,
```skkX
evaluates as
``kX`kX
, which in turn evaluates as
X
.
The k
builtin is a ``constant function constructor''.
That is, for all X, `kX
is the
constant function with value X. The s
builtin
corresponds to ``substituted application'': that is,
``sXY
is a function that, instead
of applying X to Y directly, will apply each of
them to Z (the argument) first, and then one to the other.
Finally, i
is the identity function.
From the remarks of the previous paragraph
follows a method for ``lambda removal'' (or ``elimination of
abstraction''). Indeed, suppose given an expression F
that apart from applications and the primitive functions, also
(possibly) contains occurrences of one variable, call it
$x
. We want to construct the function which,
when applied to a given expression X, will return the value
of F with X substituted for
$x
. Let us write
^xF
for this expression (where
^
is supposed to be a lambda character). Since lambda
(abstraction, that is) is not part of the Unlambda language (hence its
name), we need a way to remove it (to perform elimination of
abstraction). The algorithmic method is as follows: we can be reduced
to three cases: either F is some builtin (or, more
generally, anything primitive other than $x
)
or it is $x
, or it is
`GH
, where G and
H are simpler expressions. In the first case,
^xF
is merely
`kF
(since F does not depend on
$x
). In the second case,
^xF
is
^x$x
, that is, it is
i
. In the third case, we have precisely a substituted
application, and ^x`GH
is ``s^xG^xH
.
These rules allow progressive reduction of a lambda expression, and,
starting from the innermost lambda, of any expression containing
lambdas (a term of the untyped lambda calculus). It is sometimes
possible to speed things up a little. Notably,
^xF
can be replaced by
`kF
as soon as F does not contain
$x
. However, such departures from the Canon
are not without their dangers: in this particular case, F
will be evaluated, which amounts to doing a partial evaluation of
subexpressions not containing the variable within the body of the
lambda expression; and this eagerness of evaluation can introduce
early loops or such nasty surprises (as a rule, replacing with
`d`kF
will work, but it may be considered
cheating).
The v
function is not very important and rarely useful.
It takes an argument, ignores it and returns v
. It can
be used to swallow any number of arguments. The v
function can be implemented using s
, k
and
i
(and hence, using s
and k
only). Indeed, it can be written using the lambda expression
`^h^x`$h$h^h^x`$h$h
(which evaluates to ^x
of the same thing), and abstraction
elimination shows that this is ` ``s``s`ks``s`kki``s`kki
``s``s`ks``s`kki``s`kki
(here is an example when early
evaluation in an abstraction elimination can be disastrous, for
example if ` ``s`kk``sii ``s`kk``sii
were used instead).
The .x
function is the only way to perform
output in Unlambda (note that there is for the moment no way to
perform input). This function takes an argument and, like the
identity function, returns it unchanged. Only contrary to the
identity function it has a side effect, namely to print the character
x on the standard output (this writing takes place when
.x
is applied). Note that while this function
is written with two characters, it is still one function; on
no account should .x
be thought of as
something applied to x (and, just to insist, there is
no such function as .
(dot), only
.x
(dot x)). The r
function is just one instance of the .x
function, namely when x is the newline character. Thus,
the `ri
program has the effect of printing a newline (so
would `rv
or `rr
or
`r(anything)
, but r
alone doesn't
do it, because here the r
function isn't applied: here my
note about the impossibility of currifying
functions of zero arguments should be made clearer).
The d
function is an exception to the normal rules of
evaluation (hence it should be called a special form rather
than a function). When Unlambda is evaluating
`FG
and F evaluates to
d
(for example when F is
d
) then G is not evaluated. The result
`dG
is a promise to evaluate
G: G is kept unevaluated until the promise is
itself applied to an expression H. When that happens,
G is finally evaluated (after H is),
and it is applied to H. This is called forcing
the promise.
For example, `d`ri
does nothing (and remains
unevaluated), and ``d`rii
prints a blank line. Another
point to note is that ``dd`ri
prints a blank line:
indeed, `dd
is first evaluated, and since it is not the
d
function, it does not prevent the `ri
expression from being evaluated (to i
, with the side
effect of printing a newline), so that when finally d
is
applied, it is already too late to prevent the newline from being
printed. To summarize, the d
function can delay the
d
function itself. On the other hand,
``id`ri
does not print a blank line (because
`id
does evaluate to d
). Similarly,
```s`kdri
is first transformed to ```kdi`ri
,
in which ``kdi
is evaluated to d
, which then
prevents `ri
from being evaluated so no newline gets
printed.
Writing `d`kF
is another form of promise
(perhaps more customary but at the same time less transparent): when
it is applied to an arbitrary argument Y, then Y
is ignored and F is evaluated and returned.
The c
(``call with current continuation'') function is
probably the most difficult to explain (if you are familiar with the
corresponding fucntion in Scheme, it will help a lot). c
called with an argument F will apply F to
the current continuation. The current continuation is a
special function which, when it is applied to X, has the
effect of making c
return immediately the value
X. In other words, c
can return in two ways:
if F applied to the continuation evaluates normally, then
its return value is that of c
; but if F calls
the continuation at some point, c
will immediately return
the value passed to the continuation. Note that the continuation can
even escape from the c
call, in which case calling it
will have the effect of going ``back in time'' to that c
call and making it return whatever value was passed to the
continuation. For a more detailed discussion, see any book on Scheme.
Examples of c
include ``cir
: here,
`ci
evaluates to the continuation of the c
which we shall write <cont>
, and we have
`<cont>r
: here, the continuation is applied, so it
makes the c
call return r
, and we are left
with `rr
which prints a newline. Another interesting
example is `c``s`kr``si`ki
: in this expression, the
argument ``s`kr``si`ki
(which does not evaluate any
further) is applied to the continuation of the c
, giving
```s`kr``si`ki<cont>
(where we have written
<cont>
for the continuation in question); this
gives ` ``kr<cont> ```si`ki<cont>
which
evaluates to `r``i<cont>``ki<cont>
, hence to
`r`<cont>i
(this was where we wanted to get), and
in this expression, the continuation is applied, so that the
c
in the initial expression immediately returns
i
, and the remaining calculations are lost (in
particular, the r
is lost and no newline gets printed).
Expressions including c
function calls tend to be
hopelessly difficult to track down. This was, of course, the reason
for including it in the language in the first place.
Recall that a quine is a program that prints its own listing. By the fixed point theorems in logic, such a program exists in any Turing-complete language in which printing an arbitrary string is possible (by a computable program of the string — a technical criterion which is satisfied in all programming languages in existence). Although the fixed point theorem is constructive (and thus actually algorithmically produces a quine), actually writing down the program can be difficult. See my personal collection of quines for examples of quines in (ordinary, non obfuscated) programming languages.
From 1999/10/27 to 1999/11/03, I opened the Unlambda Quine Contest: I had written a quine in Unlambda myself, and I invited anyone else to do so. During that week, the quines were kept secret (only their md5 fingerprint was revealed so that it could be later checked), in order that their independence be guaranteed. I offered a copy of the Wizard Book to the first person to produce a quine (retrospecively I find that I should have offered it to the best quine, or to the shortest one, or some such thing, but no matter).
The contest is now over. Olivier Wittenberg (olivier.wittenberg@ens.fr) won the prize with his one megabyte quine that he sent me within a few hours of the contest's opening. Subsequent quines were written by Panu Kalliokoski (Panu.Kalliokoski@nokia.com), Jean Marot (jean.marot@ens.fr), Denis Auroux (denis.auroux@ens.fr) and Jacob Mandelson (jlm@ghs.com).
All these quines are truly gems (and, once again, I congratulate all the authors). The shortest one is only 491 bytes long, and was written by Jean Marot. The most efficient one (by a definition of efficiency which I will explain later on if I find the time) was written by Denis Auroux. Jacob Mandelson's quine is also very remarkable in that it minimizes the number of dots (dots are printing functions in Unlambda) to only 60.
The full list of quines can be found in the quine/
directory on the FTP
repository of the Comprehensive Unlambda Archive
Network.
First we must specify that whitespace is ignored in an Unlambda
program (wherever it may be, except, naturally, between the period and
the character in the .x
function name).
Comments are also ignored, a comment being anything starting from the
#
character to the end of the line.
If F and G are two Unlambda expressions, then
the expression `FG
is also an
expression (called the application of F to
G). It is evaluated as follows: first,
F is evaluated (and its value is a function, since there is
no other kind of values in Unlambda); if the value of F is
not d
, then, G is evaluated, and
finally the value of F is applied to the value of
G.
To complete the description of Unlambda, we need therefore only specify what happens when F is applied to G, and to do that we consider each possible value of F.
k
(``constant generator'')k
function takes an argument X and returns the function
`kX
(see below).`kX
(``constant function'')`kX
function (which is not primitive but
obtained by applying the primitive function k
to some
function X) takes an argument, ignores it and returns
X.s
(``substitution'')s
function takes an argument X and returns the function
`sX
(see below).`sX
(``substitution first
partial'')`sX
function (which is
not primitive but obtained by applying the primitive function
s
to some function X) takes an argument
Y and returns the function
``sXY
(see below).``sXY
(``substituted
application'')``sXY
function (which is not primitive but obtained by applying the
primitive function s
to two functions X and
Y successively) takes an argument Z and returns
the evaluation of
``XZ`YZ
.i
(``identity'')i
function
takes an argument and returns that argument.v
(``void'')v
function
takes an argument X and returns v
itself.c
(``call with current continuation'')c
function takes an argument X and returns
either the evaluation of `X<cont>
where
<cont>
is c
's current continuation
(see below), or else the value passed to <cont>
if
the latter was applied (with the effect of making c
return immediately).<cont>
(a continuation)c
to return,
making that c
return that argument.d
(``delay'')d
function is
never truly applied (it is a special form). It only occurs in the
form `dF
where F is an Unlambda
expression (see below).`dF
(``promise'')`dF
function takes an argument Y
and evaluates F, giving a function X, and
returns the evaluation of `XY
..x
(``print'') and r
(``carriage return'').x
function
is written using two characters. The first character is a
period and the second is any character. Nevertheless,
.x
is a single function in Unlambda, and
x in this expression is merely a character (read during
parsing), not a parameter to the function. The r
function is exactly equivalent to .(newline)
.
The .x
function behaves like the
i
(identity) function, with the side effect that it
prints the character x (to the standard output) when it is
applied. The r
function also behaves like the identity
and prints a newline character.e
(``exit'') only in Unlambda version 2 and
greatere
function takes an argument
X. It exits immediately, pretending (if the interpreter
cares) that the result of the evaluation of the program is
X.@
(``read'') only in Unlambda version 2 and
greater@
function takes an argument
X. It reads one character from the standard input, making
it the ``current character'' and returns the evaluation of
`Xi
or of `Xv
according as one character has been read successfully or not (for
example on EOF).?x
(``compare character read'')
only in Unlambda version 2 and greater?x
function (where x is a
character, as in the .x
function) takes an
argument X. It returns the evaluation of
`Xi
or of `Xv
according as the current character (the one read by the last
application of @
) is x or not (if
@
has not been applied or if it has encountered an EOF,
there is no current character, and x is deemed not to be
equal to the current character).|
(``reprint character read'') only in
Unlambda version 2 and greater|
function takes an argument X. It returns the evaluation of
`X.x
, where x is the
current character (the one read by the last application of
@
) or of `Xv
if there is no
current character (i.e. if @
has not yet been applied or
if it has encountered an EOF).[1999/11/04] The Unlambda 2.0.0 distribution is not available yet (expect it around mid-november). If you're really impatient you can try the 1.92.1 version of the distribution, but it is provided with no explanations or comments of any kind. The following concerns the 1.0.0 distribution:
You can download the latest Unlambda distribution tarball either via
FTP from ftp://quatramaran.ens.fr/pub/madore/unlambda/unlambda.tar.gz
or via HTTP from http://www.eleves.ens.fr:8080/home/madore/programs/unlambda.tar.gz.
You can download older versions as well (look into the FTP directory
or append the version number, with a hyphen, before the
.tar.gz
suffix).
The Unlambda interpreter, as well as all the accompanying files, is distributed under the terms of the GNU General Public License, either version 2 of this license, or, at your option, any later version. Since Unlambda is Free Software, it comes with absolutely no warranty: see the GNU General Public License for more details.
(Note that this concerns the interpreter. There is no copyright on the language itself: you do not need to ask for my permission to write an Unlambda interpreter, and you are permitted (but not encouraged of course) to write a closed-source one.)
This document is included in the Unlambda distribution. You can also find it on the World Wide Web at http://www.eleves.ens.fr:8080/home/madore/programs/unlambda/unlambda.html.
The Unlambda interpreter is written in Scheme. It is included in the Unlambda distribution tarball, and it can also be downloaded directly from the World Wide Web at http://www.eleves.ens.fr:8080/home/madore/programs/unlambda/unlambda.scm. There is no makefile because there are so many different implementation of Scheme and no command line standard. In fact, it may even be necessary (or advisable) to slightly modify the interpreter before running it or compiling it; however, this should be completely straightforward (as the entire interpreter is, in fact).
With guile
, the interpreter can be run with
guile -l unlambda.scm -c '(ev (parse))'
The Unlambda distribution also includes a compiled version of the interpreter for an Intel Pentium Pro platform running GNU/Linux with version 2.1 of the GNU libc. It was compiled using Bigloo, and is statically linked with the Bigloo and GC libraries, so that having Bigloo (or indeed any kind of Scheme) is not necessary to run the Unlambda interpreter.
The Unlambda distribution also includes a tool program, also written
in Scheme, that can be used to perform abstraction elimination as described above,
and with the syntax used above. This program is also available on the
World Wide Web, at http://www.eleves.ens.fr:8080/home/madore/programs/unlambda/unlambdaify.scm.
To run that program using guile
, try
guile -l unlambdaify.scm -c '(begin (unparse (remove-lambdas (parse))) (newline))'
Finally, the Unlambda distribution includes a few example Unlambda programs. We encourage users of Unlambda to donate their programs to the distribution.
Please send comments and suggestions about Unlambda and its interpreter to david.madore@ens.fr.
Happy hacking!
I have opened the Comprehensive Unlambda Archive Network: its goal is to gather all the Unlambda programs that are written (provided their authors agree, of course). Since there are very few programs in Unlambda altogether, it is convenient to centralize everything in one place, it will not take too much disk space, and a copy of the archive will probably be included in the next Unlambda distribution.
You can find the archive in the directory /pub/madore/unlambda/CUAN/
on the ``Quatramaran'' FTP
site. See the
MANIFEST
file for a list of the programs in the
CUAN. Please drop me a note
if you have a program you want to add to the archive.