The version of simple described in this document is version 1.1.0. This document is quite alpha right now, but I have good hopes that it will get beyond that stage some day.
I once read a fortune joke about ``the lesser-known programming languages'' which described the language SIMPLE as being composed of only two instructions, ``BEGIN'' and ``STOP'', neither of which did anything: in that way the same things can be achieved as with other programming languages but without any need for frustration and tedious debugging. As the version 1.0.0 of simple had exactly two instructions: @id@ and @void@, I thought the name SIMPLE was quite adequate.
More seriously, when I first started writing simple, I intended to write something very simple, as I had very modest means. It turns out that I produced a language far more complicated and powerful than I had first expected. In fact, the syntax is so unbelievably strange (albeit completely logical) that SIMPLE programs are the most complicated thing in the world to understand. So the name now remains as a piece of irony.
simple is of the ``macro processing'' kind, as are for example cpp and m4. In other words, its essential action consists of evaluating and expanding macros which are defined by the user (generally). In fact, SIMPLE is similar to m4 in its functioning (I was strongly inspired by the sources of m4 when I wrote simple); but it is very different in its syntax.
It then appeared that I needed a macro processor to convert whatever meant "fraction with numerator 2 and denominator 3" into whatever output I needed (like {2\over 3}). I first thought of using m4 (cpp is out of question of course). Unfortunately, m4 is made to handle mainly programs and not text files, so I encountered all sorts of difficulties. First of all, I would have had to use the ``m4_ prefix on all builtins'' option because m4 interprets macros wherever they are found (there is no special macro invocation character) and that can be a pain. But most annoying was the problem with the backtick (`) character: apparently the only way in m4 to write a macro which will produce a backtick (without permanently changing the quote characters, because otherwise the same problem would occur for whatever happens to be the open quote character) is to write (with m4_ prefixes):
m4_changequote(`[',`]')m4_define(__lq,`)m4_changequote([`],['])m4_dnl m4_define(_lq,`m4_changequote(`[',`]')__lq[]m4_changequote([`],['])')m4_dnlwhich makes the macro _lq produce a single left quote (exercice for those who know m4 a bit: why did I need the __lq macro and why can't I just use that). Another thing I do not like about m4 is that it does not gobble comments (one would wonder why they're called comments, then), so either one has to use dnl to produce comments or one has to change the comment character to that of TeX which involves making an assumption as to what it is, precisely the sort of things I was trying to avoid. Anyhow, m4 did not suit my needs, so I just had to put my hands in the dirt and write my own macro preprocessor, which is what I did. Et dixi ``fiat SIMPLE''. Et SIMPLE fit.
As to my ``universal TeX'' project, it is not even started yet. But my current idea is to have the files processed by SIMPLE, and even before that by a tiny program which will change all ISO8859-1 characters (which I use a lot because I occasionally write in French) to SIMPLE macros (because SIMPLE does not permit invocation of macros by a single character - and on the other hand it's a pain to have to write a SIMPLE macro invocation for every accented character). SIMPLE, of course, might change these macros back to the ISO8859-1 character in question, if ISO8859-1 input is recognized by whatever form of TeX (or other) is sought.
There was a time when I attempted to classify programming languages - it proved fruitless: each programming language seems to occupy its very own class. This applies to macro processing languages. They seem to be closer to functional languages (such as caml or Miranda) than to imperative languages (such as C or Pascal), but the issue is not altogether clear.
One thing that can help distinguish programming languages is the kind of calling mechanism which they use. The kind of calling mechanism which macro processors use is ``call-by-need'' which means that the arguments to a function (macro) are evaluated first, before the macro is itself expanded, and even if the macro does not need these arguments. So, essentially, if you write ignorearg(screwupall()), everything does get screwed up, contrarily to what would happen if call-by-name were used (this obviously illustrates the infinite superiority of macro processing languages :-). Still, macro processing languages provide ways to inhibit evaluation: that is called ``quoting'', and we will have much more to say on the subject.
In an ideal functional language, functions cannot have global effects, so that calling the same function twice with the same arguments should produce the same result. That restriction does not apply to macro processing languages: a macro may modify a variable (that is, redefine a macro), so that applying it twice may yield completely different results.
Macro processors resemble functional languages in that there is, really, no such thing as an ``instruction'', at least no difference between ``expressions'' and ``instructions''. A functional language (say, pure lambda-calculus) may be completely untyped, everything being of the ``function'' type. As far as macro processors go, everything is of the ``list'' type, where ``list'' means ``list of tokens'' or ``character string'' as the case may be.
The central idea behind a macro processor is that of ``re-evaluation'': when a macro has been evaluated (expanded), the expansion obtained is fed back to the input so that it will be evaluated again. Only non-macro tokens and quoted elements are not (re)evaluated. As a very simple example, suppose that the macro infiniteloop evaluates to infiniteloop; then that expansion will be re-evaluated, causing an infinite loop. Wonderful invention, the wheel.
As a slightly more sophisticated example of perpetual motion, let us suppose we have a macro double which takes a parameter and evaluates to that parameter applied to itself. Then we might apply the macro double on itself, which will result in double being applied to itself, and so on, perpetually re-evaluating the same thing. Now there is one important thing to note: we should not write double(double) (if the syntax is m4-like, say) to mean ``double applied to itself'', because if we write that, then the ``inner'' double gets evaluated first (as are any arguments), resulting in either nothing at all or in an error, as it was not given any arguments. Rather, we should quote the inner double to prevent its evaluation and pass the double object itself (rather than its evaluation) to the ``outer'' double. So in m4 we would write double(`double'). In fact, the complete program in m4 is:
define(`double',`$1(`$1')')double(`double')try it and watch your computer start spinning like mad (note that there are three pairs of quotes in the definition of double, the really interesting one being the inner one which sees to it that double(`double') does indeed evaluate to double(`double') and not simply to double(double)). The corresponding program in SIMPLE is:
@def@<@double@>|<@1@<@1@>">"@double@<@double@>"
@def@<@greet@>|<Hello, @1@!>"% @greet@world"->
Hello, world!The part before the arrow (->) is the input which is presented to simple and second part is the output produced by it.
We encourage readers to try all the examples.
DON'T PANIC->
DON'T PANICIn other words, simple just copies to the output whatever it is fed in; that is true so long as the input does not contain any of the eight special characters, which are @, ", |, #, <, >, % and `.
Santa Claus `<santa.claus`@toys.np`>->
Santa Claus <santa.claus@toys.np>Note that it is not an error to escape an ordinary (i.e. not special) character: it just leaves the ordinary character in question unaltered.
Wonderful`!->
Wonderful!
This is ordinary text %and this is a comment. and this is the continuation of it. Note how the new line was swallowed by the comment. 10`% of 90 is 9. `%This is not a comment %but this is. so it should appear on the output.->
This is ordinary text and this is the continuation of it. Note how the new line was swallowed by the comment. 10% of 90 is 9. %This is not a comment so it should appear on the output.
The `@id`@ builtin just evaluates to its first argument: @id@First argument|Second argument|Third argument" Of course, if there is only one argument, it evaluates to that: @id@(of course)" As for the `@void`@ builtin, it is even less useful: it evaluates to nothing: @void@SIMPLE is really stupid!"->
The @id@ builtin just evaluates to its first argument: First argument Of course, if there is only one argument, it evaluates to that: (of course) As for the @void@ builtin, it is even less useful: it evaluates to nothing:Perhaps you don't see it, but there's an empty line at the end of the output in the previous example. That is because the linefeed character after the last double quote character in the input was copied to the output.
Note that it is not possible to call a function with no argument. That is because the quote character is required to finish a function call. The next best thing one can do is call a macro with a single argument and let that argument be empty, like in @mymacro@".
Note that an argument to a macro may perfectly well contain itself a macro call. That constitutes a nested macro call, and it works just like you'd think:
@id@This @id@is"@void@ stupid, @id@really"" a @id@@id@nested"" macro call.|No"->
This is a nested macro call.
@def@<@macro@>|<This is a simple macro.>" @macro@"->
This is a simple macro.Note the empty line before in the output (before the ``This is a simple macro.'' line). That is because the line feed character on the first line of input was not gobbled by anything. To avoid this, one generally uses a comment character. So one would have:
@def@<@macro@>|<This is a simple macro.>"% @macro@"->
This is a simple macro.
Now how about parameters? We have seen that builtin macros can take parameters. How about user-defined macros? Well, they can take parameters also. To use parameters in a user-defined macro, the definition may contain the special arguments @1@, @2@ (and so on) which get replaced by the first, second (and so on) argument when the macro is called. Here are a few examples:
@def@<@greet@>|<Hello, @1@!>"% @greet@world" @def@<@introduce@>|<Dear @1@, let me introduce you to @2@.>"% @introduce@Peter|Paul" @def@<@exch@>|<@2@,@1@>"% @exch@First|Second" @exch@First|Second|Third" @exch@First"->
Hello, world! Dear Peter, let me introduce you to Paul. Second,First Second,First ,FirstNote from the last line that when a user macro is called with fewer arguments than it was intended for then the missing arguments get replaced by empty strings. Conversely, the before-last line shows that when a user macro is called with more arguments than intended then the extra arguments are simply discarded. This, however, does not apply to builtins: a builtin makes precise assumptions about its number of arguments, and when these assumptions are not met, an error will occur. For example, @def@ expects exactly two arguments (and also that the first is exactly one token long), and if this is not the case, simple will complain.
@def@<@call@>|<@1@">"Thus, @call@ with one parameter will get replaced by this parameter followed by the double-quote token, thus calling the macro with no parameters. Now suppose the macro @macro@ is defined as previously to expand to This is a simple macro.; then one may wish to call the macro @call@ with the macro @macro@ as parameter (which is a very roundabout way of calling @macro@ itself with no parameter!). One might, naïvely, write, @call@@macro@". However, a single look at that shows immediately that it is just not right; for if we write that, then the double-quote terminates the parameter list to @macro@ and nothing terminates the parameter list to @call@. In other words, when SIMPLE sees @macro@, it immediately starts collecting its arguments in order to evaluate it, and that is what we want to avoid, because we want the @macro@ token itself, rather than its evaluation, to be passed to @call@. Or: we want to prevent the evaluation of @macro@. This is called quoting the token @macro@. There are two ways to do that, and we start with the simpler: it consists of putting @macro@ between a pair of angles, < and >. So to summarize, we have the following:
% First we define @macro@ itself: % it expands to ``This is a simple macro.'' @def@<@macro@>|<This is a simple macro.>"% % Now we define the @call@ macro: % @call@ with one argument, a macro, evaluates to that macro % called with no arguments. @def@<@call@>|<@1@">"% % Now we perform a simple call: @macro@" % And now a more complicated call: @call@<@macro@>"->
This is a simple macro. This is a simple macro.
Beyond that, it gets complicated. An MS-DOS port seems hopeless because I make a very intensive use of realloc(), sometimes with rather large memory blocks, so the huge model would be a necessity, and probably the poor thing would choke itself out of memory real fast even if it can be compiled.