Software Development

Discovering The Arcane World Of Esoteric Programming Languages

An esoteric programming language is a programming language designed to challenge the norms of language design. It is a language designed to make a point. The point could be anything: the challenge itself of creating something unusual, making an elaborate joke, creating a language as an artistic expression, testing promising ideas for programming, etc.

This should make clear that there is not an easy way to properly categorize or organize all the different esoteric programming languages, since by definition they defy the rules. But we could use some concepts and organizing principles to get a better sense of the community around them. So that we could understand what the people use them for, or to use this information as an inspiration to create your own esoteric language.

Some Useful Concepts

Turing Tarpit

A Turing tarpit is a programming language that is Turing-complete, but practically unusable. A Turing-complete language can simulate any Turing-machine. In practical terms, you can write any program with it. So in theory all such languages are equally powerful, but the ones that are also a Turing Tarpit make incredibly difficult to write any kind of program. A common way to do that is to reduce the number of available symbols: try writing even an Hello World program with only 8 characters.

Quine

A quine is a program that produces as output a copy of its source code. The program cannot be the empty program, even if the empty program is a valid program in the specific language. Reading the source code from disk or memory is also considered cheating. The name “quine” was coined by Douglas Hofstadter, in his book Gödel, Escher, Bach: An Eternal Golden Braid, in the honor of philosopher Willard Van Orman Quine. So the concept of quine is not unique to esoteric programming languages, but nonetheless is a typical first challenge to overcome when writing one.

Code Golf

A code golf is a type of competition in which the objective is to write the shortest possible program that implement a certain algorithm. In this context, shortest means having the shortest source code. It does not mean having the smaller size of the binary executable code. Playing code golf is called “golf scripting”. There are also all other kinds of competitions related to programming, which, given the audience, frequently encourage creative cheating. Generally speaking a challenge is a motivating factor for many esoteric programming languages.

Obfuscation

The concept of obfuscation, that is to say to hide the real meaning of a message, is obviously not exclusive to esoteric programming languages or even programming in general. But it has a great relevance in many esoteric programming languages. Especially if you intend it in the larger general sense of misleading or confusing the recipient. You are not hiding the message because the message itself is important, but because the hidden form is, in itself, the objective. A typical way to achieve this is by minimizing the number of symbols of the language, but you can also camouflage a programming language in what looks like normal text or an image.

Non-Determinism

In this larger meaning of misleading, or confusing the audience, non-determinism can also be used to achieve the same objective. A non-deterministic language is one for which, given the current state of a program, the next state cannot always be predicted. The concept has some use in normal programming languages, essentially due to unpredictable conditions at run time. But it is taken to the extreme by certain esoteric programming languages, by randomizing variables or even randomizing the instructions themselves.

Groups Of Esoteric Programming Languages

As we have already said we cannot provide an exhaustive way of organizing all the esoteric programming languages. And even if somebody could find it, soon somebody else would invent a new esoteric programming languages just to make it invalid. So what we are trying to do is simply to offer some groups, or categories, to better understand and explore the world of esoteric programming languages.

  • Languages with an objective value, although this value is not necessarily an extrinsic or typical one. A basic example is a language designed to win a code golf competition: its value can be measured, but it has no meaning outside of the community of esoteric programming languages. But the more useful kind are the ones designed to achieve a non typical object, such as to bypass security measures (see JSFuck) or to define Type-0 languages of the Chomsky hierarchy (see Thue).
  • Unusable languages. These are languages that are very challenging to use. This can be an explicit objective of the designer or simply an accident of the design itself. The user may even found the fun to use them for some time. A typical way to achieve an unusable language, both by design and by accident, is with minimalism (see Brainfuck). But if you want something almost impossible to use you need something like a language that use a ternary system and purposefully self-alter itself, among other things (see Malbolge).
  • Languages for testing an idea or proving something. These are languages that may not do something useful, but they are good testing ground for a new concept or to prove something. For instance a language in which programs are arranged in a two-dimensional grid (see Befunge) or an almost pure functional language (see Unlambda)
  • Artistic languages. These are languages that are designed with the idea of having some artistic value. You could argue that they are effectively part either of the group of the testing languages or the ones with an objective value. We put them in a different category, because they don’t look like programming, but they the tend to look interesting, even if you may not want to use them. For instance they look like abstract art (see Piet) or like a Shakesperean play (see Shakespeare).
  • Jokes languages. These are languages invented with people with a weird sense of humour. Like a person that says: «do you want to hear a joke? – then it reads War And Peace, before concluding – Our whole lives are a joke!». An example is a satirical language, where even the reference manual is a joke (see INTERCAL)

Some Notable Esoteric Languages

Of course this is not an exhaustive list of notable esoteric programming languages, but more modestly a list of a few notable for their success or peculiarity.

Befunge

The main esoteric feature of Befunge is that programs are arranged in a two-dimensional grid. It is also a stack based and a reflective language, so it allows a program to alter itself. The main objective was to design a language for which it was as hard as possible to create a compiler.

The original version, now called Befunge-93, it is not Turing complete, because it puts a limit of 80×25 to the grid size. But a new version, called Befunge-98, removes this limit and it should be Turing complete. Befunge has spawned a whole class of multidimensional languages, called Fungeoid.

Befunge has commands that control the direction up, down, left, right and thus can also create a cycle, together with more traditional commands for output, binary operations, etc.

An Hello Word program looks like this.

>              v
v  ,,,,,"Hello"<
>48*,          v
v,,,,,,"World!"<
>25*,@

Brainfuck

Probably the most famous of all esoteric programming languages, it is notable for his extreme minimalism. Both in the number of commands available and also in having a very small compiler. In fact the second condition is the reason for the first, because the author wanted to create a language with the smallest possible compiler.  The author created a compiler that used only 240 bytes and sometimes later somebody created one with just 100 bytes. A natural consequence of this objective is that the language is hard to use, although it is Turing complete. In short, it is a Turing tarpit.

It is sometimes referred to with a censored spelling, such as Brainf*ck or many other variations.

The language consist in eight different commands that manipulate a data pointer and two stream of bytes for input and output. It also supports loops. Other characters, besides the ones representing the commands, are considered comments and ignored by the compiler.

This is an example of how the language works (taken from the Esolangs wiki).

Code:   Pseudo code:
>>      Move the pointer to cell2
[-]     Set cell2 to 0
<<      Move the pointer back to cell0
[       While cell0 is not 0
  -       Subtract 1 from cell0
  >>      Move the pointer to cell2
  +       Add 1 to cell2
  <<      Move the pointer back to cell0
]       End while

A Hello World program looks like this.

++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.

If you want to find out more you can see on the corresponding Wikipedia page or Esolangs page.

INTERCAL

INTERCAL may have not been the first esoteric programming language, but it is certainly the first famous one. How old is it? It was created in 1972 and the first implementation was made with punched cards. The language was intended as a parody, but also to make something completely new, an alien to the programming world. The name is not an acronym, according to the authors:

The full name of the compiler is “Compiler Language With No Pronounceable Acronym,” which is, for obvious reasons, abbreviated “INTERCAL.”

The rest of the INTERCAL Reference Manual (PS format) is also full of nonsensical or humorous statements. This is both for the language they describe and for how they do it. For instance this is a section on Variables

INTERCAL allows only 2 different types of variables, the 16-bit integer and the 32-bit integer. These
are represented by a spot ( . ) or two-spot ( : ), respectively, followed by any number between 1 and 65535,
inclusive. These variables may contain only non-negative numbers; thus they have the respective ranges of
values: 0 to 65535 and 0 to 4294967295. Note: .123 and :123 are two distinct variables. On the other hand,
.1 and .0001 are identical. Furthermore, the latter may not be written as 1E-3 .

The language also allows the use of a modifier PLEASE. However it also requires the use of the proper quantity of this keyword. It cannot be used too little or too much, that is to say you need to show the proper amount of politeness. The really unfunny part is that it was an undocumented feature in the original manual. A joke manual for a joke manual may be okay, but one that it is incomplete is unacceptable.

This is an Hello World in a version of INTERCAL implemented with C: C-INTERCAL.

DO ,1 <- #13
PLEASE DO ,1 SUB #1 <- #238
DO ,1 SUB #2 <- #108
DO ,1 SUB #3 <- #112
DO ,1 SUB #4 <- #0
DO ,1 SUB #5 <- #64
DO ,1 SUB #6 <- #194
DO ,1 SUB #7 <- #48
PLEASE DO ,1 SUB #8 <- #22
DO ,1 SUB #9 <- #248
DO ,1 SUB #10 <- #168
DO ,1 SUB #11 <- #24
DO ,1 SUB #12 <- #16
DO ,1 SUB #13 <- #162
PLEASE READ OUT ,1
PLEASE GIVE UP

In short, INTERCAL is a very elaborate joke. The kind of joke that may lead you to admire its creator for its perseverance or to doubt its sanity, or maybe both.

JSFuck

The success of Brainfuck are spurred the creation of many derivatives languages, too many to mention them all. But the most notable one is JSFuck.

JSFuck is not a proper language, but it is more of an esoteric programming style of Javascript that requires of using only 6 characters. The resulting programs are valid JavaScript programs and in fact it was invented to bypass security techniques like malware detection. Its properties made it also useful for code obfuscation.

It works because you can evaluate any expression in Javascript as any type. So [] represents and empty array, but by prepending it with a + you can force it to evaluate as the number 0. To obtain a letter like a you can manipulate the value false, etc. The end result is an extremely verbose language: the equivalent of alert("Hello World!") is 22948 characters long.

Malbolge

Malbolge is a language specifically designed to be almost impossible to use: «via a counter-intuitive ‘crazy operation’, base-three arithmetic, and self-altering code», in the words of its own creator. In fact even the author never wrote a working program with it. The first program was generated by another program  that implemented an heuristic search algorithm. The language is not Turing complete and it is also based on a misspelling of Malebolge, the eight circle of hell in Dante Alighieri’s Inferno. So it really does not have any redeeming quality.

I will not attempt to describe it because the whole thing is explicitly an exercise in frustration. However there is a notable “crazy operation” that is worth mentioning: encryption. The language is supposed to work on an ternary virtual machine and has three registers, one of which contain a pointer to the current instruction and is called . After the instruction is executed the modulo 94 of the value in  replaces whatever is in  then the result is encrypted according to an encryption table. Welcome to hell, indeed.

Some brave soul has created an Hello World program.

(=<`#9]~6ZY32Vx/4Rs+0No-&Jk)"Fh}|Bcy?`=*z]Kw%oG4UUS0/@-ejc(:'8dc

Piet

Piet is a language in which programs are represented as abstract art paintings, in the style of Piet Mondrian. It is stack based and Turing complete.

There are 18 colors ordered according to hue and brightness, plus black and white. These last two have special meaning and are used for control flow. The program execution relies on two “pointers”: a Direction Pointer and a Color Block Chooser. The DP may point up, down, left and right, while CC can only point left and right. The combination of these two pointers govern the execution of the program: basically which block of color is executed next. The size of a block of color represent an integer. When the program transition between different blocks of color, their difference in hue and brightness determines the kind of command that is executed. The commands are the usual one, like output a value, multiply, etc.

The Hello World program has a certain beauty.

Hello World in Piet by Thomas Schoch

Shakespeare

Shakespeare is a language designed to have: «beautiful source code that resembled Shakespeare plays». The language has few commands and it is functionally similar to assembly language, but it is unsurprisingly quite verbose.

The variables must be declared in an initial section and their names must be valid Shakespearean characters, such as Romeo or Juliet. These variables are stacks on which later will be executed operations like pop, push and input/output. The name of acts and scenes works as goto labels and the destination of certain conditional statements. Characters, that is to say variables, must be called on stage to be manipulated and there can be only two at a time on the stage.

The lines usually represents numerical constants: some nouns and adjective are converted in numerical values.

Hamlet:
 You lying stupid fatherless big smelly half-witted coward!
 
Juliet:
 You are as villainous as the square root of Romeo!

Special words, like “Remember”, or combinations thereof, like “Speak your mind”, perform commands. These are normal commands such as pushing values on the stack-character, outputting the value of the current variable or the corresponding ASCII character.

A conditional statement is represented by a question from one character and an answer from the other character which determines where to go, if the condition of the question evaluate to true.

Juliet:
 Am I better than you?
 
Hamlet:
 If so, let us proceed to scene III.

The Hello World is as long and productive as a Shakespearean play.

The Infamous Hello World Program.
 
Romeo, a young man with a remarkable patience.
Juliet, a likewise young woman of remarkable grace.
Ophelia, a remarkable woman much in dispute with Hamlet.
Hamlet, the flatterer of Andersen Insulting A/S.
 
 
                    Act I: Hamlet's insults and flattery.
 
                    Scene I: The insulting of Romeo.
 
[Enter Hamlet and Romeo]
 
Hamlet:
 You lying stupid fatherless big smelly half-witted coward!
 You are as stupid as the difference between a handsome rich brave
 hero and thyself! Speak your mind!
 
 You are as brave as the sum of your fat little stuffed misused dusty
 old rotten codpiece and a beautiful fair warm peaceful sunny summer's
 day. You are as healthy as the difference between the sum of the
 sweetest reddest rose and my father and yourself! Speak your mind!
 
 You are as cowardly as the sum of yourself and the difference
 between a big mighty proud kingdom and a horse. Speak your mind.
 
 Speak your mind!
 
[Exit Romeo]
 
                    Scene II: The praising of Juliet.
 
[Enter Juliet]
 
Hamlet:
 Thou art as sweet as the sum of the sum of Romeo and his horse and his
 black cat! Speak thy mind!
 
[Exit Juliet]
 
                    Scene III: The praising of Ophelia.
 
[Enter Ophelia]
 
Hamlet:
 Thou art as lovely as the product of a large rural town and my amazing
 bottomless embroidered purse. Speak thy mind!
 
 Thou art as loving as the product of the bluest clearest sweetest sky
 and the sum of a squirrel and a white horse. Thou art as beautiful as
 the difference between Juliet and thyself. Speak thy mind!
 
[Exeunt Ophelia and Hamlet]
 
 
                    Act II: Behind Hamlet's back.
 
                    Scene I: Romeo and Juliet's conversation.
 
[Enter Romeo and Juliet]
 
Romeo:
 Speak your mind. You are as worried as the sum of yourself and the
 difference between my small smooth hamster and my nose. Speak your
 mind!
 
Juliet:
 Speak YOUR mind! You are as bad as Hamlet! You are as small as the
 difference between the square of the difference between my little pony
 and your big hairy hound and the cube of your sorry little
 codpiece. Speak your mind!
 
[Exit Romeo]
 
                    Scene II: Juliet and Ophelia's conversation.
 
[Enter Ophelia]
 
Juliet:
 Thou art as good as the quotient between Romeo and the sum of a small
 furry animal and a leech. Speak your mind!
 
Ophelia:
 Thou art as disgusting as the quotient between Romeo and twice the
 difference between a mistletoe and an oozing infected blister! Speak
 your mind!
 
[Exeunt]

Thue

Thue is a programming language based upon a string rewriting system, called semi-Thue system. It is non-deterministic and follows the constraint programming paradigm. Which means that the variables are defined in terms of constraint (ex. something is true or false). It is a Turing tarpit.

In the case of Thue the costraints are represented by a list of substitution rules in the form:

<string>::=<replacement>

Special formats of this form represent input, output or the ending of the list of rules.

The non-deterministic nature of the language can be shown by indicating two possible replacement for the same string. For example:

  • you write a rule that says that the string a can be replaced with stupid
  • you also write a rule that says that the string a can be replaced with you are

When a program is executed the string could be replaced by either of the two options.

The list of rules is followed by a string that represents the initial state.

While writing a useful program can be hard, the typical Hello World is quite easy and understandable, at least by the standard of esoteric programming languages.

a::=~Hello World!
::=
a

Unlambda

Unlambda is an (almost) pure functional language designed to show a purely functional language (and probably how impractical that would be). It is based on combinatory logic. It is Turing complete and the first functional Turing tarpit. It relies on a few functions, an apply operator ` (the backquote character) and it also support input/output. Technically it works only on functions with a single argument, but multi-argument functions can be translated into a sequence of functions.

An Hello World program almost looks readable.

r```````````.H.e.l.l.o. .w.o.r.l.di

Other Interesting Esoteric Languages

Fugue is a language that uses MIDI files as source code. The intervals between each note is translated to specific traditional commands, such as input/output or addition.

Beatnik is a stack based language which consists in a series of English words; whitespace and punctuation are ignored. The words are converted in traditional commands according to their value in the game Scrabble.

Whitespace is a joke language in which only whitespace characters, such as tabs and spaces, are valid and any other character is ignored. This is, of course, the inverse of the usual behaviour of compilers that ignore whitespace. Given these characteristic it can be used in a Polyglot program, that is to say a program that is valid for more than one language. As long as the other language is not Python, or any language where whitespace matters.

GolfScript is a concatenative programming language designed to win code-golf competitions that is also Turing complete.

Snowflake is a reversible self-modifying language in which both the interpreter and the program are modified at each run.

FRACTRAN is a language in which programs are a list of fractions and an initial number. I am not sure what it does, but it does something, since it has a Wikipedia page.

Iota and Jot are two formal languages, each one designed to be the simplest formal system. By their nature they can also be considered  equally minimalist, Turing tarpit, programming languages. Both use only two symbols and perform two operations. A successor, called Zot, supports also input/output.

Entropy is a very aptly named programming language designed to accept the idea of giving up control. Any output of the program will be approximated and each time the data is accessed it is randomly modified. The language does not modify the original program, so each time is run the output will be different, but the initial state will be preserved. A curious side effect of the randomization of data is Drunk Eliza, a web version of the classic Eliza program in which the therapist seems to be drunk.

Monicelli is a joke language based on the comedy film My Friends. A typical program looks like a series of nonsensical Italian phrases. Surprisingly this makes sense since the film used as inspiration also feature nonsensical phrases. It is the circle of madness.

Summary

The world of esoteric programming language is as exciting as it is maddening: there are no rules, but a lot of interesting things. I hope to have given you a slightly sane window on what you could expect.

If you want to know more or to participate in this community I suggest looking at Esolangs. That website, together with Wikipedia, is the source of most examples shown in this article. There you can also find some inspiration in a list of ideas for an esoteric programming language.

You may also find interesting the blog of Marc C. Chu-Carroll in its many incarnations during the years. As the name Good Math, Bad Math implies, it is mainly dedicated to mathematics. Nonetheless it has also many analyses of esoteric programming languages in a series called Pathological Programming (Language). You can see, for instance, one dedicated to the smallest programming language.

Reference: Discovering The Arcane World Of Esoteric Programming Languages from our JCG partner Federico Tomassetti at the Federico Tomassetti blog.

Federico Tomassetti

Federico has a PhD in Polyglot Software Development. He is fascinated by all forms of software development with a focus on Model-Driven Development and Domain Specific Languages.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button