Stack Based Language

See also: Stack Computers


A stack-based language is one in which a stack, implicitly accessed by most operations, is a fundamental part of the programming model. Examples include Forth Language, Factor Language, Joy Language, Post Script, and the Java Virtual Machine (which, since one can write in assembly for it, should be considered a language albeit a low-level one).

Languages which maintain a program stack ("The Stack") for storing of Activation Records and/or parameter passing as an implementation detail, but keep the programmer from manipulating The Stack directly, don't count. This includes most imperative programming languages (Cee Language, Cee Plus Plus, Java Language)--in each of these, the program stack could be replaced with an alternative data structure (for example, heap-allocated activation records, like Smalltalk Language has).

Likewise, providing a stack data structure in the library also doesn't count.


Actually, the Forth Language uses two separate stacks: the data stack for parameter and result passing and the return stack for storing activation records. This separation makes it easy and natural for a Forth word to return multiple values. (It is possible and sometimes useful to temporarily push data on to the return stack to reduce stack juggling.)


Another common feature of stack-based languages is Postfix Notation, also called Reverse Polish Notation. Rather than writing an expression such as

+ 4

in many Stack Based Languages you write

4 +

A bit unusual, until you get used to it. Postfix notation has the nice property that it doesn't require parentheses for associativity. The following expression is ambiguous:

+ 4 * 5

Is it (3+4) * 5 = 35, or 3+(4*5) = 23? The rules of mathematics and most programming languages say the latter; Smalltalk Language says the former. To override these defaults, parentheses must be added. In postfix notation, that's not necessary--you would write either

4 + 5 *

if you mean the first, or

4 5 * +

if you mean the second.

One drawback with postfix notation is that operators must either be unary or binary, not both. If you want negation, you cannot overload - to be a unary operator; you either have to provide a new operator (say neg) or use subtraction from zero.

5 - +

is an error.

5 neg +

or

0 5 - +

both produce -2.


I generally consider stack-based languages to be better suited for Intermediate Forms or Virtual Machines (the Java Virtual Machine is a fine example) then they are for source languages. Elimination of variables/Let Bindings is theoretically interesting, and certainly elimination of variables which are only used as temporaries is nice; however, stack-based languages can increase the chance of programmer error. Stack-based languages frequently require the programmer to keep track of bookkeeping details that a compiler should handle instead ("Where on the stack is the result of evaluating the foo function?"). Variable names are also a key part of a program's documentation (well-chosen ones, anyway); eliminating them tends to obfuscate code. In my opinion, at least.

A program stack (visible to the programmer, rather than just a convenient way to deal with Activation Records) can be a useful thing; especially when it augments rather than replaces other things.


I agree; Forth Language novices are apt to make many stack-balancing errors when programming in their accustomed style (big functions, lots of conditionals, lots of intermediate values on the stack). The lesson learned is to Refactor Mercilessly and Simplify Vigorously. The ideal Forth word definition is seven words long. Easy to understand, easy to Unit Test interactively and exhaustively. Proverb: "Refactor until there is no place left for the bug to hide."

The more words you factor your code into, the more the words themselves become the built-in documentation. In the few cases where persistent stack intermediates are unavoidable, one can insert a stack comment: ( foo bar -- ), or use a local variable language extension to define symbolic names for those intermediates.

As with Assembly Language programming, you can get around the lack of a safety net in Forth (which has been described as a macro assembler for a virtual stack machine) by using good programming practices.


If you are asking Where on the stack is the result of evaluating the foo function?, you probably need to refactor in simpler terms. Unavoidably, that happens all the time.

-- iru


The built-in language in HP 48 series graphic calculators (and other HP RPN calculators) is also stack-oriented.


The Concatenative Languages Wiki located at factor.sourceforge.net provides a wealth of information about (stack-based) Concatenative Languages. -- Slava Pestov


See original on c2.com