The Original Pile Engine Demystified

Ralf Westphal's post , pdf (2006), code zip

The Original Pile Engine (OPE) as implemented by Erez Elul and Miriam Bedoni has boggled the minds of the Pile community for quite some time. Several attempts have been undertaken to understand its implementation – unfortunately without really succeeding. Recently Dirk Reuter with the help of Miriam has taken a stab at it again and written up a document trying to explain the algorithms and data structures used in the OPE (see his posting on Jan 16 2006 to the pileworks mailing list). Although this explanation is somewhat easier to understand than earlier writings it almost fully remains in the peculiar pileworks terminology.

To make Pile and its first implementation easier to understand and to pave the way for future implementations the following explanation will undertake a translation of the OPE algorithms and data structures to more usual software development terminology.

# What this Paper is about – and what not

This paper is about implementation and not about concept. It tries to explain a certain software design which realizes the Pile idea. It thus does not make any statement about the quality or feasibility of Pile concepts. Rather it wants to help make future discussions about Pile concepts easier, by getting implementational details out of the way.

Pile as a whole is like a coin with two faces: One is the conceptual side of Pile, the other the implementational.

Conceptually Pile for sure is very interesting and worthy to invest research into. Where can a Pile view on the world of information processing make a difference? That ́s an important question to answer. Erez Elul ́s “three equations” are part of the Pile concept. The very notion of relation as opposed to data is part of the Pile concept.

But the concept must not be confused with its implementation. Because if implementational details creep into a discussion about the concept, the concept might look less attractive or infeasible. So whether a relation is implemented as an integer or whether relations are managed in buffers or trees in a Pile Engine should be of no concern to whoever is interested in the Pile concept.

So far, though, many discussions about the Pile concept sooner or later referred to Erez ́ OPE, attributing it some capabilities without which Pile as a concept would not work. Thus since this implementation was so hard to understand, Pile ́s success so far was limited by it.

Now, by demystifying the OPE ́s inner workings this paper is trying to make the valuable Pile concept independent of any implementation. Of course, in the end Pile needs to be implemented somehow. But there can and should be more than one implementation with different qualities.

Pile implementations can differ in speed or memory consumption, but as is shown in this paper, they do not need to rely on some “magical algorithm” or the like in order to be called Pile Engines. Certainly it ́s a challenge to find the right data structures and algorithms for an implementation be fast and have small memory footprint. But that ́s primarily matters for software developers.

“Pile theorists” on the other hand should not be concerned too much about how a Pile Engine looks inside. As this paper will show, they can rest assured there is nothing in the OPE (or any other implementation so far) that needs to be understood first.

Pile theory is about a relational space with certain basic operations to build it and navigate through it. But whether this relational space is spanned by the OPE or another implementation, whether internally relations are integer values, or whether they are arranged in a sparsely populated 2D coordinate system, is not important for the Pile theory.

To make clear the distinction between concept and implementation of Pile, to direct research towards applications of Pile (Pile Agent) instead of implementation (Pile Engine), and to make the Pile idea independent of any particular realization in programming code: that ́s the purpose of this paper.