Marking the Text: A Necessary Distinction

At the 2002 Extreme Markup Conference, Michael Sperberg-McQueen offered these observations on the problem of overlapping structures for SGML-based markup systems.

It is an interesting problem because it is the biggest problem remaining in the residue. If we have a set of quantitative observations, and we try to fit a line to them, it is good practice to look systematically at the difference between the values predicted by our equation (our theory) and the values actually observed; the set of these differences is the residue …. In the context of SGML and XML, overlap is a residual problem.2

But in any context other than SGML and XML, this formulation is a play of wit, a kind of joke – as if one were now to say that the statistical deviations produced by Newtonian mathematical calculations left a "residue" of "interesting" matters to be cleared up by further, deeper calculations. But those matters are not residual, they are the hem of a quantum garment.

My own comparison is itself a kind of joke, of course, for an SGML model of the world of textualities pales in comprehensiveness before the Newtonian model of the physical world. But the outrageousness of the comparison in each case helps to clarify the situation. No autopoietic process or form can be simulated under the horizon of a structural model like SGML, not even topic maps. We see this very clearly when we observe the inability of a derivative model like TEI to render the forms and functions of traditional textual documents. The latter, which deploy markup codes themselves, supply us with simulations of language as well as of many other kinds of semeiotic processes, as Peirce called them. Textualized documents restrict and modify, for various kinds of reflexive purposes, the larger semeiotic field in which they participate. Nonetheless, the procedural constraints that traditional textualities lay upon the larger semeiotic field that they model and simulate are far more pragmatic, in a full Peircean sense, than the electronic models that we are currently deploying.

Understanding how traditional textual devices function is especially important now when we are trying to imagine how to optimize our new digital tools. Manuscript and print technologies – graphical design in general – provide arresting models for information technology tools, especially in the context of traditional humanities research and education needs. To that end we may usefully begin by making an elementary distinction between the archiving and the simulating functions of textual (and, in general, semeiotic) systems. Like gene codes, traditional textualities possess the following as one of their essential characteristics: that as part of their simulation and generative processes, they make (of) themselves a record of those processes. Simulating and record keeping, which are co-dependent features of any autopoietic or semeiotic system, can be distinguished for various reasons and purposes. A library processes traditional texts by treating them strictly as records. It saves things and makes them accessible. A poem, by contrast, processes textual records as a field of dynamic simulations. The one is a machine of memory and information, the other a machine of creation and reflection. Each may be taken as an index of a polarity that characterizes all semeoitic or autopoietic systems. Most texts – for instance, this chapter you are reading now – are fields that draw upon the influence of both of those polarities.

The power of traditional textualities lies exactly in their ability to integrate those different functions within the same set of coding elements and procedures.

SGML and its derivatives are largely, if not strictly, coding systems for storing and accessing records. They possess as well certain analytic functions that are based in the premise that text is an "ordered hierarchy of context objects." This conception of textuality is plainly non-comprehensive. Indeed, its specialized understanding of "text" reflects the pragmatic goal of such a markup code: to store objects (in the case of TEI, textual objects) so that they can be quickly accessed and searched for their informational content – or more strictly, for certain parts of that informational content (the parts that fall into a hierarchical order modeled on a linguistic analysis of the structure of a book).

These limitations of electronic markup codes are not to be lamented, but for humanist scholars they are to be clearly understood. A markup code like TEI creates a record of a traditional text in a certain form. Especially important to see is that, unlike the textual fields it was designed to mark up, TEI is an allopoietic system. Its elements are unambiguously delimited and identified a priori, its structure of relations is precisely fixed, it is non-dynamical, and it is focused on objects that stand apart from itself. Indeed, it defines what it marks not only as objective, but as objective in exactly the unambiguous terms of the system's a priori categories. This kind of machinery will therefore serve only certain, very specific, purposes. The autopoietic operations of textual fields – operations especially pertinent to the texts that interest humanities scholars – lie completely outside the range of an order like the TEI.

For certain archival purposes, then, structured markup will serve. It does not unduly interfere with, or forbid implementing, some of the searching and linking capacities that make digital technology so useful for different types of comparative analysis. Its strict formality is abstract enough to permit implementation within higher-order formalizations. In these respects it has greater flexibility than a stand-off approach to text markup, which is more difficult to integrate into a dispersed online network of different kinds of materials. All that having been recognized and said, however, these allopoietic text-process ing systems cannot access or display the autopoietic character of textual fields. Digital tools have yet to develop models for displaying and replicating the self-reflexive operations of bibliographical tools, which alone are operations for thinking and communicating – which is to say, for transforming data into knowledge.

We have to design and build digital environments for those purposes. A measure of their capacity and realization will be whether they can integrate data-function mechanisms like TEI into their higher-order operations. To achieve that will entail, I believe, the deployment of dynamic, topological models for mapping the space of digital operations. But these models will have to be reconceived, as one can see by reflecting on a remark about textual interpretation that Stanley Fish liked to make years ago. He would point out that he was able to treat even the simplest text – road signage, for example – as a poem and thus develop from his own "response" and commentary its autopoietic potential. The remark underscores a basic and almost entirely neglected (undertheorized) feature of discourse fields: that to "read" them – to read "in" them at any point – one must regard what we call "the text" and "the reader" as co-dependent agents in the field. You can't have one without the other.

Fish's observation, therefore, while true, signals a widespread theoretical and methodological weakness in our conceptions of textuality, traditional or otherwise. This approach figures "text" as a heuristic abstraction drawn from the larger field of discourse. The word "text" is used in various ways by different people – Barthes's understanding is not the same as a TEI understanding – but in any case the term frames attention on the linguistic dimension of a discourse field. Books and literary works, however, organize themselves along multiple dimensions of which the linguistic is only one.

Modeling digital simulations of a discourse field requires that a formal set of dimensions be specified for the field. This is what TEI provides a priori, though the provision, as we know, is minimal. Our received scholarly traditions have in fact passed down to us an understanding of such fields that is both far more complex and reasonably stable. Discourse fields, our textual condition, regularly get mapped along six dimensions (see below, and Appendix B). Most important of all in the present context, however, are the implications of cognizing a discourse field as autopoietic. In that case the field measurements will be taken by "observers" positioned within the field itself. That intramural location of the field interpreter is in truth a logical consequence of the co-dependent character of the field and its components. "Interpretation" is not undertaken from a position outside the field; it is an essential part of a field's emergence and of any state that its emergence might assume.

This matter is crucial to understand when we are reaching for an adequate formalizing process for textual events like poetry or other types of orderly but discontinuous phenomena. Rene Thorn explains very clearly why topological models are preferable to linear ones in dynamic systems:

it must not be thought that a linear structure is necessary for storing or transmitting information (or, more precisely, significance); it is possible that a language, a semantic model, consisting of topological forms could have considerable advantages from the point of view of deduction, over the linear language that we use, although this idea is unfamiliar to us. Topological forms lend themselves to a much richer range of combinations…than the mere juxtaposition of two linear sequences. (Thorn 1975: 145)

These comments distinctly recall Peirce's exploration of existential graphs as sites of logical thinking. But Thorn's presentation of topological models does not conceive field spaces that are autopoietic, which seems to have been Peirce's view. Although Thorn's approach generally eschews practical considerations in favor of theoretical clarity, his models assume that they will operate on data carried into the system from some external source. If Thorn's "data" come into his studies in a theoretical form, then, they have been theorized in traditional empirical terms. The topological model of a storm may therefore be taken either as the description of the storm and/or a prediction of its future behavior. But when a model's data are taken to arise co-dependently with all the other components of its system, a very different "result" ensues. Imagined as applied to textual autopoiesis, a topological approach carries itself past an analytic description or prediction over to a form of demonstration or enactment.

The view taken here is that no textual field can exist as such without "including" in itself the reading or measurement of the field, which specifies the field's dataset from within. The composition of a poem is the work's first reading, which in that event makes a call upon others. An extrinsic analysis designed to specify or locate a poetic field's self-reflexiveness commonly begins from the vantage of the rhetorical or the social dimension of the text, where the field's human agencies (efficient causes) are most apparent. The past century's fascination with structuralist approaches to cultural phenomena produced, as we know, a host of analytic procedures that chose to begin from a consideration of formal causation, and hence from either a linguistic or a semiotic vantage. Both procedures are analytic conventions based in empirical models.

Traditional textuality provides us with autopoietic models that have been engineered as effective analytic tools. The codex is the greatest and most famous of these. Our problem is imagining ways to recede them for digital space. To do that we have to conceive formal models for autopoietic processes that can be written as computer software programs.