Notes

Thanks are due to the Packard Humanities Institute for providing the text of Ovid in electronic form; to John Bradley for several minor repairs to TACT at critical moments; to Lidio Presutti for the Anagrams program; to the Computer Studies Programme at Trent University for the opportunity to give a talk on this subject prior to the Workshop; and to Russ Wooldridge for the encouragement and further opportunity provided by the TACT Workshop itself. (The research described here has developed considerably since the original CCHWP version of the paper was written. For more recent reports see the references on W. McCarty's homepage.) [Editorial note. The reader should keep in mind a few changes that have been made since this article was written to the names of TACT components. The term TACT is now used for the overall system of text retrieval programs, the two main components of which are MakeBase (formerly MakBas), which converts a text file into a textual database, and UseBase (formerly called TACT), which allows one to search the database. The Index display is now called KWIC; Rules are now known as Queries, Categories as Groups.]


"People are curious. A few people are. They will be driven to find things out, even trivial things. They will put things together. You see them going around with notebooks, scraping the dirt off gravestones, reading microfilm, just in the hope of seeing this trickle in time, making a connection, rescuing one thing from the rubbish. And they may get it wrong, after all."

Alice Munro, "Meneseteung", in Friend of My Youth (Harmondsworth: Penguin Books, 1991): 73.

1. Introduction

In the following paper I explore the analysis of a highly complex text with TACT, emphasizing its extendible markup scheme and means for defining groups of related words and expressions. My text is the first six books of Ovid's Metamorphoses. Using it, I show how TACT markup allows the scholar to describe a large number of complex literary structures as well as to record the locations of notable events for later recall. I also demonstrate how a thesaurus of verbal phenomena, such as images and themes, can be built with TACT "rules" and "categories". Finally, with other tools in TACT I suggest various additional ways of probing the variety of patterns on which a major work of literature is apt to depend.[*]

In what follows, I assume familiarity with TACT and use its terminology freely. A knowledge of Latin and of the Metamorphoses is not required, but will prove very useful.

Apart from showing how a literary critic uses TACT, my intention is to take stock of work in progress towards an electronic edition of the entire Metamorphoses. Thus a caveat: because I have not yet thoroughly checked either the encoding or the text itself, my results may in some cases be inaccurate; furthermore several of the ideas I illustrate have continued to evolve, introducing minor inconsistencies into some of the figures. I have two excuses for presenting my ideas so imperfectly at this early stage. First, they nevertheless indicate the kinds of questions that may usefully be asked through a medium like TACT at a time when the program itself is still in development. Second, reporting on them prior to publishing the edition invites others to influence it before its features are completely formed. I invite that influence wholeheartedly.

Beyond the more immediate aims of applying TACT to Ovidian criticism, I want also to raise the question of what a fully realized (as opposed to merely imitative) electronic edition of a literary work might be like. I will attempt a tentative answer by example and will conclude with some suggestions.

Finally, a note about the illustrative figures on which most of the remainder of this paper is a commentary. Although faithfully based on the data obtained from TACT, many of these figures have been heavily modified and extensively formatted. Such enhancements to the standard printouts of TACT displays have been necessary in order to translate a dynamic, interactive medium into a static form.

2. The Metamorphoses

In many ways, Ovid's Metamorphoses is ideal as a test case for the application of TACT to literary studies. The length of the text -- approximately 12,000 lines of Latin hexameters in 15 books — is sufficient to make the whole of it difficult for most scholars to remember in any detail. Memory is further taxed by the intricate organization of its contiguous, interlocked, and often intercalated stories of widely varying length and indeterminate number.[3] Although its complexity has been the subject of much scholarly attention, there is no general agreement as to whether the work is, as Ovid declares, a perpetuum carmen ('continuous poem', Met. 1.4), and if it is, exactly what makes it so, and how.

What is the Metamorphoses? Simply, it is a compendium of stories that define a culture, in many ways the classical equivalent of the Bible, though without canonical authority. Its influence on subsequent ages, including our own, is enormous, perhaps more than any of us realise. Like the Bible, it begins at the creation of the world, ends in a kind of apocalypse, and in between charts movement through time and space from primeval beginnings to a city and story with lasting significance — however ironically and playfully intended. It sums up and shapes the lore of its culture, remaining close to the welter of conflicting tales available to the poet, from which he selected and, to a certain extent, which he apparently supplemented by invention. Significantly, and not unlike the authors of the Bible, Ovid has paid much less regard to smooth, logical transition between stories than to complex repetition of themes and images, even to extensive wordplay. As critics have noted, Ovid seems deliberately to spoof the whole idea of narrative transition — in order, I would argue, to point us elsewhere for the principle by which numerous stories become something like a single story — a perpetuum carmen.

What is that principle? The objectives of this paper exclude the possibility of an argument,[4] but let me offer a suggestion: that our investigations begin by gathering together related images and themes in order to discover what patterns they make. These patterns in turn allow us to see how various pieces of the literary puzzle fit together. Such collecting of patterns across the poem has been in progress for some time.[5] Nevertheless, many remain unexplored, perhaps even undiscovered; despite growing interest, the work towards a comprehensive view of the poem remains far from complete because it demands so much of us.

Before computers, the explorer's basic instrument was the concordance. As an evolutionary development of the concordance generator, TACT gives us several ways to circumvent the limitations of a concordance or other device for simple word searches. Chiefly, it gives us perfect recall of multiple kinds of detail and the ability to experiment with various models of organization. Its abstract powers are only the beginning, however. Like the ancient suppliant, praying to god do ut des ('I give that you may give'), the modern user of TACT, and even more the editor of an electronic edition, must know that great benefits are obtained only in return for a considerable investment of time and effort. TACT is a means of getting in deeper, for longer, potentially with better results. It is not a labour-saving device.

3. Markup

We begin with markup, the metatextual commentary imposed on a text but always distinguishable from it. Broadly speaking, this markup allows the user selectively to reduce a complex work to its major elements, to view each of them in isolation from the others.

The reader will recall that TACT markup consists of tags placed at specific locations within the text, and that these tags may be of any number and may signify as wide a range of textual phenomena as the editor wishes to identify. Essentially tags record the editor's understanding of the text, enabling certain kinds of questions to be asked. In general the more complex the text, the more extensive the markup must be for TACT to yield useful results. Wisdom in, wisdom out. For purposes of this paper, tags have two forms:

"<name value>", in which "name" is a parameter signifying a type of entity within the text, and "value" the current value of that parameter, e.g. "<loc Rome>";

"{value}", in which the curly braces themselves signify the parameter (here, a sentient being) and "value" the particular name of that being, e.g. "{=Apollo}".

Figure 1 lists the tags I have used for the Metamorphoses and two brief samples of marked-up text. Note that all except the "name" tag use the format of the first type shown above, the so-called COCOA format; when the text is viewed in TACT, these appear only as reference information, not as text. Only the "name" tags are considered as text for recall, but as I will explain, this text — more properly, metatext — is always distinguishable from the original.

I will comment on all these tags briefly in the order in which they are given in figure 1.

3.1. Books

The division of the poem into the present 15 books is an undisputed convention and so needs to be marked. (Note the "<book 3>" tag in the example in fig. 1.) TACT automatically supplies the line numbers, so these do not have to be tagged.

3.2. Alternative divisions

Other ways of dividing up the poem are possible, however.[6] Several sections, e.g. the Cadmus Cycle and the Ovidian Aeneid, do not conform to book divisions exactly, but cohere nevertheless. In my initial scheme, I have tentatively partitioned the poem into twelve unequal divisions, denoted by the "div" tag as shown.[7] Some of these divisions are more contentious than others, but here the important point is that any scheme, or any number of schemes, may be represented by tags and easily modified by moving the tags. (I will return to this point later.) In figure 1 you will see the beginning of the Cadmus Cycle, for example, marked with "<div 3>"; this tag allows us to ask questions only of the Cadmus Cycle as well as to know quickly, through the reference field of a TACT display, when a something occurs there.

3.3. Stories and story levels

For ease of reference, I have given each story in the Metamorphoses a name and recorded it with the "s" tag, which marks its beginning; its ending is denoted by a dummy value, the dash ("-").[8] There are several kinds of "s" tags -- "s1", "s2", and so forth — to denote the various 'levels' on which stories may be said to occur.

In my initial experiments, I have broadly defined the notion of 'level' to mean any way in which one story is subsumed by another. For example, in figure 2 (a schematic representation of the Cadmus Cycle) I distinguish three levels, but make no attempt to discriminate amongst the kinds of transitions.[9] Further experiments will determine if such discrimination should be made, and indeed how many levels of storytelling it is productive to mark. Experience so far suggests that there is little benefit to identifying more than three or perhaps four levels, although in some places at least seven are distinguishable.[10]

Like the "div" tag, "s" allows us to see at a glance which story a given occurrence of something is in, and to ask a question restricted to a particular story or set of stories. In the experiments documented here, only the former option has been exercised.

3.4. Unifying presence

The "pr" tag indicates a kind of inclusion not covered by the "s" tag: namely, the 'presence' of a sentient being who by that presence can be said to unify a group of stories. Two such unifying presences are indicated in figure 2, Tiresias and Bacchus, who happen to overlap. Two kinds of "pr" are allowed, "pr1" and "pr2", to handle just such a case.

I have reached no conclusion yet about the utility of the "pr" tag, although clearly 'presence' is a kind of unifying device. Note that 'presence' need not be always or even frequently explicit in the original text; as with the Tiresian and Bacchic stories, an occasional assertion is sufficient. Theseus, for example, is present from Met. 7.404 to 9.100 (he overlaps with Hercules from 9.1 to 9.100), but for long stretches of the narrative he is quite invisible.

3.5. Location

All geographical locations are marked with the "loc" tag so that at some later stage I can chart Ovidian translatio — the spatial movement from primeval beginnings to the conclusion in Rome. Initially I have marked all locations in the same way, whatever function they have in the text; thus the place where action happens is not distinguished from a place to which reference is made, e.g. in an epithet such as Sidonius (lit. 'he from Sidon', for Cadmus). Experiments will reveal whether such distinctions are useful to have. In any case, the "loc" tag as currently defined has little immediate use in TACT. Rather "loc" is there for extraction by other software working directly on the text files; if place-of-action can be reasonably well defined, it may become useful within TACT.

3.6. Names

All names of sentient beings in the Metamorphoses and all direct references to them other than verb endings are explicitly tagged with TACT "label" markup. Thus proper names, pronouns, possessive adjectives, epithets, and substantives with clear reference to an individual (e.g. pater) are routinely tagged. Ontologically uncertain figures, such as terra or sol, are also tagged even when they do not appear as sentient. Consequently, with the tools provided by TACT the user can determine for any being the relative density of occurrence as well as chart his or her distribution across the poem.

Name tags are designed to capture the form and content of each occurrence as economically as possible. At a minimum a name tag consists of a single element, the grammatically normalized form of a proper name. Most name tags, however, are compound expressions consisting of two or more parts with separators indicating roughly the relationship between parts. A typical name tag will, for example, specify a standardized form in English spelling, the nominative singular (or plural) of the form that occurs in the text, and any relevant attributes or objects attached to the occurrence of the name. In addition, the names of other beings implied by the given occurrence are included.

My scheme is described by example in figure 3; see also figure 2 for examples in context and figure 4 for a sample name list.

Let us consider a few examples from figure 3. The first is the simplest case: any inflected form of Achilles is represented inside the curly braces of the name tag as "=Achilles". (The equals sign accomplishes several tasks: it suggests an identification of terms; it ensures that all such tags will be grouped together by TACT in the same part of the alphabet; and it makes them consistently distinguishable from the original text.[11)] In the second example, Peliades refers simultaneously to two individuals — Achilles and his father Peleus — and so generates two tags. The first, "=Achilles=Peliades", may be read 'Achilles occurring as Peliades'; "=Peleus-son/Achilles/", the second, 'Peleus' son, who is Achilles'. As a result the reference to father Peleus in the patronymic is alphabetized along with other references to him, the fact that it is a patronymic is not obscured, and for convenience the name of the son is given. In all cases, the dash (as between "Peleus" and "son") indicates the subordination of a related object or attribute; the slashes always enclose the name of a being to whom reference is made. Thus when Mercury is referred to as natus Iove, 'son of Jupiter', the resulting tags are "=Mercury=natus_Iove" and "=Jupiter-natus/Mercury/".

As illustrated above, English words like "son" or "daughter" are used when the Latin provides no equivalent; otherwise, as in phrases like natus Iove, the tag uses normalized forms of the original words. Because in TACT, as well as in most other software, spaces are taken to separate rather than join words, the underscore replaces the space wherever a connected phrase is required, as in the tag "=Ajax_clipei_dominus_septemplicis", 'Ajax, lord of the sevenfold shield', and other extended epithets.

Names connecting a being with a place (such as Thermodontiaca for Pentheselea, Queen of the Amazons) generate location tags as well as name tags. As in this example from figure 3, a being likely to be sought under the name of a group is also tagged with the name of that group.

In most instances, however, beings who appear under different names are given one primary name, more or less arbitrarily. Thus Apollo, Phoebus, and Sol are all given tags that begin with "=Apollo"; Minerva and Pallas with "=Minerva"; and so forth. Consistent with the scheme already described, the remainder of the tag preserves whatever name is actually used in the text, e.g. "=Apollo=Phoebus". Moreover, when for example Phoebus is clearly the current name, tags generated by pronouns, possessive adjectives, and substantives preserve this fact, as in "=Apollo=Phoebus=ille".

4. Character studies

Once names are tagged as above, characters may be studied in several different ways. Some of these are suggested by figure 4, figure 5 and figure 6.

4.1. Variant names and attributes

In figure 4 is a selected list of name tags, taken directly from a TACT word list, with the frequency of occurrence shown beside each form. Studying this list one can ask, for example, what attributes constitute a major as opposed to a minor character: not merely frequency of occurrence, although that is a factor, but also relationship to things and to people, especially parents and offspring; possession of variant names and titles; and appearance in a type or role, such as deus, rex, heros, vir, monstrum ('god', 'king', 'hero', 'man', 'monster'). Note that Medusa is particularly interesting in this regard: although minor as a hapless young woman (who in Ovid's version is raped, then punished for her sexual offense by being transformed), she becomes major by these criteria in her death.

Such a name list, used in TACT, can allow us to ask which attributes and variant names appear in which contexts. To pick a simple example, is Jupiter's obsessively watchful consort portrayed differently when she appears as Saturnia, formidable daughter of Saturn, instead of Iuno or coniunx Iuppiter? This remains to be seen, and can be easily with TACT.

One can also examine the use of personal pronouns, asking for example who is empowered to say ego, who to address a god or goddess as tu.

In sum, from the perspective of these tagged names, the stories are radically simplified to the characters who appear in them. Certain aspects of the stories can be more quickly, if not more clearly, seen that way. Looking at Cadmus in figure 4, for example, we notice immediately his dynastic and domestic character — well attested throughout classical literature, but not so obvious from the Metamorphoses directly, perhaps.

For some kinds of questions, we need to simplify the name list by reducing the variants of one or more names to a few major groups or even to one. We might, for example, want to see how occurrences of Diana are distributed across the poem, however she is named, or (recalling the example just given) how Juno as Saturnia compares with her when she is called by another name. TACT allows any combination of variants to be selected, but by creating a TACT "category" such combinations and simplifications of the name list are preserved.

In the same way related figures can be combined and studied together, e.g. all those who are said to be audacious or contemptuous of the gods, like Pentheus.

4.2. Frequency

In figure 5 the variety of names has been radically simplified so that we can observe how often various characters appear. Such a list is exceedingly sensitive to consistency of markup, which in my initial experiments is not reliable, so again the results here cannot be trusted. The reader should nevertheless be able to see the usefulness of the tool. If in fact Procne, Philomela, and Tereus are named with anything close to the relative frequency shown in figure 5, then the interpreter has a very interesting phenomenon to account for.

4.3. Distribution

With important exceptions, the influence or importance of a being in the Metamorphoses is proportional to distribution: minor mortals tend to be confined to single tales, major ones to sequences or cycles, and gods to larger segments. The TACT Distribution Display, then, allows us a convenient representation of influence in this sense.

An interpretively significant example begins with the observation that by about book 6 in the Metamorphoses the major gods have all but disappeared from the action. How accurate is this observation, however? Thorough tagging of names allows us not only to answer the question, but also to see if our observation conceals different rates or patterns of disappearance. Figure 6 shows distributions for a few combinations of major gods.

In the third graph of figure 6, combined occurrences of Jupiter, Juno, Apollo, and Diana are shown across the first four and into the fifth of the 'thematic divisions' (those alternative units designated by the "div" tag).[12] Apart from the first division, which is short and is concerned with primeval times, the pattern of disappearance seems clear, confirming the initial impression. When the distributions of individual gods are examined, however, hidden complexities come to light. These complexities raise questions about the nature of the gods in question. Furthermore, other gods show very different patterns in harmony with their natures. Note how in the fourth and fifth graphs of figure 6, for example, first Venus & Cupid, then Minerva are shown to be increasing in influence. This is no surprise: in the former instance, the gods of love represent an upsetting, if not revolutionary, force, as the poet repeatedly notes (see e.g. Met. 2.846 ff.); in the latter, Minerva is viri fautrix, 'the helper of man', and even when she isn't exactly helpful, she is involved far more with humans than with gods — and isn't clearly superior to humans except in power.

So, investigation by TACT confirms the reader's impression overall, but illuminates exceptions to the rule that are in accord with what we know otherwise.

Once the entire poem is available for analysis, results from such a question will be more comprehensive — and perhaps also more surprising. What is already surprising is that the first six books of the Metamorphoses, a relatively large chunk of highly complex poetry, seems such a limiting amount when viewed with TACT. Granted that overview is gained at the cost of detail, but detail can always be quickly recovered. Maps are no substitute for the actual terrain, but they put that terrain into perspective.

5. Images, Themes, Ideas

Knowing where images, themes, and ideas recur is often central to interpretation of the Metamorphoses. The student may want to know, for example, where the stretching out of arms — a very common gesture in the poem -- precedes metamorphosis, and what the assembled contexts of this image say about its significance; in what stories the imagery of hunting coincides with that of sexual passion, and what the coincidences tell us about the nature and role of Diana; where the vocabulary of hunger and thirst, or devouring and drinking are associated with acts of violence; and so forth. The answers to such questions imply what I call a thesaurus of semantic categories, for which TACT provides a set of tools and procedures.

In this section I describe the specific steps I have taken towards constructing such a thesaurus for the Metamorphoses.

The first step is to define for each image, theme, or idea a specific group of words and phrases that evoke it: the theme of 'death' for example, by such words as bustum, 'tomb, grave'; caedes, 'murder, massacre'; corpus, 'body, corpse'; and so on. The second step is to test the group against the text, eliminating all irrelevant occurrences, e.g. of corpus that refer to a living rather than a dead body. The third step is to test the resulting TACT "category" against stories which are known to contain the image, theme, or idea in question, resolving discrepancies wherever they occur. (Discrepancies may result from a mistaken reading, but usually they indicate overlooked words or phrases; occasionally they may point to ideas evoked indirectly, through broader associations.[13]) This third step is recursive but not endless, and it certainly refines the editor's understanding of the text.

However carefully built, such a thesaurus has an inherent tendency to binary categorization: for a given category, words and specific occurrences may be either included or excluded, whereas imaginative language depends on evoking (rather than defining) states between the two. Thus the final result of consulting an electronic thesaurus must be a return to the text rather than unqualified acceptance of external proof.

Figure 7 shows the present structure of my thesaurus for Ovid; items marked by an asterisk are relatively well developed, the remainder are merely lists of words awaiting systemization. The categories listed here reflect my interests as a critic, but they are not, I hope, idiosyncratic. They are intended to be sufficiently diverse so as to suggest a paradigm towards further work.

5.1. The Rule File

Figure 8 shows a representative sample from a TACT rule file, the first step in constructing the thesaurus. Words are grouped by broad semantic classes -- 'battle', 'blood', 'death', and so forth — within which there are subdivisions as required. Observe that each semantic grouping contains three-part entries with the following structure: (1) a headword of two or more segments, the last of which is normally a Latin word; (2) a TACT "rule" consisting of one or more lines that specify the inflected forms of this word in the pattern-description language available within TACT; and (3) one or more lines of commentary to remind the reader of the English equivalents of the word. In a certain number of cases an entry specifies more than one word ("Battle=general=violo", for example), but all words with different roots, and many with the same root, are specified separately to ensure maximum flexibility. Several major entries include a collection of phrases where these are required to capture an idea, e.g. "Battle=siege((phrases))".

5.2. Building Categories

In the second step of thesaurus building, the editor translates rules into TACT categories. First, for each entry in a semantic class (e.g. certo in the class 'battle', fig. 8), he or she calls up the rule, retrieves all occurrences thus defined, and saves them as a corresponding TACT category. Then, once all the categories have been constructed for the individual entries, they are combined to form a meta-category for their semantic class. In fact, any number of meta-categories may be defined on the basis of any mixture of individual categories, including other meta-categories. Thus, for example, questions may conveniently be asked about all 'battle' words less those for the gods of battle; all words involving both 'battle' and 'death'; all words involving 'love' solely as 'desire'; and so forth.

In several cases the words comprising the individual categories are unambiguous — bellum, for example, always has something to do with warfare or strife — but in others this is not true, as I suggested earlier for corpus. Note in figure 8 that some of the entries, e.g. "Blood=blush=rubesco", contain the TACT operator "pos". This operator signifies that the editor wishes to examine each occurrence of the defined word or words to decide which are to be retained in the category and which eliminated. (As it happens, rubesco and erubesco can refer simply to the reddening of the dawn rather than the flushing of human skin, although sometimes there may be a connection between the two.) After occurrences have been sorted, those retained are saved as a TACT category.

For a large, complex work such as the Metamorphoses, many hundreds, perhaps even thousands of occurrences must be individually inspected. Doing so provides an excellent though demanding way of studying the work at first hand. The editor's desire to automate this process may be balanced by the suspicion that he or she is likely to get more benefit from constructing the thesaurus than most of its users are likely to realise from consulting it. Of course we hope that this suspicion turns out to be unfounded.

5.3. Distributions

The third step in constructing the thesaurus, testing and refining categories, is not illustrated directly here, but will be mentioned in passing as we consider several TACT distribution graphs (fig. 9, fig. 10, fig. 11, fig. 12, fig. 13, fig. 14). These graphs are based on two relatively well-defined semantic groupings, one for 'battle' and the other 'love'. My chief purpose here is to show how such graphs can be applied to study the patterning and interrelation of ideas.

Two quite different kinds of graphs are illustrated. Figure 9, figure 11, figure 13, and figure 14 show distributions by percentage of the text (here books 1 through 6, each 2% representing about 100 lines of the poem); figure 10 and figure 12 show distributions according to the individual stories, regardless of length. The former is preferable for seeing the rising and falling rhythms of violence and eros, the latter for determining which stories are notable for these qualities. A quick glance at figure 9, for example, shows three major areas of violence as defined by the 'battle' category: at the foundation of the world; at the foundation of the first major city by Cadmus, the inaugurator of the heroic age; and around the appearance of the first non-domestic hero, Perseus. I use the word rhythm advisedly, as the graph invites us to ask if the peaks of violence are both anticipated by what comes before and echoed in what follows them.

Figure 11 shows a peak of erotic activity in the regeneration that follows the foundation of the world, and interestingly one that occurs between the violence of Cadmus and that of Perseus, in the stories about the downfall of Cadmus' house. What makes the latter peak especially interesting is the structural fact that these stories also contain, and in some cases conceal, great violence. Amongst these stories are those told by the daughters of Minyas (from the tale of Pyramus and Thisbe to that of Hermaphroditus and Salmacis), which echo the same structural model: while the frenzy of Bacchus rages all around them, they sit primly inside telling erotic tales in which destructive frenzy seethes just below the surface, then violently erupts.

In the graphs of distribution by story (fig. 10 and fig. 12), distinctions are much sharper. At least in the initial stage of research, one is likely to be surprised by the absence of results for certain stories more than by the peaks, unless of course the poem is unfamiliar — as it would be to an undergraduate student using the edition as a study aid. In figure 10, for example, the story of Tereus and Philomela shows less violence than one would expect, especially at the beginning, although the reader's impression is likely to be of the potential for terrible violence seething just below the surface (cf. fig. 14). Similarly, in figure 12, the erotically charged tale of Actaeon hardly shows up as erotic at all. Clearly, either the thesaurus is missing some relevant vocabulary or the imagery in question is implied somehow rather than expressed. Again, the next step is to return to the text for closer examination of the vocabulary, then modify the TACT thesaurus as required, and try again.

Whether building a thesaurus or just consulting one, the user's impressions and understanding of the text are put to the test. Any attentive reading is also a test of understanding, of course, but the computer — again, a perceptual agent — presents a new perspective. This perspective raises in a new and interesting way the question of where the reader's impressions come from. For a text of any complexity and imaginative appeal, the answer is not likely to be easy.

5.4. Co-occurrence: 'sex' and 'death' in the Metamorphoses

In figure 13 and figure 14, we can observe the overlap or 'co-occurrence' of eros and violence, first by using the proximity operator in TACT, then by superimposing two ordinary distribution graphs.

In figure 13 the significant co-occurrence is of the two semantic groupings we have already examined, 'battle' and 'love', as you can see by comparing the second graph in the top row against the last one at the bottom right. (The co-occurrences of 'love' with the other semantic groupings, introduced in fig. 13 for the first time, hardly make a difference to the overall result.) In the latter graph, I have substituted letters of the alphabet for asterisks so as to show in the important cases which story is involved. The most prominent area of the graph where sex and death overlap marks the stories told, again, by the daughters of Minyas. The other two point to the tale of Apollo and Daphne (in which an initial argument breaks out between Apollo and Cupid as to whose arrows are more effective), and the story of Proserpina's rape by the infernus raptor himself, including his incidental violation of Cyane. (Proserpina's undoing is one of the classical equivalents of the biblical Fall of Man, the archetypal event that forever intermingles sex with death.[14)] The proximity operator has serious limitations, however, since in effect it requires that words or groups of words be relatively close to each other.[15] A story involving the interrelation of two themes may, however, manifest them in words that are quite far apart. In figure 14 I circumvent the difficulty by overprinting the distribution of words in the 'love' category (indicated by horizontal brackets) with those for 'battle' (asterisks). Here a much higher degree of overlap is visible. Note, for example, the story of Pentheus, which quite properly is shown in figure 14 to be very much about the collision of eros and violence.

6. Phrases

Known phrases appear in the thesaurus, as you can see from figure 8, where they are used to identify ideas that cannot be attached to a single word without causing many irrelevant occurrences to be selected (e.g. siege warfare). More importantly, however, TACT provides tools for discovering repetitions of phrases not already identified. These repetitions provide clues for interconnections between separate stories.

Repetitions can be of two kinds: exact and inexact.

6.1. Exact phrasal repetition

Because Latin is highly inflected, exact repetition is much less common than, say, for modern orthographic English.[16] The phrasal collocation tool CollGen (a separate program that works with the TACT database) therefore misses many repetitions because of inflectional variations and differences in word order. This means, however, that the phrasal collocations CollGen does find are, or may be, particularly significant.

Figure 15 shows fragments of a CollGen listing. I have marked for consideration four collocations. The text for these is shown in figure 16, where passages are grouped accordingly. The passage at the top of figure 16 is given particular emphasis to suggest how repetitions commonly work: Echo, unable to originate speech herself, merely repeats what others say, but in Ovid's story her repetitions are selectively partial and, in fact, reveal the erotic undercurrent in Narcissus' speech.

As an example of how Ovid uses repetitions to connect stories, consider the link made by me copia fecit ('abundance makes me...') between Narcissus in Met. 3.466 and Niobe in 6.194. Narcissus, looking at himself reflected in a clear pool, has an abundance of what he has always secretly wanted, a beautiful lover, but he cannot possess him because he is him; abundance therefore makes him poor. Niobe in contrast declares that her abundance (in the form of sons and daughters) makes her rich; she boasts inordinately about it, slandering the goddess Latona and so provoking her children, Apollo and Diana, to slaughter all Niobe's offspring. Both stories revolve around the paradox that having is not having, especially when the 'possessed' object is a sentient being. From the perspective of Niobe's story, we might say that Narcissus' reflection is his progeny; from Narcissus', that Niobe's children are to her only reflections of herself, therefore already shades in an underworld. Following this and other clues, we can begin to resolve two very distinct stories into a bridging meta-story.

The reader who knows the Metamorphoses is encouraged to work out the other examples in a similar way.

6.2. Inexact phrasal repetition

A very simple tool, the TACT Index display, can be quite effective for catching inexact repetitions and, in general, for becoming acquainted with the immediate verbal neighborhood of a given word. Again because of the nature of Latin, the span of context may have to be considerably greater than is usual for an English text.

In figure 17 two word groups are shown, rubor / rubeo / erubesco ('red, become red, flush, blush') and sanguis / exsanguis ('blood, bloodless'). In the former case, we can see at a glance three associations with redness and flushing of the skin: dazzling white (candor), beating of the breast (pectora...percussa), and ignorance (nescio). All three happen to be highly significant in context, as critics have noted.

Inspecting the contexts of sanguis alerts us, for example, to the range of meanings taken on by the idea of familial relation, which in Latin as in English uses the metaphor of common blood. Thus in the line marked as example 1, genitum...sanguine means simply 'related by blood' and refers to Apollo's paternity of Phaethon. In example 3, however, matris de sanguine natos, 'born from the blood of their mother' refers to the genesis of the supernatural horses Pegasus and Cryasor when Perseus decapitates Medusa. In example 4, Pyramus, about to commit suicide, invites the earth, accipe nunc...nostri...sanguinis haustus, 'accept now the drink of my blood', an ancient image of the blood-thirstiness of maternal nature. In example 5, after Actaeon has been transformed into a stag by Diana, whom he caught bathing, he is torn apart by his own dogs, who are satiatae sanguine erili, 'sated with the blood of their master'. Truly, as Pythagoras reminds us at the end of the poem, Iron-Age humans are e sanguine nati, 'sons of blood' (example 6) in all senses — and thus we begin to see how the image of blood draws the poem together.

6.3. Recurrent verbal context

The "span" operator in TACT allows us to get a list of words commonly found within the vicinity of a given word. In figure 18 the word defining the context is again sanguis. I have lemmatized the results obtained from "span", then sorted them in reverse order of frequency. Many of the highest frequency words in proximity to sanguis are to be expected: os ('mouth, face'), corpus ('body'), mater ('mother'), caedes ('gore, slaughter'), and so forth. Some, however, are interesting, e.g. penna / pluma / volucer, ('feather', 'down', 'bird'), which point to the image of the bloodied bird fluttering anxiously at the feet of a predator — commonly in Ovid a simile for rape. We wonder in this context about the collocations of 'Jupiter' with sanguis. The occurrences of the word lumen ('light, eyes') also raise an obvious question.

7. Sound and wordplay

Amongst the literary phenomena hardest for humans to detect consistently over a large amount of text are the least rational and perhaps most common of poetic devices: sound and wordplay. Few of us can remain undistracted by the sense of the text long enough to pay attention to the subliminal messages, and even if we do, how do we represent them?

In the early 18th century Alexander Pope noted that sound can be an echo to the sense, but it can also undermine apparent sense. In figure 19, as an exercise for the reader, I give three passages with significant repetition of sound. The second and third have quite similar incantory passages — one about the labyrinth, the other about Circe's magic spell — in which sound indeed echoes sense. In the first passage the infatuated god's obsession is as much or more represented in the repetitions as in the surface meaning.

7.1. 'Similar' words

In figure 20 I have used the TACT "simil" operator, which is based on a pattern-recognition algorithm,[17] to calculate selected degrees of similarity between each of the two words in the phrase imagine vocis (lit. 'by the image of [her] voice') and the passage in which they are found. By tying the results to size of type in the printout, I attempt to suggest visually how the echoing of the targeted phrase throughout the surrounding text might be determined.

Note, however, that the "simil" algorithm is limited in what it can detect: it is based solely on repetitions of individual letters, and considers all letters the same, taking no account of differences between vowels and consonants or within groups of related sounds (dental, labial, palatal, etc.). Furthermore, we have no way to measure the density of repetition it does detect against what is common for the poem, author, genre, or language in question. It is therefore difficult to know how to interpret the results. The difficulties are not in principle insuperable, however.

7.2. Anagrams

Another attempt to detect verbal repetition is represented partially in figure 21 and figure 22. It begins with a recent and still unofficial program in the TACT system, an anagram-finder appropriately called Anagrams. Running the program on the Metamorphoses yields over 1400 exact anagrams, but only a few of these are close enough to each other to form significant pairs.[18]

Amongst these are the pairs et/te ('and / [to,for] you') and ut/tu ('so that / you') — seemingly a trivial case. Assuming the contrary, however, I used the proximity operator in TACT to locate all such pairs in which the words were five or fewer lines apart, then charted their distribution across books 1 to 6. As figure 21 shows, an unusually high concentration of the pairs occurs in the story of Phaethon. In figure 22 et, te, ut, tu and the semantically related forms tui and tibi are highlighted as they occur in a short passage from that story together with the corresponding letter sequences within other words of the text. The distinction between these sequences as independent words and as parts of other words is shown by underlining.

The possible significance of the results, which suggest a kind of accusatory murmur, may be indicated by the action of the story. In it Phoebus, the sun-god, attempts to convince Phaethon not to drive the sun-chariot — a privilege Phoebus had foolishly granted him as proof of paternity. Throughout a large part of the story, including the quoted passage, Phoebus describes the horrors Phaethon will encounter, with the question (both expressed and implied), 'And what will you do about this?' Thus an accusatory murmur in the subtext serves the narrative very well. Since we are concerned with a subliminal message, we are justified in considering mere letter sequences as well as the corresponding words.

Again, however, we do not know precisely how unusual the repetition is.

7.3. Unspecified repetitions

CollGen and Anagrams represent a step towards a more general pattern-recognition device in that they find correspondences without our having to specify a pattern or exemplar beforehand. Likewise, we can imagine how the "simil" or other such operator might compute the density of repetition, however that is defined.[19] Once we had an accurate notion of the density of repetition, we could then easily locate unusual passages within a text, or unusual texts within a tradition, and study the relationship between sound and sense more precisely than has been possible before.

The literary problem of what we mean by 'repetition' remains in the shadows. The possibility of defining it exactly and obtaining useful results on a major scale could, however, be a mighty stimulus to research.

8. Conclusions

In the early stages of a new technology, people tend to think that its purpose is merely to replace and improve on something they already know. The promise of the new is thought to be quantitative: the new thing will do the old job faster, more efficiently, and more cheaply; it will increase 'productivity'; and so forth. Tools, however, are perceptual agents. A new tool is not just a bigger lever and more secure fulcrum, rather a new way of conceptualizing the world, e.g. as something that can be levered. Thus the final question here, as I promised at the beginning, is what a fully realized — as opposed to merely imitative — electronic edition of a literary work such as the Metamorphoses would be like.

Certainly an electronic edition of the kind whose progress I document is not for reading, like a book on paper, rather for reference. Experience with such a reference tool shows immediately how silly is the technophobe's fear of a resulting decline in reading. As I suggested earlier, the old saw about computers — 'garbage in, garbage out' — has a more positive and inspiring inverse that applies to the user (as well as the editor) of an electronic edition: 'wisdom in, wisdom out'. Putting the wisdom in means more, not less, study of the printed page.

The mutability of an electronic edition, which we frequently see only as a problem, is certainly one of the most significant facts to be considered. There can be no doubt that the untouchable stability of a conventional edition is a great virtue: it makes the edition a necessary foundation for scholarly progress, liberating those who follow by allowing them to turn their attention elsewhere. At the same time, however, its immutability, enshrining judgments, discourages further experimentation. Thus, perhaps, the very failing of its electronic counterpart opens the door to great promise.

Because it is so easily changed, the electronic edition is inherently tentative and experimental, therefore paradigmatic in a new sense. If the editor does the job well, that is, the electronic metatext offers access not only to the evidence, but also to the editor's processes of reasoning. (It is thus something new, an intermediate form between the established text and the criticism it engenders.) A good electronic edition points the way to further experiments with the text, and it offers its user both the means with which to conduct them and some guiding examples. For this reason, I would argue that whenever possible the electronic edition should contain all marked-up text files as well as the compiled database.

The mutability we need protection from is the accidental, irresponsible, or malicious kind. Rather than protecting the text by sealing it within the hermetic container of some retrieval package or encrypted CD-ROM, what we require are reliable means of review, publication, and distribution. The problems entailed in making these means reliable are, however, outside the scope of this paper.


[2] As of 1 July 1996, the author will be a member of King's College, London; e-mail: willard.mccarty@kcl.ac.uk.

[*] Cf. J. Bradley, "TACT Design", CHWP, B.1; cf, notice on availability of TACT.

[3] The number of stories is indeterminable in principle because the distinction between a story as such and a reference to a story is arbitrary. We cannot even be certain of recognizing references when they occur. When, for example, should we regard a place-name derived from an eponymous founder as a reference to that figure and to any story the author may have intended him to evoke? Since these intentions are unknowable, and since our knowledge of background material is incomplete, no definite answer seems possible.

[4] I make that argument in a forthcoming book, Narcissus and his Relations in the Metamorphoses of Ovid, towards the completion of which the electronic edition described here is an aid.

[5] E.g. by Charles P. Segal in Landscape in Ovid's Metamorphoses: A Study in the Transformations of a Literary Symbol, Hermes Einzelschriften, 23 (Wiesbaden: Franz Steiner, 1969), where a reasonably obvious pattern is explored.

[6] Amongst the many see, for example, those proposed by Brooks Otis, Ovid as an Epic Poet, 2nd. ed. (Cambridge: Cambridge Univ. Press, 1970): 83-90 ff.; Walther Ludwig, Struktur und Einheit der Metamorphosen Ovids (Berlin: Walter de Gruyter, 1965).

[7] My division of the poem results in the same number of segments as Ludwig's, although the two schemes only occasionally coincide. Mine are as follows: (1) Beginnings: Met. 1.1-1.451; (2) Loves of the gods: 1.452-2.875; (3) Cadmus Cycle: 3.1-4.603; (4) Perseus: 4.604-6.145; (5) Niobe, Philomela & Procne, Medea: 6.146-7.403; (6) Theseus: 7.404-8.884; (7) Hercules: 9.1-9.797; (8) Orpheus: 10.1-11.66; (9) Prelude to war: 11.67-11.795; (10) Trojan War: 12.1-13.622; (11) Aeneid: 13.623-14.608; (12) Roman Apocalypse: 14.609-end.

[8] In more sophisticated markup schemes (such as SGML) all tags have both initial and final forms, e.g. for every "<tag>" there is a corresponding "<\tag>" that marks the end of the segment. Tags in TACT mark locations. Assigning a dummy value to a tag thus circumvents a limitation in the design of the program.

[9] The story of Cadmus, for example, occurs in two distinct segments; at the end of the first, the narrative simply breaks off to tell the stories of his progeny and others, then finally returns to describe Cadmus' fate. In the case of Pentheus, the Bacchic acolyte Acoetes actually tells his story within the embracing story of Pentheus.

[10] Strictly speaking any reference to a story outside the one in progress, no matter how brief, signifies transition to and from another 'level', but there seems little point in marking this transition as such unless it is reasonably lengthy. I have not yet settled on a minimum length at which a mere reference or allusion may be said to become a story. Again, the distinction is arbitrary (see note 2, above).

[11] In the MakBas ".MAK" file, the equals sign has been declared the first letter of the alphabet. Thus when the user obtains a word list, the contents of all name tags occur together at the beginning of the list.

[12] At the time of these experiments, the database included only books 1-6, whereas the fifth division runs from 6.146 to 7.403. Thus, again, caveat utor.

[13] Erotic vocabulary makes a particularly interesting case. In The Latin Sexual Vocabulary, for example, J. N. Adams notes that "In a suggestive context almost any object or activity may be interpreted as a sexual image" (Baltimore: Johns Hopkins, 1982): 3. The question — indeed a difficult one to answer systematically — is how such contexts are established, what words carry the burden of suggestion.

[14] See Willard McCarty, "The Catabatic Structure of Satan's Quest", University of Toronto Quarterly 56 (1986/7): 286.

[15] The problem is with the limit of DOS to the standard 640K memory: requests for co-occurrences of words very far apart (say, 50 lines before and after) overcome available memory and so cause the program to fail. It is hoped that the next full release of TACT, with improved memory management, will overcome the problem in many if not all cases.

[16] I have taken preliminary steps to lemmatize the Metamorphoses, but a significant amount of work must be done to resolve the multitude of ambiguities detected by the lemmatization software, kindly applied to the poem for me by Dr. Pieter Masereeuw (Amsterdam).

[17] This is the Ratcliff/Obershelp algorithm, described in John W. Ratcliffe and David E. Metzener, "Pattern Matching: The Gestalt Approach", Dr. Dobb's Journal (July 1988): 46-51.

[18] Anagrams also is able to detect 'partial anagrams', i.e. words formed out of some of the letters of other words. The number of these is so great, however, that I have not yet been able to explore the verbal patterns they may reveal.

[19] For example, an elaboration of "simil", working on a TACT database, might take each word within a given span, calculate its similarity to each of the other words, averaging the results (perhaps weighting the average as a function of distance between words), and attach the resulting quantity to the position in the text. Then density could be displayed for any given unit of the text, e.g. line, stanza, narrative entity.