1. Introduction

The Orlando Project begins in a book too big for its bindings. In 1991, The Feminist Companion to Literature in English, a reference book on women writers, was published by some of us who are now part of the Orlando Project (Blain, Clements & Grundy 1991). Isobel Grundy, Patricia Clements and Virginia Blain started that work believing that its discoveries would easily fit within the covers of a chunky little reference book. They were wrong. During the course of their research, becoming increasingly aware of the extent and range of women's writing, they several times renegotiated the length of The Feminist Companion with its generously flexible publisher. The hard work of condensation and the publisher's flexibility notwithstanding, however, when the book was published it had no index. They'd had to leave it out. The index to the Companion -- which offered readers a few basic pathways into the several inches of densely packed information in the book -- was a hundred A4 pages in typescript. Printing it would have broken the bindings. Literally. Had the book been bound at that length, it would have fallen apart. For scholarly users of the Companion, this means that there are only two ways of accessing the book's information: the alphabetical order of biographical entries, and the rough chronological groupings of writers. The image of the bursting book links with another -- that of Feminist Companion filing cabinets -- on three continents -- also full of information about women's writing in English, not published because there wasn't room for it in the book.

Those two images have a lot to do with the genesis and the character of The Orlando Project. This research tool won't burst the bindings, and it won't leave relevant research in the filing cabinets. And this one will offer its readers -- or end-users, as we also call them -- many different ways of accessing the information it contains. In what follows, we will outline the major principles on which the project is built, provide some examples of the work we are doing, and consider some of the wider implications of the kind of work the Orlando Project has undertaken.

2. Project Team and Funding

The Orlando Project, the full title of which is "An Integrated History of Women's Writing in the British Isles", is based at the University of Alberta and directed by Patricia Clements. There are currently more than twenty participants in the project, including two principal investigators, four co-investigators, three postdoctoral fellows, a project librarian, a research collaborator, and eight graduate research assistants. One co-investigator and two graduate students work at the University of Guelph. The Social Science and Humanities Research Council of Canada is supporting the project with a Major Collaborative Research Initiatives Grant for five years from 1995-2000. Both host universities have also supplied key financial support. [1]

3. Project Aims

The Orlando Project aims to produce the first full scholarly account of women's writing in the British Isles, and to do so in two formats. It will finish up with five printed volumes of literary history, four of which will be individually authored, and it will create electronic products to be delivered on CD-ROMs or on the Internet or both. The project is named after the panhistoric character in Virginia Woolf's historical fantasy of 1928 (Woolf 1928). This "escapade", as Woolf called it, figures the development of women's writing, and the conditions under which such writing has been possible, in a fantasy-biography which gives a shifting identity but a changeless name to the writing woman, and registers the flux of history as the ground on which she develops. Orlando, beginning to write as an Elizabethan, emerges as a fully developed writer in Woolf's own time. The work she is writing throughout this whole span of time is a poem called The Oak Tree. At first male, Orlando becomes scandalously female in the eighteenth century, when, by Woolf's account, the woman writer became part of the record. Over the course of this modern history, Orlando is the representative woman writer. The Oak Tree is the representative collective text.

We've named ourselves after Woolf's historical speculation because of its double focus on women's writing and history, and because of its feminism and its sensitive registration of the complexity of the shifting conditions under which women have written. Our vision of literary history assumes the complex integration of a number of fields: writing by women, the changing conditions of women's lives, writing by men, historical processes, and the broader cultural environment. We are paying particular attention to the construction of gender, looking at the ways in which it is always uneven and contested, always in flux, always in dialogue with and constituted inextricably from such factors as class, national identity and sexual orientation. The interactions among the elements of what we are calling cultural formation -- both of the social, historical, and writing culture, and of individual women writers and groups of women writers -- is crucial to our sense of how literary history needs, now, to be investigated.

Traditional literary history has been charged with contributing to a totalizing or linear view of the past, and recent historical projects, such as The Columbia Literary History of the United States (Elliot 1987), have sought to articulate more fully a sense of the untidiness, the raggedness, of literary and social change, often opting, as a result, for depth or thick description of historical moments rather than for broad and sweeping slices of chronology, periods which are assumed to have a stable character -- the kind of history Virginia Woolf satirized in stating "in or about December, 1910, human character changed" (Woolf 1950: 91). We've opted for chronological sweep, but we intend to give a sense of the thickness, the layering and the untidy multiplicity of the historical moment together with the larger temporal shifts that differentiate those moments from one another. The complexity of women's relationship to writing demands multiple narratives with many different focal points, like the branching Oak Tree in Orlando.

4. Use of Technology

We have chosen, despite the fact that the scholars who initiated the project had no prior experience in humanities computing, to make computing technology the intimate companion of this history which aims to account for both the complexity of the moment and for larger temporal shifts. We opened this paper with the image of a bursting book, since the roominess, the sheer information-bearing capacity of the electronic medium, together with its ability to enable complex indexing, provided our initial motivation for turning to computing. But the project is now well beyond the purposes of indexing and deeply involved in something which has far more impact on the way we conduct our work as literary historians. That is the structuring of the information we are gathering.

For the Orlando Project, technology has become much more than a simple tool. It is altering the way we are conducting our research and changing the ways in which we approach the problems of literary history. We think the technology will allow us to do a different, and in some ways a better, kind of history. We want computers to help us to bring together and into focus the complex relationships that inform literary history. The hype of hypertext's ability to create multiple pathways for users through electronic material has become a bit stale, but the possibility of offering multiple rather than single trajectories, of fracturing the single, monological narrative, retains its freshness for us. The possibility of offering parallel and intersecting narratives, interlinked with non-narrative material, allows us to make the user of our electronic history an active partner -- another collaborator -- in the history. Electronic forms permit, and may indeed embrace, the display of difference, of competing narratives or explanatory paradigms, or even of contradiction in the materials presented, offering alternatives to more linear paradigms. We don't, of course, think that books are merely linear: the footnoted, annotated scholarly text is its own kind of hypertext. But we do believe, with Jerome McGann, that computers can help to dispel "the illusion that eventual relations are and must be continuous, and that facts and events are determinate and determinable" (McGann 1991: 197). We think that the computing tools we are using should help to make evident the patterns and meanings immanent in massed historical detail. We also hope to find news ways of addressing the gaps, discontinuities and unknowable silences in the history of women's writing by devising new ways of seeing and taking stock of what isn't there as well as what is. So we want our logo, an image of an oak tree, to be a pun, a technologized paronomasia, underlining the points of similarity and the points of difference between the organic/literary Oak Tree with which Orlando struggles for centuries and the branching structures of the literary/technological tools we are building.

5. Choice of Standard Generalized Markup Language

Before we give some examples, we need briefly to outline some features of the encoding language we are using, though these will be familiar to some readers of this paper. Standard Generalized Markup Language -- SGML -- is an international standard that exists independently of proprietary computer programs. Our use of it means that the material we are encoding will remain available to later generations of programs, and ensures that the usable life of our history does not depend on Bill Gates's marketing plans. The Text Encoding Initiative is a project which is developing protocols for encoding electronic texts in SGML. [2] TEI-conformant SGML is increasingly the choice of academic electronic edition projects. Our use of SGML, then, will make it that much easier for us to link our project to other humanities computing projects, such as the growing corpus of on-line texts, including the Women Writers Project at Brown University and the Victorian Women Writers Project at Indiana.

SGML adds "descriptive markup" in the form of tags -- or "elements" -- that describe various features of the text which is being tagged. The different tags are related to each other according to a set of rules that governs their hierarchical, branching relationship. Those rules are called a Document Type Definition or DTD. How a DTD is conceived has an immense impact on the shape of an encoded text, since the DTD structures the encoding of the information and largely determines what may be gleaned from the material much further down the line. Different DTDs can be created to suit different types of documents and different tagging purposes. The language used by the World Wide Web, Hypertext Markup Language, is an SGML DTD, and the DTD governs what you can and cannot do with HTML tags. SGML thus offered us an excellent compromise between standardization and customization, by allowing us to develop tools tailored to the project's needs.

5.1 Orlando's Use of SGML

The Orlando Project's work in humanities computing is experimental to the extent that the work we are doing in SGML is different from what is being done in other SGML projects we know of. For one thing, we are not encoding preexisting texts. SGML has typically been used by scholars to describe the structural features of texts -- their titles, annotations, stanzas, paragraphs, and so on -- for the purposes of making such structural features available for various kinds of scholarly analysis and of governing their presentation in print or on the Web. The Brown Women Writers Project, the Victorian Women Writers Project, the Perseus Project and the Model Editions Partnership are all using SGML for such purposes. [3] Unlike them, however, we are not creating an electronic archive based on existing text: the text we are tagging is the one we are writing.

Our attempt to produce the tools and the text simultaneously has far-reaching impacts on our research and writing practices. It means that the tagging -- the whole process of tagging from the initial development stage to the final writing up of research -- is in a reciprocal creative relationship with the research and the writing. The tagging embodies a set of judgments about which information we wish to present and the ways in which we wish to describe it. Once the tagging is structured, the tagging directs the research. Neither the text being tagged nor the tagging method has been imported, predetermined or preproven, into the Orlando Project, so we face a different challenge from that presented by a prefab computing program. We think that this is what must be done to domesticate the tools of computing for use in humanities research. Projects like the Text Encoding Initiative are addressing the problem of how best to translate products of the pen or press into electronic form. We are participating in that large project from the slightly different angle of exploring how best to produce the texts that we want to write in this new medium.

For those of us who work in the language, the restriction of the tagging process is initially -- and continuingly -- uncomfortable. It is an experience which sometimes makes the problem of the bursting book look elementary. Though this forest is endless, the tracks through it must be set in advance, and though the tracks will be very numerous, there is no such thing as free navigation. The idea of total freedom in electronic space is a marketing myth of hypertext, and the structures we put in place now will help determine which tracks are and aren't available when our work is done. The structures for tagging, the DTDs, are the central site of the integration of computing and literary history on The Orlando Project. We have developed three DTDs to date: a Biography DTD, a Writing DTD and an Events DTD. These present us with the means of gathering and entering information about writers' lives, their texts and the historical conditions in which they worked and with which they constantly interacted.

5.2 DTD Development

Construction of these Biography, Writing, and Events DTDs is at the heart of the collaboration: we have spent countless hours together, deciding exactly what it is we find important about, for instance, a woman's life. What elements of women's lives do we want to be accessible when the writing is complete? What kinds of questions do we want the materials to be able to answer? In this collaboration, the literary members of the team try to make their processes of evaluation and judgment as explicit and as concrete as possible; the computing members of the team push for specificity and attempt to systematize the process of structuring the texts we write. The DTD which emerges from our Document Analysis Sessions must amply and accurately reflect the sense of values of the research team (each member of which entered the conversation with an individually shaped set of literary and scholarly values, not to mention individually defined feminisms) and, at the same time, it must be workable for the people who are building the DTDs. Our analysis and the DTD construction is presided over by our sense of our multiple hypothetical future reader, who will need to find in what we construct a critically intelligent, usable instrument, serviceable in her research. The process of DTD development has forced us to make the conceptual organization of the project -- from our initial, strategic, if uncomfortable, division of its work into the categories of biography, writing and events -- explicit. We have had to perform close, careful analysis of the categories of significance in the kinds of interpretive statements we anticipate wanting to make as a result of the very research we want to use these tools to undertake.

Struggling to make the language of computing serve the needs of our scholarship, we are acutely aware of the ceaseless shifting of both language and history. As previously noted by a non-computing worker in the language, words "slip, slide, perish, / Decay with imprecision, will not stay in place, / Will not stay still", and everything we are working with changes its meaning through history (Eliot 1963: 194). In this collaborative project, specialists in different periods of women's writing have different perspectives on the meanings of key critical matters, such as class or race. These differences present our computing colleagues with the challenge of making the language of encoding able not only to reflect nuanced judgements but also to represent these historical changes of value and these multiplicities of critical perspective. Not a small challenge for a medium apparently so situated on the yes/no divide, so fixed, static in structure.

5.3 Limitations of SGML

We ask ourselves constantly what it means to be dependent on a language of binaries. The language of computers is, of course, in its most fundamental aspect, dependent on binaries, and we feel that the rigidity, exclusivity and lack of ambiguity this implies carries over into the tools of the project. Though we know that nothing is ever only one thing, in SGML we are confronted with exclusive choices. Does this tag apply: yes or no? A piece of text is either in the tag or it is not. If a woman writer has worked for the Women's Social and Political Union, do we wrap our discussion of that fact in a tag labeled "Politics" or a tag labeled "Occupation"? There are no shades of gray, and we need to make choices about how deeply to tag for the sake of productivity. Sometimes we double tag, spreading our discussion of an issue across more than one element because both aspects of it are important to capture for retrieval purposes, but in other cases we choose not to.

As we seek to overcome some of the limitations of literary history, then, we are confronted in various ways with the need to overcome limitations in computer encoding. The technology is both enabling and confining. For instance, we have three DTDs which permit us to encode important information and critical analysis. But we have had to sort our historical writing into three distinct areas, despite our view that such division is a distortion. Our division among biography, writing and world events is a strategic choice arising from the complexity of the tagging we want to do. We couldn't design a DTD that would allow us to talk about everything at once -- indeed, that would frustrate many of the aims of our computer encoding -- so we were faced with a trade-off.

We have devised a number of interlocking strategies for working against these difficulties. We allow for some nesting of tags so some things can be more than one thing at once. We will, of course, use hypertext links between different documents, regardless of those three divisions. We can use keywords such as "activism" or organizational names, such as the Women's Social and Political Union, to group documents together, and we are working towards a structured vocabulary or thesaurus that will serve as an entry-point into the material. Nor will the documents we are creating appear as isolated texts in the final version: they will be linked sequentially or interleaved in various ways. We are currently working on a system to put our SGML texts into "database" form, so we can pull together bits and pieces from different types of documents. Thus, for instance, we might produce a WSPU mini-web, which traces the relationships between women writers and the WSPU through social and political events, events in the lives of women writers, and the texts that they wrote, thus making visible the reciprocal relationship between textual, material, and ideological factors. We are already at the stage where we can draw together events from all three kinds of documents, so that biographical, publishing, and social and political material appears in a single chronology.

The yes-no language of computing is, we hope, overridden by the structural dynamics of these tools as a whole; they create multiple binaries in various relationships to each other which are in some sense analogous to the layerings of ideologies or discourses. We can't operate without discursive structures or ideological categories, nor would we wish to, but the computing language we are learning to use has some strengths for us. It pushes us to articulate our positions in as clear and detailed a way as possible and to make it possible for our users to assess more easily the steps in the process by which we have made our history. That commitment to ongoing collaborative process defines many aspects of the Orlando project's computer work, down to the fact that each document has attached to it a history of responsibility which records who worked on a document in its various stages. Letting the user into this process is one of the ways in which what we are creating differs markedly from a book: the user is more active in the process of creating and assessing meaning, and indeed may view herself as the latest in a series of collaborators who produce the text that she reads.

6. Cultural Formation

To show what we mean here we want to explain how we've dealt with race and ethnicity in tagging the lives of women writers. Race and ethnicity are part of the cultural formation element within our Biography documents in which we note crucial influences on the constitution of a British woman writer's social positioning. We want to make the complexities of questions of race and ethnicity emerge so as to emphasize that these are shifting, historically constituted and interestedly deployed categories whose use must be understood contextually. One strategy for doing this has to do with the relationship between the elements of the biography DTD. In our first working version of this DTD, such factors as race, language and religion were structurally separate, so that discussing their interrelation was awkward. In our revision, we created a single cultural formation element to allow for the simultaneous discussion of such factors. This structure aims to reflect current understandings of subjectivity as constituted not by tranhistorical or isolated categories but through multiple categories intersecting at an historically specific moment for a particular individual or group.

Within cultural formation, we have the following elements for race and ethnicity: <race/colour> <ethnicity> <nationality> <geography> <national-heritage> <geographical heritage>. The first four elements are used for categories directly associated with the women writer herself; the last two provide for ways of talking about the national and geographical backgrounds of her forbears. To account for the different ways categories get mobilized, the terms used in each tag will not be exclusive or even internally consistent across what we write. Depending on the context, for instance, "Jewish" may be tagged as one or more of "religious denomination" "race/colour" or "ethnicity", and "West Indian" may be tagged with "geography", "nationality", "race/colour" or "ethnicity" tags. We plan to make it possible for the user who is interested in race/ethnicity to constitute her own search group. This could be done, for instance, as a kind of search menu with pull-down boxes, so that if she wanted to search on "black" authors, she would be presented with a list of possible constituencies which could be included in the search or not, as she chose. We would make it clear that "black" and "African", for instance, are not commensurate: that a search on all the women writers whose cultural formation element indicates an association with the African continent (i.e. contains <geography>African</geography> or <geography>Africa</geography>) would include Doris Lessing, and "black" might include people of East Indian background in the UK, especially if self-identified as such.

In other words, we don't think that we could, or should, come up with an exact, fully defined, or mutually exclusive set of categories: the point is the overlap between them. Within the system we are creating, counting per se -- for instance, of trying to determine the number of women writers in a particular identity category -- will be highly problematic, that is, purposefully problematized. As we move along in the project, we are carefully building sets of associations which can be accessed either through indexes or by some kind of specialized search function. Cultural formation, then, makes the definition and demarcation of such categories as race and ethnicity the shared responsibility of taggers and readers. By becoming active in the process of deciding how to search for and group the material in our history, our reader will become an active collaborator in the process. And to the extent that we are doing this tagging and thinking through these issues in order to push forward our own understanding of the material, the line between researcher and reader or user is continually blurred.

6.1 Example of Cultural Formation

The case of Anna Leonowens, the writer whose memoirs of her time in Siam became the basis for the musical The King and I, provides an example of how the cultural formation element works in practice. Leonowens is of interest in this context because recent historians have suggested that her subject position may be rather more complex than she represented it in her description of her life as an "English governess". The bare text of the cultural formation element for Leonowens is simply standard prose:

Although Leonowens herself, in attempting to adopt an unequivocally English identity, implicitly claimed that she was white, evidence suggests that although her father was probably Welsh and presumably white, her mother was quite possibly Eurasian. As the daughter of a low-ranking soldier for the East India Company, she would not have held a high position in Anglo-Indian society (Bristowe 1976: 23-31).

Revealing the pertinent SGML tags shows the depth of the tagging embedded in the prose:

Although Leonowens herself, in attempting to adopt an unequivocally <nationality self-defined="selfYes"> English </nationality> identity, implicitly claimed that she was <raceColour self-defined="selfYes"> white </raceColour>, evidence suggests that although her father was probably <nationalHeritage forbear="father"> Welsh </nationalHeritage> and presumably white, her mother was quite possibly <raceColour forbear="mother" historicalTermContext= Victorian British> Eurasian </raceColour>. As the <class self-defined="selfNo"> daughter of a low-ranking soldier for the East India Company </class>, she would not have held a high position in Anglo-Indian society (Bristowe 1976: 23-31). [4]

The tagging marks Leonowens' self-construction as English and white while acknowledging the possibility, based on scholarly evidence, that her position in relation to English constructions of race and nation was rather ambiguous. The term Eurasian is marked as specific to Victorian British usage and would be contextualized by a gloss noting its use "in the context of the ... colonization of India, to connote a mixture of white, European and Asiatic, usually East Indian, parentage" (OED). In addition, we will link similar terms as "Eurasian" and "Anglo-Indian" and would include "Eurasian" in results for searches on word "Indian". The end result of such tagging, we hope, will be to complicate and interrogate, both for ourselves and our users, the identity categories associated with women's writing in the British Isles.

The Orlando Project is still very much in progress. In addition to the problems of history-writing, of focussing our energies on a vast project of research and writing, we have major computing challenges before us. We need to figure out a means of providing access to the project's language in a way that is context-sensitive and which will help link related terms and clarify their relationships. We need to create documentation and design a delivery system which will make the structures of our encoding as transparent as possible. We need to enable our readers to engage in productive questioning, in problematization, of the materials we have researched. And we need to continue to ensure that the processes of the computing do not distort the processes of historical selection and interpretation.

7. Team Tag

We have been discussing tagging. Before we conclude, we must turn briefly to the tag team. As the above discussion of the lengthy processes of document analysis and DTD development will have made clear, Orlando is a genuinely collaborative project. There are some twenty-one of us working out the various intellectual frameworks of the project. Negotiation abounds: between the volume authors, between the volume authors and the computing planners; between the computing designers and the computing technicians building the connective tissues; between all these and the graduate research assistants who are the front-line users of the systems we are developing. This is not usual in the humanities -- certainly collaboration of this kind is new to most of us -- and the implications of this kind of collaboration are institutional as well as intellectual. Given evaluation practices which are tightly bolted to the idea of individual authorship, or, at a pinch, joint authorship, our institutions will be hard put to evaluate work which sometimes may not be authorship at all. Who, for instance, produced this paper? This is surely a challenge which as a profession we will be facing with increasing frequency. But the single most interesting institutional feature of the Orlando project is the involvement of graduate students. Humanities PhDs are long, lonely experiences. Graduate students, of whom there are always eight to ten at work in Alberta and Guelph, create a different kind of environment in their work on the Orlando Project. In an important sense, they are the project: they do a great deal of the library work, the checking, the first tests of the computing processes; they comprise a large part of the tag team; and they write materials for the text base. [5] To them, Orlando provides both intellectual community and a real role in a major project, with all of the training that implies. Some twenty-four or so graduate students will emerge from this project with experience of collaborative work and with an understanding of their research as part of the larger work of the discipline. They will also have confident knowledge in humanities computing and an understanding that we can adopt an active and shaping attitude to these tools, creating ways in which they serve our purposes.

We feel, as a team, that we are involved in the domestication of computing for the humanities. For us, this gendered metaphor highlights the obvious and important fact that we are a team predominantly composed of women, working to reshape, in the interests of feminist inquiry, the tools of a field very markedly dominated by men. That we are doing this with the major backing of the Social Sciences and Humanities Research Council and our universities is an important part of the picture. We understand gender as cutting across every element of our project, from the time past we are constructing together to the time present in which we are working together to construct it. So while this domesticating labour is also productively underway in many other arenas in the humanities, we think we have a particular angle on the potential servitudes and freedoms of writing with computers, on the politics of knowledge this work produces, and on the ways that we may seek to possess the means of electronic production rather than being dispossessed by them. We find this work both daunting and exciting, and we hope that our collaboration will lead to many others -- both with the future users whose roles we see as continually blurring into our own, and with other attempts to use computers to open new possibilities for the conduct of research in the humanities.


[*] This paper was first presented June 1, 1997 at a joint session of the Association of Canadian College and University Teachers of English (ACCUTE) and Canadian Consortium for Computing in the Humanities / Consortium pour Ordinateurs en Sciences Humaines (COCH/COSH). We would like to thank the organizers of both of these organizations, as well as the audience for stimulating discussion. Susan Brown and Patricia Clements would like to thank Christine Bold, Donna Palmateer Pennee, and Ann Wilson for their generous assistance with an early draft of this paper.

[1] The Orlando Project web site contains further information on project members and activities, <URL: http://www.ualberta.ca/ORLANDO/>.

[2] Text Encoding Initiative Home Page is found at <URL: http://www.uic.edu:80/orgs/tei/>.

[3] Information about these projects can be found via their home pages: The Brown University Women Writers Project <URL: http://www.wwp.brown.edu/>; Victorian Women Writers Project <URL: http://www.indiana.edu/~letrs/vwwp/>; The Perseus Project <URL: http://www.perseus.tufts.edu/>; The Model Editions Partnership <URL: http://mep.cla.sc.edu/MEP-Home.HTM>.

[4] The reference is to Louis and the King of Siam (Bristowe 1976), a study on Leonowens' son Louis and his career in Siam which casts doubt on Leonowens' claim to be English born and bred. For the sake of clarity, some tagging has been omitted from this example.

[5] The graduate students who have contributed to date to the work of the Orlando Project are Shauna Barry, Jocelyn Brown, Pippa Brush, Kathryn Carter, Jennifer Chambers, Tina Cheng, Karen Chow, Paul Dyck, Sarah Gibson, Jane Haslett, Carolyn Lee, Mary-Elizabeth Leighton, Heather McAsh, Margaret McCutcheon, Catherine Nelson-McDermott, Andrew Mactavish, Aimée Morrison, Sarah Timleck and Samantha Wrigley.