In keeping with the speed at which change comes upon us in the digital world, the term "born digital" developed a second meaning within four years of first being used to describe artefacts that, to quote Marcia Stepanek, "never . . . existed on paper" (1998). Stepanek’s meaning is the one with which educators and scholars are probably most familiar. In this use, "born digital" describes a work that is not, in Bolter and Grusin’s term, "remediated from print" (2000: 44-62). That is, it describes a human creation that originates in the digital realm, one created in and for cyberspace, as opposed to one that was born in print culture, created for life on paper. But by 2002, Wired magazine had renovated the meaning of "born digital" into a signification that is probably still more familiar to administrators than to teachers in higher education; but, as the sub-title of the Wired article, "Born Digital: Children of the Revolution," would have it, if we think of "born digital" as being a reference to an emerging demographic of readers, we open the door to new ways of thinking and new ways of teaching.
"Born digital" in this second, newer sense describes not human creativity, but human beings. As of 2007, few of those who constitute the primary readership of this paper would have been "born digital." But many who have taught in a first-year university or college classroom within the past year or two will have encountered students "born digital," that is, people born into a world in which the computer has been no less a constant feature of their lives than school, television, or radio.1 Most educators would probably wish the same could be said for books in the lives of the current generation of students.
But the purpose of this paper is not to lament the arrival of digital culture. Far from it. The scholars on whose work this paper is based have a similar faith in the positive potential of digital communication as Johannes Gensfleisch zur Laden zum Gutenberg and Johann Fust must have had in Gutenberg’s adaptation of the wine-press. That being said, they also understand well enough the opposite position, articulated by Sven Birckerts more than a decade ago in his book with the evocative title, The Gutenberg Elegies, and summarized more viscerally at about the same time by novelist Annie Proulx’s pronouncement in the New York Times that "Nobody is going to sit down and read a novel on a twitchy little screen. Ever."2 Our students, according to this line of thinking, are the poorer for being born digital. According to this thinking they are deprived of the necessity of confronting the logical structure of the book, and they are deprived of the pleasure of experiencing the comfortable tactile and olfactory sensations of a physical (analogue) book. Furthermore, such thinking goes, students could learn patience in the absence of hyperlinks and fast-forward buttons, and they could develop, to paraphrase Canadian poet Alden Nowlan, a sense of history to accompany any knowledge of history they might pick up—by design or by accident—along the way.3 But whether one embraces digital culture or rues it, this much can be agreed upon: book history and book culture were never more in need of being taught than they are now, now that the conscious memory of the typical first year university or college student extends no further back than to a time when the Internet was available in the home.
What the emergence of the born-digital generation means to educators is more profound than might initially appear. Quite simply, when educators turn to technologies of the digital to teach book history and bibliography to students born digital, we do so not only for purposes of preservation, but for purposes of revelation and explication. If we set about remediating a book into a digital artefact for an audience of those born into any part of Gutenberg’s long-lived print culture, our primary concern will be preservation of the known and familiar. What we create in this case must not violate the user’s sense of what a book is and what its parts are, but we need not tell the whole story: we can assume the user has a shared history of more or less similar book use. The terms "chain line," "watermark," and possibly even "gutter" will likely have been heard before, even if the student reader cannot point to these physical features in early modern books because, in the first two instances, she has never seen them in the books she commonly reads, and, in the last instance, we are dealing with specialized and rarely used terminology.
However, if we set about remediating a book into a digital artefact for an audience of those born digital, our primary concern will no longer be preservation; rather, it will be representation of the (ironically) new and unfamiliar.4 That’s the revolutionary change we need now to plan for: students who encounter the book as new and unfamiliar rather than as a practically invisible medium. Teaching book history and bibliography to students born into digital rather than print culture will require us to re-present the absent printed artefact so that even someone wholly unfamiliar with such items will be able to comprehend what they are seeing and eventually what they might be able to simulate touching.5 But while we still must wait for haptic interfaces that enable an answer to the question "What did an early book feel like?" we can nonetheless move forward with the visual presentation of the early modern book. It is worth recalling here Katherine Hayles’ observation in "Translating Media," that the digital revolution has made the book available for consideration "as a physical object" in ways it formerly had not been (2003: 269). In the language of the Russian Formalists, the digital revolution has "defamiliarized" the book such that we are able to see it as though for the first time. This is worth recalling because it provides an opportunity further to recall the absence of, and hence the unfamiliarity to modern readers of, the qualities of the book that were commonplace to readers of books in the long-lived hand press era: qualities such as paper that is not uniform in shape, thickness, or texture, deckled edges, pages not yet separated, and, again, chain lines, wire lines, and watermarks. Thus, as we prepare to meet the needs of students born digital, we remind ourselves also of the needs of non-specialist readers, the lay reader as it were, of the late-Gutenberg period.
Mindful of the needs of the teachers of and researchers in bibliography and book history, Canada Research Chair in Humanities Computing Dr. Ray Siemens invited Drs. Claire Warwick (University College, London), Brent Nelson (University of Saskatchewan), Alan Galey (University of Toronto), Paul Dyck (Canadian Mennonite University), and Richard Cunningham (Acadia University), to the University of Victoria’s Electronic Textual Cultures Laboratory (ETCL) to dismantle an early modern book for the purposes of working toward the creation of an electronic resource for research in, and the teaching of, bibliography and book history. In the ETCL, the progress made would not have been possible without the substantial assistance of ETCL project manager Karin Armstrong and programmers Mike Elkink and Greg Newton. In the balance of this paper I will describe the planning process we undertook before entering the lab, our experience in the lab, and our results.
Bad enough we had all agreed, before gathering in Victoria, to disassemble one book; in the end we discovered we would need to dismantle two books to achieve our representational goals. This need to use (or perhaps more appropriately abuse) two books rather than one was a direct result of the planning we undertook prior to entering the ETCL. We began with the basic idea of digitizing an early modern book and defining the project so that it could be completed in a single day. The opportunity for this project came in the form of a selection of early modern books that had been rescued from the discount bins of a couple of London’s antiquarian book stores. These books were inexpensive because undesirable either to individual or institutional collectors.6 Since we had only one day to work in the ETCL toward our goal of dismantling a print object for re-creation as a set of digital artefacts, we recognized the need to scale back from any thought of planning to digitize an entire book. We determined, therefore, to limit our scope to only a single gathering, rather than the whole book.
In order both to maximize our productivity during lab time, and because we wanted our plans to be as uninflected as possible by the (admittedly irrational) bibliophilia we feared might cloud our judgement once the actual books were in front of us, we met for a planning session the day before we entered the lab and emerged with the following principles of procedure.
1) Because we had only one day, we would emphasize completion over complexity.
2) With the educational value of reconstituting a printed sheet from a bound gathering in mind, we decided that we would need a fairly complex gathering (more than merely a folio or quarto) so that reconstructing it would be a challenge, but not so complex that it would be too difficult for a student with only rudimentary knowledge of early modern book construction. That is, we felt a duodecimo would be too complex, a quarto too simple, and that an octavo would be ideal.
3) Because we imagined reconstructing a complete printer’s sheet, we decided we would prefer a book with untrimmed pages, to provide one more level of information (the deckled edge) to introductory-level students.
4) For the sake of providing extra visual information, but also in the interests of providing as much bibliographic data as possible, we decided to prefer books in which we could locate a clear watermark and chain lines.
5) Despite the comparative undesirability and unimportance of these books to collectors (and implicitly therefore to scholars), we were not certain we could easily cross the psychological barrier of "bibliosection," so we decided that the final criterion for selecting our specimen would be the looseness or deterioration of the binding. We would then choose the gathering that would require the least amount of cutting to remove.
6) To the best of our abilities (given that we were all involved in the work itself) we agreed to keep as much of a record as we could of our procedure and of each decision made in-process.7
In hindsight it seems obvious that knowing much about how early modern books came into being offers but little in the way of training for dismantling a book; nonetheless, we were all surprised by our own ignorance of how best to take a book apart in a manner that would avoid damaging its component parts and enable those components to be re-assembled at a later date, e.g. by a single curator, another team, or by students in a book history class. We all were deeply concerned over the risks we posed to books that had, until now, survived hundreds of years of use and abuse at the hands of either well- or indifferently-intentioned people. Nonetheless, the next day, after assembling at the lab, we immediately dismissed a folio and a quarto (though the quarto was the most interesting book per se).8 All of the books were trimmed (i.e. no deckled edges). Of the remaining books, we preferred a sixteenth-century edition of Terence in octavo, which fit our preferred-format criterion; but though the paper was an interesting mixture of material, it had no sign of a visible watermark. The book on which we chose to focus the majority of our efforts was Jacques-Bénigne Bossuet's Discourse sur Histoire Universelle (1771) in duodecimo. It has very clear chain lines and two (as it appeared at the time) identifiable watermarks in a single gathering, both along the outer edge of the page, where we would easily be able to capture them digitally. In choosing this book, however, we had to sacrifice our principle of choosing a comparatively simple collation.
We started by laying flat the Discourse sur Histoire Universelle open at the gathering in which we had discerned what we originally mistook to be two watermarks but subsequently recognized as a watermark and a countermark.9 First we determined the start and the end of the gathering, then we very gently pulled at the gathering while holding the book securely in order to expose the cord (also sometimes called a thong: cf. Gaskell, 2006: 148) to which each gathering is sewn during the binding process. We documented this process as well as we could for the disassembly of both Bossuet’s Discourse and Terence’s Comedies by recording it with one of the ECT L’s digital video cameras. The result can be viewed at < http://etcl-dev.uvic.ca/public/book/movies/ >. The gathering chosen to represent the Discourse had pages stuck together due to water damage, and although this proved to be only a minor irritant for our experiment, it calls attention to the care required in choosing the right book or books to take apart. While one would not want to disassemble a book in pristine condition, choosing one that is too dishevelled might pose other, possibly insurmountable, problems. Removing the gathering left an obvious hole in the book, so we found it unnecessary to mark the spot from which we had removed pages. We were not confronted with an obvious next step upon removal of the gathering, which we found took us slightly less than one hour.
Although we knew we wanted an electronic version of the gathering, it was not obvious even what such a version would be, much less what steps we should take to create it. Would we be better served by scanning the printed sheets that formed the gathering, or photographing them? If we decided to scan, should we employ optical character recognition (OCR) software, or settle for the reproductive power of a simple image? These questions forced us into a conversation about the nature of the objectives we chose to pursue. Ultimately, we decided we were more concerned with book history than with the history of this particular book, still less with the history of its text (i.e. of its printed content), and so we opted for an image of the artefactual pages.
Through experimentation we found that the ETCL’s Epson scanner was able to out-perform the Laboratory’s 8 mega-pixel camera.10 Scanning at 1200 dots per inch (dpi) in tagged image file (tif) format produced extremely high resolution images, examples of which can be seen at < http://etcl-dev.uvic.ca/public/book/images/ >; but due to their high resolution, downloading them across the Internet will often be impractical and will always be difficult for those using a low speed (i.e. not broadband) connection. At 1200 dpi, scanning took twelve minutes per exposure (not including image processing). With the duodecimo format of the Discourse, this meant that scanning the complete gathering would take a total of nearly two hours (8 scans x 12 minutes = 96 minutes + time between scans to place and remove the sheets). Ideally, we would have liked to scan at an even higher resolution, probably somewhere between 2,000 and 4,000 dpi, but our self-imposed one-day limit meant that this was out of the question. Had we been willing to reduce the quality of our scans to 600 dpi, we could have used a second available scanner, thereby speeding up the digitization process; but the quality of 1,200 dpi scans was too much for us to resist, so we decided to allocate extra time to scanning for the sake of quality.
We envision the results of our day’s labour as the first deposit to an archive of digital material intended to contribute to studies in bibliography and book history. As it stood within days after our time in the lab, that deposit was sub-divided into seven component parts: 1) casting_off; 2) data; 3) discourse_reconstruction; 4) images; 5) impression; 6) lightbox; 7) movies. All of these and an early draft of this paper, as delivered at a joint session of the annual conferences of the Canadian Association for the Study of Book Culture and the Society for Digital Humanities / Société pour l’études des medias interactifs in Saskatoon, in May 2008, can be accessed, thanks to the University of Victoria, their Humanities Computing and Media Centre and the ETCL, at < http://etcl-dev.uvic.ca/public/book/ >. Not all of these seven sub-directories will prove to be of equal value in the classroom, and similarly not all of them will be of equal value to those conducting research into book history or into practices of digitization. I will not describe in too great detail the contents of any of the sub-directories, but some sense of each of them must be given.
In the "casting_off" directory we provide some extremely basic text analysis under the heading "View the average line length in the top half vs. the bottom half of a page." This directory represents a tangential line of investigation undertaken in the spirit of pure, unplanned experimentation. The series of numbers provided therein was seen to have value in considering the work of a compositor. How hard would the compositor(s) have worked to keep the number of characters and spaces (each representing 1 in our calculations) consistent from line to line, from page-half to page-half, and from page to page? To a great extent such consistency is demanded by the printing press’s forme, but might aesthetic considerations have come into play, too? We do not pretend to have gathered all the data necessary to address fully questions of production demands versus production desires, and in fact are not even sure this is the appropriate direction in which to set off in search of such data, but within the confines of time, expertise, and temporally-compressed imagination available to us, we offer this as a start.
Also to be found at < http://etcl-dev.uvic.ca/public/book/casting_off/ > are edited versions of OCR-generated pages of the French text of the Discourse. I should note at this point that we realized by midday that we would only have time to deal with one of the two gatherings we had dis-covered, and that the Discourse was felt to have more representative power for our purposes, so we abandoned Terence. Upon clicking on "View the OCR text by page" a reader is liable to suspect browser incompatibility because all that appears is a drop-down menu form with "97" in it, and a "Submit" button. But the menu will allow the reader to choose numbers from 97 to 120, and pressing "Submit" after choosing will cause a narrow column of left-justified text to be displayed. This column, narrow as it is, presents the full text of each page from 97 to 120. We hope that the seemingly unnaturally narrow display will help students realize, especially when they compare it to the pdf images to be found in the next sub-directory, "data," how much thought went into the early presentation of text, and that in printing and book-making our sense of what is natural is entirely constructed, socially conditioned, and the result of surprisingly high levels of abstract thought.
Under "data" < http://etcl-dev.uvic.ca/public/book/data/ > the reader will find the raw notes taken during our day in the lab, the raw OCR text, and a folder containing pdf images of the Discourse. In this last folder are comparatively low resolution images of the entire gathering and higher resolution images of only some pages from the gathering. It is worth noting that the images of the entire gathering are low resolution only in comparison with the extraordinarily high resolution we supply elsewhere.
The third sub-directory, < http://etcl-dev.uvic.ca/public/book/discourse_reconstruction/ >, provides the lure that I have found hooks students into paying attention and wanting to see more of this valuable resource. It is a Flash animation of the Discourse gathering assembled and mounted by Mike Elkink from a page-turning source file provided by the O’Reilly Web DevCenter.11 The "Wow!" factor of seeing an effective digital recreation of the familiar medium of the book seems to impress students in a way that seeing the individual parts laid out does not. In all likelihood this has more to do with the dynamism and interactivity of the page-turning effect in comparison to the static presentation offered by the pdfs and html displays than it does with the whole being greater than the sum of its parts; but when considered in that light, it does provide a useful point of departure for classroom discussion of the value of book-length versus broadsheet printed artefacts (then and now).12
The "images" sub-directory offers numerous images produced from our work. The images are offered as Joint Photographic Experts Group (more commonly jpeg), and as Sony Raw Files (or srf): the latter files are far too large for any but the fastest connections or most patient readers to use. There are also further sub-directories within "images," and each of these folders is descriptively named. The images, even in jpeg format, re-size extraordinarily well in Firefox, Opera, and IE, enabling easy and effective display of the chain lines, watermarks and countermarks, moisture stains, bleed through (of the ink, overleaf), and any other pertinent details. In truth, the technology makes study of many bibliographic features easier and more certain because it renders miniscule visual qualities of paper, ink, and binding string easier to see and examine.13 One can only hope that as haptic research progresses the same can be said for the tactile features it currently renders accessible.
Reproductions of the impositions of various formats can be found in the "impression" sub-directory.14 These provide means for teaching students how formes had to be set for printing and how paper was folded after being imprinted. These impressions will also assist those who, like those of us who worked on this project, can benefit from a reminder of how to arrange the loose leaves when they access the sixth (and last) interactive sub-directory, labelled "lightbox."
We owe the functionality of our penultimate sub-directory to Greg Newton, who adopted the Virtual Lightbox developed by Amit Kumar and Matthew Kirschenbaum, and currently supported by MITH.15 In our "lightbox" the reader can choose to re-arrange the leaves of either the outer or the inner forme of the duodecimo imprint that forms the gathering we removed from the Discourse. Upon choosing one of these options, the reader will be prompted to accept the "unverified" Java? applet published by Amit Kumar. Do so, and you will see a menu (which I will not expand upon here) across the top of the browser window, with a diagram of the duodecimo imposition just below that, in the upper left-hand position. The other six images on the lightbox’s stage are the six imprints that go into the creation of the Discourse sheet. With the diagram as a guide, the user can rearrange these impressions to recompose the original printed sheet. This activity provides a sense of the complexity involved in arranging, printing, folding, and cutting the original printed sheet.16 Having immediate access to the imposition diagram usually proves invaluable for all but the most expert bibliographers. By demonstrating the difficulty of reconstruction, we believe we have provided the means to teach students of the difficulty of composing print-ready pages, and thus to disabuse them of the commonly held belief that the past was a simpler time with technologies that pale in comparison to the complexity of technologies in our own time. It also, of course, offers the opportunity to appreciate the printer’s art through the re-construction of a virtual print artefact at a medial stage between the printed sheet of paper and the folded gathering, thus preparing the way for a possible future addition to our archive in which we deal with the construction of a book out of a set of assembled gatherings.
The final sub-directory contains movie files of some of the actual work performed in the lab, and is really of interest only in the strictest sense of archiving in the digital era.
To conclude, I will return to where I started: as more texts and more people are born digital, there will be an ever increasing need to teach bibliography and the history of the book. This is true for reasons of presentation and cultural preservation, but also for reasons of information architecture. The book is not a naturally occurring form; it is not the natural way for information and knowledge to be developed, contained, and preserved, nor even a natural way. The book evolved through a series of historical and cultural accidents and decisions described by Carla Hesse as "the modern literary system" (1996: 22; cf. Nunberg, 1996: 16). As we now find ourselves in a time similar to that in which the well-evolved codex form came more or less permanently under the influence of Gutenberg’s printing press, we need to be fully conversant in the technical as well as the cultural and economic forces that made the book such an influential and determining facet in the evolution of modernity. Thus, building an archive that preserves and presents features of the early printed book in a way that enables those features to be accessed and manipulated in ways that would not be possible using a physical book offers a means by which scholars and students can remind themselves of the past as they design for the future. The researchers involved with this project see their work and the results it produced as a first step toward the creation of such an archive.
1 A further classification, beyond chronological age, can be assumed to distinguish between those born into the industrialized world and those born into what is described as the developing world (as though industrialism is a higher evolutionary state), but as this can be assumed, it will not be dwelt on in the body of the paper. A similar glossing-over of socio-economic class is justified by the comparatively small numbers of those who will have entered higher education from backgrounds too poor to own a computer and too ignorant to make use of a public library computer terminal.
2 Proulx, 1994: A13; qtd. in O’Donnell, 1996: 37.
3 "Nothing is more likely to stunt the intellect than a knowledge of history unaccompanied by a sense of history" (Nowlan, 1971: 102).
4 I do not mean to assert that the book will disappear any time soon. It seems to me unlikely it will disappear ever. What I mean to suggest is that intimate experience of the physical codex form is likely to diminish in the lives of even economically advantaged students in whose homes books have long been a staple of childhood experience. Books will remain part of the furniture of their experience, but as e-paper and other alternatives become more mainstream, the paper-based book may become more and more unfamiliar to students, and as less familiar objects, they will need to be introduced as though they are new to the life of the student.
5 Given the advanced state of work on electronic paper it seems not too fanciful to imagine an e-paper able to change its texture, to assume the bumps and ridges of a watermark or a section of chain lines, in response to a reader’s instruction to do so.
6 Undesirability in the antiquarian book trade is typically due to one, or some combination, of three things: damage, surplus (higher supply than demand), or inherent lack of interest in the book’s form or content.
7 See sub-directories "data -> Book Digitization Re. . ." and "movies" for the written and videographic records in their rawest forms. Below I will discuss these and the other sub-directories made available at http://etcl-dev.uvic.ca/public/book/.
8 The books from which we were able to chose were the following:
Jacques-Bénigne Bossuet, Discours sur L'histoire Universelle (1771).
Willem Hessels van Est, Theologiæ Doctoris (1696).
John Suckling, Fragmenta Avrea (1648).
Terence, a M. Antonio Mvreto Locis Prope Innvmerabilibvs Emendatus & Lectionum Varietate Recèns Auctus (1592).
Cicero, Oraisons Choisies: Latines et Françoises (1728); and another, undated edition. Both editions of Cicero are in quarto and are interesting not just because they are Cicero and are transliterated, but also because there is evidence in both texts of gatherings having been inserted from more than one imprint. For example, there seems to be a gathering printed in 1723 in the 1728 First Book.
9 On the countermark, see Gaskell, 2006: 62.
10 By comparison, the Monastery of St. Catherine was using, in 2004, photographic equipment that enabled it to produce images with a resolution of 75 megapixels. See Gauch, 2004.
12 Cf. Hesse, 1996: 27 on the book’s "mode of temporality" as compared to other print artefacts.
13 "In a 64-exposure image of a 10th-century manuscript of the Gospels, written in gold leaf, the luster on the Greek letters seems as realistic as if one were viewing the actual book, and in a close-up the uneven texture of the gold leaf is crystal clear" (Gauch, 2004). http://www.nytimes.com/2004/03/04/technology/circuits/04monk.html?pagewanted=1&ei=5007&en=336680830314c574&ex=1393736400&partner=USERLAND Accessed January 7, 2008.
14 Reproduced under conditions of fair use from Gaskell, 2006: 88?101.