Chapter 1Feeding our reading machines: From the typographic page to the docuverseThis is an edited version of my Institute lecture delivered at the DHSI 2012, Victoria, BC. The original text of the Institute lecture can be found at http://www.themediares.com/pages/parlance/reading-machines.html (Van der Weel 2012).

Chapter 1
Feeding our reading machines: From the typographic page to the docuverse[1]

Adriaan van der Weel, Leiden University: a.h.van.der.weel@hum.leidenuniv.nl

Accepting Editors: Brent Nelson and Richard Cunningham


Abstract / Résumé

Digital textuality will be defining the nature and uses of literacy to the same degree as printing has done since Gutenberg's invention. The article explores the implications of one of the fundamental differences between the screen and paper substrates: digital fluidity versus the fixity of paper.

La textualité numérique définira la nature et les usages de la culture dans la même mesure que le fait l'imprimerie depuis l'invention de Gutenberg. L'article explore les implications de l'une des différences fondamentales entre les supports écran et papier: le caractère fluide du numérique par rapport au caractère fixe du papier.

Keywords / Mots clés

Digital textuality, reading on screens versus paper, semantics of typography, computer-aided reading



Introduction

Πάντα ῥεῖ or "everything flows," Heraclitus is supposed to have pronounced in the days when writing was not yet common and the force of the living voice was golden. Two millennia later our bookish world became characterized by the unrelenting pursuit of fixity, so much so that by the end of the nineteenth century readers truly deserved to be called Homo typographicus (Van der Weel 2011). Now, another half a millennium later, Heraclitus' maxim should be raked up again, for the world has left fixity behind and irrevocably entered a new era of textual fluidity. Digital textuality is characterized by a continuous and ongoing process of text-constitution without a natural endpoint: the digital reading substrate—the screen—does not fix the text the way the printed page does, but rather preserves the text's liquidity below the reading surface. Digital text technology thus represents a radical break with the history of text production. Just as centuries of books and printing have conditioned us to read in a particular, typographic way, our screens are now equally set to condition us for a particular, but obviously very different, way of reading. Much of this conditioning is already happening. It takes place in two distinct ways. The first is the usual indirect and delayed way that can be observed retrospectively in the case of all former text technologies. The other way is a much more direct and immediate one, resulting from the fact that this persistent liquidity of all digital text makes the computer into a reading machine in the literal sense (in contrast to the figurative sense in which Paul Valéry has called the book a "machine à lire," Valéry 1960).

Writing and printing have always been primarily aids in the production of text, and in its first few decades, digital text processing was also used predominantly in the service of print production. This "servant" use was gradually supplemented, and eventually superseded, by wide-ranging explorations of the "true" potential of the digital medium in, first, digital text processing and, especially from the advent of the World Wide Web, digital distribution and access. Since then ambitious bulk digitization programmes have been launched, and Web 2.0 has radically and unconditionally democratized text production. The frenetic first decades of transformations in text production culminated in the broad availability and accessibility of digital text. Now the emphasis is clearly shifting to consumption practices, with textual liquidity promoting unsuspected new ways of "reading." This is fundamentally changing those consumption practices—indeed the very concept of reading. Inevitably, the dominance of the current ideal of deep long-form reading will be challenged by these new ways of reading, thereby causing changes in our literate mentality. As is well known, old mediums do not just die; they live on in various reincarnations. Similarly, I am not suggesting that reading habits shaped by centuries of paper use will disappear, but that they are about to be relegated to a different position.

In what follows I will first discuss how the currently dominant cultural practice of reading evolved. I will then discuss a few new ways of digital reading that appear to be already occurring, both at large and specifically in a scholarly context, speculating on their significance and suggesting some reasons why they will have radical effects on our literate mentality.

Textual forms and reading practices

After language, the most decisive event in human cultural evolution was the invention of writing more than five thousand years ago. Writing implements and writing surfaces have evolved extensively over those five thousand plus years, but the "technical" act of reading, involving our eyes and brain, has remained basically constant since the world's major writing systems (i.e., scripts) received roughly their present form (for the Roman alphabet, in about 700 BC). The workings of this remarkable visual–linguistic–cognitive process have been described in fascinating detail by Stanislas Dehaene, who noted that the "reading circuits" in the human brain, which result from "neuronal recycling" of suitable brain areas, are much more invariant across cultures than they appear at first sight (Dehaene 2009, 302-304). Nothing much may have changed in this process during the five or so millennia that have passed since the invention of writing, yet a vast number of histories of reading have been written, with probably more appearing now than ever. They are histories of what people read, where, how, how much, and why. That is to say, they are histories of the social uses of reading: what it means for the individual and for society to be able to read and disseminate knowledge and culture through writing. If these histories of reading make one thing clear it is how central the history of literacy is to human cultural (as opposed to biological) evolution. David Olson has argued the tremendous social significance of literacy brilliantly in his cultural history of literacy The world on paper: The conceptual and cognitive implications of writing and reading (Olson 1994). This is a gripping account of how our relationship to text has changed over time in the Western world, and how the development of a literate mentality has shaped our cognitive abilities. However natural it is to us today to engage textually with the whole of our natural and mental world, the magnificent feat of exploring and mapping that world in writing was never a self-evident trajectory (Ong 2004; Reiss 2000).

Not surprisingly of course, histories of reading follow closely the history of the changing means of text production. The changes in our reading habits may not actually be determined by the preceding (and corresponding) history of text production and distribution, but they are at the very least made possible by it. In this sense the history of reading, or text consumption, can be said to be the chiefly social counterpart of the chiefly technological phenomenon of text production. The most familiar example of technological change in text production is no doubt the invention and rapid subsequent dissemination of the printing press, but other technologies, such as the succession of writing surfaces from clay tablets to paper books and other print products, each with their own characteristics and affordances, have also deeply affected the social role of writing—and thus of reading (see Vandendorpe 2009).

The history of reading could in other words be understood as the history of the discovery of the social uses of consecutive text technologies. In the case of the printing press this discovery process led eventually to what I like to refer to as "The order of books." [2] This is a literate mindset characterized by the widespread recognition of textual conventions, standardizations, codifications, and various other types of fixity in recording and transferring knowledge and culture in printed form—and by the ever-growing expectation of such fixity. Fixity of course means much more than the representation of words on paper in unchanging form and content. In the long history of getting text to represent the richness of speech, much effort has been expended, for example, on the formal and painstaking definition of words through dictionaries, and on describing—and prescribing—the way they can be used, through grammars and syntaxes. More informally, too, the way words were used in written and printed sources has helped to circumscribe their meaning: a form of codification through sheer repetition, in the same and similar contexts, from generation to generation, of the same unchanging (or barely changing) texts.

Precision, like fixity, is another vital precondition for the emergence of The order of books. That written or printed text is not simply speech set down is only a twentieth-century realization. As literacy has increased, writing has proven an excellent aid in thinking. Authors have learned to express themselves in writing ever more precisely and ever more independently of the sort of illocutionary force that accompanies speech acts (Olson 1994, see esp. chapter 5). This precision is, however, not just a matter of formulation, i.e. of the refinement of linguistic meaning. For the purpose of expressing written language ever more precisely, and capturing as much as possible of its semantic richness, we have eagerly exploited the extreme precision with which the printing press can dispose text on the typographic page. In other words, the mise-en-page and the mise-en-livre—the way we place text two-dimensionally on the page and three-dimensionally in books—have actually become integral to all textual meaning. As a matter of fact, to think of typography as catching a part of the semantic richness of language is misleading. What we have in fact done is extend the semantic richness of language by enlisting the possibilities of typographic form. We have evolved conventions that enable us to encode and again decode—all largely subconsciously—an enormous amount of extra, non-verbal meaning in the mise-en-page and the mise-en-livre. As the species inhabiting The order of books, Homo typographicus can instantly recognize the genre to which a text belongs (say, poetry or a footnoted scholarly argument), and grasp the meaning and significance of the particular way a text has been articulated or segmented. That is to say, we unconsciously apprehend the way the text has been divided into individual blocks of type with a structural function: the footnotes, running heads, book titles of a scholarly text, or the lines and stanzas of poetry (cf. what McGann 2001 calls the "bibliographic code").

Homo typographicus, then, is the product of a long evolution of conventions, first in manuscript but then a fortiori in print, that have imprinted themselves on his collective consciousness and, especially, unconsciousness. From our earliest exposure to books and print we learn to read the page not just for the meaning of the actual words it contains, but also for the typographic form they have been given.

We have come to function as textually literate beings to the extent that the printed word assumes an almost tangible reality. As Ernst Cassirer has phrased it, rather than "things themselves" we manipulate the words that stand in for these things:

Man cannot escape from his own achievement. He cannot but adopt the conditions of his own life. No longer in a merely physical universe, man lives in a symbolic universe. Language, myth, art, and religion are parts of this universe. They are the varied threads which weave the symbolic net, the tangled web of human experience. All human progress in thought and experience refines upon and strengthens this net. No longer can man confront reality immediately; he cannot see it, as it were, face to face. Physical reality seems to recede in proportion as man's symbolic activity advances. Instead of dealing with the things themselves, man is in a sense constantly conversing with himself. He has so enveloped himself in linguistic forms, in artistic images, in mythical symbols or religious rites that he cannot see or know anything except by the interposition of this artificial medium. (Cassirer 1972, 25; see also Harari 2014, esp. chapter 3)

Rather than diminishing, the reality-shaping function of symbolic systems is migrating to new environments: Cassirer's description clearly prefigures our present screen world, mediated to the full, in which ever larger chunks of people's lives are relegated to a second-order reality largely built from words.

The typographical habit at the dawn of the digital text

The joint (and inseparable) history of text production/distribution and text consumption is the history of The order of books and the making of Homo typographicus. It is against this background that we have to regard the history of that latest of text technologies: the computer. The first groping explorations in the 1940s of how the computer could be used for the processing of text marked the beginning of another such cycle of technological invention followed by social discovery. Just as developments in print production had major social consequences for reading, we are once again heading for major changes in text consumption, this time caused by text becoming produced and distributed digitally. We can surmise that this time, too, the new forms of text production will be followed at some distance by a more gradual discovery of the manifold social uses of this new technology. Thus a primarily technological history of digital text production will again be paralleled eventually by a primarily social history of digital reading practices. Again, the "technical" act of reading as described by Stanislas Dehaene may well remain essentially unchanged. Although we now know that the mind is in a two-way relationship with technology. So the very way our brain is wired may change, affecting how the brain is used for reading. So, will the social change associated with digital text technology be as fundamental and pervasive as it was in the case of printing?

In fact I have little doubt that the effects of digital textuality on reading practices and literacy will be even more dramatic than they were in the case of print. Moreover, these effects will happen faster. The major changes in digital text transmission that we have already witnessed occurred over just a few decades. But the dramatic difference is that this time the new substrate—the digital screen—has fundamentally different technological properties than paper—the dominant substrate of the last few centuries—or papyrus, or wax tablets before them. Simply put, the digital text displayed on the screen always remains machine processable below the reading surface. It continues to be available to any operations we wish to perform on it, from simply locating it in a search action to subjecting it to complex computational scholarly analysis. What is more, as a Universal Machine, the computer is not a fixed technology whose potential simply lies waiting to be discovered. It is impossible to even begin to predict how the computer as a text technology will develop beyond the way it is being used now. That this must have far-reaching implications for reading is obvious.

So what are the major changes in digital textual transmission we have already witnessed? At first it did not look as if change would be so rapid. Initially, in what we might term the offline era, the computer was used first and foremost in a variety of ways that furthered the long-established habits of Homo typographicus. That is to say, a major use of computers was as digital aids for the production of print books and other paper textual products. Never mind that this required an awareness of abstruse markup codes which, on the face of it, could not be farther removed from the dictates of Homo typographicus. In the mid-eighties, such markup codes were also widely used in word processing on the personal computer. In programs like WordStar and Multimate, the control characters required for italics, boldface, underlining, and so on, would actually appear on screen. In WordPerfect the codes could always be brought up in its "underwater screen." The use of codes in the typesetting industry (where they derived from), was one thing, but their acceptance by reluctant non-specialist, and often incidental users was quite another. Not surprisingly, therefore, it was not long before such codes were discarded or, more precisely, moved into a black box. The decisive solution came in the shape of a major capitulation to Homo typographicus: the Graphic User Interface (GUI) and WYSIWYG (What You See Is What You Get). These interfaces introduced familiar typography, however basic, on-screen for the ordinary user. The Apple Macintosh blazed the trail, and after Microsoft's Windows took the world by storm, the rest is textual history: the world fell hook, line, and sinker for the lures of the GUI and WYSIWYG.

The victory of WYSIWYG closed that brief window in which an awareness of codes seemed to be necessary for anyone who wanted to do something textual on a computer, at least as far as the general user was concerned. Yet more or less simultaneously with this move from (a mild form of) coding to victorious WYSIWYG, the first seeds were sown of an alternative approach to textual form. "Text professionals," especially in the publishing world, wishing to break out of the typographic mould, were developing the code idea further. This gave rise to the world of markup, in which SGML (Standard Generalized Markup Language) was the pinnacle of achievement, later superseded by XML (eXtensible Markup Language). Though SGML was presented as a departure from the book-derived typographic mindset, it may well be asked how fundamental a departure it really was. In retrospect it could be argued that as an alternative representation of typographic structuring principles, markup still in large part pandered, and continues to pander, to our typographic condition. Though they replace visible typographic structure by a more abstract hierarchical one, SGML and XML markup are characterized by a similar sort of precision and attention to detail in catching that part of the textual semantic richness that lies outside the linguistic content proper (of course markup can also be used to represent typographic representation as well as the underlying structure that it implies). Yes, markup aids the machine processing of text, but it continues to place major emphasis on the act of human interpretation that precedes it. It still involves humans telling the reader how to interpret a particular segment of text, which is precisely what we do when we give typographical shape to a piece of text, be it a poem, a book, or a business letter. The difference is that the markup instructs the computer on behalf of the reader instead of being addressed to the reader directly. However, the computer can still not read the meaning of the actual text unaided; it cannot "understand" its sentences or paragraphs the way humans can.

Initially, most of the computing involved was thus primarily aimed at text constitution, first with a view to eventual printing, but even when texts were already being prepared for digital distribution, achieving fixity was a pressing concern. Computability, that essential characteristic of digital text below the visible surface, was experienced as evincing a number of decidedly undesirable side effects: lack of fixity in form, substance, and even existence. The creation in the early 1990s of the pdf file format by Adobe is a good illustration of the continuing focus on recognizable and dependable typographic shape to preserve a sense of fixity.

I dwell on this history because it is illustrative of the force of our typographic conditioning and thus offers insight into a very basic aspect of our textuality. Our history as denizens of The order of books has determined, and to a large extent continues to determine, the way we program our computers to deal with text—perhaps to a larger extent than we have so far realized. I also suspect that in understanding this typographic heritage we might achieve a better understanding not only of where we find ourselves today, but also of where we might be heading tomorrow. This long typographic heritage has proven particularly tenacious, and is obviously not easily cast aside.

The first stages of digital textuality, being so much in the mould of the hierarchical book, are not likely to have had much impact on the "common reader." However, when we survey the digital textual landscape of today, all our academic and personal efforts at digital mediation and remediation, of scanning, interpreting and coding text have been dwarfed by two recent developments that certainly do affect the common reader: the most ambitious scanning programme the world has ever seen—Google Books—and Web 2.0. The amazing expansion of the docuverse by the industrial-scale production of digital text by Google Books, and other mass digitization programmes like it, perhaps represents no more than a quantitative leap in access, but Web 2.0, while falling far short of the ideals and expectations of people like Ted Nelson and the proponents of the semantic Web, has nonetheless transformed the very nature of text production. Enabling anyone, potentially, to be an author, it is replacing the hierarchical mindset of the printed book with a democratic one. (As many commentators have recently pointed out, the term "democratising" should properly be taken as a descriptor of the technological potential rather than of actual effects, political or otherwise. See, for example, Taylor 2014 and Weinberger 2015.) For the first time the hold of print on our textual practices has begun to weaken.

Of course it is too early to judge the social effects of such recent developments. We certainly lack the advantage of the hindsight we have in the case of printing. In any event, as we have seen, the effects on consumption of these changes in production will always show a time lag. One notable reason for this lag is that the innate human desire for cultural continuity causes a period of "imitation," be it conscious or unconscious. The first printed books looked very much like manuscripts; the last few decades similarly saw a noticeable reluctance to abandon the typographic form of print. However, the cycle does not appear to be taking as long this time around, and history is speeding up. The chief stumbling block to our critical appraisal of social effects is not their delayed appearance; it is the elusiveness of the social processes involved. Identifying and describing concrete technological developments is one thing; describing the more drawn-out and diffuse process of discovering its social uses is quite another. This goes a fortiori for identifying cause-and-effect relationships between technological inventions and social developments. Such relationships are tenuous at best. As Donald Norman posits in Things that make us smart: Defending human attributes in the age of the machine:

In my opinion the easy part of the prediction is the technology. The hard part is the social impact; the effect on the lives, living patterns, and work habits of people; the impact upon society and culture… It is the social impact of technology that is least well understood, least well predicted. (Norman 1993, 186)

Even historiographically, with the clear benefit of hindsight, changes in production are generally better documented, and easier to describe, than are their links with social effects, even more so when the phenomena in question are still in process. Nevertheless, I would like to try to identify a few observable social phenomena that I think could be interpreted as effects of the digital production, distribution, and consumption cycle of text, notably the way consumption is technologically enabled by the digital substrate. In doing so, I would especially like to focus on one aspect of digital textuality, viz. textual liquidity, or the fact that text displayed on the digital screen remains machine processable below the reading surface. Textual liquidity means that the computer can be used not just for text production, but also as a reading machine.[3] I would like to speculate about the social effects of digital text technology on reading, taking into account both the delayed effects (comparable to those of the printing press) and the direct effect of the computer as a reading machine. My interest is ultimately in reading as a social practice at large more than in professional, academic "power" reading. Yet because it makes sense to assume that experimentation that begins in the academic sphere eventually percolates down into society at large, I would like to look also at some nascent scholarly practices.

The social impact of the computer as reading machine

The new type of text consumption born from digital textual transmission is characterized nicely by Alain Giffard's term "industrial" reading (Giffard 2011, 99; see also Giffard 2009). Barring a few exceptions, digital reading takes place online and involves the docuverse: the entirety of documents—or nodes—available online, comprising all of the new text formats the WWW has spawned in addition to every single genre that ever existed in a paper format. The docuverse also integrates the realms of moving images, still images, and sound, that can be accessed by anyone who is connected to the network every moment of the day, every day of the year. In the docuverse the collection level (that is to say, the infrastructure) and the content level merge. To be connected to the network is to be able to search not only for any resource that is linked to it, but directly inside the content of that resource. Further, digital reading is unlike reading from paper in that there is no text on the screen when it is turned on. The liquidity of the digital text requires the reader always to select which characters, words, and sentences are brought up on the screen to begin with. Reading in the docuverse thus involves a greater amount of what Giffard calls "pre-reading." Finding and selecting—the navigation process through the entirety of connected nodes and documents—are intrinsically and continuously part of the actual act of reading. Information is often found and accessed directly inside the text, by full-text searching. But selection in the pre-reading stage may also result from programmed algorithms, such as those of the Google search engine, or personified news pages or RSS feeds. Either way, this means that books or articles are often not engaged in the way they were intended or imagined by their authors, i.e. as presenting integral arguments to be read as a whole. This is not a new phenomenon; scholars, for example, have always read discontinuously.[4] However, the practice of reading books from cover to cover is becoming a less standard practice. Moreover, conventional controlled, hierarchical "knowledge systems" such as library catalogues and subject bibliographies like Current Contents or the Modern Languages Association (MLA) are increasingly being abandoned in favour of the full-text search or the recommendations of any of the many peer-groups to which one might like to belong online, depending on the subject (Van der Weel 2011). The corollary of forgoing the use of conventional bibliographical expert systems is that readers take on increasing responsibility for what and how they read. Apart from the more philosophical question about readers' responsibility for their own future as a reading audience, Giffard identifies three main categories of reader responsibility: (1) for the closure of the text: the reader selects, collects, and binds together the different fragments of texts found; (2) for the actual reading process: overcoming the many obstacles to maintaining one's attention on the text; (3) for the technology: computing demands familiarity with, and control over, a host of widely varying new technologies in the form of devices and their interfaces (Giffard 2013). To these we might add what I have termed "the deferral of the interpretive burden, which [is shifting] more and more from the instigator of the […] communication to its recipient" (Van der Weel 2010). The more readers choose their own route through the docuverse, the less use they will make of authorial guidance, ranging from the merest narrative exposition to full analysis of facts.

That the docuverse comes with a bottomless supply of built-in distraction (Giffard's second point) does not mean that it is not possible to engage in immersive reading on a device that is online 24/7, just that it requires a great deal more discipline than reading offline. Distraction is built into the digital reading environment and is often built into the very text, in the form of hyperlinks, ads, and navigational aids. As a consequence, there is a tendency towards reading shorter texts or snippets of longer ones, and, as a result, a more superficial engagement with text. (This is convincingly argued by Nicholas Carr in The shallows: How the Internet is changing the way we think, read and remember, Carr 2010.) Research shows, moreover, that people do not bring as much mental effort to reading on screens as on paper to begin with (Ackerman and Goldsmith 2011; Liu 2005; Nielsen 2006).

I have speculated elsewhere that as permanent and reliable online access is becoming the rule rather than the exception, an interesting change might be about to take place in the economy of reading (see Van der Weel 2014). In the drive to make as much text as possible accessible digitally, one of the unintended side effects of the immaterial (or virtual) nature of digital text may well be that, following music and films, e-books might be moving out of an ownership and into an access paradigm. Access has always been the paradigm in the case of the mass media, such as radio, television, newspapers (whose ephemeral materiality hid the fact that they were about access more than ownership), and, before videotape, also film. It would not surprise me if this development away from material ownership will come in time to be matched by its mental counterpart. That is to say, it seems conceivable that not owning a text materially might tend toward making its intellectual content less one's own figuratively. In this scenario reading practices might also become less aimed at learning or remembering. There are indeed signs that the nature of the digital substrate is furthering such a tendency away from storing information in our brain. Two characteristics of the docuverse are especially relevant in this regard. Firstly there is the continuous and increasingly reliable accessibility of the docuverse. This makes it practicable to relegate more memory functions to it, as is indeed occurring (for example, see Hooper and Herath 2014; Sparrow 2011). Note, incidentally, that this phenomenon has reinforced the notion among some educators that it might be feasible to burden students less with learning facts, emphasizing fact-finding strategies instead. Secondly, digital screens offer less help with longer-term storage of information in the brain. This is because in contrast to the printed page the liquidity of screens plays havoc with the brain's need to map its reading in the same way as it maps its physical environment. This need derives from the "neuronal recycling" mentioned above (Dehaene 2009), which involves the repurposing of natural shapes in reading. (Mark Changizi has called this phenomenon "harnessing," Changizi 2011b; see also Changizi 2011a; Jabr 2013; Morineau et al. 2005; Szalavitz 2012).

The ability to turn the docuverse into external human memory is only one of the more obvious ways in which the very availability and accessibility of text is fostering the use of the computer as a reading machine, but it has fascinating potential ramifications for the future of reading. In an academic context it is, for example, stimulating the "reading" of "big data." Just as bankers might analyze large amounts of numerical data to detect fluctuations in share values or currency that are invisible to humans, humanists are similarly learning to extract invisible knowledge from very large corpora of text using pattern recognition. Such "distant reading," to use Franco Moretti's term (Moretti 2007; other terms mooted for the same phenomenon have been "machine reading" in Hayles 2012 and "macroanalysis" in Jockers 2013) includes, for example, "culturomics" the study of cultural trends through quantitative analysis of digital texts. (Michel and Lieberman Aiden 2011 helped Google build their Ngram Viewer which uses n-grams to discover cultural patterns in language use over time on the basis of the Google Books). One current research project in the Netherlands involves statistical analysis of texts identified as "literary" in the hope that this may yield tools that can determine the "literariness" of other texts. One may imagine that such software would, for example, allow publishers to trawl writing blogs on the Web automatically for promising literary talent. Distant reading is regarded by many, including—crucially—many funding bodies, as the next big thing in humanities research.[5]

Reading big data is a thriving field whose meteoric rise has only just begun. Currently it remains very much a scholarly pursuit, but its appeal is such that it will no doubt end up being used more in the popular domain as well. In the meantime, much discussion has been devoted to the nature of the research that can be performed with the use of statistical methods and, notably, the extent to which such research might or might not be antithetical to humanities' pursuits (see Drucker 2012 and McCarty 2010). This is part of the wider question of how the use of the computer affects scholarship epistemologically. The need to adjust our concept of knowledge as a result of the use of computers is a constructivist notion that is fairly widespread (see, e.g., Weinberger 2016). (That the increasing use of computers would affect the nature of human knowledge was an issue already noted by François Lyotard in The postmodern condition: A report on knowledge, Lyotard 1984, and addressed acerbically by Roy Harris in his provocative essay The language machine, Harris 1987.) What has received less attention, however, is the nature of the corpus, or corpora, that serve as its input. It seems important to note that there is a watershed between the corpus of digitized legacy materials and the corpus of digital-born texts and other data. Both can be used (and are advocated to be used) as research objects for distant reading. The difference between them is partly a matter of sheer quantity: the amount of digital material—digitized and born-digital—is set to grow by sixty per cent this year, and already the total has dwarfed non-digital information at a tiny percentage of the total amount. But more important is the qualitative difference between the digital data that are now being created and generated and the analogue data of yore. Two factors are especially relevant here. The first is the growing proportion of this wealth of digital data that concerns information about the use that individuals make of data and services, including textual corpora: in other words, consumption metadata. The second is the changing nature of the primary data as a result of the democratization of textual production mentioned earlier. From culling odd bits of information from written and printed sources we are now rapidly moving to a situation in which we have access to the full digital documentation of the minutiae of people's daily lives. This also draws attention to the inevitable tension between what is thought to be intrinsically important to researchers and what research interests may be triggered by the contingent availability of material or methods. I will return to this later. (Another interesting question is to what extent the use of datasets and quantitative methods vs the interpretative ways of the humanities might introduce a two-culture division within the humanities.)

As a counterpart to distant reading, close reading is another potential function of the computer as a reading device. This is not different in principle from the sort of close reading propagated by the New Critics, but it is a closer form of reading than humans can engage in without instrumental assistance. A nice example is the discovery by Tanya Clement of a patterning in Gertrude Stein, The making of Americans so subtle as to have eluded all readers so far (Clement 2008). Here the use of the computer has yielded new insight into the structural "virtuosity" of Stein's novel that would simply not have been available without the affordances of the digital substrate: in casu, the computer as a reading device. With its already superhuman, and ever growing, powers of memory and retention as well as processing, the computer as a reading machine looks like it is achieving the same status as the telescope had for seventeenth-century observers, "enlarging 'the whole range of what its possessors could see and do [so] that, in the end, it was a factor in changing their whole picture of the world'" (as quoted in McCarty 2012, 113). Used as an extension of the reading brain, the computer, like the microscope and telescope before it, extends the bandwidth of the human senses, in line with McLuhan's observations about media. Just as cosmology and particle physics represent two extremes outside the range of the human eye, distant reading means reading outside of the normal human range, enabling one "to see" things in texts that cannot be seen by the naked eye/brain.[6] It is not unthinkable that—as an example of experimentation in the academic sphere percolating down into society at large—the algorithms enabling such glances at the extremes of our textual world might eventually become available as apps or other commonplace forms.

How far will the computer affect reading practices?

Using the computer as an extension of the reading brain changes the dimensions of our textual universe. But does it also mean that we are finally realizing that age-old dream of "calculating" language? As we saw (and as David Olson described so masterfully), with the cultural evolution of writing technology, we managed over time to extend our linguistic communication to writing. As humans we are not likely to be interested in what the computer might have to say to us (at least not until it shows signs of a great deal more mental activity). Aristotle called humans ζοον λογον ἐχων ("zoon logon echoon"), or an animal with language. What we are interested in is that speaking animal's thinking, and how that animal communicates the results of its cogitations to other speaking animals. So any calculating of language we would like the computer to do is necessarily tied to human-produced utterances.

Moreover, the textual resources of print and typography extended the semantic possibilities of speech in a number of ways (see Ong 2004 for some examples of particularly sophisticated use of typography). Currently, such typographically-enriched written forms of human expression are beyond being machine readable. The machine cannot interpret typographic conventions the way humans can. This is the chief problem markup has been designed to solve, but which it can only do through human intervention. But even the naked words themselves remain by and large as challenging as ever when it comes to calculating language. It is a challenge that goes back to the early modern era, and the sheer length of its history not only shows how deeply felt is the need to meet it, but also the magnitude of the task. The problematics of the use of language for human communication began to be felt particularly acutely in the seventeenth century, as the needs to communicate the results of new scientific observations and experiments grew more pressing and demanding. The search for a suitable scientific discourse led thinkers to mathematics, whose rational austerity provided a model to counteract, or even subdue, the unruly subjectivity of language. Descartes placed mathematical truths above all suspicion, "on account of the certitude and evidence of their reasoning," and wrote his Meditationes de prima philosophia (1641) "more geometrico": according to the geometrical method. Spinoza used the method to great effect for his famous Ethica (published 1677 in Opera posthuma). But the appeal of mathematics went beyond the great rationalist philosophers.

After Pascal presented his first mechanical calculator in 1642, the mechanical manipulation of tokens presented itself as an alternative to the mathematical style in discursive prose. The idea of a "reasoning engine" took hold. Thomas Hobbes had already thought of the reasoning brain as a form of computing: "By ratiocination, I mean computation," he wrote in De corpore (1655), and he meant this literally as (mentally) adding and subtracting thoughts (Sawday 2007, 239). Leibniz, too, was looking for a way to reduce the natural semantic ambiguity of language and embraced the idea of "calculating" language. In 1673 he demonstrated to the Royal Society what he called with a reference to Hobbes a calculus ratiocinator. The instrument could perform all of the four arithmetical operations (addition, subtraction, multiplication, and division). This was two more than the addition and subtraction of Pascal's machine, by which it had been inspired, but his sights were set higher still. Leibniz envisaged eventually substituting mathematical symbols for the constituent parts of any logical argument. The machine he had in mind would thus allow the solution of problems of knowledge and philosophy by purely mechanical computation, once "all truths of reason [had been] reduced to a kind of calculus" (letter from Leibnitz to Nicholas Remond, 10 January 1714, as quoted in Sawday 2007, 241, from Dyson 1997, 36). The needs of science and those of everyday conversation obviously diverged, not to say clashed, and everyday conversation won. Language being the cultural tool it is, and being intended to convey shades of grey that are likely to resist the mechanical manipulation of tokens, it is unlikely that Leibniz's vision will ever come to pass.

The history of the codification of semantics in typography on the one hand, and the endeavours to reduce semantic ambiguity through recourse to logic and mathematics on the other, can thus be regarded as parallel advances toward the same larger purpose of calculating language to achieve greater precision in textual communication. But ironically, while the aim of the codification of semantics in typography was to improve the precision of written communication by making the subjective part of language more objective, typography actually extended the non-linguistic semantic richness of language, adding to rather than diminishing the challenge presented by that elusive goal of computing language. Curiously, both typographic and computer "processing" of text can be said to involve a reduction of language to manipulable tokens. Just as a typesetter need not understand the linguistic meaning created by the characters he strings together, the computer cannot understand the linguistic meaning created by the tokens it strings together.

The answer to the question of whether the computer as a reading machine might begin to realize that age-old dream of "calculating" language must at this stage be "no." Yes, the computer is, in I.A. Richards' phrase, "a machine to think with," even more sophisticated than the book, but it does not help us with the semantic challenge, much less deliver the mathematical certainty that the seventeenth-century was looking for.

It may ultimately be impossible to evade the semantic problems inherent in the communication between two human beings, but that is not to say that no progress has been made at all. Even if the semantic problem has not been solved, it could be argued that on the "reading" (or consumption) side of the equation it can now be established, at least probabilistically, what a text or group of texts could be saying. What is more, while in terms of the input (or production) side, computers cannot do much beyond giving us a little help, what they can do is give intelligible shape to the outcome of their "reading," be it distant or close. And interestingly, the presentation of the results of their "reading" does not necessarily have to be textual. In the form of maps, graphs and trees (Moretti 2007), computer output can evade the need to use language altogether.

Some concluding questions

Does the discovery of the computer's reading potential mean that digital textuality is now coming into its own? And if digital textuality is now coming into its own, could it be that just as long centuries of books and printing have conditioned us to read in a particular way, our screens are now set to condition us to read in the particular way suggested by the inherent characteristics of the computers that drive them? And if our screens should indeed be nudging us into directions away from our typographic past, what consequences will that have for our textual future? In slightly different words, these questions are variations on one particularly relevant, perhaps even burning, question that we as humanists have been asking ourselves for decades without ever receiving a satisfactory answer: do we keep doing what we always did, only now with the help of the computer, or does the computer lead us to do different things altogether? One of the most persistent discussions in digital humanities has been about the question of whether the computer is really permitting us to pursue a qualitatively different sort of research, or whether its chief merit is that it allowed the same things to be done faster, more smartly, or more efficiently. And when we ask ourselves this, we ought also to ask ourselves, why are we even looking for qualitative changes? In the sciences new tools are invented to be able to get closer to answering basically the same questions. If a change of tool sends us humanists to ask different questions, does that mean that the old questions were not good enough (or, heaven forbid, that they were the wrong ones)? If we should find ourselves not doing the same thing, we should at least pause to ask why not—and again, what might be the consequences of that.

This leads me to ask the equally burning, more general question: do we shape the technology to help us do what we already wanted to achieve, or does the available technology instead steer us in a direction which, whether desirable or undesirable, was not one that we chose ourselves? In other words, do we build the infrastructure to help us answer the questions we want to ask or do we end up asking the questions the infrastructure suggests we ask? As humans we like to think of ourselves as firmly in control of our inventions. But how much control do we really have? It was not a vision of the Culturomics project that moved Google to attempt to digitize the world's books. Rather, it was the existence of the infrastructural resource of Google Books that inspired people (with the aid of Google grants) to start thinking of how to use it. Ultimately, I wonder if our new reading machines do not demand to be fed just like nineteenth-century printing machines, which led to the exploitation of new, more popular genres (Erickson 1996). That is to say, are our machines not asking us to adapt our reading and research behaviour? Though we devise them as our servants, might we not end up obeying them as the Molochs they have become?

These are very big questions to be asking, but even if it might be too early to answer them, I think we should not shy away from asking. As we have seen, the properties of the digital medium and digital textuality are only gradually being discovered, and so the social implications for reading are also only gradually becoming clearer. As in the case of printing, the introduction of digital technology is merely where the real revolution, which is social, begins to unfold. Yet the way the computer is used to read liquid digital texts is already affecting current reading practices. All modes of textual production are historically contingent, along with the social mode of reading they foster. The liquidity of text is not only fostering new ways of "industrial" reading, but also a new perspective on text. It looks like we are about to relegate the familiar paradigm of typographic reading (deep and attentive reading of texts resulting from a hierarchical mode of production), developed over centuries of print, to a secondary place. Typographic ways of reading, dominant for centuries, may turn out to be a historical contingency. That is not to say that we can simply lay to rest the long inheritance of Homo typographicus. As long as we keep reading fiction for entertainment, forms of typographic reading will probably persist. But, for better or for worse, the digital textual medium will encourage other, very different types of reading. The computer's potential as a reading machine has little need for typographical conventions. Beyond access lies not only a continuation of what we did before computers and have been doing with the help of computers in the phase when we used them to imitate typography: increasingly, computers will steer our efforts in directions that are inspired by the inherent characteristics of the machine. This may not help our memory; it may make us worse readers (Wilkens 2012 for one has suggested that it will "almost certainly make us worse close readers"). But for good or ill, a new reading paradigm will ineluctably change our literate mentality.


Notes

[1] This is an edited version of my Institute lecture delivered at the DHSI 2012, Victoria, BC. The original text of the Institute lecture can be found at http://www.themediares.com/pages/parlance/reading-machines.html (Van der Weel 2012).

[2] My use of this phrase was inspired by the title of Roger Chartier's L'Ordre des livres, published in French in 1992 and translated into English as The order of books: Readers, authors, and libraries in Europe between the fourteenth and eighteenth centuries (Chartier, 1994; see also Van der Weel 2011, 67-103).

[3] In Disappearing through the skylight: Culture and technology in the twentieth century, O.B. Hardison Jr. identifies two ways in which technologies can be used. "Classic" use is when "the technology is being used to do more easily or efficiently or better what is already being done without it ... The alternative is to use the capacities of the new technology to do previously impossible things, and this second use can be called 'expressive'" (Hardison 1990, 236). Clearly, here we are witnessing the transition from "classic" to "expressive" use of computers.

[4] Much has been written about the importance of discontinuous reading for scholarship. The bookwheel, invented by the Italian military engineer Agostino Ramelli, is its emblematic evocation. Anthony Grafton, The footnote: A curious history can be read as an exquisite ode to the pleasures of discontinuous reading—and writing (Grafton 1997).

[5] Digging into Data (http://www.diggingintodata.org/), funded by research organizations in Canada, the Netherlands, the United Kingdom, and the United States, is one of the more prominent examples of such funding initiatives. For an influential recent book about the promise of Big Data research, see Stephen Ramsay's Reading machines: Toward an algorithmic criticism (Ramsay 2011).

[6] Katherine Hayles has suggested a spectrum of forms reading, one end of which is machine reading. She suggests, however, that the extension of the spectrum is accompanied also by qualitative change, requiring greater efforts at interpretation: "If events occur at a magnitude far exceeding individual actors and far surpassing the ability of humans to absorb the relevant information … 'machine reading' might be a first pass toward making visible patterns that human reading could then interpret" (Hayles 2012, 29).


Works cited / Liste de références

Ackerman, Rakefet and Morris Goldsmith. 2011. "Metacognitive regulation of text learning: On screen versus on paper." Journal of Experimental Psychology: Applied 17.1: 18-32.

Carr, Nicholas. 2010. The shallows: How the internet is changing the way we think, read and remember. London: Atlantic Books.

Cassirer, Ernst. 1972. An essay on man. New Haven and London: Yale University Press.

Changizi, Mark. 2011a. Harnessed: How language and music mimicked nature and transformed ape to man. Dallas, TX: BenBella Books.

---. 2011b. "The problem with the Web and e-Books is that there's no space for them." Psychology Today, February 7. Accessed February 2, 2016. http://www.psychologytoday.com/blog/nature-brain-and-culture/201102/the-problem-the-web-and-e-books-is-there-s-no-space-them.

Chartier, Roger. 1994. The order of books: Readers, authors, and libraries in Europe between fourteenth and eighteenth centuries. California, Stanford: Stanford University Press.

Clement, Tanya. 2008. "'A thing not beginning and not ending': Using digital tools to distant-read Gertrude Stein's the making of Americans." Literary and Linguistic Computing 23.3: 361-381.

Dehaene, Stanislas. 2009. Reading in the brain: The science and evolution of a human invention. New York: Viking.

Descartes, Rene. 1641/1960. Discourse on method and meditations. Translated by Laurence J. Lafleur. New York: The Liberal Arts Press.

Drucker, Johanna. 2012. "Humanistic theory and digital scholarship." In Debates in the digital humanities, edited by Matthew K. Gold, 85-95. Minneapolis: University of Minnesota Press.

Dyson, George. 1997. Darwin among the machines. London: Penguin.

Eisenstein, Elizabeth. 1979. The printing press as an agent of change: Communications and cultural transformations in Early-Modern Europe. Cambridge: Cambridge University Press.

Erickson, Lee. 1996. The economy of literary form: English literature and the industrialization of publishing, 1800-1850. Baltimore and London: Johns Hopkins University Press.

Giffard, Alain. 2009. "Des lectures industrielles." In Pour en finir avec la mécroissance: Quelques réflections d'ars industrialis, edited by Bernard Stiegler, Alain Giffard, and Christian Fauré. 115-216. Paris: Flammarion.

---. 2011. "Digital reading and industrial readings." In Going digital: Evolutionary and revolutionary aspects of digitization, edited by Karl Grandin, 96-104. Stockholm: Centre for History of Science.

---. 2013. "Digital readers' responsibilities." In The unbound book, edited by Joost Kircz and Adriaan van der Weel, 81-89. Amsterdam: Amsterdam University Press.

Grafton, Anthony. 1997. The footnote: A curious history. London: Faber and Faber.

Harari, Yuval Noah. 2014. Sapiens: A brief history of humankind. London: Harvill Secker.

Hardison, O.B. Jr. 1990. Disappearing through the skylight: Culture and technology in the twentieth century. Penguin Books.

Harris, Roy. 1987. The language machine. Ithaca: Cornell University Press.

Hayles, Katherine. 2012. How we think: Digital media and contemporary technogenesis. Chicago: University of Chicago Press.

Hooper, Val and Channa Herath. 2014. "Is Google making us stupid? The impact of the Internet on reading behavior." In Bled 2014 Proceedings 1, AIS Electronic Library (AISeL). http://aisel.aisnet.org/bled2014/1.

Jabr, Ferris. 2013. "The reading brain in the digital age: The science of paper versus screens." Scientific American, April 11. Accessed February 2, 2015. http://www.scientificamerican.com/article.cfm?id=reading-paper-screens.

Jockers, Matthew L. 2013. Macroanalysis: Digital methods and literary history. Champaign, IL: University of Illinois Press.

Liu, Ziming. 2005. "Reading behavior in the digital environment: Changes in reading behavior over the past ten years." Journal of Documentation 61.6: 700-712.

Lyotard, François. 1984. The postmodern condition: A report on knowledge. Translated by Geoff Bennington and Brian Massumi. Manchester: Manchester University Press.

McCarty, Willard, ed. 2010. "Introduction." In Text and genre in reconstruction: Effects of digitalization on ideas, behaviours, products, and institutions. 1-11. Cambridge: Open Book Publishers.

---. 2012. "A telescope for the mind?" In Debates in the digital humanities, edited by Matthew K. Gold, 113-123. Minneapolis: University of Minneapolis Press.

McGann, Jerome J. 2001. Radiant textuality: Literature after the World Wide Web. New York: Palgrave.

Michel, Jean-Baptiste and Erez Lieberman Aiden. 2011. "Quantitative analysis of culture using millions of digitized books." Science 331.6014: 176-182.

Moretti, Franco. 2007. Graphs, maps, trees: Abstract models for literary history. London: Verso.

Morineau, Thierry, Caroline Blanche, Laurence Tobin, and Nicolas Guéguen. 2005. "The emergence of the contextual role of the e-Book in cognitive processes through an ecological and functional analysis." International Journal of Human-Computer Studies 62.3: 329-348.

Nielsen, Jakob. 2006. "F-Shaped pattern for reading web content." NN/g Nielsen Norman group: Evidence-Based User Experience Research, Training, and Consulting, April 17. Accessed February 2, 2016. http://www.nngroup.com/articles/f-shaped-pattern-reading-web-content/.

Norman, Donald. 1993. Things that make us smart: Defending human attributes in the age of the machine. Reading, MA: Addison-Wesley.

Olson, David. 1994. The world on paper: The conceptual and cognitive implications of writing and reading. Cambridge: Cambridge University Press.

Ong, Walter J. 2004. Ramus, method, and the decay of dialogue from the art of discourse to the art of reason. Chicago: University of Chicago Press.

Ramsay, Stephen. 2011. Reading machines: Toward an algorithmic criticism. Champaign: University of Illinois Press.

Reiss, Timothy J. 2000. "From trivium to quadrivium: Ramus, method, and mathematical technology." In The renaissance computer: Knowledge technology in the first age of print, edited by Neil Rhodes and Jonathan Sawday. 45-58. London: Routledge.

Sawday, Jonathan. 2007. Engines of the imagination: Renaissance culture and the rise of the machine. Abingdon: Routledge.

Sparrow, Betsy, Jenny Liu, Daniel M. Wegner. 2011. "Google effects on memory: Cognitive consequences of having information at our fingertips." Science 333.6043: 776-778.

Szalavitz, Maia. 2012. "Do e-Books make it harder to remember what you just read? Digital books are lighter and more convenient to tote around than paper books, but there may be advantages to old technology." Time, March 14. Accessed February 2, 2016. http://healthland.time.com/2012/03/14/do-e-books-impair-memory/.

Taylor, Astra. 2014. The people's platform: Taking back power and culture in the digital age. New York: Metropolitan Books.

Vandendorpe, Christian. 2009. From papyrus to hypertext: Toward the universal digital library. Translated by Phyllis Aronoff and Howard Scott. Urbana, IL: University of Illinois Press.

Valéry, Paul. 1960. "Les deux vertus d'un livre." In Oeuvres, edited by Jean Hytier. Vol. 2: 1249. Paris: Gallimard.

Van der Weel, Adriaan. 2009. "Explorations in the Libroverse." Paper published in Going digital: Evolutionary and revolutionary aspects of digitization, edited by Karl Grandin, June 23-26, Stockholm, Sweden: Centre for History of Science. 32-46.

---. 2010. "New mediums: New perspectives on knowledge production." In Text comparison and digital creativity, edited by Wido van Peursen, Ernst D. Thoutenhoofd and Adriaan van der Weel, 253-268. Leiden: Brill.

---. 2011. Changing our textual minds: towards a digital order of knowledge. Manchester: Manchester University Press.

---. 2012. "Feeding our reading machines: Digital textuality 3.0." Parlance, July 2. Accessed June 22, 2016. http://www.themediares.com/pages/parlance/reading-machines.html

---. 2014. "From an ownership to an access economy of publishing." Logos 25.2: 39-46.

Weinberger, David. 2015. "The internet that was (and still could be)." The Atlantic, June 22. Accessed February 2, 2016. http://www.theatlantic.com/technology/archive/2015/06/medium-is-the-message-paradise-paved-internet-architecture/396227/.

Weinberger, David. 2016. "Rethinking Knowledge in the Internet Age." LARB, 2 May 2016, https://lareviewofbooks.org/article/rethinking-knowledge-internet-age/#!

Wilkens, Matthew. 2012. "Canons, close reading, and the evolution of method." In Debates in the digital humanities, edited by Matthew K. Gold, 249-258. Minneapolis: University of Minnesota Press.

Valid XHTML 1.0!



Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.