On 2 April 1750 George Washington recorded the first entry in his ledger book, a practice he would continue until his death. Though only eighteen at the time, it is obvious that he took pride in maintaining meticulous, accurate, and clear financial records. What is also evident is that Washington was aware of best accounting practices including those outlined in the accounting textbook Book-keeping methodiz’d by John Mair, a book he purchased for his stepson, John Parke Custis, in 1762 (Gambino and Palmer 1976, 14-15; Abbot 1990, 168). Over the course of his life, Washington accumulated thousands of financial documents where he recorded every transaction, from money lost at cards to money spent on food and textiles for Mount Vernon, the cost of tutors for Martha’s children, and medical treatments for slaves. He detailed his shares of the Potomac Company, money lent to friends, and rents collected from his western landholdings. Washington was a savvy entrepreneur and successful businessman who sought innovation, took calculated risks, and because of his methodical recordkeeping, had a broad understanding of the American economy. The Washington financial papers will enrich our understanding of George Washington and offer insight into his times, the topics ranging from the histories of economics, material culture, manufacturing, and agriculture.

The ledgers, journals, account books, invoices, waste books, expense accounts, and bills of exchange on deposit at the Library of Congress, Mount Vernon, and repositories throughout the United States together represent one of the richest corpora of early American financial documents. And while many of the manuscripts are available for viewing at the archives or online, historians of Washington and Mount Vernon have only touched on the rich information the documents contain. An article by Helen Cloyd briefly examines Washington the accountant, noting that his interest in bookkeeping began in 1747 and illustrating this point with a few excerpts from his correspondence and financial documents (1979). There was also a dissertation written appraising Washington’s business affairs and resources (Peterson 1970). Later works touch on the same themes of Washington the businessman, farmer, and accountant, as well as his interest in the economic situations both at Mount Vernon and beyond (Smith 1993, 65-66; Achenbach 2004, 24). Shenkir, Welsch, and Bear, in an article on Thomas Jefferson, summarise the utility of his financial documents, describing them as being "of immeasurable value to historians and their biographical work on Jefferson." Specifically, "the use of [Jefferson’s] financial records as an invaluable resource in the writing of history can be observed by juxtaposing the historian’s writings with the entries in the Account Book"; most of the works examining Washington and his finances take the same approach (1972, 35).

Missing from the literature, though, is a thorough examination of the detailed transactions, Washington’s system of accounting, and fluctuations and changes in prices, currencies, and goods. This is not surprising considering the difficulties encountered while working with the documents: dense, encrypted content; multiple entries of single transactions; a complex web of documents; multiple currencies within the same document; as well as transcription issues. Despite these challenges, technologies now exist that will enable us to make these financial documents intellectually accessible; specifically, they hold the potential for content mining, textual analysis, currency valuations, tracking of purchased and sold goods, and examinations of relationships, both business and personal. From these records we can learn the amount of money Washington spent in a given year, who he had financial dealings with, how the price of certain commodities fluctuated over time, the types of goods he purchased and how much was spent on specific items, and how various currencies (such as Maryland and Virginia) were valued.

Making Washington’s financial papers accessible had been an early goal of ours at the Papers of George Washington project (PGW), but given their complexity and our means of publication, very little had been done. In the 1980s PGW published, in print, several pages from the ledgers in the Colonial Series of the Papers of George Washington (for example, see Abbot 1998, 182; Figure 1).

Figure 1: Cash Account.

Over the years we have included a few other cash accounts as documents and occasionally used excerpts from various financial documents for our annotations. Ideas about how we might make these documents available began to take shape as PGW moved forward with the digital rendition of the letterpress volumes. We began to think about accessibility solutions for the financial papers, and our ideas grew and evolved with the huge advances made in the field of digital humanities in the last few years. In 2009 we decided the financial papers should be a digital-only edition, but the questions of how and where to start remained. First and foremost our goal has been to make these documents available online not only to broaden the potential audience, but perhaps more importantly to publish in an environment where an accurate transcription could be displayed while at the same time facilitating a variety of search and browse options as well as data mining and analysis. Developing a wish list for what we wanted was relatively easy, but answering the consequent questions about how to accomplish these goals proved challenging. We realised that how the documents are structured, fielded, and/or tagged is a critical consideration in the analytic process and not simply a matter of rendering handwriting as type on a screen. The means to an end would shape the end, determining much about how and what editors and historians can understand about these texts.

Issues surrounding an initial XML-based approach

When we first started thinking about how to digitise the financial papers, our initial thought was to add them to The Papers of George Washington Digital Edition (PGWDE). Published by Rotunda, the electronic imprint of the University of Virginia Press, this edition contains all previously published letterpress (print) volumes. The PGWDE was the first publication in Rotunda’s American Founding Era Collection to move from print to digital. The schema (using the TEI P5 with some customisations) used in this XML-based publication was very much informed by the letterpress’s structure, defining the visual presentation and organisation of the documents. But while it was important to mirror the letterpress edition, this remediated edition also exploited the searching and displaying capabilities of the digital medium. It was and remains a very good solution for moving legacy publications from letterpress to digital and to date has been the model for subsequent remediations of edited primary materials in the Founding Era collection. The markup captures document structure: salutation, dateline, body text, signature, and postscripts. Other PGW-specific elements are also tagged, including editorial and source notes, annotation, repository codes, short titles, cross-references, as well as metadata elements that are used in indexing and searching. In the digital edition users can browse by series, volume, and document, as well as peruse the cumulative index. Text, author/recipient, and date searches can also be performed.

This XML-based solution worked well for the majority of the Washington edition, as well as for other editions based on personal papers (including Thomas Jefferson and the Adams Family) because these papers are comprised mostly of letters and almost entirely of documents that are narrative and discursive. Moreover, letters and other dated documents (such as orders and proclamations) can easily be arranged hierarchically: chronologically then alphabetically. This is not to suggest that all individual documents can stand alone: most do not. Therefore, in order to contextualise and connect related documents, cross-references within the documents are noted (and linked in the digital edition) and ancillary information and associated documents mentioned in the annotation. But unlike these narrative and discursive documents, financial documents are tabular in structure, span decades-long periods of time, are not particularly useful when read from beginning to end, and are uniquely dependent on other documents for full understanding.

It is also instructive that the very question of just what constitutes a single "document" in the financial papers is problematic. This "document" designation, with its corresponding markup, while adequate for presenting transcriptions of letters, reports, and circulars, seemed too confining for complicated financial records. For instance the three ledger books, each of which could be considered an individual document, total around 850 folios in length (about 1,700 pages). That would mean, in the current digital edition structure, once users navigated to the "document" they would have to scroll through hundreds of pages. Treating individual folios as documents is also problematic because one ledger folio can contain entries from multiple accounts, whether with an individual or commercial entity, noting cash expenditures and income, rents collected, and dealings in tobacco. These accounts were frequently carried forward to other folios and ledgers. Furthermore, a folio page with multiple accounts contains transaction dates that are chronological only within each individual account, not the folio. Next we discussed whether individual transaction lines could be treated as documents, modelling our document structure loosely on the structure adopted for Washington’s Diaries. But unlike diary entries, a transaction line taken out of its context–what financial document it comes from, the account it is associated with, where it is located within the financial document (in the case of a ledger, the debit or credit side)–is not only problematic but in many cases unintelligible.

With these challenges and realisations in mind, we revisited the issue of how to balance the two desired outcomes: a searchable transcription and calculable data. Given that we are not only concerned with preparing accurate transcriptions and presenting them logically, but with making these documents intellectually accessible, we reevaluated our approach. Instead of initially focusing on what constitutes a "document" in the financial papers, we decided it would be better to explore how these documents will be used and researched, and what technologies would help us accomplish that outcome. In order to make these records not just available but truly accessible, we needed a solution that would enable us to develop a behind-the-scenes index and glossary that would power complex searching and browsing features, capture and define the financial content so that users could run calculations, and connect related records (the essence of Washington’s system of accounting) so that this complex web of documents could be understood.

The tabular structure of financial records immediately brings to mind the possibilities of a relational database. Much has been written about double and single entry accounting and relational databases in the fields of accounting, business, and information systems (Wang, Du, and Lee 2002; Wigley 2013; Perry and Newmark 2012); however, absent from the literature is the question of how a relational database might be useful for historical financial records. Nevertheless, we found the theories of entity-relationship modelling (an abstract way of describing a database in which there are entities and relationships) to be applicable and useful when conceptualising our plan for dealing with Washington’s financial records (Chen 1976). After consulting with Peter McMickle, professor in the School of Accountancy at the University of Memphis, who has designed and implemented several accounting systems, the idea of a relational database became even more appealing. It was not until we partnered with a group working on developing a content management system for the editing and publishing of digital documentary editions (DocTracker) that we were able to fully explore the possibilities of a database solution.

The database/DocTracker solution

DocTracker (DT2) is a content management database for documentary editions built in FileMaker Pro by Bob Oeste of Dataccompli Software Solutions and Mary MacNeil of the Dolley Madison Digital Edition and the University of Virginia Press. Initially crafted for a born-digital edition of a collection of mainly letters, DocTracker centralises and facilitates the complete editorial process: document search and collection, document cataloguing, transcription and markup, source-image storage, editorial workflow management, gathering of metadata, annotation, and XML output. Beginning in July 2012, DT2 partnered with the Papers of George Washington Financial Papers Project and the Civil War Governors of Kentucky Digital Documentary Edition to expand DocTracker’s "menu of core functions, and to tackle data entry and output requirements of complex documentary materials" (DocTracker2 2013). A significant part of this expansion included the development of solutions for editing, representing (both transcription and data), and publishing financial documents. The National Historical Publications and Records Commission (NHPRC) provided funding for the development work and eventual release of the beta and final versions of DocTracker, which will be freely available by June 2014 at http://www.doctracker.org. The Papers of George Washington also recently (July 2013) received a three-year grant from the NHPRC to further develop DocTracker’s financial records functionality, develop a Web interface for financial documents, build an open-access web prototype with basic search and download features, and to write and publish a guide for developing digital editions of financial papers projects.

Our partnership with the DocTracker team gave us an opportunity to explore how a database might negotiate the challenges presented by financial documents, such as managing and interpreting multiple currencies, fluctuating valuations, and the system of barter, as well as handling complex document relationships. But the most critical questions, at least as we make the first pass in preparing these documents for digital publication, come down to two. First, what is the best way to preserve and present a document both as an object that has been accurately transcribed to mirror the hand-written original, while also providing a way for scholars to analyze the financial content in these records? Quite simply, we need to preserve text and create data. And secondly, how can the database be used to define and organise different types of financial documents such as ledgers, journals, waste books, and receipts? Similarly, are the documents within Washington’s corpus unique or regularised according to type? And if the latter, is how they relate in keeping with standard accounting practices?

Issue one: transcribing text and creating data

Creating an accurate transcription in DocTracker is straightforward; document structure and content definitions are captured using fields for transcribed text within the database. The issue that needed to be addressed in the database development early on was what to do with instances of incomplete data such as dittos and abbreviations, missing dates and amounts, and partial headings. In Figure 2 the account holder named in the header (George Fairfax), a date, and a few examples of "ditto" have been highlighted.

Figure 2: Page from George Washington, 1750-72, Ledger Book 1, Library of Congress, Manuscript Division.

As mentioned earlier, one goal of the project is to provide accurate transcriptions of the documents with very little editorial intervention; however, in order for the text to be fully searchable from the beginning something had to be done with the incomplete data. The solution is something we termed "docs v. data." Essentially there are two different versions of a document in DocTracker: literal transcription (document) and expanded transcription (data). On the data side dittos, abbreviations, and shorthand have been expanded. This includes everything from abbreviations for tobacco and locations to short hand for people mentioned in the transactions. Dates have also been added to all ledger lines, as it was an accepted accounting practice to list a date once and not again until a date change occurred (Mair 2012, 7). We also decided to substitute the "Contra" on the right (or credit) side of the ledger with the person or entity associated with the account and added the header terms "Dr" (debit) and "Cr" (credit) to every line on the left and right side of the ledger respectively, as what side of the folio something is on implies meaning. And finally the transaction totals were decimalised. Other known issues that will eventually be addressed on the data side include dates (or ledger lines) with multiple transactions, currency differences, valuation, barter, and subtotals, as well as missing or incorrect data. The screenshot in Figure 3 is the data view of the manuscript page in Figure 2 where the account holder, date, and ditto expansions have been highlighted.

Figure 3: Data View in DocTracker.

Issue two: financial document types

The collection of Washington’s financial documents numbers in the thousands. The three main documents – Ledgers A, B, and C, ranging from 1750 to 1799 – contain basic business accounts where Washington recorded everything from the purchase of farm and household products to the sale of slaves, the work of artisans, and farm operations. There are also a number of account books kept both by Washington and his various farm managers, cash memorandum books where Washington recorded personal and business expenditures (Figure 4), receipts, bills, and household expense accounts (Figure 5), along with numerous other miscellanea.

Figure 4: Page from George Washington, 1772-73, Cash Memorandum Book, Library of Congress, Manuscript Division.

Figure 5: Page from George Washington, May 9, 1787, Daily Expenses, Library of Congress, Manuscript Division.

Though there are thousands of documents, because Washington seems to have followed common accounting practices, most can be categorised according to type: ledger, cash memoranda, and journal of accounts. Metadata and templates structured on the different types of financial documents will be developed and applied to every new document brought into the system. How documents relate to each other also needs to be addressed: for instance, transactions were quickly recorded in a wastebook, moved to a journal, where a bit more information was added, and finally recorded in the ledger, so that a single transaction would usually have been recorded in several places. This set of relationships will have to be managed in the database, most likely using links, and carefully organised and displayed in the resulting publication.

Though the primary focus of our development and editorial work has thus far been on the ledgers (double entry accounts), the DocTracker development team has built additional interfaces to work with other financial records. A "Simple" and flexible table interface will be used for documents that don’t fit into the single- or double-entry account structure (Figure 6). Values can be entered into unspecified fields, and these fields can then be defined once converted into XML (Figure 7). Values entered into this interface can be converted to XML by clicking on the XML tab. The "Single-entry table" interface contains both transcription and data and will be used for documents such as journals of accounts and daybooks (Figure 8). And the "Double-entry table" interface will be used for the ledgers (Figure 9).

Figure 6: View of the Simple table interface in DocTracker.

Figure 7: View of the XML created from the table in Figure 6.

Figure 8: Single entry table in DocTracker.

Figure 9: Double entry table displaying the debit/left side of a ledger in DocTracker.

Figure 10: Double entry table displaying the credit/right side of a ledger in DocTracker.


As we move forward with development and spend more time working with the financial documents, we will undoubtedly come across other issues that need to be examined and resolved. In fact we are already starting to think about what the next phase might consist of: creating a glossary of people, places, things, and concepts; annotating the documents; linking the financial documents with related documents in the digital edition. Another topic of interest as the project develops is how the cumulative index of the Papers of George Washington might provide an intellectual framework for the financial documents, not only in making decisions about what types of content should be tagged, but also how we might use index entry structures to help categorise important topics in the financial papers, such as agricultural activities, manufacturing, and farm management. Another challenge will be figuring out what kind of XML output we need for displaying the transcriptions in the digital publication and whether a schema based on the Text Encoding Initiative (TEI) guidelines will work for financial documents. Kathyrn Tomasek and Syd Bauman are currently tackling this issue with the development of "transactionography, 'which' models transactions as a sequence of one or more transfers of anything of value from one account to another" (2012).

There is still much to be done both in thinking about these documents and in developing DocTracker, but it is this process of trial and error, creating and rethinking, challenge and innovation that make this project both exciting and rewarding. Indeed, these are certainly exhilarating times to be involved in the field of digital humanities. And though George Washington probably never imagined future generations would be interested in his financial records, we are confident that he would be pleased with our meticulousness, creativity, and innovation.

