<div class="article-body" id="middle"><!DOCTYPE HTML>

<div>
<h2>

        Introduction
    </h2>
<p>
		The Web of 2009 tantalizes the user who approaches it with an interest in history. Hughes’ and Greengrass’ recently-published <em>Virtual Representation of the Past</em> makes clear the variety of newly developed resources that are available. Nevertheless, the ever-increasing
        array of historical source material, archival records, inscriptions, and reports now published on the Web, including online topic-based websites and
        online journals, are all too often just beyond the grasp of the non-specialist scholar. Even today, professional historical research on the Web is
        dependent on the researcher knowing the right websites ahead of time, rather than on simple queries of the Web as a whole. Some component of this
        global information network should be capable of supporting a query searching for all online historical evidence relating to a given time and place. For
        instance, it should be possible to enter the query “1767 AD” in order to bring the user in contact with the newspaper transcriptions from that year
        provided by Costa’s Geography of Slavery, with the pertinent proceedings of the Old Bailey courthouse in London published online by Emsley et al.
        (2009) and the large number of remaining online sources associated with that year. It should be possible to receive highly relevant results from such
        queries instead of the largely irrelevant results often generated by a Google search. Such a service would, moreover, make online historical research
        more useful and more pertinent to the interested layperson. Imagine, for instance, a family visiting Brittany to trace their family roots. What texts,
        artefacts and scholarly discussions exist, they might ask, that pertain to Brittany at the time when their ancestors came to the North America? The
        online summaries of a local tourist bureau or Wikipedia articles are unlikely to suffice, but if the online digital contributions of local and national
        museums and archives were made available, such a family could more effectively plan their personal historical journey of discovery.
    </p>
<p>
        There is a second audience that would potentially be interested in using a unified entry point for historical research: the growing number of people
        who willingly contribute their expertise online through wikis, blogs, and discussion for a (Sunstein149-164). Such a group, provided with a general
        outline or index and simple digital tools to associate secondary materials to this index, would become a powerful adjunct to the core of professional
        researchers for a given topic, filling in helpful ancillary materials such as references to journal articles, on-line discussions, and references in
        the popular media. In short, a usable, global overview of historical resources on the Web might become a sort of seed crystal, whose introduction
        facilitates the seemingly spontaneous creation of a much larger matrix of interconnected information.
    </p>
<p>
        The requirements for such a system are demanding, but this paper will show that a combination of web 2.0 programming techniques and Semantic Web
        technologies and standards will go a long way toward meeting this challenge. First, a common schema specifying the basics of historical information is
        required. Moreover, individual projects must be able to provide their data in this schema, or related ones, without being required to use them
        internally. For the application of such a schema to be possible, researchers will require a means to translate the metadata governing documents from
        one schema to another, ideally through a method that requires nothing from the document’s original authors. The first part of this paper, therefore,
        shows how, with slight modifications, the applications and standards of the Resource Description Framework (RDF) supports these requirements.
    </p>
<p>
        Second, historians need to consider the kinds of tools that will effectively enable users to browse thousands, and even hundreds of thousands, of
        historical events. Supporting such a task not only entails questions relating to user interface design, but it also raises problems pertaining to the
        volume of data possibly being exchanged. In order to illustrate these issues, the second part of this paper introduces the <em>Fawcett Toolkit</em>, a
        computer application built upon web 2.0 techniques and current standards relating to the markup of humanities documents. Published under a free and
		open source licence, and built upon similarly licensed software, the <em>Fawcett Toolkit </em>represents the current state of <em>the Historical Event Markup and Linking Project</em>, ongoing research into markup languages designed to aggregate historical event data and
        represent them as computer-generated visualizations such as maps, timelines, and animations (Robertson1051-1052; Robertson).
    </p>
<p>
        Each advance in work such as this brings into view new opportunities and new technical and social impediments. The third part of this paper outlines
        two such issues. First, it demonstrates that historical data acquired online require developers to undertake a more careful approach to the selection
		and transmission of text expressed in languages other than English. Second, I suggest that one implication stemming from projects such as the <em>Fawcett Toolkit</em> is the following: if the digital historical community wishes to support the powerful and precise aggregation of historical
        content on-line, it must make open licensing of raw data a fundamental requirement for academic publishing. Only then will it be possible to build a
        network of historical information that enables specialist and non-specialist users to access the full array of content and analyses created by digital
        historians.
    </p>
<br/>
<h2>

        1.0 Schemas for Historical Events
    </h2>
<p>
        A reasonable consensus has emerged regarding the data types and relationships that are needed to categorize the past online. The Heml project provides
        XML schemas to outline the past as a series of events. They were first devised in 2001 and revised in 2003 (Robertson). In these schemas, “events”
        minimally comprise a textual label, such as the “Battle of Actium,” and a temporal label that can be resolved to machine readable formats. Optionally,
        an event may comprise participants (who in turn might have roles in the event), and locales. Finally, these schemas connect the event model with
        evidence, either online or in print. The Heml XML Schema was designed to provide a missing component within the context of a much more comprehensive
		set of specifications for XML, the Text Encoding Initiative's (TEI's) P4 Guidelines. The P4 Guidelines were published in <em>The TEI Consortium: guidelines for electronic text encoding and interchange</em> and included every other component required to compose a digital
        historical commentary (TEI Consortium, 2002). They did not, however, offer the capability to encode historic events, a practice that makes it easier to
        generate timelines, maps, and other guides for the reader.
    </p>
<p>
        TEI's P5 specification fills this gap (TEI Consortium, 2007). It also includes event tags, though the exact modalities of their use are slightly
        different from those of Heml. At present, TEI events tags appear to have a primarily descriptive function, since they must appear within a tag
        describing a person (or a place), and, according to common convention, nested XML tags imply that the enclosing tag has ownership over the enclosed
        elements. However, as we shall see, TEI P5 event tags have an excellent array of possible qualifiers, and can be associated with places defined
        elsewhere in the text and with anchor tags indicating the evidence within the text for the event.
    </p>
<p>
        In a much-quoted turn of phrase, Dempsey notes the “recombinant potential” of reusable and pervasive cultural information on the Web when encoded in
        the more formal language of propositional logic. Indeed, recent research shows that historical data encoded in this manner can make good use of
        artificial intelligence research through data-mining techniques (Ciravegna et al.). Somewhat overlooked, perhaps, is the potential of Semantic Web
        technologies to represent broad, not necessarily deep, knowledge about the past: to link together data from highly disparate sources and schemas into a
        common schema, and then to make this pool easily searchable.
    </p>
<p>
        The specification that appears most likely to meet humanists’ requirements in this respect is the CIDOC-CRM or “Cultural Reference Model” (Doerr,
        Hunter, and Lagoze 169). It also models historical events. While the TEI appears to use historical events as descriptors for places and persons, the
        CIDOC-CRM was built to meet the needs of the field of cultural heritage, and, in particular, the needs of researchers and practitioners describing its
        objects. It defines these objects largely in terms of the events they undergo, and thereby through the people and places associated with them. It has
        achieved the status of a standard approved by the International Organization for Standards (ISO), the leading international standards-setting body (ISO
        21127:2006). In the digital humanities literature, the potential applicability of the CIDOC-CRM standard has generated considerable interest (Eide).
    </p>
<p>
        The CIDOC-CRM has not been without its detractors. Rejecting its suitability as a set of specifications to support an ancient world data
        mining-project, Gilles points out that the CIDOC-CRM’s orientation toward historical events puts entities such as people, places, and inscriptions at a
        distance from each other. And he notes that there are large problems left unsolved by the CIDOC-CRM, particularly those of reference and evidence. As a
        result Gilles proposes to delay its implementation in his informatics; in this paper I hope to show that the incompleteness of the CIDOC-CRM for any
        given task need not deter us from using it to describe many things and relations (Gilles).
    </p>
<br/>
<h2>

        2.0 Historical RDF
    </h2>
<p>
        The Semantic Web is based on a technology known as the Resource Description Framework, or RDF (Beckett). Although an XML representation of RDF data
        exists, the scholar accustomed to XML should avoid viewing RDF through an XML frame of reference, and instead understand an RDF database or document as
        one or more statements, with each statement comprising three parts: a subject, a predicate (or property), and an object. This is illustrated in Table
        1. Each row of that table is a single statement, sometimes called a “triple.” In the subject column, strings with underscores, such as john_hammond,
        represent variables; in common practice these are encoded with Uniform Resource Identifiers (URIs) and represented to users through the string
        associated with the variable, using a predicate such as has_label.
    </p>
<br/>
<center>
<table border="1" cellpadding="0" cellspacing="0">
<tbody>
<tr>
<td valign="top" width="32">
</td>
<td valign="top" width="216">
<p>
                        Subject
                    </p>
</td>
<td valign="top" width="179">
<p align="center">
                        Predicate/Property
                    </p>
</td>
<td valign="top" width="238">
<p>
                        Object
                    </p>
</td>
</tr>
<tr>
<td valign="top" width="32">
<p>
                        1
                    </p>
</td>
<td valign="top" width="216">
<p>
                        john_hammond
                    </p>
</td>
<td valign="top" width="179">
<p>
                        has_label
                    </p>
</td>
<td valign="top" width="238">
<p>
                        “John Hammond”
                    </p>
</td>
</tr>
<tr>
<td valign="top" width="32">
<p>
                        2
                    </p>
</td>
<td valign="top" width="216">
<p>
                        founding_of_owens_gallery
                    </p>
</td>
<td valign="top" width="179">
<p>
                        has_label
                    </p>
</td>
<td valign="top" width="238">
<p>
                        “Founding of OwensArtGallery”
                    </p>
</td>
</tr>
<tr>
<td valign="top" width="32">
<p>
                        3
                    </p>
</td>
<td valign="top" width="216">
<p>
                        founding_of_owens_gallery
                    </p>
</td>
<td valign="top" width="179">
<p>
                        has_participant
                    </p>
</td>
<td valign="top" width="238">
<p>
                        john_hammond
                    </p>
</td>
</tr>
<tr>
<td valign="top" width="32">
<p>
                        4
                    </p>
</td>
<td valign="top" width="216">
<p>
                        founding_of_owens_gallery
                    </p>
</td>
<td valign="top" width="179">
<p>
                        has_date
                    </p>
</td>
<td valign="top" width="238">
<p>
<strong></strong>
</p>
</td>
</tr>
<tr>
<td valign="top" width="32">
<p>
                        5
                    </p>
</td>
<td valign="top" width="216">
<p>
                        hammond_visits_san_frans
                    </p>
</td>
<td valign="top" width="179">
<p>
                        has_label
                    </p>
</td>
<td valign="top" width="238">
<p>
                        “Hammond Visits San Francisco”
                    </p>
</td>
</tr>
<tr>
<td valign="top" width="32">
<p>
                        6
                    </p>
</td>
<td valign="top" width="216">
<p>
                        hammond_visits_san_frans
                    </p>
</td>
<td valign="top" width="179">
<p>
                        has_participant
                    </p>
</td>
<td valign="top" width="238">
<p>
                        J_hammond
                    </p>
</td>
</tr>
<tr>
<td valign="top" width="32">
<p>
                        7
                    </p>
</td>
<td valign="top" width="216">
<p>
                        j_hammond
                    </p>
</td>
<td valign="top" width="179">
<p>
                        owl:sameAs
                    </p>
</td>
<td valign="top" width="238">
<p>
                        john_hammond
                    </p>
</td>
</tr>
</tbody>
</table>
</center>
<br/>
<p align="center">
        Table 1: Historical Statements in RDF
    </p>
<br/>
<p>
        Such “raw” data is usually only expressed to the user in the format described in the Object Column. It is usually material that is “machine readable” –
        dates, names, latitude and longitude markers – material the computer uses to compute and then form visualizations for the user. This information is not
        necessarily contained in one document, repository, or library. As long as all the documents and data use the same URIs when referring to the same
        concept, the statements may originate from multiple databases, documents, and even web services. Moreover, the order of the statements does not change
        the meaning of the set of statements.
    </p>
<p>
        Scholars in the Humanities who have worked with digital documents are probably familiar with technologies that define a markup specification, such as
        Document Type Definitions (DTDs) or XML Schemas. These formally state which markup elements may appear and constrain their location and relationships
        within the document, indicating, for instance, that an element comprising a quotation may not appear within an element denoting a personal name; or
        that an element that denotes an historical event must have some chronological information associated with it. In common use, these are given as input
        to computer programs that will declare if a document “conforms” to the DTD or XML Schema. Most documents will “declare” their DTD or XML Schema at the
        beginning of the document, and the processor will fail with an error if the document does not conform to the specification to which it professes to
        adhere. Given this stricture, it can be assumed that general purpose tools for markup languages may be applied with good effect.
    </p>
<p>
        When dealing with XML documents, one common application that relies on document specifications is a program that transforms a first document, such as
        one conforming to the TEI specification expressed in a DTD or XML Schema, into another kind of XML document, say an XHTML file that is specified in its
        own DTD or XML Schema, and can be read by a web browser. The technology most commonly used for this is “Extensible Stylesheet Language Transformations”
        (XSLT). Because the source document conforms to an XML specification, the author of the XSLT program is able to assume safely that the same location
        and governing relations in the source document will pertain in all conforming documents, and can therefore reliably replicate them in the end document.
        Elsewhere, the conforming XML data is used as the input to a program that generates a visual component, such as a chart. In most cases, the
        specification exists as a kind of contract between the source of data and the processes that consume it, and so it might be said that the tools for
        specifications commonly employed by Humanities scholars define data in order to constrain it.
    </p>
<p>
        While it is possible to use constraining specifications for RDF, it and other features of the Semantic Web are driven by technologies that require a
        shift in thinking for developers and users. In general, it may be said that Semantic Web tools specify data not in order to <em>constrain</em> it, but
        rather to permit its <em>discovery</em> and <em>interpolation</em>. We may illustrate the potential of this approach with an example. Consider
        statement three in Table 1, and assume that has_participant is defined explicitly as a predicate that associates historical events with the human
        agents that participate in them. (In an XML definition this “has a participant” relationship is usually implied through a nested structure, just as an
        XML paragraph implies a “have a sentence” predicate because it has sentences nested within it. In RDF, the “has a participant” relationship would be
        expressed through the three data, or “triple”.) Using common XML approaches, this information alone would not pass muster: minimally, an historical
        event should have chronological information associated with it, and this information would be defined in the DTD or schema.
    </p>
<p>
        Applications that use specifications for the Semantic Web, such as RDF Schemas, take a much more liberal view of this data. If, as we stated above, the
        predicate has_participant is defined to associate historical events with human agents, then associated technologies would deduce that the URI
        founding_of_owens_gallery is an historical event, unless otherwise informed. The fact that founding_of_owens_gallery does not yet have a date or a
        labelling text associated with it does not impede this deduction because it can be provided such associations using data from another, as yet unknown,
        source. In terms of formal logic, Semantic Web technologies employ the “open world assumption:” they do not reject a statement as false even if a
        pertinent datum such as a specified date is missing. As a result, computational tools cannot rely on such RDF specifications to act as a guarantee of
        the completeness of incoming data.
    </p>
<p>
        Table 1 illustrates another powerful and simple RDF tool that that supports the discovery of historical events. With reference to lines one and six, it
        might have been suspected that there are two variables used to represent one person, John Hammond. This is a likely scenario, since the aggregation of
        historical data will result in multiple URIs that, in fact, refer to the same person, or place or event. However, the statement in line seven makes the
		two URIs, john_hammond and j_hammond, equivalent, so that the same person is represented as participating in events where either URI appears. <a href="#_edn1" name="_ednref1" title="">[1]</a>
</p>
<br/>
<h2>

        3.0 The Fawcett Toolkit
    </h2>
<p>
        The remainder of this paper presents the <em>Fawcett Toolkit</em>, a complementary suite of general-purpose digital tools to support work on the
        historical Semantic Web. The toolkit first provides components for “servers,” computer programs that collect and reconcile the data as described above
        and then send out parts of that data in response to queries. It also provides these for “clients,” typically web-browsers running on computing devices
        across the Internet. These browsers access web pages containing components that query a server. The server replies to the query with the appropriate
        data and the clients use these to produce historical maps or other visualizations. The servers and clients must of course agree on a common language
        for the queries and replies. The Worldwide Web Consortium has defined just such a query language for RDF called SPARQL, and it has specified a standard
        response format for replies from queries. Though in some respects SPARQL is less powerful than other RDF query languages, it has the advantage of being
		well-implemented, especially in the Joseki server software available from Hewlett Packard. The version of Joseki provided in the <em>Fawcett Toolkit</em> is equipped with an RDF schema file that allows it to discover CIDOC-CRM events in heterogeneous RDF data. It loads this data
        into the server with, for example, an XSLT stylesheet that transforms conforming TEI P5 documents into the CIDOC-CRM-based markup that it uses
        internally. Thus, besides its original intention as a means of visualizing large numbers of historical events from across the web, the toolkit can be
        used as an adjunct to a digital library that comprises historical material or as “scaffolding” while marking up TEI P5 event tags.
    </p>
<br/>
<h2>

        4.0 A Simple Event Schema Based on the CIDOC-CRM
    </h2>
<p>
        The CIDOC-CRM is a general and abstract model, one that intentionally avoids specifying how to encode machine-actionable data pertaining to time and
        location. It also does not provide a model for associating evidence with events. In modelling events for the current <em>Fawcett Toolkit</em>, we have
        taken a pragmatic approach to these matters, using the CIDOC-CRM's classes and properties where appropriate and then supplementing them with simple
        relations where necessary. Illustration one presents part of a sample event in graphical form, with the blue elliptical nodes representing URIs, the
        white rectangles representing strings, numbers and other machine actionable data, and the arcs between nodes representing the property relationships
        between RDF resources. The illustration shows that some shortcuts have been made to simplify the markup in comparison to the complete CIDOC-CRM. For
        example, the schema here makes the assumption that all locations are points and so it uses the common geo: namespace to identify latitude and
        longitude. The representation of time is also simplified compared to the complete CRM schema, as well, though the <em>Fawcett Toolkit</em> can still
        process all the chronological relations offered by the Heml XML Schema.<a href="#_edn2" name="_ednref2" title="">[2]</a>
</p>
<br/>
<p>
</p><center>
<img alt="Figure 1. RDF Representation of an Event Originally Encoded in TEI P5" class="" src="/article/id/7201/file/104880/"/>
<br/>
<br/>
        	Figure 1: RDF Representation of an Event Originally Encoded in TEI P5
        </center>
<br/>
<p>
        The project's SPARQL server, packaged as part of the toolkit release, contains 5,600 events and over 115,000 statements. The “reasoners” built into the
        Joseki server are provided with simple rules for interchange between specifications for historical data, and using these they derive many of the events
        and statements from other RDF schemas. For instance, some events were encoded by students using “Semantic MediaWiki,” an extension to the software that
        runs Wikipedia (Krötzsch, Vrandecic, and Völkel). Using this extension, the MediaWiki environment can function as a user-friendly RDF editor. However,
        the data encoded by the wiki environment employed a different schema from the modified CIDOC-CRM schema described here and were therefore adapted to
        this schema by using the techniques described above.
    </p>
<p>
        Other events were rendered into conforming RDF, represented in XML, through XSLT transformations of XML source documents. Because of their novelty and
        general utility to Humanities scholars, special attention was given to the XML event tags in the TEI P5 specification, with the aim of exploring how
        the event visualization tools could interact efficiently with digital library software such as the <em>Perseus Digital Library</em>, which is now
        available under a free and open-source license. The goal was to augment the users' experience of the library without adding code to the library’s
		software. The <em>Perseus Digital Library</em> is now installed locally and a summer research student converted the Perseus text of Sallust's <em>Bellum Catilinae</em> from the older P4 standard of the TEI to the more recent P5 standard. He added appropriate P5 event tags, as well as place
        and person tags in Latin and English. An XSLT stylesheet was written to convert this XML-encoded event data into RDF that conformed to the CIDOC-CRM
        specification, and the data stored among others in the server software was to be queried and visualized on Web clients.
    </p>
<p>
        The results were satisfactory enough to inspire us to encode a wholly new text, the diary of John Hammond as he traveled from Montreal to the West
        Coast in 1871, a holding of the Mount Allison University Archives (Hammond).<a href="#_edn3" name="_ednref3" title="">[3]</a> The historical markup of
        this text was not completed, but enough was done to confirm the results obtained from the ancient text. This finding is significant because the Hammond
        diary uses a different means of sectioning the text and provides temporal descriptions of events that are sometimes precise to the hour. Examples from
        the Hammond diary are used to illustrate visualization techniques in the following two sections.
    </p>
<br/>
<h2>

        5.0 A Historical Mapping Widget
    </h2>
<br/>
<p>
</p><center>
<img alt="The Mapping Widget" class="figure" src="/article/id/7201/file/104881/"/>
<br/>
        	Figure 2: The Mapping Widget
        </center>
<br/>
<p>
        Although there are many experimental visualizations of the RDF data available in the <em>Fawcett Toolkit</em>, the most developed of these is the
        mapping widget appearing in Illustration 2. The visualization shown here exemplifies the approach we have taken in order to maximize the amount of
        historical information available in one viewing while minimizing the demands on the server. In the <em>Fawcett Toolkit</em>, the SPARQL query engine is
        the only server software associated with the RDF data, and its only task is to make replies to the client's queries, encoding this data in the
        lightweight JSON format. JSON encoded queries are encapsulated in a common JavaScript library used by all <em>Fawcett</em> tools, and this library
        itself makes use of the sparql.js JavaScript query library (Feigenbaum, Torres, and Yung).
    </p>
<p>
        The JSON-only approach benefits the process in two ways. First, it offloads the tasks of drawing the map and its dynamic components onto the client
        computer, a fair exchange since there are far more clients than servers. Second, it reduces the size of the files that must be communicated from server
        to client.
    </p>
<p>
        Further reductions in map files are gained because the mapping widget uses AJAX-like communication to add further information to the map as the user
        interacts with it.<a href="#_edn4" name="_ednref4" title="">[4]</a> On its first drawing, the map widget builds a layer within the OpenLayers
        JavaScript mapping toolkit. This layer does not include the event information for each location (some of which might be expected to be associated with
        a large number of events). Rather, each point is encoded with its co-ordinates, label, and the URI of the location in the RDF database. When the user
        mouses over the location, a SPARQL query is constructed to request the corresponding list of events, which are then rendered in chronological order
        below the map. In Illustration two, the user has moused over the location corresponding to the mouth of the Columbia River, and the single event that
        takes place at this location, according to in the original TEI document, is “Hammond heads toward Portland, OR ....” In this way, AJAX programming
        allows the user to navigate a potentially very large dataset with very few delays.
    </p>
<br/>
<h2>

        6.0 Rendering Source Documents Inline
    </h2>
<p>
        In order to improve user experience, the same interactive approach was extended to the display of textual evidence associated with the event. In
        previous work with Heml-Cocoon software, source documents were accessed through labelled hyperlinks. This was not an optimal solution, however, because
        it drew users away from the central navigating tool, the map or the timeline, and it prevented side-by-side comparison of source documents. In print
        media, it is common to list citations inline or to provide a brief quotation that is embedded in the sentence. For the <em>Fawcett Toolkit</em>, it was
        hoped that we might find an analogue that could apply in the digital world.
    </p>
<p>
        One of the important roles of digital library software like Perseus is so-called "chunking", a function that divides the text in suitable sections such
        as books and lines of poetry or pages of text (Crane and Wulfman79). The Perseus software not only serves these chunks as fully rendered web pages, but
        also, given the proper address, as fragments of the raw TEI file in XML. Because the chunking pattern of the text is encoded in the TEI file, the XSLT
        file that creates the RDF can also discern the specification and create conforming patterns for the addresses of the page links. Thus it is possible
        for the mapping widget to provide textual evidence drawn from the digital library and render it inline, in a way that does not detract the reader from
        the map.
    </p>
<p>
        Illustration three shows the relations necessary to make this possible. As was the case with the example shown in Illustration one, the evidence, here
        given the URI hammond_diary#KamloopsLetter, has a label. Its web address, though, is to the Perseus 'xmlchunk.jsp' service. In addition, the evidence
        object is associated with an appropriate XSLT file available on the web.<a href="#_edn5" name="_ednref5" title="">[5]</a> This transforms the
        TEI-encoded XML chunk into a fragment of a web page encoded in XHTML.
    </p>
<br/>
<p>
</p><center>
<img alt="Graph of Relations Needed to Render a Reference Inline" class="figure" src="/article/id/7201/file/104882/"/>
<br/>
<br/>
        	Figure 3: Graph of Relations Needed to Render a Reference Inline
        </center>
<br/>
<p>
        The result of providing these two relations is a series of user interactions best explained with reference to Illustration four. When the user clicks
        on the text “Hammond heads toward Portland, OR ...”, the words ”Evidence” and “Referred To In” appear below. These are the labels of two kinds of
        relations between the event and a text. RDF and the software allow for the creation of a custom taxonomy that describes the relationships between event
        and text, which may include categories such as “Eyewitness Account,” “Memoir,” and even types of secondary resources such as “Discussion” and even
        “Refutation”.<a href="#_edn6" name="_ednref6" title="">[6]</a>
</p>
<p>
        When a user clicks on either the “Evidence” or “Referred to In” headings, the citation for each applicable source appears. In the case of our example,
        only one source appears, “The Diary of John Hammond p. 10b.” However, because a large number of citations potentially could be displayed at the
        instigation of the user, the data is again rendered through an asynchronous update of the page. Finally, when the user clicks on this text, the browser
        fetches both the TEI fragment and the XSLT file associated with the reference that are provided in RDF. It then renders the TEI fragment into XHTML
        using the XSL file and inserts the resulting XHTML inline. Subsequent clicks on the text, citation, or resource class label will hide and show all
        enclosed materials, allowing the user to explore this material in-place.
    </p>
<br/>
<p>
</p><center>
<img alt="Evidence Rendered Inline" class="figure" src="/article/id/7201/file/104883/"/>
<br/>
        	Figure 4: Evidence Rendered Inline
        </center>
<br/>
<p>
        Some scholars might be concerned that this approach to rendering an historical text is part of an unfortunate trend in the online publication of
        documents. It is sometimes claimed that the digital medium excessively fragments texts, doing damage to their cohesiveness or to their narrative
        integrity. However, it should be noted that texts have been fragmented for the sake of historical information many years before the advent of the
        Internet: the historical sourcebook is a well-established genre. In fact, this inline rendering technique is a substitute, not for extensively quoting
        the text, but rather for citing it and, in all likelihood, it being left unread. As a result, a technique such as this makes the text more accessible
        to the person with historical interests, not less. Finally, the digital medium makes it possible to offer multiple paths of research. In this case, it
        would be a simple improvement to add a link that allows the researcher to load the text in its own browser tab or window in order to explore the
        context of a particular source or to read a text that has been discovered through a geo-temporal search process.
    </p>
<br/>
<h2>

        7.0 Other Experiments
    </h2>
<p>
        The <em>Fawcett Toolkit</em> includes some less well-developed experiments in historical encoding and visualization. We used a hemlRDF:comprisesEvent
        property to nest events within each other. For instance, an event labelled “John Hammond's Trip to Western North America” could comprise all the events
        marked up in the diary, and the single event referring to Hammond’s trip would itself be one of several events nested under an event labelled “John
        Hammond’s Life,” using the hemlRDF:comprisesEvent relationship. This hierarchy of events was used to draw a tree list using a component from the Yahoo
        user interface library. In this form of visualization, the events in the tree that have hemlRDF:comprisesEvent properties are indicated with an expand
        icon, shown in the form of a “+” sign that, when clicked, retrieves and renders the child events. The nesting is not ranked, so it could continue
        through a great number of levels. It is yet another way that large numbers of events can be displayed together.
    </p>
<p>
        Users might want to add to the pool of references associated with an event. The JavaScript experiment named “Libenter” shows how the client’s browser
        can capture a description of the location on the page of a portion of text that has been highlighted with the browser’s “select” tool. Because SPARQL
        now includes commands to “update,” or add to, RDF databases, it would be possible to associate with an event the URI of a pertinent webpage and the
        description of the important part of that page, similar to what was done above with the TEI chunks from the Perseus server.
    </p>
<p>
        Finally, the Joseki server used in this work had an important alteration that was necessary for it to serve historical data successfully. A property
        function was written for it so that it would translate the various temporal representations applied to events into plain numbers. These numbers allow
        the client to request a temporal range or data sorted temporally.
    </p>
<br/>
<h2>

        8.0 Future Work
    </h2>
<p>
        The examples given above were all drawn from an English source and with English metadata labels, but often historical research involves many languages.
        It is, of course, considered best for a researcher to read evidence and discussions in the author's original language. However, even the most
        linguistically adept professional historian, not to mention the interested layperson, cannot be expected to fulfill this standard at all times. When
        the historical researcher cannot read the pertinent document in its original language, she would naturally prefer having access to a translation
        rendered in a language that she is familiar with. For these reasons, a system such as the one described here should, when possible, provide researchers
        with translated sources and discussions that suit their abilities.
    </p>
<p>
        The process of altering the responses of a computer program in order to suit a user's linguistic and cultural preferences is known commonly as
        "internationalization." In recent years, several standards have arisen to support this process in the networked world of the Web. First, there is BCP
        47, a standard set of abbreviations for languages, their variants, and their computer encodings (Phillips and Davis). Documents encoded in XML
        languages, such as XHTML, identify the language of the text enclosed within an element through the use of an xml:lang attribute equalling the
        appropriate abbreviation. Finally, the hypertext transfer protocol− that is, the language that a Web browser speaks to a Web server when requesting
        materials − includes a line in the “header” of the request labelled “Accept-Language,” whose list of language tags “restricts the set of natural
        languages that are preferred as a response to the request” (Fielding et al.). When a server receives such a request with this header line, it uses this
        information to match the text resources available to it, choosing the alternative that is highest on the list of user preferences.
    </p>
<p>
        This process is appropriate to, for instance, a commercial Web service, but not to the scholarly processes we have in mind, because it implies that all
        the linguistic representations of a given text are equal in value. At very least, a text that is in the author's original language should be preferred
        if the user's Accept-Language header lists this among the languages that he can read. Ideally, the server might rate the linguistic fidelity of the
        source – for instance, ranking a translation by the author above other translations – and serve the best text with this and the user's language
        abilities in mind. Finally, if the only document available to the browser is in a language that the researcher cannot read, it is important that the
        server provide this source nonetheless because its absence would suggest that there is no asserted evidence for the event. A determined researcher
        would, after all, contact a colleague who could offer assistance in the translation.
    </p>
<p>
        The web servers built by the Heml project upon the Cocoon XML publishing engine were able to negotiate language content in this manner, but an
        unmodified server that only returns RDF in response to SPARQL queries could not. This lack of negotiation occurs because, in common practice with RDF,
        the metadata that describes the language of resources does not appear within the propositional logic on which SPARQL queries are based. Thus, even
        though SPARQL makes it possible to filter a query based on language (for example, excluding event labels that are not within a given set), SPARQL does
        not make it possible to perform queries of the sort, “what languages are used to label this event?” Yet it is exactly the latter form of queries that
        are necessary for the kind of more advanced language-matching necessary for a historian who would like the query to fall back to resources that are in
        languages not listed in the Accept-Language list sent by his or her browser. In order to perform these sorts of functions, the metadata that encodes
        the language of resources must be declared within RDF statements. Since it is not reasonable to hope that those producing the RDF data would adhere to
        such a language markup scheme, the solution seems to be to modify the RDF server so that it rewrites the RDF graph to express linguistic information in
        this manner when each RDF source is loaded into the server. Such a system would generate properties that are defined as subproperties of rdfs:label,
        including one for each linguistic option encoded in BCP 47. These could then be queried through the rdfs:label property using an RDFS reasoner.
    </p>
<p>
        RDFS subproperties can also be used to indicate the identity and authority of the person associating a given property with a given event. If, for
        instance, historian Smith disputed that the passage cited in Illustration one (labelled with the URI #KamloopsLetter and expanded in Illustration
        three) actually provides evidence for that event, it would be possible to indicate that the relation between “August 15, 1871” is asserted by Chung,
        but disputed by Lebans, and generate different chronologies and event lists according to their opinions. This paves the way for satisfactorily using
        the more complex relations of the CIDOC-CRM, indicating relations like causation, that are only acceptable when ascribed to a scholar's opinion. This
        work is planned for the Summer of 2009.
    </p>
<p>
        These processes, then, are the pitfalls and potentials of a global index of historical resources based in a simplified use of the CIDOC-CRM, served
        with slightly modified SPARQL servers, and visualized with in-browser JavaScript programs. There remains, however, one paramount issue on which the
        success of this approach depends: a changing culture of research in the Humanities. It has long been observed that the emphasis in the humanities on
        solitary research betrays its roots in monastic scholarship. But this lack of a collaborative spirit is perhaps most obvious in the realm of digital
        publication, where the real basis of the interchange of ideas, such as the data and metadata encoded in databases and TEI-encoded XML files, are rarely
        published. Indeed, in today's security-minded Internet there are ever-increasing ways for a Humanities project to impede the aggregation of its
        resources. It should be understood that, as a discipline, History has more to gain from the responsible popularization of its topic and from the free
        interchange of ideas than each scholar does to lose if his data is not always associated with his exact web sites. The biological sciences recognize
        the benefits of the free flow of data. For example, <em>The Broad Institute</em>, , a leading Genome research lab, publishes complete mammal genome
        sequences and makes them available to fellow researchers based on the following conditions: “1. The data may be freely downloaded, used in analyses,
        and repackaged in databases. 2. Users are free to use the data in scientific papers analyzing particular genes and regions if the provider of these
        data ... is properly acknowledged.” (Broad Institute of MIT and Harvard). The rising tide of online historical information is the historians'
        equivalent to the genome database, and it should be made as widely available, searchable, and interchangeable as possible.
    </p>
<br/>
<div align="center">
<hr align="center" size="2" width="100%"/>
</div>
<h2>

        Works Cited
    </h2>
<p>
<strong>Beckett, Dave. </strong>
        “RDF/XML Syntax Specification (Revised): W3C Recommendation 10 February 2004.” <em>W3C: World Wide Web Consortium Website</em>. 2004. Web. 6 June 2009.
        &lt;<a href="http://www.w3.org/TR/rdf-syntax-grammar">http://www.w3.org/TR/rdf-syntax-grammar</a>&gt;.
    </p>
<p>
<strong>Broad Institute of MIT and Harvard.</strong>
		“Horse Genome Project.” At <em>Broad Institute Website</em>. 2009. Web. 6 June 2009. &lt;        <a href="http://www.broad.mit.edu/mammals/horse">http://www.broad.mit.edu/mammals/horse</a>&gt;.
    </p>
<p>
<strong>Ciravegna, Fabio et al.</strong>
        “Finding Needles in Haystacks: Data-mining in Distributed Historical Datasets.” <em>The Virtual Representation of the Past</em>. Ed. Lorna Hughes and
        Mark Greengrass. Farnham, UK: Ashgate, 2008. 65-79. Print.
    </p>
<p>
<strong>Costa, Tom.</strong>
		“The Geography of Slavery in Virginia.” <em>Virginia</em><em> Center for Digital History Website</em>. 2005. Web, 6 June 2009. &lt;<a href="http://www2.vcdh.virginia.edu/gos">http://www2.vcdh.virginia.edu/gos</a>&gt;.
    </p>
<p>
<strong>Crane, Gregory, and Clifford Wulfman.</strong>
        “Towards a cultural heritage digital library.” <em>JCDL ’03: Proceedings of the 3rd ACM / IEEE-CS joint conference on Digital libraries</em>. Houston:
        IEEE Computer Society, 2003. 75-86. Print.
    </p>
<p>
<strong>Dempsey, L. </strong>
		“Divided by a Common Language: Digital Library Developments in the US and UK.” <em>The 4th International JISC / CNI Conference Website, Edinburgh. 26-27 June 2002</em>. 2002. Web. 6 June 2009. &lt;
        <a href="http://www.ukoln.ac.uk/events/jisc-cni-2002/presentations/ppt-2000-html/lorcan-dempsey_files/v3_document.htm">
            http://www.ukoln.ac.uk/events/jisc-cni-2002/presentations/ppt-2000-html/lorcan-dempsey_files/v3_document.htm
        </a>&gt;.
    </p>
<p>
<strong>Doerr, M., J. Hunter, and C. Lagoze.</strong>
        “Towards a Core Ontology for Information Integration.” <em>Journal of Digital Information</em> 4.1 (2003): 169. Print.
    </p>
<p>
<strong>Eide, Oyvind.</strong>
        “The Exhibition Problem. A Real-life Example with a Suggested Solution.” In <em>Literary and Linguistic Computing</em> 23.1 (2008): 27-37. Print.
    </p>
<p>
<strong>Emsley, Clive, Tim Hitchcock, and Robert Shoemaker.</strong>
        Old Bailey Online. 2009. Web. 6 June 2009. &lt;<a href="http://www.oldbaileyonline.org/">http://www.oldbaileyonline.org</a>&gt;.
    </p>
<p>
<strong>Feigenbaum, Lee, Elias Torres, and Wing Yung.</strong>
<em>Sparql.js.</em>
        2007. Web. 6 June 2009. &lt;<a href="http://thefigtrees.net/lee/sw/sparql.js">http://thefigtrees.net/lee/sw/sparql.js</a>&gt;.
    </p>
<p>
<strong>Fielding, R. et al.</strong>
		“HTTP/1.1: Header Field Definitions, June 1999.” <em>W3C: World Wide Web Consortium Website.</em> 1999. Web. 6 June 2009. &lt;        <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html">http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html</a>&gt;.
    </p>
<p>
<strong>Gilles, Sean.</strong>
        “Concordia, Vocabularies, and CIDOC CRM.” 2008. Web. 6 June 2009. &lt;
        <a href="http://concordia.atlantides.org/docs/concordia-crm.html#on-the-cidoc-crm">
            http://concordia.atlantides.org/docs/concordia-crm.html#on-the-cidoc-crm
        </a>&gt;.
    </p>
<p>
<strong>Hammond, John.</strong>
<em>Mount Allison Archives</em>
        . John Hammond Fonds, 2008. 8800/2. Print.
    </p>
<p>
<strong>Harper, J. Russell. </strong>
<em>Early Painters and Engravers in Canada.</em>
        Birkenhead, England: Uof Toronto P, 1970. Print.
    </p>
<p>
<strong>Hughes, Lorna and Mark Greengrass.</strong>
<em>The Virtual Representation of the Past</em>
        . Farnham, UK: Ashgate, 2008. Print.
    </p>
<p>
<strong>Krötzsch, Markus, et al.</strong>
        “Semantic Wikipedia.”<em>Web Semantics: Science, Services and Agents on the Web</em> 5.4 (2007): 251-261. Print.
    </p>
<p>
<strong>Phillips, A. and E. Davis.</strong>
		“BCP 47: Tags for Identifying Languages.” <em>RFC Editor Website</em>. 2006. Web. 6 June 2009. &lt;        <a href="http://www.rfc-editor.org/rfc/bcp/bcp47.txt">http://www.rfc-editor.org/rfc/bcp/bcp47.txt</a>&gt;.
    </p>
<p>
<strong>Robertson, Bruce.</strong>
		“Exploring Historical RDF with Heml.” <em>Digital Humanities Quarterly</em>. 3(1). 2009. Web. 6 June 2009. &lt;        <a href="http://www.digitalhumanities.org/dhq/vol/003/1/000026.html">http://www.digitalhumanities.org/dhq/vol/003/1/000026.html</a>&gt;.
    </p>
<p>
<strong>───.</strong>
		“DTDs and Schemata.” <em>The Historical Event Markup and Linking Project Website</em>. 2009. Web. 6 June 2009. &lt;        <a href="http://heml.mta.ca/samples/blocks/heml/schemata">http://heml.mta.ca/samples/blocks/heml/schemata</a>&gt;.
    </p>
<p>
<strong>───.</strong>
		“Visualizing an historical semantic web with Heml.” <em>Proceedings of the 15th international conference on World Wide Web (WWW’06). Edinburgh, Scotland, May 23-26, 2006</em>. New York: ACM: 1051-1052.
        2006. Web. 6 June 2009. &lt;
        <a href="http://www2006.org/programme/files/xhtml/p199/pp199-robertson-xhtml.html">
            http://www2006.org/programme/files/xhtml/p199/pp199-robertson-xhtml.html
        </a>&gt;.
    </p>
<p>
<strong>Sunstein, Cass R.</strong>
<em>Infotopia: How Many Minds Produce Knowledge.</em>
        Oxford: Oxford UP, 2008. Print.
    </p>
<p>
<strong>TEI Consortium.</strong>
<em>The TEI Consortium : guidelines for electronic text encoding and interchange</em>
        . Oxford: Humanities Computing Unit, University of Oxford, 2002. Print.
    </p>
<p>
<strong>TEI Consortium. </strong>
<em>TEI P5: Guidelines for Electronic Text Encoding and Interchange</em>
        . 2007. Web. 6 June 2009. &lt;<a href="http://www.tei-c.org/release/doc/tei-p5-doc/en/html">http://www.tei-c.org/release/doc/tei-p5-doc/en/html</a>
        &gt;.
    </p>
<p>
<strong>Thaller, Manfred.</strong>
        “Which? What? When? On the Virtual Representation of Time.” <em>The Virtual Representation of the Past</em>. Eds. Lorna Hughes and Mark Greengrass.
        Farnham, UK: Ashgate, 2008. 115-124. Print.
    </p>
<br/>
<div>
<div align="center">
<hr align="center" size="2" width="100%"/>
</div>
<div id="edn1">
<h2>
                Endnotes
            </h2>
<p>
<a href="#_ednref1" name="_edn1" title="">[1]</a>
                In fact, this example is somewhat simplified, since it would likely result in <em>two </em>John Hammonds being associated with all the events
                in which each is associated elsewhere in the RDF. A better approach is to use RDF Schema’s “subclassing” for this problem.
            </p>
</div>
<div id="edn2">
<p>
<a href="#_ednref2" name="_edn2" title="">[2]</a>
                The text highlighted in this paper admittedly is not representative of the many problems of encoding time. A recent discussion of these issues
                can be found in (Thaller, 2008).
            </p>
</div>
<div id="edn3">
<p>
<a href="#_ednref3" name="_edn3" title="">[3]</a>
				Hammond was an artist and photographer who studied with Whistler in 1885-6. He was instrumental in establishing the collection of the                <em>Owens Art Gallery</em> of <em>Mount Allison University</em>, and was director of the art school at <em>Mount Allison College</em> 1907-1920
                (Harper, 1970: 144-45).
            </p>
</div>
<div id="edn4">
<p>
<a href="#_ednref4" name="_edn4" title="">[4]</a>
                AJAX stands for “Asynchronous Javascript and XML.” This process is only AJAX-like because the mode of communication between the server and the
                client is not XML, but rather JSON responses to SPARQL queries.
            </p>
</div>
<div id="edn5">
<p>
<a href="#_ednref5" name="_edn5" title="">[5]</a>
                The XSL file is provided indirectly through the hammond_diary#TeiFragmentRenderer so that more than one address could be given for the file,
                and so that the client can more easily cache the compiled XSLT file.
            </p>
</div>
<div id="edn6">
<p>
<a href="#_ednref6" name="_edn6" title="">[6]</a>
                Even though the reference is defined with the same hemlRDF:Evidence element as illustrated in Illustration 1, the “Referred To In” reference
                type also appears because “Evidence” is encoded as a subclass of “Referred To In,” and all references appear in both their own class and all
                superclasses. It is not clear if this is the preferred behaviour.
            </p>
</div>
</div>
</div>
</div>
	Subject	Predicate/Property	Object
1	john_hammond	has_label	“John Hammond”
2	founding_of_owens_gallery	has_label	“Founding of OwensArtGallery”
3	founding_of_owens_gallery	has_participant	john_hammond
4	founding_of_owens_gallery	has_date
5	hammond_visits_san_frans	has_label	“Hammond Visits San Francisco”
6	hammond_visits_san_frans	has_participant	J_hammond
7	j_hammond	owl:sameAs	john_hammond