Introduction

“Digital Humanities” is a fraught term, on whose definition rests funding decisions, tenure lines, and institutional power dynamics. Its (or their) public face is multifaceted: New York Times articles (Cohen 2010), museum exhibits (Quirk 2015), popular tools (DiRT Directory 2016), and tech industry partnerships (Google Research Blog 2010; Kirschenbaum 2007) all contribute to how the Digital Humanities (DH) interact with the wider world.1 In academic circles, the term is often associated with backchannel chatter (Holmberg and Thelwall 2014), grey literature (Huggett 2012), and informal workshops and conferences (French 2015). DH has too many definitions to be well-defined (Terras, Nyhan, and Vanhoutte 2013), but its influence is great enough to warrant an exploration of how it appears to newcomers, to scholars, and to the world. The annual Alliance of Digital Humanities Organizations (ADHO) conference provides one important vantage point whence to launch such an exploration (Earhart 2015; Sugimoto and Weingart 2015). As the largest and most public DH-labeled event,2 the conference reflects and constructs many of the visible contours of DH, even (or especially) when it fails to adequately represent all aspects of the community, the scholarship, or the pedagogy.

The first Digital Humanities conference was held in 2006 following the founding of ADHO, but its roots are in the joint Association for Literary and Linguistic Computing (ALLC)/Association for Computers and the Humanities (ACH) conference first held in 1989 (ADHO.org2016).3 This essay reflects on an ongoing quantitative analysis of this conference to trace its changing shape since 1989. The analysis investigates whether the common characterization of DH as collaborative, inclusive, and globally-minded appears true through data-driven methods, keeping in mind that while ADHO’s conference is not a synecdoche for the entire digital humanities community, the conference does represent the community’s most public face. We present results in modest visualizations and simple statistics for greatest accessibility. Preliminary results reveal a growing conference, growing research team sizes, poor gender diversity, poor (but recently improving) regional diversity, and some shifts in topical focus of presentations. In light of recent controversies in which self-identified digital humanists have become increasingly worried that they and their work are not adequately represented, a topic discussed at length at DH2015 (Terras 2015 and 2011), we conclude that the annual DH conference has more work to do in reflecting its broad constituency and ethos for inclusion and diversity, though we save improvement suggestions for the companion piece referenced in Footnote 1.

Methods and Data

The DH conference and its joint ALLC/ACH predecessor began in 1989. We have collected schedules or programs from each, and have entered their contents into a spreadsheet to analyze trends across geography and time. By the writing of this piece, we have no data entered from before 2004. From DH2004–DH2013, we entered presentation title, author names, author institutional affiliations (if provided), author country affiliations (if provided), author academic departments (if provided), presentation type (panel, poster, plenary, etc.), presentation text (abstract or full paper depending on availability), and keywords (if provided). In addition to this 2004–2013 dataset of publicly available conference information, we created an additional dataset from conference submissions for 2013, 2014, and 2015, which contains the same fields as the above dataset. By checking submissions against the final programs for 2013–2015, we could analyze acceptance rates across several variables.

During and after data collection, we hand-cleaned names, institutions, and departments, ensuring as best as possible that different people with similar names were given separate unique IDs, and that identical people with spelling variations in their names were given the same unique ID. We did the same for departments and institutions. We appended gender information (m/f/other/unknown) to authors by a combination of hand-entry and automated inference using Lincoln Mullen’s “gender” package for R (Mullen 2016). This is problematic for many reasons, including a lack of possible gender options, the inability to encode gender changes over time, and the possibility of our matching incorrect genders to authors—especially those with names poorly represented on U.S. census and birth records (Posner 2015). We are working to improve this process (see an extended discussion in our forthcoming companion piece with Jeana Jorgensen), but feel even uncertain information is better than no information in this context.

Finally, we used a combination of Google Spreadsheets, Microsoft Excel, Notepad++, OpenRefine, and the R and RStudio development environment to collect and analyze the data for trends. We opt to present simple visualizations, counts, and comparisons rather than more rigorous statistical results in the interest of clarity, but at the expense of certainty. Readers should interpret these results as indicative rather than conclusive.

Findings

The number of presentations and unique authors at the annual conference has increased nearly every year in the last decade (see Figure 1). Although the data do not appear in Figure 1, preliminary analysis shows even greater acceleration in 2014 and 2015.

Figure 1 

Rate of DH conference growth over 10 years (2004–2013).

This matches other analyses of digital humanities (Terras 2012), showing increasing DH activity and participation across the board, with no signs of slowing down. The conference is healthy and attendance rotates, with 60 ± 10% of each year’s authors never having attended previously. This suggests a core of about 200 authors, as of 2013, orbited by a constellation of digital humanists who do not regularly attend the conference, disciplinary tourists (perhaps humanities or computer science researchers or librarians with one-off DH projects), and short-term collaborators on multi-authored projects. Such a large portion of attendees appearing only once raises the question of whether “big tent digital humanities” itself should be considered a discipline in its own right, or simply a meeting place that some steer closer to than others. That is: is DH made up entirely of tourists?

Although data for earlier years are unavailable due to privacy standards in many countries, data from the conference in Sydney, Australia in 2015 show that attendance and author lists do not perfectly overlap. Only 70% of pre-registered attendees were also authors of conference presentations. The other 30% of attendees, nearly 150 people, likely included local participants, ADHO committee members, university administrators, and industry professionals. Between attendees and authors, by 2015 we suspect a core community of around 300 returning participants, and a periphery numbering in the several thousands (THATCamp n.d.; @DHNown.d.).4

That not every author attends, and not every attendee is an author, is itself unsurprising. The demographic difference between the two groups is worth mention, however. We found at DH2015 that ≈35% of authors were women, yet women comprised ≈46% of attendees (Weingart 2015).5 Work must be done to improve representation at future conferences to combat this disparity.

Topics

When submitting to the DH conference, authors must attach author-supplied keywords and ADHO-assigned topics to their presentations. Conference committees rarely made this data public before 2013, meaning topical analysis over the last few decades requires hand-coding or algorithmic assistance, neither of which are complete at the time of this writing. Preliminary results are available, however, combining coded data after 2013 (see Figure 2 and Weingart n.d.)6 with anecdotal evidence from preceding years.

Figure 2 

Topical change at DH Conferences 2013–2015.

In recent years, DH presentations have shifted away from project-based to principle- and skill-focused topics. For instance, interface and user-experience design, scholarly editing, and information architecture, among other project-based topics, have declined. Conversely, text analysis, visualization, and data modeling have increased, especially in the last few years. The exception to this is the rise of topics associated with digitization and GLAM (Galleries, Libraries, Archives, & Museums).

The most prominent topics covered recently have related to literary studies, text analysis/mining, visualization archives, and interdisciplinary collaboration. History, linguistics, philosophy, and gender studies have found a home at DH in the past, but their presence fluctuates, especially in comparison with the dominance of literary studies. This dominance should not be surprising given digital humanities’ cultural origins (Schreibman, Siemens and Unsworth 2004),7 though it often comes at the expense of representing other equally rich traditions combining technology with the humanities (Leon 2015; Sloman 1978).8 Historical studies jumped from comprising 10% of presentations in 2013 to 17% in 2014, and down to 15% in 2015. It remains unclear whether this indicates random fluctuations, trends over time, or differing regional profiles of DH. Other recently growing topics include semantic analysis and cultural studies.

The most visible drops in coverage came in topics related to pedagogy, scholarly editions, user interfaces, and research involving social media and the web. Between 2013 and 2015, the conference lost a quarter of its coverage related to pedagogy. “Scholarly Editing” dropped from 11% to 7% of the conference proceedings, and “Interface and User Experience Design” from 13% to 8%. Among the more surprising drops were those in “Internet/World Wide Web” (12% to 8%) and “Social Media” (8.5% to 5%). We mention these specifically because the trends are fairly clear across the three years for which we have data, and conform to our anecdotal awareness of previous years. That said, three years of analysis is not enough to form solid conclusions about shifts in topical coverage, and more collection will be required to confirm these results.

Authorship

Between 2004 and 2013, nearly 2,000 total authors presented at DH, with the most rapid introduction of new authors after 2010 (see Figure 3). Even after taking the growth of the conference itself into account, new authors are appearing faster than we might expect. Figure 4 shows the rate of introduction of new authors normalized by the growth of the conference itself, such that values above 1 mean authors are entering the conference faster than the conference is growing. The rate of new authors is increasing, suggesting the conference is becoming less insular, or perhaps there are more disciplinary tourists, submitting one presentation and never doing so again. The percentage of returning authors is consequently decreasing, while the sheer volume of core authors is still slowly increasing. This suggests, possibly, that the DH conference is growing in popularity and encouraging more tourists faster than it is growing in core members.

Figure 3 

Increasing number of authors at DH conferences who never authored at the conference before.

Figure 4 

First-authorship rate normalized by conference growth.

DH often self-identifies as innately collaborative, yet our study indicates that over one-third of presenters at the DH conference remain close to their disciplinary humanistic roots by adhering to the single-authorship tradition (Spiro 2009).9 It is unclear whether other humanities conferences hold a similar co-authorship ratio. Even so, with nearly two-thirds of DH presentations signed by multiple authors, the data indicate a tendency toward collaboration, whether or not that collaboration is innate to all DH work.

The co-authorship rate does not likely represent a true account of collaborative work, but rather a lower bound. Collaboration in digital humanities research may often go uncredited, with invisible work contributed by students, interns, or hired assistance. Given this, single-authored DH presentations may have uncredited authors, and perhaps multi-author presentations do not represent their full collaborative scope in the authorship credits. This confusion will continue as long as DH lacks an agreed-upon standard for credit, though work is being done in this direction (Crymble and Flanders 2013).

While the insular nature of humanities research is unlikely to disappear from DH, a time-based analysis shows that the number of single-authored presentations is decreasing, as the average number of authors per presentation steadily grows (see Figure 5).

Figure 5 

Average number of co-authors on a single presentation in a given DH conference year.

Regional Diversity

Since ADHO is a collection of international organizations, we were interested in the regional diversity of conference authors. We inferred author countries based on their institutional affiliations (e.g., University of Victoria is coded as Canada) and clustered them by U.N. macro regional standards (e.g., Canada = Americas). In doing this, our analysis shows the conference lacks regional diversity, which may be attributed to the locations in which the conference is held.10 Between 2004 and 2013, 1,056 authors originated from the Americas (US: 851; Canada: 202; Mexico: 1; Peru: 1; Uruguay: 1), and 794 were from Europe (see Figure 6). Figure 7 shows the prominence of American authors occurred not only in the odd years when the conference was held in the Americas (with ≈65% American authors), but also in the even years when it was held in Europe (with ≈50% American attendees). While the conference remains Americas-centric overall, regional diversity is on the rise, with notable increases of authors from Asia and Oceania, although no scholars affiliated with African countries appeared in this analysis.

Figure 6 

Authors per region 2004–2013. Authors we were unable to locate are aggregated under “(blank)”.

Figure 7 

Country of author institutions to DH conferences 2004–2013.

Preliminary analysis shows greater regional diversity in 2014, and unsurprisingly the most diverse yet in 2015, when the conference was held in Sydney. We feel ADHO’s decision to bring the conference farther afield was a step in the right direction.

Gender Distribution

With women playing increasingly central leadership roles in the DH community, we hoped to see similarly improved representation among DH authors. After coding for author gender, we looked at the percentage of authors each year who were women (or at least who registered as women according to our hand-corrected algorithmic approach), as well as the percentage of first authors who were women (see Figure 8). With minor fluctuations per year but an unchanging average over time, about a third of all authors from 2004–2013 were women. The ratio is only slightly (though consistently) better for first-authorships, such that a higher percentage of first authors were women.

Figure 8 

Percentage of female authors at each annual ADHO conference 2004–2013.

The critique may be raised that this is not a problem of representation, but of interest—though even if this were a broadly valid criticism, it is not true in this case. As mentioned earlier, ≈35% of DH2015 authors appear to be women, contrasted against ≈46% of attendees. Thus attendees are not adequately represented among conference authors. From 2004–2013, North American men seem to represent the largest share of authors by far.

Conclusions and Future Analysis

The data show that over the last decade, ADHO’s international conference has become slightly more collaborative and regionally diverse, that text and literature currently reign supreme, and that women are underrepresented with no signs of improvement thus-far. This is at odds with many of our anecdotal experiences with colleagues online and at home, a group that is more diverse and multidisciplinary than the annual conference reflects. We hope for ADHO to take this disparity into account when organizing future conferences. For instance, if conference location correlates to regional diversity of authors, ADHO might consider hosting the DH conference less often in North America and more often in non-Anglocentric countries. Certainly to some extent, the onus is on the authors and reviewers themselves to promote diversity and broader representation in their panels and projects, and ADHO might find ways to encourage diverse panels and multi-author presentations, or discourage many presentations from the same author. Finally, diversifying the reviewer pool could broaden the topical scope and geographic representation of presentations and attendees. These suggestions reflect efforts already underway in ADHO, which we applaud. We do not make these suggestions as a gesture towards reaching an international conference whose demographics exactly match the global population, but to ensure DH scholarship remains healthy through the inclusion of a broad range of perspectives and approaches.

While the preliminary results are useful and telling, we continue to expand our dataset to include DH abstracts since 1989, and with that, we will look deeper into our initial findings. For instance, while we can anecdotally conclude that there has been a shift in the focus of topics presented at DH, from project- to skill-based, we plan to provide a quantitative assessment of these shifts over time and space. It would be interesting to see how topics distribute geographically, to determine whether regional differences contribute to various differences over self-definitions of digital humanities. Furthermore, we hope to examine authorship with more granularity, to interrogate the diversity of multi-authored presentations for cross-institutional and international collaboration. We also plan to analyze the relationships between new and repeat authors with topics and the fields they come from, as well as correlating topic with gender. Preliminary results suggest gender does skew what topic is being discussed, with topics more often written by women less likely to appear in the conference. Finally, we will open our dataset so authors can edit their own information, allowing a more sensitive gender analysis beyond the male/female binary and taking into account the fluidity of the category over time.