1. Limitations of TACT
What are the problems with traditional tools like TACT?
TACT [1] is a second generation tool. First generation tools were batch-oriented and designed originally for mainframes. They were not interactive; the user had to run a job and then examine the results. This meant that exploring a text was a long iterative process, but the batch files did offer a record of what was done to get a particular result. TACT, on the other hand is one of the best second generation tools. It was designed from the beginning for interactive use on a microcomputer. It even employs primitive windows in which the user can view different data displays. Being interactive, the process of trying queries, checking the results, and refining your queries is much faster. One of the things that was lost, however, was a way to track the project so that you could reconstruct how you arrived at a result after the interaction.
While there are many problems specific to TACT, we are going to focus here on those problems which TACT shares with similar interactive tools. We take TACT as an example of a text-analysis tool, not because we wish to criticise TACT, but because we have used it many years and have an intimate knowledge of its design -- one of us was involved in its creation and development. The major limitations we encountered when we used TACT to study Hume's Dialogues Concerning Natural Religion were:
- 1.1 It cannot be extended.
- 1.2 It is difficult to record one's work.
- 1.3 It is difficult to share one's research results.
1.1 Extension
One of the first problems we encountered with TACT was that it is a closed program; it cannot be modified or extended except by those who own the code. Not only is the program closed to extension, but the output it generates cannot be passed to other programs dynamically. The result files created by TACT must be manually exported and then opened in another program for further processing.
One solution that occurred to us was to repackage TACT so that it could be accessed by other programs. TACTweb [2], from one perspective, is such a repackaging for a WWW server to access. In our experience, however, the extension problem is not simply a matter of providing hooks so that a program can be called or call others. In original research one moves quickly beyond what has been done before in ways not foreseen. The unforeseeable nature of original research means that a research tool cannot be extended in a predictable way to meet the demands of research. What is needed are research tools whose every aspect can being modified and extended through easily added modules. In other words, to be truly extensible, a tool should be capable of having any component replaced. Eye-ConTact is one model for how such extensibility can be achieved based on similar tools in the sciences. [3]
1.2 Record of the Experiment
According to one model of what happens in a computer-assisted research project, the researcher comes to a text with questions formulated in terms of queries the computer can answer. If the answers are interesting, the researcher records the answers and how they were derived and then publishes a description of the methods used and the results. For the results to be convincing, others have to be able to repeat the research and arrive at similar results by following the described methods. As readers of such publications we expect the research to be described in sufficient detail to allow us to test the results ourselves. Programs like TACT are unfortunately missing mechanisms to record or log one's work in order to accurately describe the methods used for oneself and others. (How often have you saved results and found that a few months later you cannot remember how they were arrived at!) We need tools that ensure that our computing work is clearly and completely logged while it is developing, so that it can be accurately described and later recreated by others. Specifically, we need tools that:
- Keep an accurate log of our progress so that we can understand it later and describe it to others.
- Keep an accurate log of our work so that we and others can repeat it.
This leads to the next problem.
1.3 Sharing Results
Existing text-analysis tools have a fundamental problem: they do not assist the researcher to share his or her research in a fashion that makes the text analysis accessible. TACT is a private tool; you study a text and then you publish your results as a separate act, preferably hiding the fact that a mere computer assisted you in any way. TACT does not help you keep track of how you reached your conclusions, nor does it help you show others how you arrived at those conclusions. It fosters private exploration, not open reproducible interactive research.
Research is by nature something others can recover (or re-search) if they are sceptical. To be convincing, research results have to be open to examination, so that others can traverse the logic of the research themselves. Current text-analysis tools do not allow one to share results in an interactive form for verification; they force us to either give colleagues the complete environment or nothing at all. Many papers based on text analysis simply tell us that the computer assisted a given insight, without showing us how the insight was arrived at. The methods used are described all too often in a truncated fashion, because there is really no graceful way to share them.
In short, we need tools that make the research accessible, not the technology. Such tools should show how the results were arrived at and what decisions were made along the way, instead of showing the toys used. Paradoxically, what we need are research tools aimed, not at the researcher for his or her private exploration, but at the researcher's audience who wants to test the insight. We need a tool that allows people to package easily their research for distribution in an interactive form and which highlights the research, not the technology.
2. The Eye-ConTact Design
The design philosophy of Eye-ConTact that grew out of TACT's limitations can be summarized as follows:
- 2.1 It is a visual programming environment.
- 2.2 It has a modular architecture.
- 2.3 Any component can be replaced.
- 2.4 It may be slower than simple tools.
2.1 Visual Programming
Eye-ConTact is first and foremost a prototype of a visual programming environment in which one creates text analysis applications. It is, in a sense, a thought piece meant to illustrate some of the ideas it implements, and to provoke further thought about these issues.
Eye-ConTact is a domain-specific visual programming environment in which one creates text analysis applications. Visual programming environments need to be distinguished from program visualizations and data visualizations. Brad Myers in "Taxonomies of Visual Programming and Program Visualization" nicely distinguishes the two, defining visual programming as "any system that allows the user to specify a program in a two(or more)dimensional fashion" (Myers 1990: 98). By contrast program visualization is a graphical representation of aspects of a program that may have been written in a conventional text programming language. Data visualization has nothing to do with the program, but rather has to do with the information managed by a program. A graph produced by a spreadsheet would be an example of a data visualization - it offers a view of the data manipulated by the program, rather than illustrating some aspect of the program. We will not discuss here the variety of such visual programming and visualization tools used in the sciences (see Price 1993 for a taxonomy of software visualization).
It is easy to confuse visual programming environments with data visualization because several examples of scientific visualization programs that produce data visualizations are at the same time themselves visual programming environments. For example, IRIS Explorer, which runs on Silicon Graphics workstations, is an environment in which the user creates maps that show how data should be piped, transformed and eventually graphed. (The reader familiar with this and similar programs will appreciate how Eye-ConTact is an attempt to explore the application of this model to textual visualization.) These maps are visual programs - they describe a complex process that the user wants performed on his or her data. What is confusing is that the final results of typical Explorer programs are data visualizations that graphically show some aspect of the data. Thus a map might show how data from a flight simulation would be processed in order to produce a graphic representing the air-flow over the wing of the simulation. The map is a visual program and the air-flow image is a data visualization. Eye-ConTact likewise has maps that are visual programs that can be run and could produce data visualizations.
Unlike general purpose visual programming environments like Prograph, Eye-ConTact is a domain-specific environment; it has been designed for creating programs in a specific domain, i.e. text analysis. There have been questions raised about the application of visual programming. It was assumed that visual programming would make programming accessible to everyone, but M. Petre and colleagues (Petre & Green 1993 and Petre 1995), among others, have suggested that visual programming benefits the domain expert more than the novice. Visual programming, like textual programming, is more effective when the users have acquired visual "reading" skills for the graphical conventions of the domain. A picture might be worth a thousand words, but without conventions and experienced viewers the same picture isn't worth the same thousand words. While general purpose visual programming may not be terribly efficient, domain-specific visual programming does have promise, precisely because of the restricted domain (See Raymond 1991). In a specific domain there is likely to be an informal consensus about the operations needed and the domain expertise that can be enhanced in a visual programming environment. Graphical elements, rather than being so abstract that they cannot be understood, can represent the common, conceptually simple operations of the domain. The expertise of those working in the domain can be harnessed by providing them with a programming tool aimed directly at their work. Ideally a domain-specific programming environment should hide the mysteries of programming and present the expert in the domain with operations that correspond to their research. This is the promise of a domain-specific environment such as Eye-ConTact - it should allow those interested in text-analysis to build applications that correspond to their research questions without having to master a new discipline.
What follows is a narrative of how you might use Eye-ConTact:
- When you run Eye-ConTact it opens a Toolbox and a Map window.
- To create an application with which to study a text, you click on icons in the Toolbox and drop them on the Map to represent the operations you want done.
- To connect the operations you click on the (red) output box of an icon and then click on the (blue) input box of the icon to which you want to connect it. After laying out an application the Map might look like this (without the labels):
- You then double-click on the operation icons which need to be fed information. This opens the form for that operation. Here is a form for specifying the text database to open:
- Once all the relevant information has been entered, you run the application by clicking on the Run & Stop button in the toolbox. One can set the program to run continuously, even while you are editing the Map. When that is the case, any time you make a significant change, the program reruns the relevant modules.
- If you find results that are interesting you can annotate the Map and save it. A saved Map keeps all the settings so that running the application again will deliver the same results.
2.2 Eye-ConTact Architecture
The underlying architecture is two-fold:
The Eye-ConTact environment is made up of modules and files. Specifically:
- A Framework module which is controls the user interface. In Eye-ConTact as it stands there is only one framework module, a visual programming environment called the Eye-ConTact Framework. We anticipate two others to provide publication interfaces.
- Process modules which process data. In the present version of Eye-ConTact the most important process module is BTACT, a variant of the TACTweb text engine.
- Files used by Eye-ConTact or produced by Eye-ConTact. In particular there is the Map file which saves a project, the TDB files used by BTACT, and the temporary files created by the various processes.
2.3 Advantages
Some of the advantages of this design are:
- You can replace any module as your project progresses, including the framework module.
- For those who already have functioning code, you can adapt your programs for use in a framework like Eye-ConTact.
- A distributed community of developers can expand such an environment as long as they agree on an inter-process communication protocol and data formats.
2.4 Disadvantages
Some of the disadvantages are:
- Such an environment is slower than a single well-designed program.
- You can end up with a mess of uncooperative modules that would be even less accessible and more frustrating.
- Without communication and data standards a distributed community of developers could end up with incompatible modules.
3. How Eye-ConTact Deals with the Identified Limitations
How does Eye-ConTact deal with the limitations identified in the first part of this paper?
- 3.1 Any module can be replaced, as can the framework.
- 3.2 Visual maps record and show the logic of one's exploration.
- 3.3 Alternative frameworks allow one to share results.
3.1 Extending with Modularity
Eye-ConTact deals with the issue of extension by encouraging modularity. Not only does Eye-ConTact consist of a collection of modules, including the framework module, but it also presents the user with a modular interface paradigm. The user is encouraged to think of a text analysis project as a flow of data from one module to another. Eye-ConTact is, in effect, a tool for managing smaller modules and passing data from one to the other.
Eye-ConTact goes further in the support of modularity. It has built-in tools for adding modules or repurposing existing code to act as modules. The extension tools include:
- A generic DOS process tool
- If a DOS program performs a needed task, one can call it from within Eye-ConTact. Eye-ConTact can even pass parameters to the program in certain circumstances.
- A module builder
- We will offer a tool within Eye-ConTact for creating the interface to modules written by others for Eye-Contact. The module builder allows items to be added to the toolbar and simple forms to be built for the control of the module.
-
Accessible framework design
- Finally, the Eye-ConTact framework program is built in Visual Basic in a fashion that allows it to be extended easily by those who are familiar with Visual Basic. We hope that add-on modules in Visual Basic can eventually be integrated into the system without having to consult the original code.
3.2 Recording with Maps
Eye-Contact deals with the problem of recording the logic of an exploration by encouraging the user to lay out the fundamental steps in a visual environment. The user creates the logic by dragging out icons and "connecting the dots". This has the advantage of acting as both a record of the flow of choices made and a synoptic description of that flow, which should make the research easier to grasp. John Bradley experimented in TACT with a macro language that could record activities and replay them. This had the disadvantage that it was hard to "read" the macro file, let alone change the process. Graphical representation shows the logic in a more intuitive way. With the help of these records, users can build new projects, show and exchange them, and, finally, create custom applications that hide the logic.
Eye-ConTact also has an annotation tool that allows one to insert comments on the Map and attach them to particular operations. This is to encourage verbose and contextualized discussion of the project. Such annotations are particularly useful when the Map is to be shared.
One outcome of the Eye-ConTact approach is that it forces the user to map out the logic of his or her exploration before generating any results. From one perspective this is a disadvantage. The novice who does not have a research agenda, but is simply testing the tool, will have to make a map before he or she can see anything. By contrast, interactive tools can present the user with a default collection of displays (the word list, the KWIC list, and a full text display) already to be clicked on. The user can learn about text analysis by clicking on any of the displays and watching what happens in the others. While Eye-ConTact does not offer this sort of immediate feedback, it can be used to create such interactive tools. If we think of Eye-ConTact as a visual programming tool, it can be used by experts to create applications that are interactive and immediate. In fact, in Eye-ConTact one can, in theory, create any other type of text analysis tool.
One design issues that the Eye-ConTact prototype has raised is the degree of detail to be shown on Map display. If the Map is to show the logic of a project it needs to show more than generic icons for operations whose details are hidden in forms. At the same time, there are operations (especially displays) whose details would be too verbose to be shown on the Map. What we need is to find a way to let the user spill out the details and some of the resulting displays onto the Map so that it can serve as a reasonably complete representation of the whole.
3.3 Sharing with Alternative Frameworks
With its visual programming paradigm, the Eye-ConTact Framework is only one possible user interface. If the framework module which manages the modules and the user interface is treated as one more replaceable module then one can share applications with alternative frameworks that present alternative user interfaces. The researcher should be able to create an interactive package for the audience, once interesting results have been mapped out. The audience could be students or scholars. Such interactive publications would have the following features:
- Protection of Intellectual Property: A research package that is to be distributed widely should have built-in protection to make it awkward for the original electronic text to be stripped out. If one wants to share research based on texts that are not in the public domain one needs tools that protect the owner's rights. If such protection existed in tools we might be able to negotiate appropriate licenses from owners.
- Appropriate Features: A research package that is designed to forefront the research in an accessible fashion should have only the minimum features needed for the research issue at hand. This means the packaging should hide unnecessary features.
- Interactivity: A research publication should invite interactive exploration. Unlike the research development environment where one is encouraged to map out a project in a disciplined fashion, a research publication should first encourage exploration of the results. If research based on computer-assisted text analysis has not made it out of the humanities computing ghetto it is because our colleagues cannot try out what we have done. A research publication should be like a research play-pen where our colleagues can understand what we have achieved without having to build everything afresh.
We envisage two types of publications that Eye-ConTact is being designed to support:
- Network Publications: An increasingly important medium for the publication of research is the World Wide Web. Eye-ConTact, as it uses a relative of TACTweb (a CGI version of TACT) should be capable of saving projects as specialized CGI programs. One of the difficulties of using TACTweb today is setting it up. We hope to adapt Eye-ConTact to allow a developer to save an application for mounting on the web as a combination of HTML pages and CGI modules. The HTML pages could then be expanded into a full on-line paper with embedded forms to interact with the CGI program.
- Stand-Alone Publications: Like any interpreted programming environment Eye-ConTact will have a feature that will allow the developer to save a project in a form where the Map will disappear and only the relevant forms will be accessible to the user. This is how we envisage creating interactive publications for student or collegial use. By replacing the Eye-ConTact Framework with a run-time version, a researcher could distribute an application while hiding the displays and features that are not needed for publication.
Conclusion
Tact is after all a kind of mind reading. (Sarah Orne Jewett)
Eye-ConTact is both a design philosophy and a crude environment cobbled together out of modified existing tools like TACT and some new tools like the Eye-ConTact Framework program. This last program is a prototype designed to test the interface paradigm proposed here where the user maps out experiments with texts. A modular design encourages one to cobble together such prototypes in the hopes of being able to create more robust modules later. We hope the reader will not condemn the design because the prototype is flawed. In our defense, we believe that the best way to test a design philosophy is to build prototypes that can be tried by scholars on real projects. The project has already revealed some interesting issues regarding the design, which we will note here by way of conclusion:
- What should be represented in a visual development environment such as this: a) the flow of research decisions, or b) the flow of data between actual modules? The same project could look very different when laid out, depending on what one represents.
- Can we design communication protocols and data formats for text analysis modules that would not constrain future developments? If such a protocol and formats could be standardized, would they encourage distributed development of tools?
- What level of detail should be shown on a Map to make it a reasonable representation of the logic of a project? How can details hidden in the forms be shown on the Map when they are important?
Credits
We would like to acknowledge the following people and organizations:
The BTACT module reuses TACT code originally written by John Bradley and Lidio Presutti.
The McMaster Arts Research Board provided financial and programming support for this project.
The Eye-ConTact Framework program is being created in Visual Basic by Patricia Monger, Visualization Specialist, McMaster University.
The TG graphics module was programmed by Mark Janoska.
Notes
[1] TACT was developed originally by John Bradley and Lidio Presutti at the University of Toronto starting in 1984. For more information, or to download TACT, click here. In 1995 TACT was adapted so that it could be a CGI (Common Gateway Interface) program. The result was TACTweb, which can be tried or downloaded through the World Wide Web. The manual for TACT, a CD-ROM with TACT, and an extensive collection of texts is now published by the MLA under the title Using TACT with Electronic Texts (Lancashire et al. 1996).
In this paper we will use TACT as an example of a traditional text analysis tool, partly because one of us was involved in its design and development, and partly because it is still one of the best tools of its kind.
[2] TACTweb connects TACT to the World Wide Web -- making a TACT TDB database accessible to the entire WWW community. By using WWW forms, users have access to some of the interactive services that TACT provides them, but without requiring them to use TACT itself, or have a copy of the TACT database on their own machine. TACTweb can also be thought of as a text engine module called by other programs like a WWW server or another text-analysis program. In fact Eye-ConTact uses TACTweb in just this fashion: as a module that is called when needed.
For more information about TACTweb click here.
[3] One might ask if such extensibility is a reasonable goal? Perhaps we should not set our sights so high given the modest programming resources in the humanities. We believe such extensibility is not only feasible, it is the best way to ensure the long term survival of research tools. Like Jason's boat, Eye-ConTact is designed to be a collection of tools which can be slowly replaced, component by component, over time, as research in humanities computing evolves. Any research tool project that is closed may not survive the natural curiosity of researchers.
Bibliography
- ARNHEIM, R. (1996). Visual Thinking, Berkeley: University of California Press.
- BERTIN, J. (W.J. Berg, tr.) (1983). Semiology of Graphics, Madison: University of Wisconsin Press.
- BRAINERD, B. (1987). "Textual Analysis and Synthesis by Computer", Abacus 4.2: 8-18.
- BRUNET, E. (1991). "What Do Statistics Tell Us?", in S. Hockey, N. Ide & I. Lancashire (eds.), Research in Humanities Computing, Vol. 1, Oxford: Clarendon Press.
- EARNSHAW, R.A. & N. WISEMAN (1992). An Introductory Guide to Scientific Visualization, Berlin: Springer-Verlag.
- FOULSER, D. (1995). "IRIS Explorer: A Framework for Investigation", Computer Graphics, 29.2: 13-6.
- HUME, David (Stanley Tweyman, ed.) (1991). Dialogues Concerning Natural Religion, New York: Routledge.
- IDE, N. (1989). "Computer-Assisted Analysis of Blake", in R. Potter (ed.), Literary Computing and Literary Criticism: Theoretical and Practical Essays on Theme and Rhetoric, Philadelphia: University of Pennsylvania Press.
- LANCASHIRE, I., J. BRADLEY, W. MCCARTY, M. STAIRS & T.R. WOOLDRIDGE (1996). Using TACT with Electronic Texts, New York: The Modern Language Association of America.
- MCKINNON, A. (1989). "Mapping the Dimensions of a Literary Corpus", in Literary and Linguistic Computing, 4.2: 73-84.
- MYERS, B.A. (1990). "Taxonomies of Visual Programming and Program Visualization", Journal of Visual Languages and Computing, 1: 97-123.
- PETRE, M. (1995). "Why Looking Isn't Always Seeing: Readership Skills and Graphical Programming", Communications of the ACM, 38.6: 33-44.
- PETRE, M. & T.R.G. GREEN (1993). "Learning to Read Graphics: Some Evidence that 'Seeing' an Information Display is an Acquired Skill", Journal of Visual Languages and Computing, 4: 55-70.
- POTTER, R.G. (1988). "Literary Criticism and Literary Computing: The Difficulties of a Synthesis", Computers and the Humanities, 22: 91-7.
- PRICE, B., I. SMALL & R. BAEKER (1993). "A Principled Taxonomy of Software Visualization", Journal of Visual Languages and Computing, 4.1: 211-66.
- RAYMOND, D.R. (1993). "Visualizing Texts" in Making Sense of Words: Proceedings of the Ninth Annual Conference of the UW Centre for the New OED and Text Research, Waterloo, Ontario: UW Centre for the New OED: 19-32.
- RAYMOND, D.R. (1991). "Characterizing Visual Languages", IEEE Workshop on Visual Languages, IEEE Workshop on Visual Languages, 176-82.
- REPENNING, A. & T. SUMNER (1995). "Agentsheets: A Medium for Creating Domain-Oriented Visual Languages", IEEE Computer, March: 17-25.
- TUFTE, E.R. (1983). The Visual Display of Quantitative Information, Chesire, Connecticut: Graphics Press.