ArticlePDF Available


The emerging discipline of ‘digital humanities’ has been plagued by a perceived neglect on the part of the broader humanities community. The community as a whole tends not to be aware of the tools developed by DH practitioners (as documented by the recent surveys by Siemens et al.), and tends not to take seriously many of the results of scholarship obtained by DH methods and tools. This article argues for a focus on deliverable results in the form of useful solutions to common problems that humanities scholars share, instead of simply new representations. The question to address is what needs the humanities community has that can be dealt with using DH tools and techniques, or equivalently what incentive humanists have to take up and to use new methods. This can be treated in some respects like the computational quest for the ‘killer application’—a need of the user group that can be filled, and by filling it, create an acceptance of that tool and the supporting methods/results. Some definitions and examples are provided both to illustrate the idea and to support why this is necessary. The apparent alternative is the status quo, where digital research tools are brilliantly developed, only to languish in neglect and disuse.
Killer Applications in Digital Humanities
Patrick Juola
Duquesne University
Pittsburgh, PA 15282
August 31, 2006
The emerging discipline of “digital humanities” has been plagued by
a perceived neglect on the part of the broader humanities community.
The community as a whole tends not to be aware of the tools developed
by DH practitioners (as documented by the recent surveys by Siemens et
al.), and tends not to take seriously many of the results of scholarship
obtained by DH methods and tools. This paper argues for a focus on
deliverable results in the form of useful solutions to common problems
that humanities scholars share, instead of simply new representations.
The question to address is what needs the humanities community has
that can be dealt with using DH tools and techniques, or equivalently
what incentive humanists have to take up and to use new methods. This
can be treated in some respects like the computational quest for the “killer
application” – a need of the user group that can be filled, and by filling
it, create an acceptance of that tool and the supporting methods/results.
Some definitions and examples are provided both to illustrate the idea
and to support why this is necessary. The apparent alternative is the
status quo, where digital research tools are brilliantly developed, only to
languish in neglect and disuse.
1 Introduction
“The emerging discipline of digital humanities”. . . . Arguably, “digital humani-
ties” has been emerging for decades, without ever having fully emerged. One of
the flagship journals of the field, Computers in the Humanities, has published
nearly forty volumes, without having established the field as a mainstream sub-
discipline. The implications of this are profound; tenure-track opportunities for
DH specialists are rare, publications are not widely read or valued, and, perhaps
most seriously in the long run, the advances made are not used by mainstream
This paper analyzes some of the patterns of neglect, the ways in which
mainstream humanities scholarship fails to value and participate in the digital
humanities community. It further suggests one way to increase the profile of
this research, by focusing on the identification and development of “killer” ap-
plications (apps), computer applications that solve significant problems in the
humanities in general.
2 Patterns of Neglect
2.1 Patterns of participation
A major indicator of the neglect of digital humanities as a humanities discipline
is the lack of participation, particularly by influential or high-impact scholars.
As an example, the flagship (or at least, longest running) journal in the field
of “humanities computing” is Computers and the Humanities, which has been
published since the 1960s. Despite this, the impact of this journal has been
minimal. The Journal Citation Reports database suggests that for 2005, the
impact factor of this journal (defined as “the number of current citations to
articles published in the two previous years divided by the total number of
articles published in the two previous years”1) is a relatively low 0.196. (This
is actually a substantial improvement from 2002’s impact factor of 0.078.) In
terms of averages from 2002–4, CHum was the 6494th most cited journal out
of a sample of 8011, scoring in only the 20th percentile. By contrast, the most
influential journal in the field of “computer applications,” Bioinformatics, scores
above 3.00; Computational Linguistics scores at 0.65; the Journal of Forensic
Science at 0.75. Neither Literary and Linguistic Computing,Text Technology,
nor the Journal of Quantitative Linguistics even made the sample.
In other words, scholars tend not to read, or at least cite, work published
under the heading of humanities computing. Do they even participate? In six
years of publication (1999-2004; volumes 33–38), CHum published 101 articles,
with 205 different authorial affiliations (including duplicates) listed. Who are
these authors, and do they represent high-profile and influential scholars? The
unfortunate answer is that they do not appear to. Of the 205 affiliations, only
5 are from “Ivy League” universities, the single most prestigious and influential
group of US universities. Similarly, of the 205 affiliations, only sixteen are from
the universities recognized by US News and World Report [USNews, 2006] as
one the top 25 departments in in any of the disciplines of English, history,
or sociology. Only two affiliations are among the top ten in those disciplines.
While it is of course unreasonable to expect any group of American universities
to dominate a group of international scholars, the conspicuous and almost total
absence of faculty and students from top-notch US schools is still important.
Nor is this absence confined to US scholars; only one affiliation from the top
5 Canadian doctoral universities (according to the 2005 MacLean’s ranking)
appears. (Geoff Rockwell has pointed out that the MacLean’s rankings are
1, accessed June 15, 2006
School Papers (2005) Papers (2006)
USNews Top 10 7 4
Cal-Berkeley 1 1
Princeton 1
Stanford 1 2
Columbia 1
Johns Hopkins
Michigan-Ann Arbor 2
UNC-Chapel Hill 1 1
MacLean’s top 5 2 3
Toronto 1 (3 authors) 1
Western 1
UBC 1 1
Ivies not otherwise listed 4 6
Brown 4 (one paper 2 authors) 6
Table 1: Universities included for analysis of 2005 ACH/ALLC and 2006 DH
not necessarily the “best” research universities in Canada, and that a better
list of elite research universities would be the so-called “Group of 10” or G–
10 schools. Even with this list, only three papers — two from Alberta, one
from McMaster – appear.) Australian elite universities (the Go8) are slightly
better represented; three affiliations from Melbourne, one from Sydney. Only in
Europe is there broad participation from recognized elite universities such as the
LERU. The English-speaking LERU universities (UCL, Cambridge, Oxford, and
Edinburgh) are all represented, as are the universities of Amsterdam, Leuven,
Paris, and Utrecht despite the language barrier. However, students and faculty
from Harvard, Yale, Berkeley, Toronto, McGilli, and Adelaide — in many cases,
the current and future leaders of the fields — are conspicuously absent.
Perhaps the real heavyweights are simply publishing their DH work else-
where, but are still a part of the community? A study of the 118 abstracts
accepted to the 2005 ACH/ALLC conference (Victoria) shows that only 7 in-
cluded affiliations from universities in the “top 10” of the USNews ranking.
Only two came from universities in the “top 5” of the Maclean ranking, and
only 6 from Ivies (Four of those six were from the well-established specialist DH
program at Brown, a program unique among Ivies.) A similar analysis shows
low participation among the 151 abstracts at the 2006 DH conference (Paris).
The current and future leaders seem not to participate in the community, either.
2.2 Tools and awareness
People who do not participate in a field cannot be expected to be aware of
the developments it creates, an expectation sadly supported by recent survey
data. In particular, [Siemens et al., 2004, Toms and O’Brien, 2006] reported on
a survey of “the current needs of humanists” and announced that, while over
80% of survey respondents use e-text and over half use text analysis tools, they
are not even aware of “commonly available tools such as TACT, WordCruncher
and Concordancer.” The tools of which they are aware seem to be primarily
common Microsoft products such as Word and Access. This lack of awareness
is further supported by [Martin, 2005] (emphasis mine):
Some scholars see interface as the primary concern; [electronic]
resources are not designed to do the kind of search they want. Oth-
ers see selection as a problem; the materials that databases choose
to select are too narrow to be of use to scholars outside of that
field or are too broad and produce too many results. Still others
question the legitimacy of the source itself. How can an electronic
copy be as good as seeing the original in a library? Other, more
electronically oriented scholars, see the great value of accessibility of
these resources, but are unaware of the added potential for research
and teaching. The most common concern, however, is that schol-
ars believe they would use these resources if they knew they existed.
Many are unaware that their library subscribes to resources or that
universities are sponsoring this kind of research.
Similarly, [Warwick, 2004a] describes the issues involved with the Oxford
University Humanities Computing Unit (HCU). Despite its status as an “inter-
nationally renowned centre of excellence in humanities computing,”
[P]ersonal experience shows that it was extremely hard to con-
vince traditional scholars in Oxford of the value of humanities com-
puting research. This is partly because so few Oxford academics
were involved in any of the work the HCU carried out, and had little
knowledge of, or respect for, humanities computing research. Had
there been a stronger lobby of interested academics who had a vested
interest in keeping the centre going because they had projects asso-
ciated with it, perhaps the HCU could have become a valued part
of the humanities division. That it did not, demonstrates the con-
sequences of a lack of respect for digital scholarship amongst the
3 Killer Apps and Great Problems
One possible reason for this apparent neglect is a mismatch of expectations
between the expected needs of audience (market) for the tools and the com-
munity’s actual needs. A recent paper [Gibson, 2005] on the development of
an electronic scholarly edition of Clotel may illustrate this. The edition itself
is a technical masterpiece, offering, among other things, the ability to compare
passages among the various editions and even to track word-by-word changes.
However, it is not clear who among Clotel scholars will be interested in using
this capacity or this edition; many scholars are happy with their print copies
and the capacities print grants (such as scribbling in the margins or reading on
a park bench). Furthermore, the nature of the Clotel edition does not lend itself
well either to application to other areas or to further extension. The knowledge
gained in the process of annotating Clotel does not appear to generalize to the
annotation of other works (certainly, no general consensus has emerged about
“best practices” in the development of a digital edition, and the various pro-
posals appear to be largely incompatible and even incomparable). The Clotel
edition is essentially a service offered to the broader research community in the
hope that it will be used, and runs a great risk of becoming simply yet another
tool developed by the DH specialists to be ignored.
Quoting further from [Martin, 2005]:
[Some scholars] feel there is no incentive within the university
system for scholars to use these kinds of new resources.
— let alone to create them.
This paper argues that for a certain class of resources, there should be no
need for an incentive to get scholars to use them. Digital humanities specialists
should be in a unique position both to identify the needs of mainstream hu-
manities scholars and to suggest computational solutions that the mainstream
scholars will be glad to accept.
3.1 Definition
The wider question to address, then, is what needs the humanities community
has that can be dealt with using DH tools and techniques, or equivalently what
incentive humanists have to take up and to use new methods. This can be
treated in some respects like the computational quest for the “killer applica-
tion” – a need of the user group that can be filled, and by filling it, create an
acceptance of that tool and the supporting methods/results. Digital Humanities
needs a “killer application.”
“Killer application” is a term borrowed from the discipline of computer sci-
ence. In its strictest form, it refers to an application program so useful that
users are willing to buy the hardware it runs on, just to have that program.
One of the earliest examples of such an application was the spreadsheet, as
typified by VisiCalc and Lotus 1-2-3. Having a spreadsheet made business deci-
sionmaking so much easier (and more accurate and profitable) that businesses
were willing to buy the computers (Apple IIs or IBM PCs, respectively) just to
run spreadsheets. Gamers by the thousands have bought Xbox gaming consoles
just to run Halo. A killer application is one that will make you buy, not just
the product itself, but also invest in the necessary infrastructure to make the
product useful.
For digital humanities, this term should be interpreted in a somewhat broader
sense. Any intellectual product — a computer program, an abstract tool a the-
ory, an analytic framework — can and should be evaluated in terms of the “affor-
dances” [Gibson, 2005, Ruecker and Devereux, 2004] it creates. In this frame-
work, an “affordance” is simply “an opportunity for action” [Ruecker and Devereux, 2004];
spreadsheets, for instance, create opportunities to make business decisions quickly
on the basis of incomplete or hypothesized data, while Halo creates the opportu-
nity for playing a particular game. Ruecker provides a framework for comparing
different tools in terms of their “affordance strength,” essentially the value of-
fered by the affordances of a specific tool.
In this broader context, a “killer app” is any intellectual construct that
creates sufficient affordance strength to justify the effort and cost of accepting,
not just the construct itself, but the supporting intellectual infrastructure. It is
a solution sufficiently interesting to, by itself, retrospectively justify looking the
problem it solves — a Great Problem that can both empower and inspire.
Three properties appear to characterize such ”killer apps”. First, the prob-
lem itself must be real, in the sense that other humanists (or the public at large)
should be interested in the fruits of its solution. For example, the organizers of
a recent NSF summit on “Digital Tools for the Humanities” identified several
examples of the kinds of major shifts introduced by information technology in
various areas. In their words,
When information technology was first applied [to inventory-
based businesses], it was used to track merchandise automatically,
rather than manually. At that time, the merchandise was stored
in the same warehouses, shipped in the same way, depending upon
the same relations among produces and retailers as before[. . .]. To-
day, a revolution has taken place. There is a whole new concept
of just-in-time inventory delivery. Some companies have eliminated
warehouses altogether, and the inventory can be found at any instant
in the trucks, planes, trains, and ships delivering sufficient inventory
to re-supply the consumer or vendor — just in time. The result
of this is a new, tightly interdependent relationship between sup-
pliers and consumers, greatly reduced capital investment in “idle”
merchandise, and dramatically more responsive service to the final
A killer application in scholarship should be capable of effecting similar
change in the way that practicing scholars do their work. Only if the prob-
lem is real can an application solving it be a killer. The Clotel edition described
above appears to fail under this property precisely because only specialists in
Clotel (or in 19th-century or African-American literature) are likely to be inter-
ested in the results; a specialist in the Canterbury Tales will not find her work
materially affected.
Second, the problem must get buy-in from the humanities computing com-
munity itself, in that humanities computing specialists will be motivated to do
the actual work. The easiest and probably cheapest way to do this is for the
process of solution itself to be interesting to the participating scholars. For
example, the compiling of a detailed and subcategorized bibliography of all ref-
erences to a given body of work would be of immense interest to most scholars;
rather than having to pore through dozens of issues of thousands of journals,
they could simply look up their field of interest. (This is, in fact, very close to
the service that Thompson Scientific provides with the Social Science Citation
Index, or that Penn State provides with CiteSeer.) The problem is that though
the product is valuable, the process of compiling it is dull, dreary, and unre-
warding. There is little room for creativity, insight, and personal expression
in such a bibliography. Most scholars would not be willing to devote substan-
tial effort — perhaps several years of full-time work — to a project with such
minimal reward. (By contrast, the development of a process to automatically
create such a bibliography could be interesting and creative work.) The process
of solving interesting problems will almost automatically generate papers and
publications, draw others into the process of solving it, and create opportuni-
ties for discussion and debate. We can again compare this to the publishing
opportunities for a bibliography — is “my bibliography is now 50% complete”
a publishable result?
Third, the problem itself must be such that even a partial solution or an
incremental improvement will be useful and/or interesting. Any problem that
meets the two criteria above is unlikely to submit to immediate solution (oth-
erwise someone would probably already have solved it). Similarly, any such
problem is likely to be sufficiently difficult that solving it fully would be a ma-
jor undertaking, beyond the resources that any single individual or group could
likely muster. On the other hand, being able to develop, deploy, and use a par-
tial solution will help advance the field in many ways. The partial solution, by
assumption, is itself useful. Beyond that, researchers and users have an incen-
tive to develop and deploy improvements. Finally, the possibility of supporting
and funding incremental improvements makes it more likely to get funding, and
enhances the status of the field as a whole.
3.2 Some historical examples
To more fully understand this idea of a killer app, we should first consider the
history of scholarly work, and imagine the life of a scholar c. 1950. He (probably)
spends much of his life in the library, reading paper copies of journal articles and
primary sources to which he (or his library) has access, taking detailed notes by
hand on index cards, and laboriously writing drafts in longhand which he will
revise before finally typing (or giving to a secretary to type). His new ideas are
sent to conferences and journals, eventually to find their way into the libraries
of other scholars worldwide over a period of months or years. Collaboration
outside of his university is nearly unheard-of, in part because the process of
exchanging documents is so difficult.
Compare that with the modern scholar, who can use a photocopier or scan-
ner to copy documents of interest and write annotations directly on those copies.
She can use a word processor (possibly on a portable computer) both to take
research notes and to extend those notes into articles; she has no need to write
complete drafts, can easily rearrange or incorporate large blocks of text, and
can take advantage of the computer to handle “routine” tasks such as spelling
correction, footnote numbering, bibliography formatting, and even pagination.
She can directly incorporate the journal’s formatting requirements into her work
(so that the publisher can legitimately ask for “camera-ready” manuscripts as
a final draft), eliminating or reducing the need both for typists and typesetters.
She can access documents from the comfort of her own office or study via an
electronic network, and use advanced search technology to find and study docu-
ments that her library does not itself hold. She can similarly distribute her own
documents through that same network and make them available to be found by
other researchers. Her entire work-cycle has been significantly changed (for the
better, one hopes) by the availability of these computation resources.
We thus have several historical candidates for what we are calling “killer
apps”: xerographic reproduction and scanning, portable computing (both ar-
guably hardware instead of software), word processing and desktop publishing
(including subsystems such as bibliographic packages and spelling checkers), net-
worked communication such as Email and the Web, and search technology such
as Google. These have all clearly solved significant issues in the way humanities
research is generally performed (i.e. met the first criterion). In Ruecker’s terms,
they have all created ‘affordances” of the sort that no modern scholar would
choose to forego. The amount of research work — journals, papers, patents,
presentations, and books — devoted to these topics suggests that researchers
themselves are interested in solving the problems and improving the technolo-
gies, in many cases incrementally (e.g., “how can a search engine be tuned to
find documents written in Thai?”).
Of course, for many of these applications, the window of opportunity has
closed, or at least narrowed. A group of academics are unlikely to be able
to have the resources to build/deploy a competing product to Microsoft and/or
Google. On the other hand, the very fact that humanities scholars are something
of a niche market may open the door to incremental killer apps based upon (or
built as extensions to) mainstream software, applications focused specifically on
the needs of practicing scholars. The next section presents a partial list of some
candidates that may yield killer applications in the foreseeable future. Some of
these candidates are taken from my own work, some from the writings of others.
3.3 Potential current killer apps
3.3.1 Back of the Book Index Generation
Almost every nonfiction book author has been faced with the problem of index-
ing. For many, this will be among the most tedious, most difficult, and least
rewarding parts of writing the book. The alternative is to hire a professional
indexer (perhaps a member of an organization such as the American Society of
Indexers, and pay a substantial fee, which simply shifts
the uncomfortable burden to someone else, but does not substantially reduce it.
A good index provides much more than the mere ability to find information
in a text. The Clive Pyne book indexing company2lists some aspects of what
a good index provides. According to them, “a good index:
provides immediate access to the important terms, concepts and names
scattered throughout the book, quickly and efficiently;
discriminates between useful information on a subject, and a passing men-
has headings which are concise, accurate and unambiguous reflecting the
contents and terminology used in the text;
has sufficient cross-references to connect related terms;
anticipates how readers will search for information;
reveals the inter-relationships of topics, concepts and names so that the
reader need not read the whole index to find what they are looking for;
provides terminology which might not be used in the text, but is the
reference point that the reader will use for searching through the index;
can make the difference between a book and a very good book”
A traditional back-of-the-book (BotB) index is a substantial intellectual ac-
complishment in its own right. In many ways, it is an encapsulated and stylized
summary of the intellectual structure of the book itself. “A good index is an
objective guide to the text, a link between the author’s ideas and the reader.
It should be a road map that leads readers to every relevant idea without frus-
trating detours and dead ends.”3And it is specifically not just a concordance
or a list of terms appearing in the document.
It is thus surprising that a tedious task of such importance has not yet been
computerized. This is especially surprising given the effectiveness of search en-
gines such as Google at “indexing” the unimaginably large volume of information
on the Web. However, the tasks are subtly different; a Google search is not ex-
pected to show knowledge of the structure of the documents or the relationships
2 makes a good index.htm, accessed 5/31/2006
3Kim Smith,, accessed 5/31/2006.
among the search terms. As a simple example, a phrasal search on Google (May
31, 2006) for “a good index,” found, as expected, several articles on back of the
book indexing. It also found several articles on financial indexing and index
funds, and a scholarly paper on glycemic control as measured (“indexed”) by
plasma glucose concentrations. A good text index would be expected to identify
these three subcategories, to group references appropriately, and to offer them
to the reader proactively as three separate subheadings. A good text index is
not simply a search engine on paper, but an intellectual precis of the structure
of the text.
This is therefore an obvious candidate for a killer application. Every hu-
manities scholar needs such a tool. Indeed, since chemistry texts need indexing
as badly as history texts do, scholars outside of the humanities also need it.
Unfortunately, not only does it not (yet) exist, but it isn’t even clear at this
writing what properties such a tool would have. Thus there is room for fun-
damental research into the attributes of indices as a genre of text, as well as
into the fundamental processes of compiling and evaluating indices and their
expression in terms of algorithms and computation.
I have presented elsewhere [Juola, 2005, Lukon and Juola, 2006] a possible
framework to build a tool for the automatic generation of such indices. With-
out going into technical detail,the framework identifies several important (and
interesting) cognitive/intellectual tasks that can be independently solved in an
incremental fashion. Furthermore, this entire problem clearly admits of an in-
cremental solution, because a less-than-perfect index, while clearly improvable,
is still better than no index at all, and any time saved by automating the more
tedious parts of indexing will still be a net gain to the indexer. Thus all three
components of the definition of killer app given above are present, suggesting
that the development of such an indexing tool would be beneficial both inside
and outside the digital humanities community.
3.3.2 Annotation tools
As discussed above, one barrier to the use of E-texts and digital editions is the
current practices of scholars with regard to annotation. Even when documents
are available electronically, many researchers (myself include) will often choose
to print them and study them on paper. Paper permits one not only to mark
text up and to make changes, but also to make free-form annotations in the
margins, to attach PostIt notes in a rainbow of colors, and to share commentary
with a group of colleagues. Annotation is a crucial step in recording a reader’s
encounter with a text, in developing an interpretation, and in sharing that
interpretation with others.
The recent IATH Summit on Digital Tools for the Humanities [IATH Summit, 2006]
identified this process of annotation and interpretation as a key process underly-
ing humanistic scholarship, and specifically discussed the possible development
of a tool for digital annotation, a “highlighter’s tool,” that would provide the
same capacities of annotation of digital documents, including multimedia doc-
uments, that print provides. The flexibility of digital media means, in fact,that
one should be able to go beyond the capacities of print — for example, instead
of doodling a simple drawing in the margin of a paper, one might be able to
“doodle” a Flash animation or a .wav sound file.
Discussants identified at least nine separate research projects and communi-
ties that would benefit from such a tool. Examples include “a scholar currently
writing a book on Anglo-American relations, who is studying propaganda films
produced by the US and UK governments and needs to compare these with
text documents from on-line archives, coordinate different film clips, etc.”; “an
add-on tool for readers (or reviewers) of journal articles,” especially of electronic
journal systems (The current system of identifying comments by page and line
number, for example, is cumbersome for both reviewers and authors.); and “an
endangered language documentation project that deals with language variation
and language contact,” where multilingual, multialphabet, and multimedia re-
sources must be coordinated among a broad base of scholars. Such a tool has
the potential to change the annotation process as much as the word processor
has changed the writing and publication process.
Can community buy-in be achieved? There is certainly room for research
and for incremental improvements, both in defining the standards and capacities
of the annotations and in expanding those capacities to meet new requirements
as they evolve. For example, early versions of such a project would probably not
be capable handling all forms of multimedia data; a research-quality prototype
might simply handle PDF files and sound, but not video. It’s not clear that the
community support is available for building early, simple versions – although “a
straw poll showed that half of [the discussants] wanted to build this kind of tool,
and all wanted to use it.” [IATH Summit, 2006], responding to a straw poll is
one thing and devoting time and resources is another altogether; it is not clear
that any software development on this project has yet happened. However, given
the long-term potential uses and research outcomes from this kind of project, it
clearly has the potential to be a killer application.
3.3.3 Resource exploration
Another issue raised at the summit is that of resource discovery and explo-
ration. The huge amount of information on the Web is, of course, a tremendous
resource for all of scholarship, and companies such as Google (especially with
new projects such as Google Images and Google Scholar) are excellent at finding
and providing access. On the other hand, “such commercial tools are shaped
and defined by the dictates of the commercial market, rather than the more
complex needs of scholars.” [IATH Summit, 2006] This raises issues about ac-
cess to more complex data, such as textual markup, metadata, and data hidden
behind gateways and search interfaces. Even where such data is available, it is
rarely compatible from one database to another, and it’s hard to pose questions
to take advantage of the markup.
In the words of the summit report,
What kinds of tools would foster the discovery and exploration
of digital resources in the humanities? More specifically, how can we
easily locate documents (in multiple formats and multiple media),
find specific information and patterns in across [sic] large numbers
of scholarly disciplines and social networks? These tasks are made
more difficult by the current state of resources and tools in the hu-
manities. For example, many materials are not freely available to
be crawled through or discovered because they are in databases that
are not indexed by conventional search engines or because they are
behind subscription-based gates. In addition, the most commonly
used interfaces for search and discovery are difficult to build upon.
And, the current pattern of saving search results (e.g., bookmarks)
and annotations (e.g., local databases such as EndNote) on local
hard drives inhibits a shared scholarly infrastructure of exploration,
discovery, and collaboration.
Again, this has the potential to effect significant change in the day-to-day
working life of a scholar, by making collaborative exploration and discovery
much more practical and rewarding, possibly changing the culture by creating
a new “scholarly gift economy in which no one is a spectator and everyone can
readily share the fruits of their discovery efforts.” “Research in the sciences has
long recognized team efforts. .. . A similar emphasis on collaborative research
and writing has not yet made its way into the thinking of humanists.”
But, of course, what kind of discovery tools would be needed? What kind of
search questions should be supported? How can existing resources such as lexi-
cons and ontologies be incorporated into the framework? How can it take advan-
tage of (instead of competing with) existing commercial search utilities? These
questions illustrate many of the possible research avenues that could be explored
in the development of such an application. Jockers’ idea of “macro lit-o-nomics
(macro-economics for literature)” [Jockers, 2005] is one approach that has been
suggested to developing useful analysis from large datasets; Ruecker and De-
veraux [Ruecker and Devereux, 2004] and their “Just-in-Time” text analysis is
another. In both projects, the researchers showed that interesting conclusions
could be drawn by analyzing the large-scale results of automatically-discovered
resources and looking at macro-scale patterns of language and thought.
3.3.4 Automatic essay grading
The image of a bleary-eyed teacher, bent over a collection of essays at far past
her bedtime is a traditional one. Writing is a traditional and important part
of the educational one, but most instructors find the grading of essays to be
time-consuming, tedious, and unrewarding. This applies regardless of the sub-
ject; essays on Shakespeare are not significantly more fun to grade than essays
on the history of colonialism. The essay grading problem is one reason that
multiple choice tests are so popular in large classes. We thus have another po-
tential “killer app,” an application to handle the chore of grading essays without
interfering with the educational process.
Several approaches to automatic essay grading have been tried, with rea-
sonable but not overwhelming success. At a low enough level, essay grading
can be done successfully just by looking at aspects of spelling, grammar, and
punctuation, or at stylistic continuity [Page, 1994]. Foltz [Foltz et al., 1999] has
also shown good results by comparing semantic coherence (as measured, via La-
tent Semantic Analysis, from word cooccurances) with that of essays of known
LSA’s performance produced reliabilities within the range of their
comparable inter-rater reliabilities and within the generally accepted
guidelines for minimum reliability coefficients. For example, in a set
of 188 essays written on the functioning of the human heart, the av-
erage correlation between two graders was 0.83, while the correlation
of LSA’s scores with the graders was 0.80. . . .
In a more recent study, the holistic method was used to grade
two additional questions from the GMAT standardized test. The
performance was compared against two trained ETS graders. For
one question, a set of 695 opinion essays, the correlation between
the two graders was 0.86, while LSA’s correlation with the ETS
grades was also 0.86. For the second question, a set of 668 analysis
of argument essays, the correlation between the two graders was 0.87,
while LSA’s correlation to the ETS grades was 0.86. Thus, LSA was
able to perform near the same reliability levels as the trained ETS
Beyond simply reducing the workload of the teacher, this tool has many
other uses. It can be used, for example, as a method of evaluating a teacher for
consistency in grading, or for ensuring that several different graders for the same
class use the same standards. More usefully, perhaps, it can be used as a teach-
ing adjunct, by allowing students to submit rough drafts of their essays to the
computer and re-write until they (and the computer) are satisfied. This will also
encourage the introduction of writing into the curriculum in areas outside of tra-
ditional literature classes, and especially into areas where the faculty themselves
may not be comfortable with the mechanics of teaching composition. Research
into automatic essay grading is a active area among text categorization scholars
and computer scientists for the reasons cited above. [Valenti et al., 2003]
From a philosophical point of view, though, it’s not clear that this approach
to essay grading should be acceptable. A general-purpose essay grader can do
a good job of evaluating syntax and spelling, and even (presumably) grade “se-
mantic coherence” by counting if an acceptable percentage of the words are close
enough together in the abstract space of ideas. What such a grader cannot do
is evaluate factual accuracy or provide discipline-specific information. Further-
more, the assumption that there is a single grade that can be assigned to an
essay, irrespective of context and course focus, is questionable. Here is an area
where a problem has already been identified, applications have been and con-
tinue to be developed, uptake by a larger community is more or less guaranteed,
but the input of humanities specialists is crucially needed to improve the service
quality provided.
4 Discussion
The list of problems in the preceeding section is not meant to be either exclusive
or exhaustive, but merely to illustrate the sort of problems for which killer apps
can be designed and deployed. Similarly, the role for humanities specialists to
play will vary from project to pro ject – in some cases, humanists will need to
play an advisory role to keep a juggernaut from going out of control (as might
be needed with the automatic grading), while in others, they will need to create
and nurture a software project from scratch. The list, however, shares enough
to illustrate both the underlying concept and its significance. In other words,
we have an answer to the question “what?” — what do I mean by a “killer
application,” what does it mean for the field of digital humanities, and, as I
hope I have argued, what can we do to address the perennial problem of neglect
by the mainstream.
An equally important question, of course, is “how?” Fortunately, there
appears to be a window opening, a window of increased attention and avail-
able research opportunities in the digital humanities. The IATH summit cited
above [IATH Summit, 2006] is one example, but there are many others. Re-
cent conferences such as the first Text Analysis Developers Alliance (TADA),
in Hamilton (2005), the Digital Tools Summit for Linguistics in East Lansing
(2006), the E-MELD Workshops (various locations, 2000–6), the Cyberinfras-
tructure for Humanities, Arts, and Social Sciences workshop at UCSD (2006),
and the recent establishment of the Working Group on Community Resources
for Authorship Attribution (New Brunswick, NJ; 2006) illustrate that digital
scholarship is being taken more seriously. The establishment of Ray Siemens in
2004 as the Canada Research Chair in Humanities Computing is another impor-
tant milestone, marking perhaps the first recognition by a national government
of the significance of Humanities Computing as an acknowledged discipline.
Perhaps most important in the long run is the availability of funding to
support DH initiatives. Many of the workshops and conferences described above
were partially funded by competitively awarded research grants from national
agencies such as the National Science Foundation. The Canadian Foundation
for Innovation has been another major source of funding for DH initiatives. But
perhaps the most significant development is the new (2006) Digital Humanities
Initiative at the (United States) National Endowment for the Humanities. From
the website4:
NEH has launched a new digital humanities initiative aimed
at supporting projects that utilize or study the impact of digital
technology. Digital technologies offer humanists new methods of
conducting research, conceptualizing relationships, and presenting
4, accessed 6/18/2006
scholarship. NEH is interested in fostering the growth of digital hu-
manities and lending support to a wide variety of projects, including
those that deploy digital technologies and methods to enhance our
understanding of a topic or issue; those that study the impact of
digital technology on the humanities–exploring the ways in which it
changes how we read, write, think, and learn; and those that digitize
important materials thereby increasing the public’s ability to search
and access humanities information.
The list of potentially supported projects is large:
apply for a digital humanities fellowship (coming soon!)
create digital humanities tools for analyzing and manipulating humanities
data (Reference Materials Grants, Research and Development Grants)
develop standards and best practices for digital humanities (Research and
Development Grants)
create, search, and maintain digital archives (Reference Materials Grants)
create a digital or online version of a scholarly edition (Scholarly Editions
work with a colleague on a digital humanities project (Collaborative Re-
search Grants)
enhance my institution’s ability to use new technologies in research, educa-
tion, preservation, and public programming in the humanities (Challenge
study the history and impact of digital technology (Fellowships, Faculty
Research Awards, Summer Stipends)
develop digitized resources for teaching the humanities (Grants for Teach-
ing and Learning Resources)
Most importantly, this represents an agency-wide initiative, and thus illus-
trates the changing relationship between the traditional humanities and digital
scholarship at the very highest levels.
Of course, just as windows can open, they can close. To ensure continued
access to this kind of support, the supported research needs to be successful.
This paper has deliberately set the bar high for “success,” arguing that digi-
tal products can and should result in substantial uptake and effect significant
changes in the way that, as NEH put it, “how we read, write, think, and learn.”
The possible problems discussed earlier are an attempt to show that we can
effect such changes. But the most important question, of course, is “should
“Why?” Why should scholars in the digital humanities try to develop this
software and make these changes? The first obvious answer is simply one of self-
interest as a discipline. Solving high-profile problems is one way of attracting
the attention of mainstream scholars and thereby getting professional advance-
ment. Warwick [Warwick, 2004b] illustrates this in her analysis of the citations
of computational methods, and the impact of a single high-profile example. Of
all articles studied, the only ones that cited computation methods did so in the
context of Don Foster’s controversial analysis of “A Funeral Elegy” to Shake-
The Funeral Elegy controversy provides a case study of circum-
stances in which the use of computational techniques was noticed
and adopted by mainstream scholars. The paper argues that a com-
plex mixture of a canonical author (Shakespeare) and a star scholar
(Foster) brought the issue to prominence. . . .
The Funeral Elegy debate shows that if the right tools for tex-
tual analysis are available, and the need for, and use of, them is
explained, some mainstream scholars may adopt them. Despite the
current emphasis on historical and cultural criticism, scholars will
surely return in time to detailed analysis of the literary text. There-
fore researchers who use computational methods must publish their
results in literary journals as well as those for humanities computing
specialists. We must also realize that the culture of academic disci-
plines is relatively slow to change, and must engage with those who
use traditional methods. Only when all these factors are understood
and are working in concert, may computational analysis techniques
truly be more widely adopted.
Implicit in this, of course, is the need for scholars to find results that are
publishable in mainstream literary journals as well as to do the work resulting
in publication, the two main criteria of killer apps.
On a less selfish note, the development of killer applications will improve the
overall state of scholarship as a whole, without regard to disciplinary boundaries.
While change for its own sake may not necessarily be good, solutions to genuine
problems usually are. Creating the index to a large document is not fun —
it requires days or weeks of painstaking, detailed labor that few enjoy. The
inability to find or access needed resources is not a good thing. By eliminating
artificial or unnecessary restrictions on scholarly activity, scholars are freed to
do what they really want to do — to read, to write, to analyze, to produce
knowledge, and to distribute it.
Furthermore, the development of such tools will in and of itself generate
knowledge, knowledge that can be used not only to generate and enhance new
tools but to help understand and interpret the humanities more generally. Soft-
ware developers must be long-term partners with the scholars they serve, but
digital scholars must also be long-term partners, not only with the software de-
velopers, but with the rest of the discipline and its emerging needs. In many
case, the digital scholars are uniquely placed to identify and to describe the
emerging needs of the discipline as a whole. With a foot in two camps, the
digital scholars will be able to speak to the developers about what is needed,
and to the traditional scholars about what is available as well as what is under
5 Conclusion
Predicting the future is always difficult, and predicting the effects of a newly-
opened window is even more so. But recent developments suggest that digital
humanities, as a field, may be at the threshold of new series of significant de-
velopments that can change the face of humanities scholarship and allow the
“emerging discipline of humanities computing” finally to emerge.
For the past forty years, humanities computing has more or less languished
in the background of traditional scholarship. Scholars lack incentive to partici-
pate (or even to learn about) the results of humanities computing. This paper
argues that DH specialists are placed to create their own incentives by develop-
ing applications with sufficient scope to materially change the way humanities
scholarship is done. I have suggested four possible examples of such applica-
tions, knowing well that many more are out there. I believe that by actively
seeking out and solving such Great Problems – by developing such killer apps,
scholarship in general and digital humanities in particular, will be well-served.
[Foltz et al., 1999] Foltz, P. W., Laham, D., and Landauer, T. K. (1999). Auto-
mated essay scoring: Applications to educational technology. In Proceedings
of EdMedia ’99.
[Gibson, 2005] Gibson, M. (2005). Clotel: An electronic scholarly edition. In
Proceedings of ACH/ALLC 2005, Victoria, BC CA. University of Victoria.
[IATH Summit, 2006] IATH Summit (2006). Summit on digital tools for the
humanities : Report on summit accomplishments.
[Jockers, 2005] Jockers, M. (2005). Xml aware tools — catools. In Presentation
at Text Analysis Developers Alliance, McMaster University, Hamilton, ON.
[Juola, 2005] Juola, P. (2005). Towards an automatic index generation tool. In
Proceedings of ACH/ALLC 2005, Victoria, BC CA. University of Victoria.
[Lukon and Juola, 2006] Lukon, S. and Juola, P. (2006). A context-sensitive
computer-aided index generator. In Proceedings of DH 2006, Paris. Sorbonne.
[Martin, 2005] Martin, S. (2005). Reaching out: What do scholars want from
electronic resources? In Proceedings of ACH/ALLC 2005, Victoria, BC CA.
University of Victoria.
[Page, 1994] Page, E. B. (1994). Computer grading of student prose using mod-
ern concepts and software. Journal of Experimental Education, 62:127–142.
[Ruecker and Devereux, 2004] Ruecker, S. and Devereux, Z. (2004). Scraping
Google and Blogstreet for Just-in-Time text analysis. In Presented at CaSTA-
04, The Face of Text, McMaster University, Hamilton, ON.
[Siemens et al., 2004] Siemens, R., Toms, E., Sinclair, S., Rockwell, G., and
Siemens, L. (2004). The humanities scholar in the twenty-first century: How
research is done and what support is needed. In Proceedings of ALLC/ACH
2004, Gothenberg. U. Gothenberg.
[Toms and O’Brien, 2006] Toms, E. G. and O’Brien, H. L. (2006). Understand-
ing the information and communication technology needs of the e-humanist.
Journal of Documentation, (accepted/forthcoming).
[USNews, 2006] USNews (2006). U.S. News and World Report : America’s best
graduate schools (social sciences and humanities).
[Valenti et al., 2003] Valenti, S., Neri, F., and Cucchiarelli, A. (2003). An
overview of current research on automated essay grading. Journal of In-
formation Technology Education, 2:319–330.
[Warwick, 2004a] Warwick, C. (2004a). No such thing as humanities comput-
ing? an analytical history of digital resource creation and computing in the
humanities. In Proceedings of ALLC/ACH 2004, Gothenberg. U. Gothenberg.
[Warwick, 2004b] Warwick, C. (2004b). Whose funeral? a case study of com-
putational methods and reasons for their use or neglect in English studies. In
Presented at CaSTA-04, The Face of Text, McMaster University, Hamilton,
... La tâche compliquée de l'adaptation mutuelle (adaptation de l'outil aux besoins spécifiques des utilisateurs et l'apprentissage de ce nouvel outil par ses utilisateurs) est aussi exposée. À cela s'ajoute souvent la difficulté d'intéresser les IT aux SHS, avec pour conséquence une absence de vraie réciprocité dans les échanges entre ces deux domaines [Juola 2008]. ...
Article Cet article reprend les potentialités et les difficultés rencontrées lors de l’élaboration d’outils numériques dans le cadre d’un projet interdisciplinaire investiguant des dessins de dieux chez l’enfant et l’adolescent. Il se concentre sur les réorientations du projet qui en ont découlé par rapport aux attentes de départ. En particulier, nous décrirons comment des solutions ont été trouvées pour profiter des potentialités des outils numériques et quels choix ont dû être effectués concernant ces outils, en fonction de quatre types de défis : l’introduction et l’adaptation d’outils numériques aux questions de recherche des sciences humaines et sociales, le développement simultané des outils et des questions de recherche, les limitations techniques dues à la nouveauté du sujet du point de vue numérique, et la collaboration entre les sciences humaines et les spécialistes du numérique.
... Pliny (, and the Open Annotation Collaboration ( (Bradley, 2012;Juola, 2008). Yet however many technical teams work on the problem, and however many interesting digital solutions are offered, the majority of users continue to make use of the method used by Cosin himself: writing notes by hand, sometimes in the text of the book itself. ...
Certain problems in the design of digital systems for use in cultural heritage and the humanities have proved to be unexpectedly difficult to solve. For example, Why is it difficult to locate ourselves and understand the extent and shape of digital information resources? Why is digital serendipity still so unusual? Why do users persist in making notes on paper rather than using digital annotation systems? Why do we like to visit and work in a library, and browse open stacks, even though we could access digital information remotely? Why do we still love printed books, but feel little affection for digital e-readers? Why are vinyl records so popular? Why is the experience of visiting a museum still relatively unaffected by digital interaction? The article argues that the reasons these problems persist may be due to the very complex relationship between physical and digital information and information resources. I will discuss the importance of spatial orientation, memory, pleasure, and multi-sensory input, especially touch, in making sense of, and connections between physical and digital information. I will also argue that, in this context, we have much to learn from the designers of early printed books and libraries, such as the Priory Library and that of John Cosin, a seventeenth-century bishop of Durham, which is part of the collections of Durham University library.
... Additional question is whether the field of digital humanities represents the best route of that transition. While discourses surrounding digital humanities assert expectations about revolutionizing and/or saving humanities (see 4Humanities, Mission), studies suggest that this field still resides on the margins of humanities scholarship (see Juola, 2008;Thaller, 2012). At the same time, digital technologies have become prevalent in humanists' daily practices, revealing an evolution of humanities scholarship that has been slowly unfolding in the background, transforming humanists' research and teaching practices (see Liu, 2009). ...
... Tools für Dokumentationszwecke, Evaluation und die Überprüfung von Qualität leisten ihren Beitrag zu valider Wissenschaft, haben aber eine geringere Verbreitung (Bosman und Kramer, 2015). Juola (2008) schreibt, dass vor allem Tools, die drängende Probleme der Geisteswissenschaften lösen, das Potential haben zu "Killer applications" zu werden. Er führt hier beispielsweise einen Generator für Indizes sowie Tools zur Annotation von Texten an, als auch Tools, um Ressourcen zu entdecken und näher zu erforschen. ...
... 6 In a self-reflective paper, Peter Robinson -on behalf of digital editorial philology -ponders the slow changeover from printed critical editions to digital text editions, 7 and similarly Patric Juola examines how "the emerging discipline of 'digital humanities' has been plagued by a perceived neglect on the part of the broader humanities community." 8 As John Bradley points out, only a small percentage of humanist scholars "go beyond general purpose information technology and use digital resources and more complex digital tools in their scholarship." 9 For this situation to improve, Robinson and Juola call for new tools and killer applications; Robinson for collation tools and Juola tools for index generation, resource exploration and collaborative research. ...
Full-text available
In this study, interviews reveal that sustained discontinuous reading constitutes a distinctive reading characteristic among humanist scholars. Preferring paper to screen, scholars actively use their hands in flicking, jumping, underlining and annotating when studying long-form academic and literary texts. The study has been carried out against a background of complaints amongst digital humanities scholars on the rather moderate use of digital resources and a general neglect on the part of the broader humanities community, a situation which some researchers argue can only be improved by designing tools that resemble the ordinary study habits of humanist scholars. In the digital humanities, critical interest in reading seems to have waned, and some scholars are now calling for more user-oriented basic research. Outside the digital humanities, however, research on reading is immense, and a brief review confirms that a great deal of knowledge has been acquired, much of which is highly relevant for the digital humanities. Focusing on physical aspects, the study examines continuousness in academic reading, text immersion, multimodality, hypertext, and the use of body, hands and fingers, while at the same time comparing paper and screen reading. The study describes how participants use the Web for searching and accessing literature, whereas sustained reflective reading is done on paper and dominated by unfaithful and discontinuous reading characterised by manual handling, underlining and writing.
... Authors in other fields have suggested the use of computational methods to aid literature reviews (e.g. Kostoff et al 2001, Juola 2008, Ananiadou et al 2009, but the utility has not yet been demonstrated in environmental sciences, and use of text mining techniques for environmental analyses remains limited. ...
Full-text available
Digitally-aided reviews of large bodies of text-based information, such as academic literature, are growing in capability but are not yet common in environmental fields. Environmental sciences and studies can benefit from application of digital tools to create comprehensive, replicable, interdisciplinary reviews that provide rapid, up-to-date, and policy-relevant reports of existing work. This work reviews the potential for applications of computational text mining and analysis tools originating in the humanities to environmental science and policy questions. Two process-oriented case studies of digitally-aided environmental literature reviews and meta-analyses illustrate potential benefits and limitations. A medium-sized, medium-resolution review (~8000 journal abstracts and titles) focuses on topic modeling as a rapid way to identify thematic changes over time. A small, high-resolution review (~300 full text journal articles) combines collocation and network analysis with manual coding to synthesize and question empirical field work. We note that even small digitally-aided analyses are close to the upper limit of what can be done manually. Established computational methods developed in humanities disciplines and refined by humanities and social science scholars to interrogate large bodies of textual data are applicable and useful in environmental sciences but have not yet been widely applied. Two case studies provide evidence that digital tools can enhance insight. Two major conclusions emerge. First, digital tools enable scholars to engage large literatures rapidly and, in some cases, more comprehensively than is possible manually. Digital tools can confirm manually identified patterns or identify additional patterns visible only at a large scale. Second, digital tools allow for more replicable and transparent conclusions to be drawn from literature reviews and meta-analyses. The methodological subfields of digital humanities and computational social sciences will likely continue to create innovative tools for analyzing large bodies of text, providing opportunities for interdisciplinary collaboration with the environmental fields.
Humanities scholars face many problems when trying to design, build, present, and maintain digital humanities projects. To mitigate these problems and to improve the user experience of digital humanities collections, it is essential to understand the problems in detail. However, we currently have a fragmented and incomplete picture of what these problems actually are. This study presents a wide systematic literature review (SLR) on the problems encountered by humanities scholars when adopting particular software tools in digital humanities projects. As a result of this review, this paper finds problems in different categories of tools used in digital humanities. The practice barriers can be divided into four types: content, technique, interface, and storage. These results draw a full picture of problems in tools usage, suggest digital humanities discipline further improve tools application and offer developers of software designed for humanities scholars some feedback to make them optimize these tools.
Full-text available
Digital humanities has become an influential and widely adopted term only in the past decade. Beyond the rapid multiplication of associations, centres, conferences, journals, projects, blogs, and tweets frequently used to signal this emergence, if anything characterizes the field during this time it is a concern with definition. This focus is acknowledged and reflected, for instance, in Matthew Gold’s 2012 edited collection, Debates in Digital Humanities. The debates surveyed are overwhelmingly definitional: ‘As digital humanities has received increasing attention and newfound cachet, its discourse has grown introspective and self-reflexive’ (x). Questions that Gold identifies as central to and expressive of the emerging field include: Does one need to build or make things to be part of the digital humanities? ‘Does DH need theory? Does it have a politics? Is it accessible to all members of the profession’, or only those working at elite, well-funded institutions? ‘Can it save the humanities? The university?’ (xi).
Libraries and cultural institutions have been proactive in adopting different policies for preservation of culture. This is evident by the growing number of cultural repositories and digital libraries set for managing and making accessible different forms of cultural assets ranging from folklore, custom documentaries, craft designs and patterns, architectural setups etc. These procedures not only help them to preserve valuable indigenous knowledge but explore the richness in the cultural values of different nations. The proliferation of Information communication technology (ICT) has resulted in the merging of different forms of digitalized information which combine print, voice, video, and graphics for educational and recreational purposes. The application of Digital Humanities in preservation, management and accessibility of cultural resources ranging from curating online collections to data mining large cultural data sets cannot be neglected. The chapter discusses the concept of Digital Humanities in the light of its rich background and importance in present times for preserving human culture by acquiring, managing and making available cultural assets for further research. The chapter also attempts to explore and identify the recent contributions to the concept by analyzing ongoing Digital Humanities initiatives and projects by different organizations and information centers to stimulate future Research and development trend in the field.
Full-text available
Today while historians producing interactive maps and linguists use computer technology in order to determine the word pattern used in texts, it is observed that researchers from almost every discipline use internet and web interactively. It can be claimed that boundaries among the disciplines have started to become uncertain with convergence. As a result of these developments, digital humanities emerged by the beginning of using computer technologies in traditional humanities. The aim of this study was to determine the general characteristics of Digital Humanities. In this way, it is aimed to identify new types of resources and to identify the challenges of new methods and research behaviors for libraries. In this study, the scientometrics was conducted by considering scientific papers published in the field of digital humanities. Accessed publications were evaluated according to the concepts/terms used in keyword plus, document abstracts and titles. Multiple Correspondence Analysis was used in the analysis of the concepts (keywords, title or abstracts). Scope/field-oriented findings as a result of the analysis were discussed. This study tried to answer the questions such as; which kind of services do new resource types appeared in digital humanities require in libraries? Do methods used in digital humanities require new infrastructures for libraries and organizations providing information services? And points out the new challenges posed by digital humanities for libraries and information services. This study presented findings related to emerging new resource types, needs and research behaviors appeared because of developing digital humanities fields for organizations, especially libraries, providing information services.
Full-text available
The potentials for teaching and learning using technology are tremendous. Now, more than ever before, computers have the ability to spread scholarship around the globe, teach students with new methodologies, and engage with primary resources in ways previously unimaginable. The interest among humanities computing scholars has also grown. In fact at ACH/ALLC last year, Claire Warwick and Ray Siemens et al. gave some excellent papers on the humanities scholar and humanities computing in the 21st century. Additionally, in the most recent version of College and Research Libraries (September 2004), a survey was conducted specifically among historians to determine what electronic resources they use. The interest in this is obviously growing, and the University of Michigan as both a producer of large digital projects as well as a user of such resources is an interesting testing ground for this kind of survey data. Theoretically, Michigan should be a potential model for high usage and innovative research and teaching. In many cases it is; nevertheless, when one looks at the use of electronic resources in the humanities across campus and their use in both the classroom and innovative research, it is not what it could be. The same is true at other universities. At many universities across the U.S. and Canada, including those with similar large scale digitization efforts, use remains relatively low and new potentials of electronic resources remain untapped. Why?
This chapter starts by asking why people are making electronic editions, and to some extent the discussion has focused on the challenges of electronic editing. The author argues that these very challenges contribute to the attraction of working in this medium. Fundamental aspects of literary editing are up for reconsideration: the scope of what can be undertaken, the extent and diversity of the audience, and the query potential that – through encoding – can be embedded in texts and enrich future interpretations. Electronic editing can be daunting – financially, technically, institutionally, and theoretically – but it is also a field of expansiveness and tremendous possibility. University presses and digital centers are other obvious places one might look for resources to support digital publication, and yet neither has shown itself to be fully equipped to meet current needs.
In earlier work of Project Essay Grade (PEG) we used computers to evaluate prose of high school students. In major experiments, PEG successfully imitated single human ratings, despite the crude hardware and software of the late 1960s. Today, computers are common in home and school, and advanced software packages permit much more powerful analysis. In the present research we analyzed recent federal samples of 495 and 599 essays and simulated groups of human judges, reaching multiple Rs as high as .87, close to the apparent reliability of the targeted judge groups. We also generated weights from formative samples of two thirds each, which predicted well the other one-third samples (with simple rs higher than .84). Another cross-validation predicted across different years, students, and judge panels, with an r of .83. Thus, the computer surpassed two judges, which is the usual human panel. Results appear encouraging for further research and indeed for early application to large programs of essay evaluation and reporting.
Purpose – The purpose of this paper is to understand the needs of humanists with respect to information and communication technology (ICT) in order to prescribe the design of an e-humanist's workbench. Design/methodology/approach – A web-based survey comprising over 60 questions gathered the following data from 169 humanists: profile of the humanist, use of ICT in teaching, e-texts, text analysis tools, access to and use of primary and secondary sources, and use of collaboration and communication tools. Findings – Humanists conduct varied forms of research and use multiple techniques. They rely on the availability of inexpensive, quality-controlled e-texts for their research. The existence of primary sources in digital form influences the type of research conducted. They are unaware of existing tools for conducting text analyses, but expressed a need for better tools. Search engines have replaced the library catalogue as the key access tool for sources. Research continues to be solitary with little collaboration among scholars. Research limitations/implications – The results are based on a self-selected sample of humanists who responded to a web-based survey. Future research needs to examine the work of the scholar at a more detailed level, preferably through observation and/or interviewing. Practical implications – The findings support a five-part framework that could serve as the basis for the design of an e-humanist's workbench. Originality/value – The paper examines the needs of the humanist, founded on an integration of information science research and humanities computing for a more comprehensive understanding of the humanist at work.
Summit on digital tools for the humanities
  • Summit
[IATH Summit, 2006] IATH Summit (2006). Summit on digital tools for the humanities : Report on summit accomplishments.
Scraping Google and Blogstreet for Just-in-Time text analysis
  • Devereux Ruecker
  • S Ruecker
  • Z Devereux
[Ruecker and Devereux, 2004] Ruecker, S. and Devereux, Z. (2004). Scraping Google and Blogstreet for Just-in-Time text analysis. In Presented at CaSTA- 04, The Face of Text, McMaster University, Hamilton, ON.