ArticlePDF Available

Abstract and Figures

Tesserae is a web-based tool for automatically detecting allusions in Latin poetry. Although still in the start-up phase, it already is capable of identifying significant numbers of known allusions, as well as similar numbers of allusions previously unnoticed by scholars. In this article, we use the tool to examine allusions to Vergil’s Aeneid in the first book of Lucan’s Civil War. Approximately 3,000 linguistic parallels returned by the program were compared with a list of known allusions drawn from commentaries. Each was examined individually and graded for its literary significance, in order to benchmark the program’s performance. All allusions from the program and commentaries were then pooled in order to examine broad patterns in Lucan’s allusive techniques which were largely unapproachable without digital methods. Although Lucan draws relatively constantly from Vergil’s generic language in order to maintain the epic idiom, this baseline is punctuated by clusters of pointed allusions, in which Lucan frequently subverts Vergil’s original meaning. These clusters not only attend the most significant characters and events but also play a role in structuring scene transitions. Work is under way to incorporate the ability to match on word meaning, phrase context, as well as metrical and phonological features into future versions of the program.
Content may be subject to copyright.
The Tesserae Project: intertextual
analysis of Latin poetry
............................................................................................................................................................
Neil Coffee, Jean-Pierre Koenig, Shakthi Poornima,
Christopher W. Forstall, Roelant Ossewaarde and
Sarah L. Jacobson
The University at Buffalo, The State University of New York, USA
.......................................................................................................................................
Abstract
Tesserae is a web-based tool for automatically detecting allusions in Latin poetry.
Although still in the start-up phase, it already is capable of identifying significant
numbers of known allusions, as well as similar numbers of allusions previously
unnoticed by scholars. In this article, we use the tool to examine allusions to
Vergil’s Aeneid in the first book of Lucan’s Civil War. Approximately 3,000 lin-
guistic parallels returned by the program were compared with a list of known
allusions drawn from commentaries. Each was examined individually and graded
for its literary significance, in order to benchmark the program’s performance.
All allusions from the program and commentaries were then pooled in order to
examine broad patterns in Lucan’s allusive techniques which were largely
unapproachable without digital methods. Although Lucan draws relatively con-
stantly from Vergil’s generic language in order to maintain the epic idiom, this
baseline is punctuated by clusters of pointed allusions, in which Lucan frequently
subverts Vergil’s original meaning. These clusters not only attend the most sig-
nificant characters and events but also play a role in structuring scene transitions.
Work is under way to incorporate the ability to match on word meaning, phrase
context, as well as metrical and phonological features into future versions of the
program.
.................................................................................................................................................................................
1Introduction
The study of allusion has grown to become a core
interest of classical—particularly Latin—literary
studies over the past several decades. Beyond
simply documenting instances of textual reuse,
scholars such as Conte (1986),Hinds (1998), and
Edmunds (2001) have enlarged the scope in which
allusion is understood to create meaning and pre-
sented several theoretical models for how allusion is
both written and read.
A number of recent digital humanities projects
have examined various aspects of text reuse.
Bamman and Crane (2008) presented a model for
identifying allusions based on multiple parameters
and detailed their methods for measuring two texts’
similarity to words, word order, and syntax. Horton
et al. (2010) created an algorithm for detecting text
reuse in French and other languages, based solely on
string similarity, which they have released under an
open source licence. Bu
¨chler et al. (2010) examined
larger scale patterns of text reuse in the treatment of
Plato by later Greek authors. Tesserae draws on
these and other projects for models, yet distin-
guishes itself as an integrated effort to develop allu-
sion detection software, undertake detailed case
studies, and bring the understanding of allusion to
a non-specialist audience.
Correspondence:
Neil Coffee, Department
of Classics, 338 MFAC,
University at Buffalo,
Buffalo, NY 14261-0026,
USA.
E-mail:
ncoffee@buffalo.edu
Literary and Linguistic Computing ßThe Author 2012. Published by Oxford University Press on behalf of ALLC.
All rights reserved. For Permissions, please email: journals.permissions@oup.com
1of 8
doi:10.1093/llc/fqs033
Literary and Linguistic Computing Advance Access published July 20, 2012
by guest on July 28, 2012http://llc.oxfordjournals.org/Downloaded from
Users of Tesserae’s web-based interface select two
texts from simple drop-down lists (Fig. 1). A list of
parallel phrases is then returned (Fig. 2); this may
be downloaded as an XML document or a list of
comma separated values. The current version of the
program is already online and freely accessible
(http://tesserae.caset.buffalo.edu) and has received
positive feedback from practising scholars of Latin
allusion, including writers of textual commentaries
who customarily note allusions.
In the remainder of this article, we present some
preliminary results from our application of the cur-
rent version of the search tool to a case study of
the Roman poet Lucan. Lucan was a poet of the
time of Nero and left unfinished at his death an
8,000-line epic on the subject of Rome’s civil war
known as the Bellum Civile (BC). In writing such an
epic, it would have been impossible for Lucan to
avoid comparison with the figure of Vergil,
approximately 100 years his senior, whose monu-
mental work, the Aeneid, had already become a clas-
sic. Lucan’s relationship with his predecessor is far
from simple: at times he relies on and reinforces
Vergil’s authority; at times he draws out ambiguity
and paradox latent in Vergil’s work; and at times he
deliberately opposes Vergil’s artistic and ideological
programs.
We formulated five questions to frame our
analysis:
(1) How often does Lucan refer to the Aeneid?
(2) What kinds of reference does he make?
(3) Where in the Aeneid does he turn most often,
and for what kinds of references?
(4) How are these references distributed within
Lucan’s text?
(5) How do these results change our present
understanding of the relationship between
the BC and the Aeneid?
Fig. 1 Tesserae user interface.
N. Coffee et al.
2of 8Literary and Linguistic Computing, 2012
by guest on July 28, 2012http://llc.oxfordjournals.org/Downloaded from
2Method
Tesserae considers two passages from different
poems to constitute a parallel if they share two or
more words. Results reported here combine the
output of two successive versions of the program,
an earlier one in which a passage was any six
consecutive words, and a later one which divided
the text into grammatical phrases based on editorial
punctuation. Word order and syntax were not con-
sidered. Word identity was judged not only by
the word’s form in the text but also by its dictionary
headword. We used the Archimedes Morphology
Service of the Max Planck Institute for the
History of Science (http://archimedes.mpiwg-
berlin.mpg.de/arch/doc/xml-rpc.html) to retrieve
headword information for our texts. Texts them-
selves were drawn from the Latin Library (http://
thelatinlibrary.com) and the Perseus Project
(http://www.perseus.tufts.edu).
To explore the contact between Lucan and
Vergil, we examined a list of parallels between the
BC and the Aeneid. We concentrated our attention
on BC Book 1 (695 lines), considering parallels
found anywhere in the entirety of the Aeneid
(9,896 lines). We ran Tesserae on these texts, then
compared the results with a list of parallels collated
from four modern commentaries: Heitland and
Haskins (1887),Thompson and Brue
`re (1968),
Viansino (1995), and Roche (2009).
Each parallel identified either by the program or
by the commentators was examined individually
and given a type number between 1 and 5 according
to its literary significance. Although this was neces-
sarily a subjective procedure, we formulated a gen-
eral set of criteria for our classification (Table 1).
The principal distinction was between meaningful
(Types 3–5) and not meaningful (Types 1 and 2)
parallels. This distinction follows the argument of
Thomas (1986, p. 117) that references either are or
Fig. 2 Tesserae results.
The Tesserae Project
Literary and Linguistic Computing, 2012 3of 8
by guest on July 28, 2012http://llc.oxfordjournals.org/Downloaded from
are not ‘susceptible to interpretation or meaning-
ful’. The set of meaningful parallels was further
divided into those that simply reused distinctive
language, and those that in doing so created new
literary significance. Conte (1986, p. 31) proposed
that an earlier work could provide either a ‘code
model’ or an ‘exemplary model’ for a later one. In
the first case, the model as a whole defines the idiom
in which the later text speaks. In the second case, the
referring author directs the reader’s attention to a
particular moment in the earlier work. This distinc-
tion separates our Type 3 from Types 4 and 5. The
final distinction, between Types 4 and 5, less and
more significant allusions, was the most subjective.
Other schemas are possible, but ours proved useful
for broadly categorizing parallels to analyze the
large-scale questions posed above, to which we
now turn.
3Results
3.1 Numbers of parallels
The automated search returned a list of 3,100
parallels across all types, while the combined efforts
of the four commentaries produced 419 parallels of
Types 2–5. A comparison of results by type is given
in Table 2. The number of Type 3–5 parallels
returned by the program was comparable to the
work of the commentators, but the program re-
ported vastly more of Types 1 and 2 than did the
commentaries. These results show that, with manual
examination of the program’s output to filter out
false positives, our automated search can already
identify a significant portion of the parallels most
interesting to literary scholars. Comparing the
program to individual commentaries, we see
that for interpretable allusions (Types 4 and 5), it
reports 103 to Viansino’s 48, but still fewer than
Roche’s 151.
These numbers tell only half the story, however.
Although Tesserae returned numbers of valuable
parallels at similar rates to the commentators, the
parallels themselves were often different from those
found by the commentators. Only half of the
Table 1 The schema used to grade parallels reported by Tesserae and the commentaries
Meaningful Not meaningful
Interpretable Not interpretable
More significant Less significant
543 21
High formal simi-
larity to analogous
context
Moderate formal
similarity to
analogous
context, or
High formal
similarity in
moderately
analogous context
High/moderate formal similarity
to very common phrase or
words, or
High/moderate formal similarity
to no analogous context, or
Moderate formal similarity to
moderate/highly analogous
context
Very common
words in very
common phrase, or
Words too distant
to form a phrase
Error in discovery
algorithm, words
should not have
matched
Table 2 All parallels reported by Tesserae and four
commentaries, by type
Type Tesserae Commentaries Total
All Roche Viansino T and B H and H
1 486 0 0 0 0 0 486
2 2,241 55 50 8 1 1 2,289
3 280 192 168 33 13 6 425
4 57 79 66 18 12 3 115
5 36 93 85 30 14 4 103
Total 3,100 419 369 89 40 14 3,418
The commentaries used were Roche (2009), Viansino (1995),
Thompson and Brue
`re (1968) and Heitland and Haskings
(1887). In adding columns, each unique parallel is only counted
once; combined totals may be less than the sum of individual
values.
N. Coffee et al.
4of 8Literary and Linguistic Computing, 2012
by guest on July 28, 2012http://llc.oxfordjournals.org/Downloaded from
interpretable allusions detected by Tesserae were
listed in the commentaries (Fig. 3). Thus, although
Tesserae returned only 25% of the commentator’s
allusions, it also increased the total number of allu-
sions found by 25%.
3.2 Parallels by type
The most obvious difference between our auto-
mated search and the commentaries was the
number of less meaningful parallels returned.
Among the commentaries, there is already a trend
in this direction, with Roche (2009) surpassing his
predecessors in the number of Type 2 and 3 parallels
reported. Unlike the other commentators, Roche
examined only Book 1 of Lucan’s poem, effectively
concentrating his efforts. He also used digital
searches along with more traditional philological
tools. These methods enabled Roche to look
beyond the exemplary model allusions most familiar
to Latinists and begin to represent the level of code
model reference which underwrites Lucan’s posture
as an epic poet. Tesserae expands this perspective
considerably.
In what proportions does Lucan use the various
types of parallels? Combining results from Tesserae
and the commentators, we start to get a compre-
hensive picture of the author’s practice. The data
presented in Table 2 suggest that in BC 1, Lucan
relies on Vergil’s generic epic language about twice
as frequently as he alludes to specific passages in the
Aeneid.
3.3 Parallels by location in source text
Lucan does not draw evenly from all books of the
Aeneid.Figure 4 shows the distribution of all paral-
lels in the Aeneid, by type. Although Lucan draws
relatively evenly on all books of the Aeneid for Type
3 parallels, he clearly favors certain books for Types
4 and 5. His most meaningful allusions are drawn
above all from Aeneid 2, followed by Books 4, 11,
and 3.
It is natural that, in presenting the destruction of
Rome as the major theme of the Roman civil war,
Lucan should draw upon Vergil’s portrayal of the
fall of Troy in Aeneid Book 2. Aeneid 11 describes
hard fighting and internal conflict in the Latin as-
sembly and is also thematically apropos. The choice
of Aeneid 4, the story of Dido, is less obvious.
Although Lucan uses material from this book for
several purposes, a significant complex of allusions
borrow notions of madness and ill rumor from the
Dido story to suggest ill-starred similarities between
Carthage and Rome. Thus, BC 1.676, attonitam
rapitur matrona per urbem (The [prophetic]
matron is swept through the awestruck city),
draws on Aeneid 4.666, concussam bacchatur Fama
per urbem (Rumor runs riot through the stunned
city) to suggest that Romans of the civil war
period were as mad and rumor driven as Dido
and her Carthaginians.
The wealth of allusions to the less-studied Book 3
of the Aeneid gives further clues to Lucan’s unique
reading of Vergil. One significant strand of Lucan’s
use of this book involves reversing its optimistic
prophecies of a new land for the Trojans in order
to suggest the woeful future in store for the Romans.
Thus, in a parallel identified only by Tesserae, Vergil
uses the image of Sicily’s separation from Italy at the
straits of Messina to foretell Aeneas’ successful jour-
ney to found Rome (Aeneid 3.418), an image Lucan
recalls and reverses when he depicts Sicily rejoined
to Italy in an eruption of Mt Aetna as a portent of
the coming war (BC 1.547).
3.4 Parallels by location in the
referring text
We combined the automated results with those
collated from commentaries to ask what large-scale
patterns could be seen in Lucan’s use of allusion
Fig. 3 Types 4–5 parallels reported by Tesserae and four
commentators. Tesserae returned a significant number of
matches unremarked by commentators.
The Tesserae Project
Literary and Linguistic Computing, 2012 5of 8
by guest on July 28, 2012http://llc.oxfordjournals.org/Downloaded from
within his own poem. Figure 5 shows Type
3–5 parallels by location in BC 1. Again, the
baseline of code model references is relatively con-
stant, punctuated by clusters of more significant
allusions.
Lucan clusters significant references throughout
the opening and closing sections, and in establishing
the principal characters: at the outset of Lucan’s
text, where he sets out his theme and the artistic
program for the work; in the opening descriptions
of Caesar and Pompey, the principal belligerents;
and in the prophecy of the matrona, which closes
the book. In contrast, at the heart of Lucan’s praise
of Nero (Lines 39–59), we find a pause in references
to the Aeneid. Here, Lucan forgoes an obvious
opportunity to ennoble Nero by association with
the grandeur of the epic tradition, and instead
creates a prosaic tone that flattens what should be
the culmination of his praise.
Consideration of large-scale patterns also reveals
how Lucan uses allusion to structure his narrative.
He shows a tendency to cluster references at the
beginning and ending of sections. More specifically,
he often closes a section with a Vergilian allusion,
capped by his own pithy or moralizing statement.
The next scene then opens with a fresh allusion to
anchor and authorize it in the Vergilian tradition.
Thus, in the transition from Rome’s decline to
Caesar’s delay at the Rubicon (BC 1.178–205),
Lucan describes the prevalence of bribery (1.178)
using language from Vergil’s depiction of sinners
in the underworld (Aeneid 6.622). He closes the
scene with his own vision of avidity leading to war
(1.82), before opening his section on Caesar’s march
123456789101112
TYPE 5
TYPE 4
TYPE 3
TYPE 3−5 PARALLELS BY BOOK IN AENEID
AENEID BOOK NUMBER
NUMBER OF PARALLELS
0102030405060
Fig. 4 All Types 3–5 parallels by book in the Aeneid.
N. Coffee et al.
6of 8Literary and Linguistic Computing, 2012
by guest on July 28, 2012http://llc.oxfordjournals.org/Downloaded from
with several new references to the Aeneid. Lucan
draws on Vergil’s authority to bring density of
meaning to his transitions, yet he reserves the cru-
cial end of the section to finish with his own master
strokes.
3.5 Lucan’s BC 1 and the Aeneid
Do the results of automated allusion detection
change our understanding of Lucan’s relationship
to the Aeneid? A full answer to this question will
require analysis of the remaining books of Lucan’s
epic, but our results provide some initial responses.
In existing scholarship, Lucan’s references to the
Aeneid have generally been taken as oppositional,
subverting the imagery and language of Rome’s
founding to suggest that the construction of
empire inevitably becomes a corrupt enterprise.
Our study supports this picture, but also adds im-
portant detail. The constancy of Type 3 parallels
shows to what degree Lucan relied on Vergil even
for the basic idiom of epic. At the same time, Lucan
uses allusions to frame scenes, uses clusters of allu-
sions to different themes within Vergil’s poem, and
shifts markedly from allusions to the Aeneid in favor
of allusions to other works in his praise of Nero.
These gestures all represent distinctive patterns in
Lucan’s large-scale use of meaningful allusions for
artistic effect.
345
700 600 500 400 300 200 100 0
TYPE 3−5 PARALLELS BY POSITION IN BC 1
PARALLEL TYPE
BC LINE NUMBER
proem and apostrophe to Rome
praise of Nero
causes of war
description of Caesar and Pompey
Rome’s moral decline
Caesar at the Rubicon
panic at Ariminum
speech of Curio
speech of Caesar
speech of Laelius
list of Gallic tribes unguarded
evacuation of Rome
prodigies
purification of Rome
Figulus’ astrology
matrona’s prophecy
Fig. 5 All Types 3–5 parallels by line in the BC, Book 1.
The Tesserae Project
Literary and Linguistic Computing, 2012 7of 8
by guest on July 28, 2012http://llc.oxfordjournals.org/Downloaded from
4Future Work
The process of evaluating each of the 3,000 results
collected by the Tesserae program and commentators
has created a benchmark set of parallels, including
positive and negative examples, for training and test-
ing future algorithms. It has also given us insight into
which new feature sets would allow us to capture the
greatest number of allusions currently missed by the
program. Among these are the ability to match syno-
nyms and the ability to match paragraph-level con-
text, both of which seem to turn on semantics.
Although sound-based allusions were not prevalent
among the current test set, other examples have con-
vinced us that the ability to match on character-level
similarities and metrical shape would bring in add-
itional high-grade allusions. As such feature sensitiv-
ity is incorporated, automatic detection of allusion,
and of style and theme generally, will increasingly
come to replicate the results of traditional scholar-
ship and open up further new perspectives on literary
meaning and artistry.
Funding
This work was supported by funding from the
Digital Humanities Initiative at Buffalo for its
Textual Analysis Working Group; and from the
University at Buffalo’s Department of Classics.
References
Bamman, D. and Crane, G. (2008). The Logic and
Discovery of Textual Allusion, Proceedings of the Second
Workshop on Language Technology for Cultural Heritage
Data (LaTeCH 2008). Marrakesh, Morocco.
Bu
¨chler, M., Geßner, A., Eckart, T., and Heyer, G.
(2010). Unsupervised detection and visualisation of
textual reuse on ancient Greek texts. Journal of the
Chicago Colloquium on Digital Humanities and
Computer Science,1(2).
Conte, G. B. (1986). The Rhetoric of Imitation: Genre and
Poetic Memory in Virgil and Other Latin Poets. Ithaca,
NY: Cornell University Press.
Edmunds, L. (2001). Intertextuality and the Reading of
Roman Poetry. Baltimore, MD: Johns Hopkins
University Press.
Heitland, W. E. and Haskins, C. E. (1887). M. Annaei
Lucani Pharsalia. London: G. Bell.
Hinds, S. (1998). Allusion and Intertext: The Dynamics of
Appropriation in Roman Poetry. New York: Cambridge
University Press.
Horton, R., Olsen, M., and Roe, G. (2010). Something
borrowed: sequence alignment and the identification
of similar passages in large text collections. Digital
Studies / Le champ nume
´rique, 2(1).
Roche, P. (2009). Lucan: De Bello Civili: Book 1. Oxford:
Oxford University Press.
Thomas, R. F. (1986). Virgil’s georgics and the art of
reference. Harvard Studies in Classical Philology,90:
171–98.
Thompson, L. and Brue
`re, R. T. (1968). Lucan’s use of
Vergilian reminiscence. Classical Philology,63: 1–21.
Viansino, G. (1995). Marco Annaeo Lucano: La Guerra
Civile Volume 1 Libri I-V. Milan: Mondadori.
N. Coffee et al.
8of 8Literary and Linguistic Computing, 2012
by guest on July 28, 2012http://llc.oxfordjournals.org/Downloaded from
... Diese einzelnen Teilschritte können auf den 367 Mit den Epitheta ‚inhaltlich bedeutungstragend' und ‚hermeneutisch interessant' werden im Folgenden Funde bezeichnet, die in methodisch vergleichbaren Untersuchungen der computerbasierten Intertextualitätsforschung als ‚bedeutungstragend' oder ‚Sinn produzierend' bezeichnet werden, vgl. Coffee et al. (2012), Coffee et al. (2013) Zuvor sei darauf hingewiesen, dass die Entwicklung der Analysestruktur keinesfalls so teleologisch verläuft, wie es die folgenden Ausführungen suggerieren. Da die methodische Vorgehensweise der automatisierten Zitatanalyse in der antikebezogenen Literaturwissenschaft grundsätzlich noch eher neu ist, bedarf es zunächst vieler Vorarbeiten, um die jeweiligen Potentiale und Grenzen der einzelnen methodischen Umsetzungsmöglichkeiten auszuloten. ...
... In einer früheren Code-Version wurde demgegenüber eine Textpassage noch als sechs konsekutiv aufeinander folgende Wörter definiert, vgl. Coffee et al. (2013) 223. Gerade die Wahl der Satzzeichen als Trennungsmarker erweist sich in der vorliegend notwendig gewordenen Weiterverarbeitung der Tesserae-Ergebnisse als hinderlich: Zwar ist dieses Vorgehen verfahrenstechnisch einfach, doch ergeben sich daraus inhaltlich nicht immer sinnvolle Texteinschnitte, an die sich Hieronymus (zu dessen Zeit es die modernen Einfügungen noch nicht gab und für den daher viel eher die Versstruktur und Metrik eine Rolle spielte) nicht immer hält. ...
... alignment). Examples of this workflow include The same general workflow has been applied to poetry in the Tesserae project (Coffee et al., 2013). Originally developed for studying Latin poetry, it has been applied to quantitative studies (Bernstein et al., 2015) and an attempt to adapt the method to English texts has recently been made by Shang and Underwood (2021). ...
Article
Full-text available
Suomen Kansan Vanhat Runot (Old Poems of the Finnish People) is a collection of nearly 90,000 oral folk poems written down between 1564 and the early 20th century. It is characterized by frequent reoccurrence of similar pieces of text on various levels (from entire poems, through passages to single verses and collocations). However, finding these similarities is challenging due to a high degree of orthographical, morphological, and compositional variation. In this article, we propose a method for automatically identifying equivalent verses, i.e. verses conveying the same meaning with the same words, using a clustering based on cosine similarity of character bigram vectors. The method achieves around 81% F-score and has been successfully used for identifying similarities across the entire SKVR corpus on the level of verse, passage, and poem. The results can be browsed through a Web interface.
... (2020);Coffee et al. (2012);Wang et al. (2020) do not report inter-annotator agreement and don't investigate the factors contributing to the disagreements. We have identified three such possible factors and conducted additional experiments to investigate their impact. ...
Preprint
Full-text available
Peer review is a key component of the publishing process in most fields of science. The increasing submission rates put a strain on reviewing quality and efficiency, motivating the development of applications to support the reviewing and editorial work. While existing NLP studies focus on the analysis of individual texts, editorial assistance often requires modeling interactions between pairs of texts -- yet general frameworks and datasets to support this scenario are missing. Relationships between texts are the core object of the intertextuality theory -- a family of approaches in literary studies not yet operationalized in NLP. Inspired by prior theoretical work, we propose the first intertextual model of text-based collaboration, which encompasses three major phenomena that make up a full iteration of the review-revise-and-resubmit cycle: pragmatic tagging, linking and long-document version alignment. While peer review is used across the fields of science and publication formats, existing datasets solely focus on conference-style review in computer science. Addressing this, we instantiate our proposed model in the first annotated multi-domain corpus in journal-style post-publication open peer review, and provide detailed insights into the practical aspects of intertextual annotation. Our resource is a major step towards multi-domain, fine-grained applications of NLP in editorial support for peer review, and our intertextual framework paves the path for general-purpose modeling of text-based collaboration.
... Classical Latin literature is a highly influential tradition characterized by an extraordinary density of allusions and other forms of text reuse (Hinds, 1998). The most widely used tools for the detection of Latin intertextuality, such as Tesserae and Diogenes, rely on lexical matching of repeated words or phrases (Coffee et al., 2012(Coffee et al., , 2013Heslin, 2019). In addition to these core methods, other research has explored the use of sequence alignment (Chaudhuri et al., 2015;Chaudhuri and Dexter, 2017), semantic matching (Scheirer et al., 2016), and hybrid approaches (Moritz et al., 2016;Manjavacas et al., 2019) for Latin intertextual search, complementing related work on English (Smith et al., 2014;Zhang et al., 2014;Barbu and Trausan-Matu, 2017). ...
... Much progress has been made in this area and a number of highly useful tools are now available-e.g. Tracer (Büchler, 2013) or Tesserae (Coffee et al., 2012). This paper, however, aims to contribute to a number of open issues that still present significant challenges to the further development of the field. ...
Thesis
Full-text available
This study sees itself as part of the larger field of intertextuality studies and examines, using the latest digital text reuse technology, the reuse of biblical texts in the writings of two Late Antique Christian authors from Egypt, the abbots Shenoute and Besa. It explores, on the basis of selected writings by these authors, the advantages and limitations of using digital methods to study the form and function of biblical intertexts in monastic literature written in Coptic. In particular, it seeks answers to a number of specific research questions, e.g., the extent to which quotations are faithful to the original biblical sources, the influence of quotation-introducing formulae or the question how digital technologies can be used to facilitate intertextuality studies. To pursue these topics, after an introduction to the life and works of both authors (Chapter 1), it describes the state of research for intertextuality studies with a focus on biblical and Early Christian and Coptic studies, and Shenoute in particular (Chapter 2). Text reuse detection technology, which was developed in the field of computer science, is introduced, and its practical applications are described. Specific focus is placed on the history of text reuse detection and the subtle differences between intertextuality and text reuses. In addition, the current progress in studies on intertextuality in Shenoute’s works is explored. Digital text reuse technology is described in detail (Chapter 3), in particular the technology that underpins the processing mechanisms employed by the latest text reuse detection software, TRACER, and pre-processing features such as optical character recognition, Unicode conversion, tokenization, lemmatization, and part-of-speech tagging. The case study for examining the application of digital text reuse technology has to focus on a limited selection of biblical texts, and specifically, on the best attested and most well-known book of the Old Testament, the Book of Psalms. Chapter 4 presents the philological and codicological information on the corpora used, the Sahidic translation of the Psalms, and the selected works by Shenoute and Besa, while Chapter 5 is dedicated to the case study and its results. It analyzes text reuses newly identified by TRACER, discusses instances of idiomatic text reuse and the question of quotation-introducing signals. In summary, this study confirms observations by previous research that the monastic authors built on the audience’s collective memory of the Bible by blending biblical phrases and concepts with their own monastic ideals. For the purpose of recontextualizing the source texts and fitting them to the current situation, unmarked changes may be applied, mostly of a grammatical nature. An interesting difference between the two monastic authors may be noted in their use of quotation-introducing signals, which merits further exploration, as does the question of the relation between the introduction of a quotation and its faithfulness. Finally, it needs to be stressed that ongoing and future digitization of the corpus of monastic authors and Coptic literature in general will very much widen the scope of digital text reuse methods and lead to new research questions and discoveries.
Article
A corpus collecting rewritings of the myth of Orpheus and Eurydice allows us to study intertextuality and to represent it through a network of correspondences and within a comparative digital edition. For this purpose, we examine the output of two text reuse detection tools, TextPAIR and Tracer, in order to combine their treatments and exploit each one’s best potential. We propose a series of treatments to enrich the results and to overcome specific challenges observed. These technical manipulations allow us to interpret the results obtained for an essay by Pierre-Simon Ballanche and confront them with empirical analyses of the Orphic topos. Thus, we show that analysis supported by computational techniques is all the more useful for the study of intertextuality as it is adapted to the corpus.
Article
Full-text available
How can computational methods illuminate the relationship between a leading intellectual, and their lifetime library membership? We report here on an international collaboration that explored the interrelation between the reading record and the publications of the British philosopher and economist John Stuart Mill, focusing on his relationship with the London Library, an independent lending library of which Mill was a member for 32 years. Building on detailed archival research of the London Library’s lending and book donation records, a digital library of texts borrowed, and publications produced was assembled, which enabled natural language processing approaches to detect textual reuse and similarity, establishing the relationship between Mill and the Library. Text mining the books Mill borrowed and donated against his published outputs demonstrates that the collections of the London Library influenced his thought, transferred into his published oeuvre, and featured in his role as political commentator and public moralist. We reconceive archival library issue registers as data for triangulating against the growing body of digitized historical texts and the output of leading intellectual figures. We acknowledge, however, that this approach is dependent on the resources and permissions to transcribe extant library registers, and on access to previously digitized sources. Related copyright and privacy restrictions mean our approach is most likely to succeed for other leading eighteenth- and nineteenth-century figures.
Chapter
Text reuse measurement is important for both LIS and literary studies, where it is mainly used to study influence between authors. Although projects such as Tesserae have already adopted computational methods for investigating text reuse in Latin poetry, its potential applications to the rich collections of English poetry have not been realized. This research proposes a modified version of the Tesserae Project’s measure based on the insight embodied in TF–IDF to study English poetry. Using the Irish poet Yeats’ relationship to five English Romantic poets as a test case, three parallel experiments were conducted in order to evaluate the suitability of this method for English poetry. The results show that this new method is effective in measuring text reuse in English poetry, and the TF–IDF based modification is more sensitive to known cases of text reuse than the original method. This method can also be adopted to noncanonical literary works in the future, providing an example of the significance of LIS for digital humanities.
Article
Preface List of abbreviations 1. Reflexivity: allusion and self-annotation 2. Interpretability: beyond philological fundamentalism 3. Diachrony: literary history and its narratives 4. Repetition and change 5. Tradition and self-fashioning Bibliography Index.
Article
We describe here a method for discovering imitative textual allusions in a large collec- tion of Classical Latin poetry. In translating the logic of literary allusion into computa- tional terms, we include not only traditional IR variables such as token similarity and n- grams, but also incorporate a comparison of syntactic structure as well. This provides a more robust search method for Classical lan- guages since it accomodates their relatively free word order and rich inflection, and has the potential to improve fuzzy string search- ing in other languages as well.
Article
This thesis represents the first full-scale, English commentary on the opening book of Lucan's epic poem, De Bello Ciuili, in sixty-five years. Its fundamental purpose is to explain the language and content of the Latin text of the book. The subject matter of the thesis beyond the introduction is naturally dependent upon the content of each individual line under consideration, but the following questions may help establish some of the larger issues I have prioritised throughout my response to the Latin text of book one. These questions may be variously relevant to an episode within book one of De Bello Ciuili, or else a sentence, a line, a word, a metrical issue, or a combination of these. How does it help locate the text within the genre of epic? What does it contribute to the overall meaning of the poem? What does it contribute to our understanding of epic narrative technique? What does it contribute to our understanding of Lucan's poetic usage and technique? How does it interact with the rest of the poem (i.e. what are the structural or intratextual markers advertised and what do they contribute to the meaning of the passage under consideration or the structure of the book or poem as a whole)? How does it interact with its (especially epic) models (i.e. what intertextual markers are at work and how does the invocation of earlier models affect the meaning of the passage under consideration)? How does it behave in relation to what we know of the norms espoused by Classical literary criticism? What are the programmatic issues, themes, and images explored or established by book one? "20 October 2005." Thesis (Ph. D.)--University of Otago, 2006. Includes bibliographical references.
Marco Annaeo Lucano: La Guerra Civile Volume 1 Libri I-V
  • G Viansino
Viansino, G. (1995). Marco Annaeo Lucano: La Guerra Civile Volume 1 Libri I-V. Milan: Mondadori.