Available via license: CC BY 4.0
Content may be subject to copyright.
Modelling Enlightenment: reassembling
intertextual networks through data-driven
research (ModERN)
Dario Maria Nicolosi, Sorbonne Université, dario.nicolosi.92@gmail.com
Glenn Roe, Sorbonne Université, glenn.roe@sorbonne-universite.fr
Nicolosi D. M. and Roe G. 2024. ‘Modelling Enlightenment: reassembling intertextual networks through data-driven
research (ModERN)’. In: Digital Enlightenment Studies 2, 62–84.
DOI: 10.61147/des.22
Digital Enlightenment Studies is a peer-reviewed open access journal. © 2024 The Author(s). This is an open-access article distributed
under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use,
distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.
org/licenses/by/4.0/.
OPEN ACCESS
The European Research Council’s ModERN project (Modelling Enlightenment: reassembling networks of
modernity through data-driven research) is a pioneering ve-year research initiative. This programme
seeks to redene the conventional understanding of 18th-century literary history by employing
advanced data-modelling and analysis techniques. By developing a comprehensive corpus of
18th-century French texts and leveraging a range of data-science methodologies such as text-reuse
detection and network analysis, the project aims to uncover novel research avenues and provide fresh
insights into early-modern French print culture and its intertextual dynamics.
In this report, we discuss some theoretical points underlying our research; we explain the choices
made in constructing our corpus and their implications; and we present some case studies to show the
potential of our research and the most prudent methodologies to adopt.
Keywords: intertextuality, network analysis, text reuse, SNA
63
modelling enlightenment: reassembling intertextual networks
1. Introduction and scope
e European Research Council project ModERN (Modelling Enlightenment: reassembling networks
of modernity through data-driven research) is a ve-year research programme that aims to challenge
received notions of 18th-century literary history through large-scale data modelling and analysis. By way
of the creation of a new, extensive corpus of 18th-century French texts and mobilising various data-science
methods like text-reuse detection and network analysis, we intend to open up new lines of research in early-
modern French print culture and its intertextual connections.
Intellectually, our project aims to align itself with previous work that has highlighted the highly
interconnected nature of 18th-century culture, analysing the social networks of intellectual gures (e.g.
Brockliss 2002), the correspondence networks between members of the Republic of Letters (Comsa et al.
2016; Edmondson and Edelstein 2019) or focusing on the diusion and reception of literary and philosophical
works (Burrows 2018; Burrows 2020; Darnton 1982; Darnton 2021). However, while these important and
stimulating studies have emphasised the circulation of people and ideas that contributed to dening the
main ideological axes and the European scope of the Enlightenment, they analyse these exchanges from a
purely material perspective (letters, books, sales records, etc.), reducing the textual content of these objects
to data points. Our project will instead employ new techniques for large-scale text analysis to identify and
analyse conceptual and intertextual networks over an unprecedented collection of 18th-century texts. As
we know, 18th-century authors demonstrated great agility in their appropriation and reappropriation of
both ancient and modern sources, relying on the shared cultural knowledge of their readers to identify
these borrowings (Edelstein, Morrissey and Roe 2013). Today, however, given that most of these references
remain hidden to contemporary readers, the identication of these intertextual relationships can provide
new insights into the reciprocal inuences, models and authorities that shaped the evolution of various
ideological, political and aesthetic discourses of the period.
At its core, the project seeks to understand how the modern constellation of Enlightenment authors
we have inherited today came into being; to uncover the cultural and ideological processes by which these
writers, and by extension the texts and concepts they helped disseminate, became so indissolubly linked to
the Enlightenment as an idea, while others – mostly forgotten today – were gradually excluded from these
same assemblages. In order to re-establish these lost voices, or, to put it another way, to reassemble these lost
networks, we need to drastically expand the corpus of texts on which we have traditionally drawn to understand
the French Enlightenment and its reception.1 ankfully, this process of expansion is already underway, as the
18th century has beneted greatly, perhaps more than any other historical period, from the past two decades of
digital transformation and the subsequent rise of the digital humanities (Burrows and Roe 2020; Paige 2021).
1 For further insights into the implications of the large-scale analysis of ‘non-canonical’ texts, on their heuristic potential as well
as the risks behind such broad analyses, see Moretti (2017).
64
modelling enlightenment: reassembling intertextual networks
Digital projects, databases and collections in 18th-century studies are now reaching a point of maturity
in their elaboration, as well as a critical mass in number, such that we can now begin to think in terms of
the literary or cultural systems, rather than individual works or authors, that inhabit our growing digital
archives, at least in the francophone context. e main ModERN corpora have thus been built including not
only canonical works and printed books that have been progressively digitised over the past two decades, but
also more ephemeral texts such as private correspondence, pamphlets, newspapers and journals: the breadth
of these collections provides a unique opportunity to trace a much broader range of intertextual practices
than has traditionally been conceivable (Kristeva 1969; Barthes 1984; Genette 1992). By identifying and
scrutinising a large swathe of these sorts of intertextual practices – from borrowings, citations, mentions
and references to paraphrases and allusions, etc. – we gain a deeper understanding of the intricate web of
inuences that shaped literary and philosophical works but which also enriches our comprehension of the
historical context in which these texts were produced.
Which 18th-century texts were most frequently ‘cited’, in what form and why? Which authors prove to
be the most ‘inuential’ – their words resonating and circulating the most within other texts of the same
period? What verses, maxims and concepts seemed to gain (or lose) popularity over the course of the long
18th century? Do specic communities emerge, centred around an authority or a foundational text, or are
textual exchanges rather to be found transversally, cutting across numerous literary and cultural elds?
Similar reections can be made regarding the reception of Antiquity. As we know, Greek and Latin literature
form the cultural and educational foundations of the time, and will, starting with the Querelle des Anciens
et des Modernes, eventually become an instrument for discussing and proposing new philosophical and
aesthetic theories (Grell 1995; Norman 2011). Which ancient authors are most frequently cited, and in what
context? In the original language or in translation? While the inuence of the most important authors (e.g.
Aristotle, Cicero, Plutarch) is well known, is it possible to unearth ‘secondary’ gures whose reception is
nevertheless crucial to understanding strategies of reusing Antiquity during the Enlightenment?
From a technical and methodological standpoint, we ground our understanding of these intertextual
exchanges as a specic instance and application of network modelling and analysis methodologies,
particularly social network analysis (SNA) for literary-historical studies. Over the past decade or so, several
research groups have begun to exploit the potential of applying the heuristic tools oered by SNA, developed
mainly in the social and data science elds, to humanities projects.2 But what are the implications behind
applying this model to a phenomenon like intertextuality? What kind of information can be extracted, and
how is it inuenced by our choices of formalisation and representation? In general, it is always important
to remember that any modelling attempt is, by denition, constructed with inherent biases related to the
dataset in question (the extent and quality of the sources it is derived from), the tools used to create it
2 On the ‘network turn’ in the humanities, see Ahnert et al. (2021). On the use of SNA in 18th-century studies, see Edmondson
and Edelstein (2019).
65
modelling enlightenment: reassembling intertextual networks
(computational techniques and their performance), and the specic research objectives (which questions
are posed, and how the results are interpreted). However, these potential obstacles remain surmountable
if it is clear to the researcher that each model is more of a heuristic tool for analysis than a repository of
immutable truth: models serve less to nd denitive answers than to propose new questions. e ability to
engage with an unprecedented amount of data oers specialists a new way to look at phenomena, in our
case, intertextuality and inuence in the 18th century.
We are condent that such large-scale projects are worthwhile, and that avenues of research opened up
by such programmes can lead to new knowledge and, eventually, new historical paradigms (Moretti 2008;
McCarty 2018). We will thus present the corpus construction and alignment methodologies that inform our
models, and then conclude with some practical use-cases for exploring and analysing intertextual networks
at scale.
2. Corpus
anks to institutional agreements with the University of Chicago, Gale Primary Sources, the University
of Oxford and the Bibliothèque nationale de France (BnF), our primary corpora are drawn from four main
sources: transcribed and curated data holdings from the ARTFL Project and the Voltaire Foundation;3 Paul
Fièvre’s éâtre Classique database (transcriptions) of French theatre;4 digitised texts in French (via OCR)
taken from the Gale Eighteenth Century Collections Online (ECCO) and e Goldsmiths’-Kress Library of
Economic Literature;5 and texts drawn from the Gallica digital library housed at the BnF (OCR).6 Overall,
our ingestion policy included the following criteria: digitised texts in French published roughly between
1685 and 1800 and, in the case of multiple editions of the same text, the earliest edition available across
collections. Preference was given to transcribed versions of texts regardless of publication date.
Our main corpus of texts is thus the result of an amalgamation of several independent collections
derived from distinct digitisation campaigns, the combination of which led to a series of inevitable problems
that we had to address. From the form and content of the metadata to the text formats and TEI-XML le
structures, each collection was unique and followed no real standard. Being conceived at dierent times and
in response to specic research objectives, each digitisation project adopts its own encoding and classication
logic, which introduces signicant variability into the combined metadata of any large-scale corpus.
3 ARTFL holdings include relevant texts from the ARTFL-Frantext database as well as other open-access and subscription-
based collections, including the Bibliothèque bleue de Troyes, the ARTFL Encyclopédie and Dictionnaires d’autrefois, see
https://art-project.uchicago.edu/; in collaboration with ARTFL, the Voltaire Foundation has developed the TOUT Voltaire
dataset, which was made available to our project, see https://www.voltaire.ox.ac.uk/voltaire-lab/tout-voltaire/.
4 See https://www.theatre-classique.fr/.
5 See https://www.gale.com/c/eighteenth-century-collections-online-part-i and https://www.gale.com/c/making-of-the-
modern-world-part-i.
6 See https://gallica.bnf.fr/.
66
modelling enlightenment: reassembling intertextual networks
is variability comes also from the editorial, literary and historical specicity of each corpus, which
inuences the choices of the researchers who compiled and encoded our corpora. For example, the issue
of paratextual elements takes on greater signicance depending on the historical or linguistic nature of the
corpus analysis one seeks to enact, begging the question of whether they should be included or excluded
from digital editions. e ARTFL-Frantext database, for instance, has removed all non-authorial elements
from its texts, in an eort to better represent the linguistic context in which they were produced.7 Other
collections, such as ECCO, reproduce texts in their entirety, including any and all paratextual elements
whether originating from the author or not.
ese tensions are indicative of larger debates around digitisation protocols and, more specically,
digital scholarly editions and the TEI-XML encoding standard: how much to encode, and at what levels
of granularity, are decisions oen made by previous editors whose justications may no longer be legible
at the time of corpus construction, leading to editorial artefacts that can skew results downstream if one
is not careful. And yet, the rst (and perhaps only) necessary condition behind the construction of an
analytical model is that the data within it are consistent with each other: if every choice and selection is
justiable in itself, a model is valid only if its elements are homogeneous and functional to the type of
analysis envisioned.
Given the diversity of text-encoding options, it thus becomes necessary to seek automated or semi-
automated methods to harmonise corpus metadata, and in particular titles and author names. e approach
we adopted combines three digital methods that assess the similarity between strings of characters. In a
rst instance, we deployed two well-established methods developed in natural language processing (NLP) –
Levenshtein distance and cosine similarity – in order to score the lexical similarity of author labels.8 e
results were fairly promising, allowing us to ascertain that ‘CARMONTELLE, Louis Carrogis, dit Louis de
Carmontelle (1717–1806)’ and ‘CARMONTELLE, Louis Carrogis de (1717–1806)’ were indeed the same
author. is may seem intuitive to the human eye, and indeed it is, but for the computer they are two
very distinct strings that inhabit the same XML eld and are therefore treated as separate entities. More
uncertain cases – author names that were too lexically dissimilar to be caught – ‘Anne-Claude-Philippe
de Tubières, comte de Caylus’ and ‘Comte de Caylus (1692–1765)’, for instance – required the use of more
direct techniques, such as comparing the two longest words in a string or systematically removing all dates
before comparison. anks to these approaches, were able to disambiguate our author names, which were
then standardised across our corpus. e same semi-automated standardisation process was employed for
the titles of texts, grouping the volumes of the same work under a single label, oen distinguished by the
mention of the volume number.
7 ARTFL-Frantext, like its French counterpart Frantext, was originally constructed by lexicographers compiling the Trésor de la
langue française dictionary in the 1970s. See https://art-project.uchicago.edu/content/art-frantext.
8 On these two measures of textual ‘similarity’, see Buscaldi et al. (2020).
67
modelling enlightenment: reassembling intertextual networks
Here we encountered the thorny issue of likely duplicates, i.e. those texts with similar, but not identical,
titles that in fact represent two (or more) versions of the same text. In order to make our dataset conform
to the analytical model we have chosen – to model intertextual exchanges using graphs and SNA tools – the
presence of a duplicate text can signicantly alter the results. Two or more copies of the same text, identical
or very similar, will tend to generate a very high number of co-occurrences among themselves (or just one,
but concerning the entire text), strongly inuencing the use of quantitative metrics derived from graph
analysis; and the same happens for every single text that cites (or is cited by) this same multiple source,
creating double, triple, etc., intertextual links. For the reasons previously mentioned, simple comparison of
metadata is not eective, given the wide range of variables in the indication of authors, titles, dates, etc. e
solution we found consists of exploiting precisely these two characteristics of duplicates: if the automatically
detected co-occurrence is extremely long, or if it coincides with the near entirety of a document, it is highly
likely that it is a duplicate; the same applies if two texts present an anomalous number of co-occurrences,
which require inspection to explain.
is last case gives us cause to recall and insist on two key points. First, any automatic process must
always be conducted in a supervised manner (hence our use of the term ‘semi-automated’): ambiguous
cases are always numerous, given that the literary, editorial, and cultural reality of any period is always more
complex and elusive than what can be strictly formalised. In the case of metadata, cases of homonymy can
occur: oen, dierent works that are interested in the same theme may have similar titles (Essai sur la poésie
épique, Essai sur les intérêts du commerce maritime...). In the case of duplicate detection, some editorial
forms, such as anthologies or encyclopaedic collections, exist precisely because they serve as repositories of
long intertextual excerpts, which quantitative analysis alone would tend to indicate as duplicates. erefore,
each algorithmic intervention must be evaluated by a domain specialist.
Secondly, there are always margins of error, independent of the researcher’s intentions, which cannot be
avoided because they are inherent to the nature of the data. For example, our digitised texts are fundamentally
dierent in terms of the underlying quality of the textual data, transcribed to near 100 per cent accuracy
in some cases, while others are the result of an automatic OCR process whose accuracy can vary greatly
depending on the source and digitisation campaign. Automatic correction oen introduces more errors than
it corrects, and as a result, most of these texts retain high levels of OCR errors that can aect the performance
of even the most robust text-reuse detection system. But, beyond the incompleteness of results (a common
problem in both ‘traditional’ or digital research), it is the introduction of a non-homogeneous dimension
(corrected texts versus OCR) that can skew data analysis: the co-occurrences of a corrected text will be much
more easily retrievable than those of an OCR-generated text, and this can create signicant variations in
the results. However, this should not be considered an impediment to proceeding, but rather a warning to
exercise caution across the entire data-processing pipeline. Understanding one’s dataset and its limitations
is a rst, necessary, step to ensuring that downstream tasks and results are not inuenced by outside factors.
68
modelling enlightenment: reassembling intertextual networks
e most prevalent authors in our corpus, i.e. those with 30 or more titles attributed to them, can be
seen in Table 1.
A brief look at this list may raise some doubts, which need to be discussed, even if merely to draw
some general methodological observations. At rst glance, there seems to be a rather signicant over-
representation of Voltaire in our corpus: due to the editorial decisions of the Voltaire Foundation at the
University of Oxford, the source for all of our Voltaire les, single poems, some just a few lines long, are
considered individual works, alongside more substantial texts, such as the Essai sur les mœurs and Candide.
In this case as well, therefore, this bias must be taken into account during interpretation: each individual
poem must always be contextualised by taking into account the actual editorial history of the composition
and its dissemination.
Figure1 Distribution of text genre as percentage of total ModERN corpus.
With the above observations in mind, we began the corpus construction phase of our project: once
duplicates had been removed and the metadata standardised, we were le with a main research corpus of
13133 documents (mainly books) totalling over 511 million words in total. Of these 13133 documents,
3385 came from curated or transcribed sources while the remaining 9748 were the result of automatic
OCR. A rough distribution of text genres in the corpus can be seen in Figure 1. ese classications
were applied by the project team using a simplied version of the duc de La Vallière’s 18th-century
classication scheme for his private library, one of the most extensive in late Enlightenment France (see
Van Praet 1783).
69
modelling enlightenment: reassembling intertextual networks
Also of note is the strong presence of classical authors; those that were constantly edited and re-edited in
the 17th and 18th centuries. For the most part, these are translations that form a coherent sub-corpus of 527
texts, identied by our research team and mostly taken from Google Books as EPUB les which were then
cleaned, corrected and transformed into TEI-XML les. e logic behind including these texts, even if they
were not present in the initial corpora, is clear: the importance of the classical world in 18th-century culture
(from the Querelle des Anciens et des Modernes to the French Revolution, passing through neoclassicism and
the literary ‘retour à l’antique’ in the 1750s–1760s) is well known, and excluding them from our research
Author name Number of works
Voltaire (1694–1778) 1057
Carmontelle (1717–1806) 76
Cicero (106–43 bce)70
Bernard de Fontenelle (1657–1757) 69
Horace (65–8 bce)58
Plutarch (c.46–c.120) 56
Denis Diderot (1713–1784) 56
Florent Carton Dancourt (1661–1725) 51
Henri-Louis Duhamel Du Monceau (1700–1782) 47
Jean-Jacques Rousseau (1712–1778) 44
Honoré-Gabriel Riquetti comte de Mirabeau (1749–1791) 44
Pierre de Marivaux (1688–1763) 41
Pierre Corneille (1606–1684) 39
Jacques Necker (1732–1804) 39
Étienne Clavière (1735–1793) 39
Louis-Sébastien Mercier (1740–1814) 38
Louis Petit de Bachaumont (1690–1771) 37
Claude-Louis-Michel de Sacy (1746–1794) 37
Molière (1622–1673) 36
Jean-Antoine-Nicolas de Caritat marquis de Condorcet (1743–1794) 35
Antoine François Prévost (1697–1763) 34
Olympe de Gouges (1748–1793) 34
Charles-Simon Favart (1710–1792) 33
Thomas Corneille (1625–1709) 32
Tacitus (c.55–c.120) 32
Pierre-Samuel Dupont de Nemours (1739–1817) 32
Nicolas-Edme Rétif de La Bretonne (1734–1806) 30
Table 1 Authors with more than 30 texts in the ModERN corpus
70
modelling enlightenment: reassembling intertextual networks
would have meant losing a large quantity of citations and references that contributed to shaping the political
and aesthetic thought of the period.
Finally, we decided to include various canonical and indispensable texts composed before the 18th
century. In this case as well, we must consider the specics of our project: if, for example, Montaigne or
Pascal were absent from our corpus, an incredibly high number of references to their texts, crucial for the
18th century, would be untraceable for us, and our representation of intertextual patterns would be severely
compromised. In fact, it would not only be a serious lack of information but a structural problem of our
network: if two texts cite Montaigne without his work being present, they would appear as citing each
other when, in reality, they are both independently referring to a previous work. us, while the intertextual
networks we aim to produce will be bounded chronologically between 1685 and 1800 as beginning- and
end-dates, the corpus of texts used to generate the reuses must necessarily include works that fall outside
these somewhat arbitrary markers. If earlier texts were available in our base collections, then we tried to
include as many of them as possible.9
Having established our corpus and outlined the reasons and choices that underlie its creation, and aer
considering its limitations and their implications in formulating our working hypotheses for interpreting
our results, we were ready to proceed to the identication of intertextual connections within this corpus and
then to explore some possible research avenues and various types of analyses that can be marshalled even
with preliminary results.
3. Alignment and use-cases
Today, multiple soware applications are available for identifying text reuses in various datasets. Among
the freely available tools for extracting textual reuses in large corpora, we considered those that use
programming languages such as R (R textreuse package10), Java (TRACER11), PHP/Perl (Tesserae12) and
Python (Passim;13 BL AST,14 a tool designed for DNA sequence analysis; and Text-PAIR15). Although these
tools oer similar functionalities, we ultimately opted for Text-PAIR as it was specically designed to meet
the needs of literary-historical research, scales well to large corpora and can be compiled as part of the
9 For English publications the choice of potentially interesting early modern texts that predate the 18th century is much easier:
one can simply leverage the texts made available by the EEBO (Early English Books Online) project: https://proquest.libguides.
com/eebopqp. No such resource yet exists for publications in French.
10 https://github.com/ropensci/textreuse/. See also Li and Mullen (2020).
11 https://www.etrap.eu/research/tracer/. See also Büchler et al. (2014) and Franzini et al. (2019).
12 https://github.com/tesserae/tesserae/. See also Coee et al. (2013).
13 https://github.com/dasmiq/passim/. See also Romanello and Hengchen (2021).
14 Basic Local Alignment Search Tool: https://blast.ncbi.nlm.nih.gov/Blast.cgi. See also Vesanto et al. (2017) and Salmi et al.
(2020).
15 Pairwise Alignment for Intertextual Relations: https://github.com/ARTFL-Project/text-pair/. See also Olsen, Horton and Roe
(2011).
71
modelling enlightenment: reassembling intertextual networks
PhiloLogic search and retrieval corpus analysis system.16 PhiloLogic creates full-text indices of corpora,
leveraging metadata and other textual elements from TEI-XML les, and organises them into a database
that can be easily queried. PhiloLogic word indices then subsequently form the basis on which Text-PAIR
runs its sequence alignment matching algorithm. Additionally, Text-PAIR is easy to congure and relatively
fast, which allowed us to experiment with several key matching parameters and pre-processing options,
including lemmatisation and stemming.17 Finally, Text-PAIR is particularly well suited for extracting noisy
reuses – i.e. those that because of OCR or other factors may include broken or highly dissimilar sequences.
Once our main corpus was built as a PhiloLogic instance, and aer having settled on our text pre-
processing and matching parameters, we compared the entire corpus to itself using Text-PAIR. is initial
pass generated almost two million potential text reuses – i.e. similar passages that co-occur in at least
two dierent texts. While impressive, these results should be taken with an appropriate measure of salt,
as they include many, many ‘noisy’ alignments. at is, passages that are indeed similar but that do not
necessarily constitute a ‘reuse’ in its fullest sense: formulaic expressions, legal boilerplate, publishing and
print privileges, commonplaces, and so on. We are actively developing a semi-automatic alignment lter
designed to eliminate much of this ‘noise’, although many cases will require human intervention at a ner-
grained level of ltering. Based on initial estimates, around 80 per cent of the identied alignments will
likely be eliminated as ‘noise’, leaving us with roughly 200000 to evaluate further. We also plan to compare
our main corpus with several secondary corpora, including dictionaries, private correspondences, printed
pamphlets and the 18th-century press.18 In the meantime, we were eager to demonstrate the utility of our
approach and present some preliminary results and possible use-cases leveraging these ltered alignments.
4. Plagiarism
e potential of such large-scale text reuse data is manifold, and the type of studies that can be conducted
with it extremely varied. For instance, it is possible to identify obvious examples of plagiarism, cases in
which an author takes advantage of their source’s relative obscurity to appropriate their text with impunity.
Similarly, one can nd references that are dicult to trace for contemporary researchers, drawn from works
which are not considered canonical today but that circulated at the time and participated in the shared
literary culture. e results of an alignment may therefore represent a real surprise for a researcher, who can
(re)discover intertextual connections that would otherwise remain invisible.
16 https://github.com/ARTFL-Project/PhiloLogic4/. See also Tharsen and Gladstone (2020). Since 2015, Clovis Gladstone,
associate director at the University of Chicago’s ARTFL project, has been the lead developer of both the PhiloLogic and Text-
PAIR codebases. We are grateful for his invaluable support of the ModERN project.
17 For a discussion of our text-matching parameters and experimentation, see Fedchenko, Nicolosi and Roe (2024).
18 Our thanks to the ARTFL project for its collection of dictionaries (https://art-project.uchicago.edu/content/dictionnaires-
dautrefois), Electronic Enlightenment for its correspondences (https://www.e-enlightenment.com), the Newberry Library for
its collection of French pamphlets (https://www.newberry.org/collection/research-guide/french-pamphlets) and the BnF
DataLab for the 18th-century press (https://www.bnf.fr/fr/bnf-datalab).
72
modelling enlightenment: reassembling intertextual networks
To take one example, we came across an unexpected case of plagiarism involving the Greek poet Sappho,
a gure whose reception in the 18th century is somewhat problematic, and whose texts were included in the
sub-corpus of classical translations mentioned above. While the scandal of Sappho’s homosexual relationships
was largely mitigated in the 18th century, leading to a more pathos-driven portrayal of her character in
numerous rewritings that depict her as an unlucky lover, old and exiled, her erotic and passionate dimension
remained ever-present. So much so that Mercier, in his utopian work L’An 2440, included her among the
ancient texts that were unanimously burned as harmful. As Joan DeJean states, ‘the eighteenth century
may have continuously rearmed Sappho’s heterosexuality because the fear of her sapphism had not been
eradicated’ (1989, p.118). Her fragments were thus, on the one hand, excluded from the canon of classical
authors, and on the other, frequently translated into French and Latin, oen in anthologies of Greek poets
(Sappho 1681; Sappho 1712; Sappho 1758; Sappho 1781). ese translations, which lack deep philological
rigour, signicantly altered the original text to promote this new characterisation, making the new versions
scarcely recognisable compared to the originals (DeJean 1989, pp.116–67). Sappho is therefore published,
but read with suspicion, or simply misread; like the poet herself, her texts are subject to misinterpretations
and rewritings. Sappho’s case is therefore exemplary in understanding how, without the use of automatic
textual comparison systems, it would be impossible to nd all the traces of her literary dissemination.
Given this context, we discovered that in 1788 her poems were the subject of an almost comical
‘appropriation’ in the poetry section of the provincial Journal du Hainaut et du Cambrésis: a certain M. Parent
de Saint-Amand published not just one, but at least three poems by Sappho (‘À ma bouteille’, ‘Ma mort’,
‘À Éléonore’, this last rechristened ‘La Discrétion à Mlle de C.L.’), explicitly presenting himself as the author, with
small changes to the texts that in no way justify this claim (see Parent de Saint-Amand 1788; Parent de Saint-
Amand 1789a; Parent de Saint-Amand 1789b). In Figure 2 we see the text of Sappho compared to the Journal
in the Text-PAIR web interface, and in Figure 3, the original text published in the Journal. Due to OCR errors
present in the two compared texts, Text-PAIR only identied the part highlighted in red and marked the minor
dierences in green. However, the researcher can easily observe that the entire poem has been fully copied,
with only the modication of the title and the change of addressee of the poem (Éléonore becomes Constance).
is is one of the main advantages of Text-PAIR, which allows for the identication of textual reuse, even
partial, when the texts are incorrect or fragmented. Clearly, in this case, this is plagiarism no matter how one
looks at it, involving only the modication of the title and the change of the addressee of the poem.
In its exceptionality, this case raises several key points:
First, we must not forget the nature of the texts we are analysing: the 18th century had a very special
relationship with Antiquity, both of proximity and acclimatisation (theories of artistic perfection,
syncretism and the principle of the belle indèle, for example). is allowed for great liberty in the reuse of
ancient texts, which could be cut, transformed and distorted for any aesthetic purpose (Zuber 1968; Grell
1995, pp.307–24). A project such as ours can thus not only detect extreme cases of plagiarism such as the
73
modelling enlightenment: reassembling intertextual networks
Figure2 The text of Sappho as it appears in TextPair.
Figure3 M. Parent’s plagiarism as it appears in the Journal du Hainaut et du
Cambrésis.
74
modelling enlightenment: reassembling intertextual networks
one above, but also identify co-occurrences and rewritings that, given the extreme nonchalance with which
sources were oen reused in the 18th century, would oen otherwise have remained hidden.
Second, and most importantly, the detection of intertextual links is highly dependent on the ‘culture’ of
the analyser: just as the readers of the Journal may not have recognised Sappho’s texts, similarly, we researchers
today would have a dicult time identifying such references if not for its automatic detection across a large and
heterogenous corpus that includes both classical translations and issues of obscure provincial newspapers. But,
the importance of these small, almost serendipitous discoveries is signicant, as they open up unpredictable and
stimulating elds of research: who was this M. Parent? Why did he choose Sappho? How transparent was this
plagiarism for the reader of the time? etc. Or it can equally serve as a starting point for new more general research
questions: is it possible to nd networks of dissemination of classical texts in the provinces, where the processes
of cultural diusion were dierent from those in the capital? Despite moral controversies, how extensive was the
dissemination of Sappho’s texts in the 18th century? etc. It is precisely these sorts of questions, and their potential
answers, that we hope will emerge once the project’s data is analysed and released to the public.
5. Quantitative analysis – uncovering the inuence of individual authors
Aside from identifying and uncovering direct (or indirect) examples of text reuse, our project seeks more
generally to understand the notion of authorial or textual ‘inuence’ in the 18th century. Confronted
with many lesser-known gures whose reception is ambiguous, this type of analysis is oen dicult. It is
undeniable, for instance, that Cicero was a key reference for 18th-century culture, both in terms of oratory
and moral reexions, but understanding the impact of less prominent gures, or those whose biographies
might signicantly aect contemporary judgment, is altogether more complex. Again, it comes down to a
question of scale: Cicero is everywhere, and the possible variations in references to his works tend to lose
their signicance; Catullus much less so, and each individual reference takes on much greater importance,
making the discovery of multiple examples particularly valuable.
Take Julius Caesar as one such example: his reception in the 18th century is highly ambiguous, both as
a historical gure and as a writer (Mercier and Bièvre-Perrin 2024). ese two aspects are oen connected:
for instance, in Rollin’s Traité des études, an important pedagogical text of the time, Caesar is simultaneously
praised for his style as a historian (Grell 1995, pp.100–106) and condemned for his arrogance and for his
political coup that undermined the institutions of the Roman Republic (Bedon 1985). Praised for his military
achievements and for civilising Gaul (Grell 1995, pp.1113–19), Caesar is also highly criticised, on one hand
politically, given his status as a ‘tyrant of usurpation’ and, on the other, as an historian, whose works are
oen considered devoid of concrete details (Poignault 1985). We need only think about Voltaire’s equivocal
treatment of Caesar in Rome sauvée, where the character is both an alternative to Cicero’s passivity and one
of the rst possible accomplices in Catiline’s thirst for power (Nicolosi 2024), or the dierent nuances his
image takes in revolutionary speeches (Parent 2022). How should one assess the period’s interest in such an
75
modelling enlightenment: reassembling intertextual networks
ambiguous gure? In this case, traditional exegesis can be enriched by the data that a project like ours can
provide, conrming and nuancing existing interpretations.
e simplest method for assessing the inuence of authors is to quantify and evaluate their presence in
the texts of the time, either as subjects of theoretical works or as protagonists in literary texts, or to the extent
that their works and words are cited or reused for their exemplarity or appropriateness. However, while
the history of Julius Caesar is evidently the subject of countless comments and analyses (in our corpus, the
expression ‘Jules César’ occurs 620 times), the case is dierent when examining his ‘active’ presence in the
period’s imagination as a ‘speaking’ subject or ‘agent’, and thus, a more direct denition of inuence in terms
of symbolic impact or direct reuse of his statements.
We can start by evaluating the number and type of plays that are dedicated to Caesar, presented in
Table 2. Based on Brenner’s catalogue of all 18th-century plays (1947), we notice that Caesar is relatively
marginalised compared to other Roman historical gures (Laplace 1985): out of eight plays about him, two
are translations of Shakespeare, where Caesar is little seen; two others give him larger roles but were not
performed on Parisian or institutional stages; and the rest are primarily about Caesar’s death, where the
protagonists are actually Brutus and Cassius. His appearances in works on other subjects are rare: besides the
aforementioned Rome sauvée, we could also mention Caton d’Utique (1715) by François-Michel-Chrétien
Deschamps. Clearly, we have here a historical character who is much talked about for the greatness of his
deeds, but whom an 18th-century public seemingly does not want to ‘see’ or ‘hear’.
Author Title Genre Acts Year (and place if
known) of rst
performance or
publication
M.-A. Barbier La Mort de César tragedy 51710 (Paris)
Banières La Mort de Jules César tragedy 51728 (Toulouse)
Voltaire La Mort de César tragedy 31735 (Paris)
Abbé Saulx La Mort de César tragedy unknown 1737 (Reims, Collège des
Bons-Enfants)
P.-A. de LaPlace Jules César (translation
of Shakespeare)
tragedy 51746 (published in Le
Théâtre anglais)
J.-B.-C. Delisle de Sales César ou les deux vestales play 11774 (at the residence of
the prince d’Hénin)
P.-P.-F. LeTourneur Jules César (translation
of Shakespeare)
tragedy 51776 (published in
Shakespeare traduit de
l’anglais)
Anonymous L’Héroïsme sénonais ou
le siège de Sens sous
Jules César
drama 31781
Table 2 Table of all 18th-century plays dedicated to Julius Caesar.
76
modelling enlightenment: reassembling intertextual networks
But what about the presence of Caesar’s words in other texts? e extracted reuse data from our
project would seem to conrm the hypothesis we have just formulated. Caesar’s two main works are the
Commentarii de Bello Gallico, and the Commentarii de Bello Civili, which we consider rst by nding
quotations directly in Latin (the texts in the original language are easily found online). It is immediately
evident that most references to these two historiographical treatises are extracted from the De Bello Gallico
(57 co-occurrences), rather than the De Bello Civile (ve co-occurrences), which as the history of an
insurrection was clearly less popular in the context of French absolutism. Quantitatively, the quotations
are not particularly numerous, and oen appear in works by non-French authors and military texts, or
those that take an interest in pre-Roman Gaul (Table 3). All told, Caesar seems to be used mainly as
documentary support for historical or proto-ethnological enquiries, and rarely taken up or commented
on for his rhetoric and sentences.
Author and nationality Title and date of publication Number of quotes
F.-R. Pommereul (French) Recherches sur l’origine de l’esclavage religieux
politique du peuple, en France (1783)
12
C. Guischardt (French) Mémoires critiques et historiques sur plusieurs
points d’antiquités militaires (1774)
8
G. Stuart (British) Dissertation historique sur l’ancienne constitution
des Germains, Saxons et habitants de la Grande-
Bretagne (1794)
6
J.-R. Sinner (Swiss) Voyage historique et littéraire dans la Suisse
occidentale (1781)
5
R. Wallace (British) Essai sur la diérence du nombre des hommes
dans les temps anciens et modernes (1754)
4
J.-B. de Mirabaud (French) Le Monde, son origine, et son antiquité (1751) 2
H. Gautier (French) Traité des ponts (1728) 2
T. Shaw (British) Voyages dans plusieurs provinces de la Barbarie
et du Levant: contenant des observations
géographiques, physiques, philologiques…(1743)
2
Table 3 Table of works that most frequently cite Julius Caesar in Latin
Using our sub-corpus of translations, we nd a similar set of practices of reuse concerning Caesar’s works
(Caesar 1678; Caesar 1763; Caesar 1785; Caesar 1786): most of the references appear, again, in military texts,
conrming how Caesar was appreciated in the 18th century as a brilliant general rather than as a politician;
his historical treatises are again used mainly to extract information about ancient Gaul (Table 4).
Interestingly, our alignments also unearthed a maxim attributed to Madame Des Houillères that
recurs frequently in the various translations of Caesar: ‘Nul n’est content de sa fortune, ni mécontent de
son esprit’. e signicant presence of this maxim, which has become proverbial, in many peritexts of
Caesar’s translations suggests a negative perception of this character, of the leader who, out of his personal
77
modelling enlightenment: reassembling intertextual networks
armation, overthrew the legitimate, albeit republican, state. Finally, our data conrm and corroborate what
we had empirically perceived when looking at the theatrical output of the century: Caesar’s ‘voice’ remained
largely unheard in the 18th century, which preferred to discuss his exploits (mainly through Plutarch and
historians of the Imperial period) than to use directly the expressions of a controversial gure in the context
of monarchical France.
Here as above, the ability to compare a large number of texts allows us to conrm a hypothesis that
would be otherwise dicult to prove in absolute terms – remaining, as these so oen do, at the level of a
‘hunch’ (in this case, correct). Certainly, due to the nature of our data, it is possible that for technical reasons
(OCR texts with many errors), some of Caesar’s quotations may remain unidentied. But given the very low
number of quotations found in both French and Latin in comparison to other Roman authors, the nature of
the texts reusing the Roman general’s treatises, and the massive presence of Madame Des Houillères couplet,
there seems to be little doubt as to Caesar’s scarce presence within the 18th-century literary eld. As with
mixed-mode methods in the social sciences, quantitative analysis, applied to narrow or inherently complex
cases, becomes a solid ally of qualitative hypotheses and research.
6. e heuristic potential of networks
Finally, what about network analysis? How can it serve literary studies? At our current stage of research,
the amount of data and its complexity make it dicult to obtain reliable results. e various obstacles
that have appeared, and that we intend to overcome, concern, for example, the diculties in classifying
co-occurrences: when are they signicant? When do they represent a true reuse and not simply a repetition
of common or formulaic language with no intertextual value? While these questions remain very much
Author Title and year of publication Number of quotes
J.-B. Dubos Histoire critique de l’établissement de la monarchie
françoise dans les Gaules (1734)
11
J. Pagès Manuscrits de Pagès, marchand d’Amiens, écrits à
la n du 17e et au commencement du 18e siècle, sur
Amiens et la Picardie (1820)
10
A.-F. Boureau-Deslandes Essai sur la marine des anciens (1768) 10
C. Guischardt Mémoires critiques et historiques sur plusieurs
points d’antiquités militaires (1774)
8
M. de Saxe Les Rêveries dédiées à Messieurs les ociers
généraux par Mr. de Bonneville (1757)
3
D. Lescallier Vocabulaire des termes de marine anglois et
françois (1777)
3
Anonymous Un bon François de l’ordre des patriciens, aux bons
François de l’ordre des plébéiens (1789)
2
Table 4 Table of works that most frequently cite Julius Caesar translated in French
78
modelling enlightenment: reassembling intertextual networks
open, we are nonetheless encouraged by some of our preliminary results, which conrm known premises of
18th-century literature while hinting at the broader potential of the project as a whole.
Let us take, for example, one of the typical dichotomies of the theatrical and literary world of the
18th century, the subject of countless specialist debates: who is the most important point of reference
for Enlightenment dramatists, Corneille or Racine? And how did these two giants of French classicism
come to inuence 18th-century playwriting? e parallel, already posed at the end of the 17th century
(Mortgat-Longuet 2003) and continued in the 18th century (Goldzink 2003), accompanies the entire
history of French literature. Many concessions would need to be made, but in general, anyone who has
dealt with the history of 18th-century theatre would likely answer Racine, insofar as his use of the pathetic
and the spectacular (e.g. Athalie) inform the main aesthetic developments of the century (Perchellet
2004a; Perchellet 2004b; Viala and Tunstall 2015, p.274). Can our current data conrm this rst, intuitive
hypothesis? And if so, how? To answer these questions, we need to rst take into account our use of graph
metrics for understanding network ‘inuence’.
Our dataset of textual reuses allows us to generate graphs in which each text or author (i.e. the totality
of texts attributed to the same author) represents a node, and in which the links between two points indicate
an intertextual exchange that has taken place between two texts (or between the entire production of both
connected authors). Each generated graph will thus have characteristics that can be analysed mathematically,
and which give us information on the function and weight that each node (and thus each text or author)
assumes within the system of intertextual exchanges.
e rst metric that can be analysed is the degree of a node, i.e. the number of exchanges in which it is
a protagonist, either as a quoting subject or as a quoted object. Using this degree measure, it is possible to
generate a simple relative ranking by number of intertextual interactions. In our case, and with the current
data at our disposal, Corneille appears in 14th position, while Racine comes in 11th. us, in the absolute,
Racine appears in more exchanges than Corneille. A rst conrmation of our hypothesis, but from which
no conclusion can be drawn: beyond this purely quantitative measure, it is the quality and importance of
these links in the general context of the intertextual network that can give us more and better indications.
Another measure that can be taken into consideration is PageRank, a measure for directed graphs
which depends on the number and quality of links to a node (Liu et al. 2017; Labatut and Bost 2019). e
underlying hypothesis behind this measure is that the most important nodes are likely to receive more links
from other important nodes. An author has a high PageRank if a large number of authors reuse their text
and these authors are themselves oen reused and, therefore, considered important in the system. In other
words, if an author who is ‘widely read’ quotes me, my ‘importance’ and the possibility of other people
reading me increase. ere is more chance of readers ‘stumbling across’ my text if Voltaire reuses me, than if
20 minor and little-read authors do. Now, if we take the ranking by PageRank of the authors in our project,
Corneille is in sixth place, while Racine is in seventh: we could therefore deduce that Corneille is less cited
79
modelling enlightenment: reassembling intertextual networks
in the absolute, but cited by more ‘important’ authors, and that therefore his impact on the literary world of
the 18th century is slightly stronger than that of Racine. But it is possible to rene this result even further.
Another fundamental measure in network analysis is betweenness centrality, which calculates the
importance of a node with respect to its position in the graph (Labatut and Bost 2019; Grandjean and Jacomy
2019). More specically, it calculates the number of times a node is on the shortest paths between two other
nodes; the more central a node is and therefore the more it can act as a bridge between other nodes in the
network, the higher its betweenness centrality. e more peripheral and isolated a node is from the rest of the
nodes in the system, the lower its betweenness will be and the lower its impact on potential paths. Now, this
measure emphasises the identication of paths, as in the case of information transmission and distribution
ows (e.g. of people, energy, goods): if to get from point A to point C, the fastest route goes through point
B, then B appears as a central node in the distribution of a resource, and will have a high betweenness – it
becomes a hub. An airport of a large city allows for the connection of many airports of smaller cities, not
otherwise connected to each other: its betweenness and importance in the transportation network are very
high, enabling the passage of travellers between disconnected places. In an intertextual network, however, the
relationships between texts/points do not describe a ow of information or represent a path between various
points. If text B cites text A, and is itself cited by text C, the representation of this interaction (A → B → C)
seems to suggest that B is a bridge between A and C; but in reality, it makes little sense to say that A and C
are connected thanks to B, for numerous reasons (what B cites from A is not necessarily what C cites from B;
and if the citation is the same, it is impossible to establish whether C cited A through B, or whether C directly
cited A). More generally, even though the direction of the arrows may suggest a path from A to C, in reality,
what is represented is only the relationship between A and B, and that between B and C.
In our network, betweenness does not indicate the importance of a text, but rather its ability to be involved
in intertextual exchanges with dierent groups of texts or literary communities. A high betweenness implies
that the node is connected to many nodes of the network which would be disconnected otherwise: a text that
cites or is cited in politics, economics, theatre, religion, etc. will come into contact with very dierent parts of our
intertextual graph, even in isolation. Conversely, a low betweenness implies that the node is poorly connected,
or connected to a homogeneous group of texts that tend to quote each other, without connections with other
areas of the network. Typical cases of low betweenness are found in religious texts or legal documents, which
are very present in texts of the same nature, but rarely found in texts dealing with other subjects.
Our network’s betweenness centrality ranking nds Racine in sixth position and Corneille 196th. e
latter’s position in the network is thus much more marginal than that of the former. e explanations for
this large gap, even in the face of similar degree and PageRank measures, may be multiple: Racine seems
to act as a ‘bridge’ between dierent groups in our network, both as a citing author (think of his links with
the religious world of Port-Royal) and as a quoted author (his verses become part of popular culture, and
quoted in contexts that do not concern theatrical dramaturgy). On the contrary, Corneille remains a highly
80
modelling enlightenment: reassembling intertextual networks
cited author (his degree measure is very high), but by homogeneous groups, or by a few individual authors
in an intense manner. Voltaire’s Commentaires sur Corneille (1764), as well as other poetic texts (La Harpe’s
Lycée, 1739, for example), quote his works extensively, and we thus understand why Corneille’s PageRank is
so high, but these texts all belong to the world of Belles-lettres alone, and links with other groups and areas
of the network remain weak.
While it therefore remains dicult (and probably pointless) to answer denitively the question we have
posed regarding the importance of Corneille or Racine in 18th-century culture, the possibility of enriching
our knowledge through the integration of graph metrics extracted from network analysis has allowed us
to imagine new and dierent research hypotheses. e two authors seem to participate, in a qualitatively
dierent way, in the network of transmission of intertextual material: Racine seems to be a more transversal
author, and his verses are extracted from their context and reused in dierent spheres; Corneille remains a
very important literary reference point, but with a few exceptions, his words (poetic, but also theoretical,
e.g. his Trois discours sur le poème dramatique) resonate mainly in literary circles. Even as we await more
conclusive and extensive data, this simple insight opens up new avenues of research: which categories of
texts quote Racine the most? How do his verses – extracts from tragic texts, and decontextualised – manage
to take on dierent valences and become meaningful? How does Corneille become a point of reference – by
adherence or contrast – to the dramatic poetics of the 18th century?
7. Conclusion: next steps
In this brief account, we have described the basic workings of our project, and presented some possible
lines of research that it can help to identify and enrich. e analyses of the plagiarism of Sappho and the
dissemination of Caesar’s texts represent an initial demonstration of the advantages of applying digital
methods in the discovery of intertextual connections, which would otherwise be either unrecoverable or
too numerous for traditional close-reading approaches. As such, each piece of data becomes the basis for
returning to the sources, leading to new interpretations that either rearm or challenge common critical
assumptions. On the other hand, the Corneille/Racine comparison shows how the use of SNA paradigms
and metrics in literature and its internal intertextual links is both possible and potentially capable of rening
common exegesis, oering new evidence for established theories or uncovering new patterns that only large-
scale distant-reading analyses can reveal (Underwood 2019).
Clearly, we are still at an early stage of our project, and much work remains to be done to make our
results meaningful. Our goal is to create and dene proles for each text/author node in our network, created
on a mathematical basis in relation to the various metrics presented (and others belonging to the eld of
graph studies – e.g. closeness centrality, clustering).19 For example, since ours is an oriented graph (the links
19 See Labatut and Bost (2019); Grandjean and Jacomy (2019).
81
modelling enlightenment: reassembling intertextual networks
connecting the works have a direction, there is a source and a target of each citation), it is possible to dene
an author-text node as an Authority (high number of texts citing it) or as an Observer (high number of
texts cited); the same goes for the category of Mediator, applicable to a node whose betweenness centrality
and PageRank measures are both high. Other possible categories will emerge through the combination of
the various metrics we intend to calculate. We also envision analysing the context of each alignment, i.e.
dening by topic modelling or sentiment analysis the ‘intention’ behind each quotation/reuse. For example,
while Corneille is quoted at length and commented on in Voltaire’s Commentaires, the criticisms that the
philosophe levies against the 17th-century playwright certainly contain a dierent set of value judgements
than those found in La Harpe’s Éloge de Racine (1772).
ese future perspectives notwithstanding, our general hope is that our large-scale treatment of
intertextual links in the 18th century will allow us to verify, in a new and novel way, some of the most
widespread literary hypotheses, and to oer the entire scholarly community the tools to conduct such
research themselves. Once organised in a database, all our data will be available online and interrogated in,
we hope, an intuitive manner, allowing any researcher to verify their own hypotheses on the circulation and
diusion of texts and authors in the 18th-century French literary eld.
References
Ahnert R., Ahnert S., Coleman C. and Weingart S. 2021. e Network Turn. Changing Perspectives in the Humanities.
Cambridge: Cambridge University Press. https://doi.org/10.1017/9781108866804.
Barron A., Huang J., Spang R. and DeDeo S. 2018. ‘Individuals, institutions, and innovation in the debates of the
French Revolution’. In: Proceedings of the National Academy of Sciences 115:18, 4607–12.
https://doi.org/10.1073/pnas.1717729115.
Barthes R. 1984. Le Bruissement de la langue. Paris: Seuil.
Bedon R. 1985. ‘César dans le Traité des études de Charles Rollin’. In: Chevallier R. (ed.) Présence de César. Paris: Les
Belles Lettres, 275–85.
Brenner C. 1947. A Bibliographical List of Plays in the French Language 1700–1789. Berkeley CA: Edwards brothers.
Brockliss L.W.B. 2002. Calvet’s web: Enlightenment and the Republic of Letters in Eighteenth-century France. Oxford:
Oxford University Press. https://doi.org/10.1093/oso/9780199247486.001.0001.
Büchler M., Burns P., Müller M., Franzini E. and Franzini G. 2014. ‘Towards a historical text re-use detection’. In:
Biemann C. and Mehler A. (eds) Text Mining. eory and Applications of Natural Language Processing. Cham: Springer,
221–38. https://doi.org/10.1007/978-3-319-12655-5_11.
Burrows S. 2018. e French Book Trade in Enlightenment Europe. London: Bloomsbury Academic.
Burrows S. 2020. ‘e FBTEE revolution: mapping the Ancien Régime book trade and the future of historical bibliometric
research’. In: Burrows S. and Roe G. (eds) Digitizing Enlightenment: Digital Humanities and the Transformation of
Eighteenth-Century Studies. Oxford University Studies in the Enlightenment. Liverpool: Liverpool University Press,
167–94.
82
modelling enlightenment: reassembling intertextual networks
Burrows S. and Roe G. (eds) 2020. Digitizing Enlightenment: Digital Humanities and the Transformation of Eighteenth-
Century Studies. Oxford University Studies in the Enlightenment. Liverpool: Liverpool University Press.
Buscaldi D., Felhi G., Ghoul D., Le Roux J., Lejeune G. and Zhang X. 2020. ‘Calcul de similarité entre phrases: quelles
mesures et quels descripteurs?’. In: Cardon R., Grabar N., Grouin C. and Hamon T. (eds) Actes de la 6e conférence conjointe
Journées d’études sur la parole (JEP, 33e édition), Traitement automatique des langues naturelles (TALN, 27e édition),
Rencontre des étudiants chercheurs en informatique pour le traitement automatique des langues (RÉCITAL, 22e édition).
Atelier Dé Fouille de Textes. Nancy: ATALA and AFCP, 14–25. https://aclanthology.org/2020.jeptalnrecital-de.2.
Caesar J. 1678. Les Commentaires de César, de la traduction de N.Perrot, sieur d’Ablancourt. Édition nouvelle revue et
corrigée. Perrot d’Ablancourt N. (trans.). Amsterdam: A.Wolfgang.
Caesar J. 1763. Les Commentaires de César […]. Nouvelle édition augmentée de notes historiques et géographiques, et
d’une carte nouvelle de la Gaule et du plan d’Alise, par M.Danville. Perrot d’Ablancourt N. and Le Mascrier J.-B. (trans.).
Amsterdam: Arkstee & Merkus.
Caesar J. 1785. Commentaires de César, avec des notes historiques, critiques et militaires. Turpin de Crispé L. (trans.).
Montargis: C.Lequatre and Paris: C.-G.Leclerc.
Caesar J. 1786. La Guerre de Jules César dans les Gaules. De Précis (trans.). Paris: Imprimerie royale.
Coee N., Koenig J.-P., Poornima S., Forstall C., Ossewaarde R. and Jacobson S. 2013. ‘e Tesserae Project: intertextual
analysis of Latin poetry’. In: Literary and Linguistic Computing 28:2, 221–28. https://doi.org/10.1093/llc/fqs033.
Comsa M. T., Conroy M., Edelstein D., Edmondson C. S. and Willan C. 2016. ‘e French Enlightenment network’.
In: e Journal of Modern History 88:3, 495–534. https://doi.org/10.1086/687927.
Darnton R. 1982. e Literary Underground of the Old Regime. Cambridge MA: Harvard University Press.
Darnton R. 2021. Pirating and Publishing: e Book Trade in the Age of Enlightenment. Oxford: Oxford University
Press.
DeJean J. 1989. Fictions of Sappho, 1546–1937. Chicago IL: University of Chicago Press.
Edelstein D., Morrissey R. and Roe G. 2013. ‘To quote or not to quote: citation strategies in the Encyclopédie’. In:
Journal of the History of Ideas 74:2, 213–36. https://www.jstor.org/stable/43291299.
Edmondson C. and Edelstein D. (eds) 2019. Networks of Enlightenment: Digital Approaches to the Republic of Letters.
Oxford University Studies in the Enlightenment. Liverpool: Liverpool University Press.
Fedchenko V., Nicolosi D.M. and Roe G. 2024. ‘À la recherche des réseaux intertextuels: dés de la recherche littéraire
à grande échelle’. In: Humanités numériques 9. https://doi.org/10.4000/11wmw.
Franzini G., Passarotti M., Moritz M. and Büchler M. 2019. ‘Using and evaluating TRACER for an Index fontium
computatus of the Summa contra Gentiles of omas Aquinas’. Fih Italian Conference on Computational Linguistics
(CLiC-it 2018), Turin: Zenodo. https://doi.org/10.5281/zenodo.3362130.
Genette G. 1992. Palimpsestes: la littérature au second degré. Paris: Seuil.
Goldzink J. 2003. ‘Le torrent et la rivière’. In: Declercq G. and Rosellini M. (eds) Jean Racine, 1699–1999. Paris: Presses
Universitaires de France, 719–28.
Grandjean M. and Jacomy M. 2019. ‘Translating networks: assessing correspondence between network visualisation
and analytics’. Digital Humanities 2019. Utrecht: HALSHS. https://shs.hal.science/halshs-02179024.
Grell C. 1995. Le Dix-huitième siècle et l’antiquité en France: 1680–1789. SVEC 330–31. Oxford: Voltaire Foundation.
83
modelling enlightenment: reassembling intertextual networks
Hamzehei A., Jiang S., Koutra D., Wong R. and Chen F. 2017. ‘Topic-based social inuence measurement for social
networks’. In: Australasian Journal of Information Systems 21. https://doi.org/10.3127/ajis.v21i0.1552.
Kristeva J. 1969. Sēmeiōtikē. Recherches pour une sémanalyse. Paris: Seuil.
Labatut V. and Bost X. 2019. ‘Extraction and analysis of ctional character networks: a survey’. In: ACM Computing
Surveys 52:5, 1–40. https://doi.org/10.1145/3344548.
Laplace R. 1985. ‘Le personnage de César à la Comédie-Française’. In: Chevallier R. (ed.) Présence de César. Paris: Les
Belles Lettres, 293–304.
Li Y. and Mullen L. 2020. textreuse: Detect Text Reuse and Document Similarity. https://docs.ropensci.org/textreuse.
Liu Q., Xiang B., Jing Yuan N., Chen E., Xiong H., Zheng Y. and Yang Y. 2017. ‘An inuence propagation view of
PageRank’. In: ACM Transactions on Knowledge Discovery from Data 11:3, 1–30. https://doi.org/10.1145/3046941.
McCarty W. 2018. ‘Modeling the actual, simulating the possible’. In: Flander J. and Joannidis F. (eds) e Shape of Data
in Digital Humanities. London: Routledge, 264–84.
Mercier C. and Bièvre-Perrin F. (eds) 2024. Jules César, construction d’une image de l’Antiquité à nos jours. Besançon:
Presses universitaires de Franche-Comté.
Moret ti F. 2008. Graphes, cartes et arbres. Modèles abstraits pour une autre histoire de la littérature. Paris: Les Prairies
ordinaires.
Moret ti F. (ed.) 2017. Canon/Archive. Studies in Quantitative Formalism from the Stanford Literary Lab. New York: n+1
Foundation.
Mortgat-Longuet E. 2003. ‘Aux origines du parallèle Corneille-Racine: une question de temps’. In: Declercq G. and
Rosellini M. (eds) Jean Racine, 1699–1999. Paris: Presses Universitaires de France, 703–17.
Most G.W. 2008. ‘Réexions de Sappho’. Rabau S. and de Gandt M. (trans.). In: Fabula-LhT5. https://doi.org/10.58282/lht.832.
Nicolosi D.M. 2024. ‘La valeur symbolique de l’espace scénique dans les tragédies romaines et grecques de Voltaire’. In:
Revue Voltaire 22, 121–36.
Norman L.F. 2011. e Shock of the Ancient. Literature and History in Early Modern France. Chicago IL: University of
Chicago Press.
Olsen M., Horton R. and Roe G. 2011. ‘Something borrowed: sequence alignment and the identication of similar
passages in large text collections’. In: Digital Studies/Le Champ numérique 2:1. https://doi.org/10.16995/dscn.258.
Paige N. 2021. Technologies of the Novel: Quantitative Data and the Evolution of Literary Systems. Cambridge: Cambridge
University Press. https://doi.org/10.1017/9781108890861.
Parent H. 2022. Modernes Cicéron. La romanité des orateurs révolutionnaires et de l’Empire (1789–1807). Paris: Classiques
Garnier.
Parent de Saint-Amand [given name unknown] 1788. ‘À ma bouteille’. In: Journal du Hainaut et du Cambrésis, par
M.le Cher de Limoges, membre de plusieurs académies 45, 378.
Parent de Saint-Amand [given name unknown] 1789a. ‘Ma mort’. In: Journal du Hainaut et du Cambrésis, par M.le
Cher de Limoges, membre de plusieurs académies 4, 35–36.
Parent de Saint-Amand [given name unknown] 1789b. ‘La Discrétion à Mlle de C.L.’. In: Journal du Hainaut et du
Cambrésis, par M.le Cher de Limoges, membre de plusieurs académies 7, 60.
84
modelling enlightenment: reassembling intertextual networks
Perchellet J.-P. 2004a. L’Héritage classique: la tragédie classique entre 1680 et 1814. Paris: Honoré Champion.
Perchellet J.-P. 2004b. ‘Corneille et ses publics au XVIIIe siècle’. In: Dix-septième siècle 225, 549–57.
https://doi.org/10.3917/dss.044.0549.
Poignault R. 1985. ‘NapoléonIer et NapoléonIII lecteurs de Jules César’. In: Chevallier R. (ed.) Présence de César. Paris:
Les Belles Lettres, 329–45.
Romanello M. and Hengchen S. 2021. ‘Detecting text reuse with Passim’. In: Programming Historian.
https://doi.org/10.46430/phen0092.
Salmi H., Paju P., Rantala H., Nivala A., Vesanto A. and Ginter F. 2020. ‘e Reuse of texts in Finnish newspapers
and journals, 1771–1920: a digital humanities perspective’. In: Historical Methods: A Journal of Quantitative and
Interdisciplinary History 54:1, 14–28. https://doi.org/10.1080/01615440.2020.1803166.
Sappho 1681. Les Poésies d’Anacréon et de Sapho, traduites de grec en françois, avec des remarques. Dacier A. (trans.).
Paris: D.ierry & C.Barbin.
Sappho 1712. Les Odes d’Anacréon et de Sapho en vers françois, par le poète sans fard. Gacon F. (trans.). Rotterdam:
Fritsch & Böhm.
Sappho 1758. Anacréon, Sapho, Moschus, Bion, Tyrthée, etc., traduits en vers français. Poinsinet de Sivry L. (trans.).
Nancy: P.Antoine.
Sappho 1781. Poésies de Sapho, suivies de diérentes poésies dans le même genre. Billardon de Sauvigny E.-L. (trans.).
London: [n.pub.].
arsen J. and Gladstone C. 2020. ‘Using Philologic for digital textual and intertextual analyses of the Twenty-Four
Chinese Histories 二十四史’. In: Journal of Chinese History 中國歷史學刊4:2, 558–63.
https://doi.org/10.1017/jch.2020.27.
Underwood T. 2019. Distant Horizons, Digital Evidence and Literary Change. Chicago IL: University of Chicago Press.
Van Praet J.B.B. 1783. Catalogue des livres de la bibliothèque de feu M.le duc de LaVallière. 4vol., Paris: Guillaume De
Bure. https://catalogue.bnf.fr/ark:/12148/cb365379105.
Vesanto A., Nivala A., Rantala H., Salakoski T., Salmi H. and Ginter F. 2017. ‘Applying BLAST to text reuse detection
in Finnish newspapers and journals, 1771–1910’. In: Bouma G. and Adesam Y. (eds) Proceedings of the NoDaLiDa
2017 Workshop on Processing Historical Language. Gothenburg: Linköping University Electronic Press, 54–58.
https://aclanthology.org/W17-0510.
Viala A. and Tunstall K. 2015. L’Âge classique et les Lumières. In: Viala A. 2014–2017. Une histoire brève de la littérature
française. Paris: Presses universitaires de France.
Zuber R. 1968. Les Belles indèles et la formation du goût classique. Paris: A.Colin.