Figure 2 - available via license: Creative Commons Attribution 4.0 International
Content may be subject to copyright.
Key terms ("nakhuda" [upper left], "powers" [upper right], "weapons" [center right], "traffic" [center right], "clerk" [bottom center], and "mistress" [lower left]) in red and their contextual associations in black plotted in a two-dimensional space, generated using PCA on our custom trained WVM. The figure illustrates how specific words are spatially distributed based on their contextual similarities. The model has been trained for 2-grams: 150 dimensions, 20 iterations, 6 word window with negative sampling of 5.
Source publication
In this article we analyze a corpus related to manumission and slavery in the Arabian Gulf in the late nineteenth- and early twentieth-century that we created using Handwritten Text Recognition ( HTR ). The corpus comes from India Office Records ( IOR ) R/15/1/199 File 5 . Spanning the period from the 1890s to the early 1940s and composed of 977K w...
Contexts in source publication
Context 1
... examining these relationships, we gain insights into the semantic connections between terms, enabling a richer understanding of the corpus's multi-layered nature. Figure 2 represents a query using five key terms: "nakhuda" (a boat captain), "powers," "weapons," "traffic," "clerk," and "mistress." For example, the term "powers," chosen as one of the search terms, is used by colonial forces to refer to themselves, and it can be found in the upper right quadrant of the plot. ...
Context 2
... they emerged through iterative cycles of close and distant reading of IOR File 5. For demonstration purposes, we chose terms that produced a clear and evenly distributed arrangement in Figure 2. This distribution highlights distinct semantic fields within the corpus, with one notable exception. ...
Context 3
... spatial positioning of terms can sometimes reflect broader patterns in the data. In Figure 2, clusters on the right side of the plot largely correspond to colonial infrastructure, Western sovereignty, and the arms and slave trades. In contrast, the left side contains terms related to the lived experiences of enslaved individuals in the Gulf. ...
Context 4
... distinct clusters in Figure 2 were chosen deliberately to illustrate key discursive "neighborhoods" in the data. By using multiterm queries, we can uncover nuanced relationships between terms, offering a more granular understanding of how concepts are interconnected and contributing to broader themes in the debates surrounding slavery. ...
Similar publications
This paper describes the ParlaMint 4.0 parliamentary corpora as made available in TEITOK at LINDAT. The TEITOK interface makes it possible to search through the corpus, to view each session in a readable manner, and to explore the names in the corpus. The interface does not present any new data, but provides an access point to the ParlaMint corpus...
The neutrality detection in Sentiment Analysis (SA) still constitutes an unsolved and debated issue. This work proposes an empirical method based on the quartiles of the polarity distribution for a lexicon-based SA approach. Our experiments are based on the Italian linguistic resource MAL (Morphologically-inflected Affective Lexicon) and applied to...