Figure 1 - uploaded by Andrea Bruera
Content may be subject to copyright.
A visualization of the Doppelgänger test. Each of the 59 novels is split into two parts (Part A and Part B), and then from each one of them, for each character and for the matched common nouns, a word vector is created by using distributional semantics models. Then, by comparing the vectors for part A and part B, we check whether we can correctly match co-referring word vectors.
Source publication
In human semantic cognition, proper names (names which refer to individual entities) are harder to learn and retrieve than common nouns. This seems to be the case for machine learning algorithms too, but the linguistic and distributional reasons for this behaviour have not been investigated in depth so far. To tackle this issue, we show that the se...
Contexts in source publication
Context 1
... show that this is the case, we propose an original referential task, the Doppelgänger test, associated with a new dataset, the Novel Aficionados dataset, made of 59 novels. The Doppelgänger test evaluates whether each entity representation learned in one subcorpus (one half of a novel) can be correctly matched to its co-referring entity representation from another subcorpus (the second half of the same novel), choosing among all the other entity representations (see figure 1). The task is challenging in that the model must distinguish between very similar entities (people and entities engaged in shared activities in a common 1 Since names of places, objects or events have been reported in cognitive studies to dissociate from proper names of conspecifics ( Lyons et al., 2002, Crutch andWarrington, 2004), in order to avoid confounds, these other sorts of names won't be considered. ...
Context 2
... order to reduce confounds, in the Doppelgänger test we take a single document where multiple entities appear -in our case, a novel, where entities are referred to by either proper names and common nouns. Then, the document is first split into two sub-corpora (Part A and Part B), both containing mentions of all the entities (see figure 1). Subsequently, for each part, a semantic representation for each entity is obtained by way of a distributional semantics model. ...
Context 3
... show that this is the case, we propose an original referential task, the Doppelgänger test, associated with a new dataset, the Novel Aficionados dataset, made of 59 novels. The Doppelgänger test evaluates whether each entity representation learned in one subcorpus (one half of a novel) can be correctly matched to its co-referring entity representation from another subcorpus (the second half of the same novel), choosing among all the other entity representations (see figure 1). The task is challenging in that the model must distinguish between very similar entities (people and entities engaged in shared activities in a common 1 Since names of places, objects or events have been reported in cognitive studies to dissociate from proper names of conspecifics ( Lyons et al., 2002, Crutch andWarrington, 2004), in order to avoid confounds, these other sorts of names won't be considered. ...
Context 4
... order to reduce confounds, in the Doppelgänger test we take a single document where multiple entities appear -in our case, a novel, where entities are referred to by either proper names and common nouns. Then, the document is first split into two sub-corpora (Part A and Part B), both containing mentions of all the entities (see figure 1). Subsequently, for each part, a semantic representation for each entity is obtained by way of a distributional semantics model. ...
Similar publications
This paper strives for self-supervised learning of a feature space suitable for skeleton-based action recognition. Our proposal is built upon learning invariances to input skeleton representations and various skeleton augmentations via a noise contrastive estimation. In particular, we propose inter-skeleton contrastive learning, which learns from m...