Chapter

Solving Word Analogies: A Machine Learning Perspective

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Analogical proportions are statements of the form ‘a is to b as c is to d’, formally denoted Open image in new window . This means that the way a and b (resp. b and a) differ is the same as c and d (resp. d and c) differ, as revealed by their logical modeling. The postulates supposed to govern such proportions entail that when Open image in new window holds, then seven permutations of a, b, c, d still constitute valid analogies. It can also be derived that Open image in new window does not hold except if a=b. From a machine learning perspective, this provides guidelines to build training sets of positive and negative examples. We then suggest improved methods to classify word-analogies and also to solve analogical equations. Viewing words as vectors in a multi-dimensional space, we depart from the traditional parallelogram view of analogy to adopt a purely machine-learning approach. In some sense, we learn a functional definition of analogical proportions without assuming any pre-existing formulas. We mainly use the logical properties of proportions to define our training sets and to design proper neural networks, approximating the hidden relations. Using a GloVe embedding, the results we get show high accuracy and improve state of the art on words analogy-solving problems.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Analogy detection on a single quadruple is less trivial than it may appear, as in many cases the boundary between valid and invalid analogies is not clearly defined. A solution is to use machine learning to learn the boundary from data, as Lim et al. implemented [10] for semantic word analogies and further explored in [11]. Using a dataset of semantic APs, they learn an artificial neural network to classify quadruples A, B, C, D into valid or invalid analogies, using pretrained word embeddings e A , e B , e C , and e D . ...
... Using a dataset of semantic APs, they learn an artificial neural network to classify quadruples A, B, C, D into valid or invalid analogies, using pretrained word embeddings e A , e B , e C , and e D . We follow a similar approach to morphological analogies in [13] by replacing the GloVe [35] semantic embeddings used in [10,11] with a morphology-oriented word embedding model. The classifier is detailed in Subsection 5.1 and Fig. 4 under the name Analogy Neural Network for classification (ANNc). ...
... In [10,11], a retrieval approach was proposed for semantic APs and later adapted to morphological APs in [15,17]. Similarly to the analogy detection approach proposed by the authors (see Subsection 2.3), the analogy solving approach relies on pre-trained embeddings. ...
Article
Full-text available
Analogical inference is a remarkable capability of human reasoning, and has been used to solve hard reasoning tasks. Analogy based reasoning (AR) has gained increasing interest from the artificial intelligence community and has shown its potential in multiple machine learning tasks such as classification, decision making and recommendation with competitive results. We propose a deep learning (DL) framework to address and tackle two key tasks in AR: analogy detection and solving. The framework is thoroughly tested on the Siganalogies dataset of morphological analogical proportions (APs) between words, and shown to outperform symbolic approaches in many languages. Previous work have explored the behavior of the Analogy Neural Network for classification (ANNc) on analogy detection and of the Analogy Neural Network for retrieval (ANNr) on analogy solving by retrieval, as well as the potential of an autoencoder (AE) for analogy solving by generating the solution word. In this article we summarize these findings and we extend them by combining ANNr and the AE embedding model, and checking the performance of ANNc as an retrieval method. The combination of ANNr and AE outperforms the other approaches in almost all cases, and ANNc as a retrieval method achieves competitive or better performance than 3CosMul. We conclude with general guidelines on using our framework to tackle APs with DL.
... Germany, which can be read "Paris is to France as Berlin is to Germany", that simultaneously capture similarities and dissimilarities between objects [28,31]. Analogical reasoning is a remarkable capability of the human mind, that has recently obtained impressive results on NLP tasks when applied on character and word embeddings [23,27,43]. Such a reasoning has also been proposed for KGs [18,25,32,38,49], using KG embeddings, i.e., vector representations of KG entities and relations that preserve as much as possible the properties of the graph [5]. ...
... Analogybased inference has been used to solve hard reasoning tasks and has shown its potential with competitive results in several machine learning tasks such as classification, decision making and recommendation [9,12,13,17], in data augmentation through analogical extrapolation for model learning, especially in environments with few labeled examples [7,8]. Moreover, it has been successfully applied in classical natural language processing (NLP) tasks such as machine translation [21], several semantic [23,24] and morphological tasks [3,27,34], as well as in (visual) question answering [39], solving puzzles and scholastic aptitude tests [37], and target sense verification [50]. ...
... Model. We adopt the supervised machine learning model proposed by Lim et al. [23]. This model is presented in Figure 3 and relies on convolutional neural networks (CNNs). ...
Conference Paper
Full-text available
Knowledge Graph Construction (KGC) can be seen as an iterative process starting from a high quality nucleus that is refined by knowledge extraction approaches in a virtuous loop. Such a nucleus can be obtained from knowledge existing in an open KG like Wikidata. However, due to the size of such generic KGs, integrating them as a whole may entail irrelevant content and scalability issues. We propose an analogy-based approach that starts from seed entities of interest in a generic KG, and keeps or prunes their neighboring entities. We evaluate our approach on Wikidata through two manually labeled datasets that contain either domain-homogeneous or -heterogeneous seed entities. We empirically show that our analogy-based approach outperforms LSTM, Random Forest, SVM, and MLP, with a drastically lower number of parameters. We also evaluate its generalization potential in a transfer learning setting. These results advocate for the further integration of analogy-based inference in tasks related to the KG lifecycle.
... arXiv:2306.16296v1 [cs.AI] 28 Jun 2023 results on NLP tasks when applied on character and word embeddings [23,27,43]. Such a reasoning has also been proposed for KGs [18,25,32,38,49], using KG embeddings, i.e., vector representations of KG entities and relations that preserve as much as possible the properties of the graph [5]. ...
... Analogybased inference has been used to solve hard reasoning tasks and has shown its potential with competitive results in several machine learning tasks such as classification, decision making and recommendation [9,12,13,17], in data augmentation through analogical extrapolation for model learning, especially in environments with few labeled examples [7,8]. Moreover, it has been successfully applied in classical natural language processing (NLP) tasks such as machine translation [21], several semantic [23,24] and morphological tasks [3,27,34], as well as in (visual) question answering [39], solving puzzles and scholastic aptitude tests [37], and target sense verification [50]. ...
... Model. We adopt the supervised machine learning model proposed by Lim et al. [23]. This model is presented in Figure 3 and relies on convolutional neural networks (CNNs). ...
Preprint
Full-text available
Knowledge Graph Construction (KGC) can be seen as an iterative process starting from a high quality nucleus that is refined by knowledge extraction approaches in a virtuous loop. Such a nucleus can be obtained from knowledge existing in an open KG like Wikidata. However, due to the size of such generic KGs, integrating them as a whole may entail irrelevant content and scalability issues. We propose an analogy-based approach that starts from seed entities of interest in a generic KG, and keeps or prunes their neighboring entities. We evaluate our approach on Wikidata through two manually labeled datasets that contain either domain-homogeneous or -heterogeneous seed entities. We empirically show that our analogy-based approach outperforms LSTM, Random Forest, SVM, and MLP, with a drastically lower number of parameters. We also evaluate its generalization potential in a transfer learning setting. These results advocate for the further integration of analogy-based inference in tasks related to the KG lifecycle.
... Furthermore, analogical inference can support data augmentation through analogical extension and extrapolation for model learning, especially in environments with few labeled examples [7]. Also, it has been successfully applied to several classical natural language processing (NLP) tasks such as machine translation [8], several semantic [9][10][11] and morphological tasks [12][13][14][15][16][17][18], (visual) question answering [19], solving puzzles and scholastic aptitude tests [20], and target sense verification (TSV) [21] which tackles disambiguation. ...
... Analogy detection on a single quadruple is less trivial than it may appear, as in many cases the boundary between valid and invalid analogies is not clearly defined. A solution is to use machine learning to learn the boundary from data, as Lim et al. implemented [10] for semantic word analogies and further explored in [11]. Using a dataset of semantic APs, they learn an artificial neural network to classify quadruples A, B, C, D into valid or invalid analogies, using pretrained word embeddings e A , e B , e C , and e D . ...
... Using a dataset of semantic APs, they learn an artificial neural network to classify quadruples A, B, C, D into valid or invalid analogies, using pretrained word embeddings e A , e B , e C , and e D . We follow a similar approach to morphological analogies in [13] by replacing the GloVe [35] semantic embeddings used in [10,11] with a morphology-oriented word embedding model. The classifier is detailed in Subsubsection 3.2.1 and Figure 1 under the name Analogy Neural Network for classification (ANNc). ...
Preprint
Full-text available
Analogical inference is a remarkable capability of human reasoning, and has been used to solve hard reasoning tasks. Analogy based reasoning (AR) has gained increasing interest from the artificial intelligence community and has shown its potential in multiple machine learning tasks such as classification, decision making and recommendation with competitive results. We propose a deep learning (DL) framework to address and tackle two key tasks in AR: analogy detection and solving. The framework is thoroughly tested on the Siganalogies dataset of morphological analogical proportions (APs) between words, and shown to outperform symbolic approaches in many languages. Previous work have explored the behavior of the Analogy Neural Network for classification (ANNc) on analogy detection and of the Analogy Neural Network for retrieval (ANNr) on analogy solving by retrieval, as well as the potential of an autoencoder (AE) for analogy solving by generating the solution word. In this article we summarize these findings and we extend them by combining ANNr and the AE embedding model, and checking the performance of ANNc as an retrieval method. The combination of ANNr and AE outperforms the other approaches in almost all cases, and ANNc as a retrieval method achieves competitive or better performance than 3CosMul. We conclude with general guidelines on using our framework to tackle APs with DL.
... Tackling Morphological Analogies Using Deep Learning -Extended Version AN EXTENDED PREPRINT Based on a formalism describing analogical proportions [6,7], Lim et al. [8] proposed a deep learning model to tackle such analogies using semantic embeddings. Unlike for the previous works, the architecture is adapted to the characteristics of analogies and the model is trained using a dataset of analogies. ...
... To do so, the two crucial steps are the learning of an embedding particularly adapted to words seen as character strings, and the definition of a network adapted to the formal properties of analogy. Lim et al. [8] have shown the performance of a similar deep learning framework on pre-trained semantic word embeddings for analogies on word semantics, and we argue that the framework itself has the potential to be applied to analogies in a wide range of domains and applications. ...
... In particular, in a context of semantic analogies, Bayoudh et al. [15] proposed to use Kolmogorov complexity as a distance measure between words in order to define a conceptually significant analogical proportion and to classify valid or invalid analogies. A completely functional form was implemented by Lim et al. [8], which propose to learn the classifier directly in the form of a neural network. Analogy detection has been used in particular in a context of analogical grids [10], i.e., matrices of transformations of various words, similar to paradigm tables in linguistic [16]. ...
Preprint
Full-text available
Analogical proportions are statements of the form "A is to B as C is to D". They constitute an inference tool that provides a logical framework to address learning, transfer, and explainability concerns and that finds useful applications in artificial intelligence and natural language processing. In this paper, we address two problems, namely, analogy detection and resolution in morphology. Multiple symbolic approaches tackle the problem of analogies in morphology and achieve competitive performance. We show that it is possible to use a data-driven strategy to outperform those models. We propose an approach using deep learning to detect and solve morphological analogies. It encodes structural properties of analogical proportions and relies on a specifically designed embedding model capturing morphological characteristics of words. We demonstrate our model's competitive performance on analogy detection and resolution over multiple languages. We provide an empirical study to analyze the impact of balancing training data and evaluate the robustness of our approach to input perturbation.
... For example, analogies on words can refer exclusively to their morphology ("cats is to cat as trees is to tree") or their semantics ("kitten is to puppy as cat is to dog"). The question of the correctness of an analogy A : B :: C : D is a difficult task and it has been tackled using both formal [7,10] and empirical approaches [8,11,12,13,14]. Although the original challenge is to find * Authors contributed equally. ...
... some structure from A, B, C, and D, recent empirical works propose data-oriented strategies based on machine learning to learn the correctness of analogies from past observations. In particular, Lim et al. [11] propose a deep learning approach to train models of analogies on corpora of semantic analogies and pretrained GloVe embeddings [15]. This approach achieves competitive results on analogy classification and completion. ...
... In this paper, we adapt the approach developed by Lim et al. [11] for semantic word analogies to morphological analogies. The major difference between the two approaches is that ours relies on a morphological word embedding, which had to be entirely developed and trained. ...
Preprint
Full-text available
Analogical proportions are statements of the form "A is to B as C is to D" that are used for several reasoning and classification tasks in artificial intelligence and natural language processing (NLP). For instance, there are analogy based approaches to semantics as well as to morphology. In fact, symbolic approaches were developed to solve or to detect analogies between character strings, e.g., the axiomatic approach as well as that based on Kolmogorov complexity. In this paper, we propose a deep learning approach to detect morphological analogies, for instance, with reinflexion or conjugation. We present empirical results that show that our framework is competitive with the above-mentioned state of the art symbolic approaches. We also explore empirically its transferability capacity across languages, which highlights interesting similarities between them.
... It departs from case-based reasoning [29]. Beyond different classical works on analogical reasoning such as [15,35,14,13,16]), there has been a noticeable renewal of interest in analogical studies with a variety of approaches, ranging from reasoning [2], machine learning [23,5] to word analogies [6,11,36,37,20,27] and natural language processing [19,12,34]. These approaches have in common to deal with analogical proportions, i.e., statements of the form "a is to b as c is to d" relating 4 items a, b, c and d [30]. ...
... In fact, we classify a quadruple of 4 sentences as a valid or non valid analogy. We tried two classical methods which have been successfully used for word analogy classification [20]: Random Forest (RF) and Convolutional Neural Networks (CNN). CNN has been popular for image classification but has also been used for text classification as it could extract and select important ngrams for classification [17]. ...
Chapter
Analogical proportions hold between 4 items a, b, c, d insofar as we can consider that “a is to b as c is to d”. Such proportions are supposed to obey postulates, from which one can derive Boolean or numerical models that relate vector-based representations of items making a proportion. One basic postulate is the preservation of the proportion by permuting the central elements b and c. However this postulate becomes debatable in many cases when items are words or sentences. This paper proposes a weaker set of postulates based on internal reversal, from which new Boolean and numerical models are derived. The new system of postulates is used to extend a finite set of examples in a machine learning perspective. By embedding a whole sentence into a real-valued vector space, we tested the potential of these weaker postulates for classifying analogical sentences into valid and non-valid proportions. It is advocated that identifying analogical proportions between sentences may be of interest especially for checking discourse coherence, question-answering, argumentation and computational creativity. The proposed theoretical setting backed with promising preliminary experimental results also suggests the possibility of crossing a real-valued embedding with an ontology-based representation of words. This hybrid approach might provide some insights to automatically extract analogical proportions in natural language corpora.
... Such quadruples capture similarities and dissimilarities between objects [16,17]. Here, given a seed entity e u s specified by the user and one of its neighbors e u r , our model predicts whether they form an analogy with a seed entity e k s and one of its neighbor e k r for which a "keep" decision is known: This prediction relies on the pre-learned embeddings of the entities and the convolutional model for analogy detection introduced by Lim et al. [13]. With its architecture, the analogy-based model is able to capture relative similarities and dissimilarities between seed entities and their neighbors to keep or to prune, and thus is able to generalize to heterogeneous unseen entities. ...
Preprint
Full-text available
Knowledge graphs (KGs) have become ubiquitous publicly available knowledge sources, and are nowadays covering an ever increasing array of domains. However, not all knowledge represented is useful or pertaining when considering a new application or specific task. Also, due to their increasing size, handling large KGs in their entirety entails scalability issues. These two aspects asks for efficient methods to extract subgraphs of interest from existing KGs. To this aim, we introduce KGPrune, a Web Application that, given seed entities of interest and properties to traverse, extracts their neighboring subgraphs from Wikidata. To avoid topical drift, KGPrune relies on a frugal pruning algorithm based on analogical reasoning to only keep relevant neighbors while pruning irrelevant ones. The interest of KGPrune is illustrated by two concrete applications, namely, bootstrapping an enterprise KG and extracting knowledge related to looted artworks.
... Given the encouraging results achieved by analogical proportions in the field of Arabic text classification and summarization (Elayeb et al., 2020) as well as in the field of domain-specific information retrieval (Bounhas & Elayeb, 2019), it would be timely to explore the application of such tools in other related fields such as Arabic NLP or Arabic monoand cross-language information retrieval (IR/CLIR) (Elayeb & Bounhas, 2021). Analogical learning and reasoning have already been applied for languages other than Arabic, in addition to the domains of NLP (Langlais & Yvon, 2014;Lavallée & Langlais, 2011;Lim et al., 2019) and IR/ CLIR (Denoual, 2007;Langlais & Patry, 2007;Moreau et al., 2007). These works can form the basis for developing new analogical Arabic NLP, IR, and CLIR tools which are still open problems. ...
Article
Full-text available
Text classification is the process of labelling a given set of text documents with predefined classes or categories. Existing Arabic text classifiers are either applying classic Machine Learning algorithms such as k‐NN and SVM or using modern deep learning techniques. The former are assessed using small text collections and their accuracy is still subject to improvement while the latter are efficient in classifying big data collections and show limited effectiveness in classifying small corpora with a large number of categories. This paper proposes a new approach to Arabic text classification to treat small and large data collections while improving the classification rates of existing classifiers. We first demonstrate the ability of analogical proportions (AP) (statements of the form ‘x is to y as z is to t’), which have recently been shown to be effective in classifying ‘structured’ data, to classify ‘unstructured’ text documents requiring preprocessing. We design an analogical model to express the relationship between text documents and their real categories. Next, based on this principle, we develop two new analogical Arabic text classifiers. These rely on the idea that the category of a new document can be predicted from the categories of three others, in the training set, in case the four documents build together a ‘valid’ analogical proportion on all or on a large number of components extracted from each of them. The two proposed classifiers (denoted AATC1 and AATC2) differ mainly in terms of the keywords extracted for classification. To evaluate the proposed classifiers, we perform an extensive experimental study using five benchmark Arabic text collections with small or large sizes, namely ANT (Arabic News Texts) v2.1 and v1.1, BBC‐Arabic, CNN‐Arabic and AlKhaleej‐2004. We also compare analogical classifiers with both classical ML‐based and Deep Learning‐based classifiers. Results show that AATC2 has the best average accuracy (78.78%) over all other classifiers and the best average precision (0.77) ranked first followed by AATC1 (0.73), NB (0.73) and SVM (0.72) for the ANT corpus v2.1. Besides, AATC1 shows the best average precisions (0.88) and (0.92), respectively for the BBC‐Arabic corpus and AlKhaleej‐2004, and the best average accuracy (85.64%) for CNN‐Arabic over all other classifiers. Results demonstrate the utility of analogical proportions for text classification. In particular, the proposed analogical classifiers are shown to significantly outperform a number of existing Arabic classifiers, and in many cases, compare favourably to the robust SVM classifier.
... Note that the NN was trained for each category. In our previous work [31], the NN model was trained on all categories for the Google dataset, but we feel that it would be a fairer comparison with LRCos if the NN was trained for each category, as the relation in LRCos was extracted for each category. ...
Article
Analogical proportions are statements of the form ‘a is to b as c is to d’, formally denoted a:b::c:d. They are the basis of analogical reasoning which is often considered as an essential ingredient of human intelligence. For this reason, recognizing analogies in natural language has long been a research focus within the Natural Language Processing (NLP) community. With the emergence of word embedding models, a lot of progress has been made in NLP, essentially assuming that a word analogy like man:king::woman:queen is an instance of a parallelogram within the underlying vector space. In this paper, we depart from this assumption to adopt a machine learning approach, i.e., learning a substitute of the parallelogram model. To achieve our goal, we first review the formal modeling of analogical proportions, highlighting the properties which are useful from a machine learning perspective. For instance, the postulates supposed to govern such proportions entail that when a:b::c:d holds, then seven permutations of a,b,c,d still constitute valid analogies. From a machine learning perspective, this provides guidelines to build training sets of positive and negative examples. Taking into account these properties for augmenting the set of positive and negative examples, we first implement word analogy classifiers using various machine learning techniques, then we approximate by regression an analogy completion function, i.e., a way to compute the missing word when we have the three other ones. Using a GloVe embedding, classifiers show very high accuracy when recognizing analogies, improving state of the art on word analogy classification. Also, the regression processes usually lead to much more successful analogy completion than the ones derived from the parallelogram assumption.
Chapter
Analogical proportions are statements of the form “A is to B as C is to D”. They support analogical inference and provide a logical framework to address learning, transfer, and explainability concerns. This logical framework finds useful applications in AI and natural language processing (NLP). In this paper, we address the problem of solving morphological analogies using a retrieval approach named ANNr. Our deep learning framework encodes structural properties of analogical proportions and relies on a specifically designed embedding model capturing morphological characteristics of words. We demonstrate that ANNr outperforms the state of the art on 11 languages. We analyze ANNr results for Navajo and Georgian, languages on which the model performs worst and best, to explore potential correlations between the mistakes of ANNr and linguistic properties.KeywordsAnalogy solvingNeural networksRetrievalMorphological word embeddings
Chapter
Analogical proportions are statements expressed in the form “A is to B as C is to D” and are used for several reasoning and classification tasks in artificial intelligence and natural language processing (NLP). In this paper, we focus on morphological tasks and we propose a deep learning approach to detect morphological analogies. We present an empirical study to see how our framework transfers across languages, and that highlights interesting similarities and differences between these languages. In view of these results, we also discuss the possibility of building a multilingual morphological model.KeywordsMorphological analogyDeep learningTransferabilityAnalogy classification
Article
Full-text available
Significance The ability to learn and make inferences based on relations is central to intelligence, underlying the distinctively human ability to reason by analogy across dissimilar situations. We have developed a computational model demonstrating that abstract relations, such as synonymy and antonymy, can be learned efficiently from semantic feature vectors for individual words and can be used to solve simple verbal analogy problems with close to human-level accuracy. The approach illustrates the potential synergy between deep learning from “big data” and supervised learning from “small data.” Core properties of high-level intelligence can emerge from relatively simple computations coupled with rich semantics. The model illustrates how operations on nonrelational inputs can give rise to protosymbolic relational representations.
Article
Full-text available
We propose a new technique for visual attribute transfer across images that may have very different appearance but have perceptually similar semantic structure. By visual attribute transfer, we mean transfer of visual information (such as color, tone, texture, and style) from one image to another. For example, one image could be that of a painting or a sketch while the other is a photo of a real scene, and both depict the same type of scene. Our technique finds semantically-meaningful dense correspondences between two input images. To accomplish this, it adapts the notion of "image analogy" [Hertzmann et al. 2001] with features extracted from a Deep Convolutional Neutral Network for matching; we call our technique deep image analogy. A coarse-to-fine strategy is used to compute the nearest-neighbor field for generating the results. We validate the effectiveness of our proposed method in a variety of cases, including style/texture transfer, color/style swap, sketch/painting to photo, and time lapse.
Article
Full-text available
Many AI researchers and cognitive scientists have argued that analogy is the core of cognition. The most influential work on computational modeling of analogy-making is Structure Mapping Theory and its implementation in the Structure Mapping Engine (SME). A limitation of SME is the requirement for complex hand-coded representations. We introduce the Latent Relation Mapping Engine (LRME), which combines ideas from SME and Latent Relational Analysis in order to remove the requirement for hand-coded representations. LRME builds analogical mappings between lists of words, using a large corpus of raw text to automatically discover the semantic relations among the words. We evaluate LRME on a set of twenty analogical mapping problems, ten based on scientific analogies and ten based on common metaphors. LRME achieves human-level performance on the twenty problems. We compare LRME with a variety of alternative approaches and find that they are not able to reach the same level of performance.
Article
Full-text available
The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several extensions that improve both the quality of the vectors and the training speed. By subsampling of the frequent words we obtain significant speedup and also learn more regular word representations. We also describe a simple alternative to the hierarchical softmax called negative sampling. An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases. For example, the meanings of "Canada" and "Air" cannot be easily combined to obtain "Air Canada". Motivated by this example, we present a simple method for finding phrases in text, and show that learning good vector representations for millions of phrases is possible.
Article
Full-text available
We propose two novel model architectures for computing continuous vector representations of words from very large data sets. The quality of these representations is measured in a word similarity task, and the results are compared to the previously best performing techniques based on different types of neural networks. We observe large improvements in accuracy at much lower computational cost, i.e. it takes less than a day to learn high quality word vectors from a 1.6 billion words data set. Furthermore, we show that these vectors provide state-of-the-art performance on our test set for measuring syntactic and semantic word similarities.
Article
Full-text available
In this paper, we advocate a study of analogies between strings of symbols for their own sake. We show how some sets of strings, i.e., some formal languages, may be characterized by use of analogies. We argue that some preliminary “good properties” obtained may plead in favour of the use of analogy in the study of formal languages in relationship with natural language.
Article
Full-text available
There are at least two kinds of similarity. Relational similarity is correspondence between relations, in contrast with attributional similarity, which is correspondence between attributes. When two words have a high degree of attributional similarity, we call them synonyms. When two pairs of words have a high degree of relational similarity, we say that their relations are analogous. For example, the word pair mason:stone is analogous to the pair carpenter:wood. This article introduces Latent Relational Analysis (LRA), a method for measuring relational similarity. LRA has potential applications in many areas, including information extraction, word sense disambiguation, and information retrieval. Recently the Vector Space Model (VSM) of information retrieval has been adapted to measuring relational similarity, achieving a score of 47% on a collection of 374 college-level multiple-choice word analogy questions. In the VSM approach, the relation between a pair of words is characterized by a vector of frequencies of predefined patterns in a large corpus. LRA extends the VSM approach in three ways: (1) The patterns are derived automatically from the corpus, (2) the Singular Value Decomposition (SVD) is used to smooth the frequency data, and (3) automatically generated synonyms are used to explore variations of the word pairs. LRA achieves 56% on the 374 analogy questions, statistically equivalent to the average human score of 57%. On the related problem of classifying semantic relations, LRA achieves similar gains over the VSM.
Article
Full-text available
Many AI researchers and cognitive scientists have argued that analogy is the core of cognition. The most influential work on computational modeling of analogy-making is Structure Mapping Theory (SMT) and its implementation in the Structure Mapping Engine (SME). A limitation of SME is the requirement for complex hand-coded representations. We introduce the Latent Relation Mapping Engine (LRME), which combines ideas from SME and Latent Relational Analysis (LRA) in order to remove the requirement for hand-coded representations. LRME builds analogical mappings between lists of words, using a large corpus of raw text to automatically discover the semantic relations among the words. We evaluate LRME on a set of twenty analogical mapping problems, ten based on scientific analogies and ten based on common metaphors. LRME achieves human-level performance on the twenty problems. We compare LRME with a variety of alternative approaches and find that they are not able to reach the same level of performance.
Article
Full-text available
Recognizing analogies, synonyms, antonyms, and associations appear to be four distinct tasks, requiring distinct NLP algorithms. In the past, the four tasks have been treated independently, using a wide variety of algorithms. These four semantic classes, however, are a tiny sample of the full range of semantic phenomena, and we cannot afford to create ad hoc algorithms for each semantic phenomenon; we need to seek a unified approach. We propose to subsume a broad range of phenomena under analogies. To limit the scope of this paper, we restrict our attention to the subsumption of synonyms, antonyms, and associations. We introduce a supervised corpus-based machine learning algorithm for classifying analogous word pairs, and we show that it can solve multiple-choice SAT analogy questions, TOEFL synonym questions, ESL synonym-antonym questions, and similar-associated-both questions from cognitive psychology.
Conference Paper
We propose two novel model architectures for computing continuous vector representations of words from very large data sets. The quality of these representations is measured in a word similarity task, and the results are compared to the previously best performing techniques based on different types of neural networks. We observe large improvements in accuracy at much lower computational cost, i.e. it takes less than a day to learn high quality word vectors from a 1.6 billion words data set. Furthermore, we show that these vectors provide state-of-the-art performance on our test set for measuring syntactic and semantic word similarities.
Article
Introduced a decade ago, analogy-based classification methods constitute a noticeable addition to the set of instance-based learning techniques. They provide valuable results in terms of accuracy on many classical datasets. They rely on the notion of analogical proportions which are statements of the form “A is to B as C is to D”. Analogical proportions have been in particular formalized in Boolean and numerical settings. In both cases, one of the four components of the proportion can be computed from the three others, when the proportion holds. Analogical classifiers look for all triples of examples in the sample set that are in analogical proportion with the item to be classified on a maximal number of attributes and for which the corresponding analogical proportion equation on the class has a solution. In this paper when classifying a new item, we specially emphasize an approach where the whole set of triples that can be built from the sample set is not considered. We just focus on a small part of the candidate triples. Namely, in order to restrict the scope of the search, we first look for examples that are as similar as possible to the new item to be classified. We then only consider the pairs of examples presenting the same dissimilarity as between the new item and one of its closest neighbors. In this way, we implicitly build triples that are in analogical proportion on all attributes with the new item. Then the classification is made on the basis of an additive aggregation of the truth values corresponding to the pairs that can be analogically associated with the pairs made of the target item and one of its nearest neighbors. We then only deal with pairs leading to a solvable analogical equation for the class. This new algorithm provides results as good as previous analogical classifiers with a lower average complexity, both in nominal and numerical cases.
Article
Recent trends suggest that neural-network-inspired word embedding models outperform traditional count-based distributional models on word similarity and analogy detection tasks. We reveal that much of the performance gains of word embeddings are due to certain system design choices and hyperparameter optimizations, rather than the embedding algorithms themselves. Furthermore, we show that these modifications can be transferred to traditional distributional models, yielding similar gains. In contrast to prior reports, we observe mostly local or insignificant performance differences between the methods, with no global advantage to any single approach over the others.
Article
In multi-class categorization tasks, knowledge about the classes' semantic relationships can provide valuable information beyond the class labels themselves. However, existing techniques focus on preserving the semantic distances between classes (e.g., according to a given object taxonomy for visual recognition), limiting the influence to pairwise structures. We propose to model analogies that reflect the relationships between multiple pairs of classes simultaneously, in the form "p is to q, as r is to s". We translate semantic analogies into higher-order geometric constraints called analogical parallelograms, and use them in a novel convex regularizer for a discriminatively learned label embedding. Furthermore, we show how to discover analogies from attribute-based class descriptions, and how to prioritize those likely to reduce inter-class confusion. Evaluating our Analogy-preserving Semantic Embedding (ASE) on two visual recognition datasets, we demonstrate clear improvements over existing approaches, both in terms of recognition accuracy and analogy completion.
Conference Paper
While continuous word embeddings are gaining popularity, current models are based solely on linear contexts. In this work, we generalize the skip-gram model with negative sampling introduced by Mikolov et al. to include arbitrary contexts. In particular, we perform experiments with dependency-based contexts, and show that they produce markedly different embeddings. The dependencybased embeddings are less topical and exhibit more functional similarity than the original skip-gram embeddings.
Article
Given a 4-tuple of Boolean variables (a, b, c, d), logical proportions are modeled by a pair of equivalences relating similarity indicators ( ab{a \wedge b} and ab{\overline{a} \wedge \overline{b}} ), or dissimilarity indicators ( ab{a \wedge \overline{b}} and ab{\overline{a} \wedge b} ) pertaining to the pair (a, b), to the ones associated with the pair (c, d). There are 120 semantically distinct logical proportions. One of them models the analogical proportion which corresponds to a statement of the form “a is to b as c is to d”. The paper inventories the whole set of logical proportions by dividing it into five subfamilies according to what they express, and then identifies the proportions that satisfy noticeable properties such as full identity (the pair of equivalences defining the proportion hold as true for the 4-tuple (a, a, a, a)), symmetry (if the proportion holds for (a, b, c, d), it also holds for (c, d, a, b)), or code independency (if the proportion holds for (a, b, c, d), it also holds for their negations (a,b,c,d){{(\overline{a},\overline{b}, \overline{c}, \overline{d})}} ). It appears that only four proportions (including analogical proportion) are homogeneous in the sense that they use only one type of indicator (either similarity or dissimilarity) in their definition. Due to their specific patterns, they have a particular cognitive appeal, and as such are studied in greater details. Finally, the paper provides a discussion of the other existing works on analogical proportions.
Article
In this paper, we try to identify analogical proportions, i.e., statements of the form “a is to b as c is to d”, expressed in linguistic terms. While it is conceivable to use an algebraic model for testing proportions such as “2 is to 4 as 5 is to 10”, or even such as “read is to reader as lecture is to lecturer”, there is no algebraic framework to support statements such as “engine is to car as heart is to human” or “wine is to France as beer is to England”, helping to recognize them as meaningful analogical proportions. The idea is then to rely on text corpora, or even on the Web itself, where one may expect to find the pragmatics and the semantics of the words, in their common use. In that context, in order to attach a numerical value to the “analogical ratio” corresponding to the phrase “a is to b”, we start from the works of Kolmogorov on complexity theory. This is the basis for a universal measure of the information content of a word a, or of a word a with respect to another one b, which, in practice, is estimated in a statistical manner. We investigate the link between a purely logical, recently introduced view of analogical proportions and its counterpart based on Kolmogorov theory. The criteria proposed for testing candidate proportions fit with the expected properties (symmetry, central permutation) of analogical proportions. This leads to a new computational method to define, and ultimately to try to detect, analogical proportions in natural language. Experiments with classifiers based on these ideas are reported, and results are rather encouraging with respect to the recognition of common sense linguistic analogies. The approach is also compared with existing works on similar problems.
Article
Analogical learning is a two-step inference process: (i) computation of a mapping between a new and a memorized situation; (ii) transfer from the known to the unknown situation. This approach requires the ability to search for and exploit such mappings, which are based on the notion of analogical proportions, hence the need to properly define these proportions, and to eciently implement their computation. In this paper, we propose a unified definition of analogical proportions, which applies to a wide range of algebraic structures. We show that this definition is suitable for learning in domains involving large databases of structured data, as is especially the case of many Natural Language Pro- cessing applications. We finally discuss some issues this approach raises and relate it to other instance-based learning schemes.
Article
An introduction for the general audience to the theories and theorists of information-processing concepts and computer simulation, and the psychological implications of this approach. Harvard Book List (edited) 1971 #328 (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
A theory of analogical reasoning is proposed in which the elements of a set of concepts, e.g., animals, are represented as points in a multidimensional Euclidean space. Four elements A,B,C,D, are in an analogical relationship A:B::C:D if the vector distance from A to B is the same as that from C to D. Given three elements A,B,C, an ideal solution point I for A:B::C:? exists. In a problem A:B::C:D1, …, Di, …, Dn, the probability of choosing Di as the best solution is a monotonic decreasing function of the absolute distance of Di from I. A stronger decision rule incorporating a negative exponential function in Luce's choice rule is also proposed. Both the strong and weak versions of the theory were supported in two experiments where Ss rank-ordered the alternatives in problems A:B::C:D1,D2, D3D4. In a third experiment the theory was applied and further tested in teaching new concepts by analogy.
Conference Paper
This study relates to the assessment of the argument of the poverty of the stimulus in that we conducted a measure of the number of true proportional analogies between chunks in a language with case markers, Japanese. On a bicorpus of 20,000 sentences, we show that at least 96% of the analogies of form between chunks are also analogies of meaning, thus reporting the presence of at least two million true analogies between chunks in this corpus. As the number of analogies between chunks overwhelmingly surpasses the number of analogies between sentences by three orders of magnitude for this size of corpora, we conclude that proportional analogy is an efficient and undeniable structuring device between Japanese chunks.
Conference Paper
Analogical proportions are statements of the form ”A is to B as C is to D” which play a key role in analogical reasoning. We propose a logical encoding of analogical proportions in a propositional setting, which is then extended to different fuzzy logics. Being in an analogical proportion is viewed as a quaternary connective relating four propositional variables. Interestingly enough, the fuzzy formalizations that are thus obtained parallel numerical models of analogical proportions. Potential applications to case-based reasoning and learning are outlined.
Article
Analogy is a powerful boundary-transcending process that exploits a conceptual system’s ability to perform controlled generalization in one domain and re-specialization into another. The result of this semantic leap is the transference of meaning from one concept to another from which metaphor derives its name (literally: to carry over). Such generalization and re-specialization can be achieved using a variety of re-representation techniques, most notably abstraction via a taxonomic backbone, or selective projection via structure-mapping over propositional content. In this paper we explore both the extent to which a bilingual lexical ontology for English and Chinese, called HowNet, can support each technique, and the extent to which both are, ultimately, variations of the same process of creative rerepresentation.
Article
We have designed, implemented and assessed an EBMT system that can be dubbed the `purest ever built': it strictly does not make any use of variables, templates or patterns, does not have any explicit transfer component, and does not require any preprocessing or training of the aligned examples. It only uses a specific operation, proportional analogy, that implicitly neutralises divergences between languages and captures lexical and syntactical variations along the paradigmatic and syntagmatic axes without explicitly decomposing sentences into fragments. Exactly the same genuine implementation of such a core engine was evaluated on different tasks and language pairs. To begin with, we compared our system on two tasks of a previous MT evaluation campaign to rank it among other current state-of-the-art systems. Then, we illustrated the `universality' of our system by participating in a recent MT evaluation campaign, with exactly the same core engine, for a wide variety of language pairs. Finally, we studied the in uence of extra data like dictionaries and paraphrases on the system performance.
Article
Introduction Saussure mentions an important phenomenon in language: analogy. Given a series of three words, human beings can coin a fourth one (Saussure, 1916). We present a linguistic structure analysis method using this type of analogy. In general, conventional parsing techniques have some difficulties (Briscoe, 1995). One is disambiguation, which means selecting a correct analysis from a large number of syntactically legitimate ones returned by a parser. Another is undergeneration, which means dealing with cases of input outside of a system's lexical or syntactic coverage. Overall, these problems are caused by a lack of linguistic knowledge installed in parsers. Recently, for the purpose of using information included in raw or tagged text, example-based methods utilizing probabilities for grammar rules (Fujisaki et al., 1989), semantic similarity (Sumita and Iida, 1991), and so on have been proposed. Our method, which is one of the example-based approaches,
Solving analogical equations on words
  • F Yvon
  • N Stroppa
  • A Delhay
  • L Miclet
Yvon, F., Stroppa, N., Delhay, A., Miclet, L.: Solving analogical equations on words. Technical report, Ecole Nationale Supérieure des Télécommunications (2004)
From analogical proportion to logical proportions
  • H Prade
  • G Richard
Prade, H., Richard, G.: From analogical proportion to logical proportions. Logica Univers. 7, 441-505 (2013)