Peter David TurneyRonin Institute
Peter David Turney
Ph.D.
About
118
Publications
69,384
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
20,605
Citations
Introduction
I am a scientist who works in the areas of computational linguistics, machine learning, artificial life, evolutionary computation, philosophy, and mathematics.
Additional affiliations
September 1989 - present
Publications
Publications (118)
Conway's Game of Life is the best-known cellular automaton. It is a classic model of emergence and self-organization, it is Turing-complete, and it can simulate a universal constructor. The Game of Life belongs to the set of semi-totalistic cellular automata, a family with 262,144 members. Many of these automata may deserve as much attention as the...
Recently we introduced a model of symbiosis, Model-S, based on the evolution of seed patterns in Conway's Game of Life. In the model, the fitness of a seed pattern is measured by one-on-one competitions in the Immigration Game, a two-player variation of the Game of Life. Our previous article showed that Model-S can serve as a highly abstract, simpl...
Biological and cultural evolution show a trend towards increasing hierarchical organization, in which entities at one level combine cooperatively to form a new entity at a higher level of organization. In each case where such a cooperative transition has been studied, we have some understanding of how the transition came about, but it is difficult...
Conway's Game of Life (GoL) is the best-known cellular automaton. It is a classic model of emergence and self-organization, it is Turing-complete, and it can simulate a universal constructor. GoL belongs to the set of semi-totalistic cellular automata, a family with 262,144 members. In such a large family, what makes GoL stand out? Packard and Wolf...
Recently we introduced a model of symbiosis, Model-S, based on the evolution of seed patterns in Conway's Game of Life. In the model, the fitness of a seed pattern is measured by one-on-one competitions in the Immigration Game, a two-player variation of the Game of Life. This article examines the role of autopoiesis in determining fitness in Model-...
We present a computational simulation of evolving entities that includes symbiosis with shifting levels of selection. Evolution by natural selection shifts from the level of the original entities to the level of the new symbiotic entity. In the simulation, the fitness of an entity is measured by a series of one-on-one competitions in the immigratio...
The Immigration Game (invented by Don Woods in 1971) extends the solitaire Game of Life (invented by John Conway in 1970) to enable two-player competition. The Immigration Game can be used in a model of evolution by natural selection, where fitness is measured with competitions. The rules for the Game of Life belong to the family of semitotalistic...
We introduce a dataset for studying the evolution of words, constructed from WordNet and the Google Books Ngram Corpus. The dataset tracks the evolution of 4,000 synonym sets (synsets), containing 9,000 English words, from 1800 AD to 2000 AD. We present a supervised learning algorithm that is able to predict the future leader of a synset: the word...
Maynard Smith and Szathm\'ary's book, The Major Transitions in Evolution, describes eight major events in the evolution of life on Earth and identifies a common theme that unites these events. In each event, smaller entities came together to form larger entities, which can be described as symbiosis or cooperation. Here we present a computational si...
We introduce a dataset for studying the evolution of words, constructed from WordNet and the Google Books Ngram Corpus. The dataset tracks the evolution of 4,000 synonym sets (synsets), containing 9,000 English words, from 1800 AD to 2000 AD. We present a supervised learning algorithm that is able to predict the future leader of a synset: the word...
Evolution by natural selection can be seen an algorithm for generating creative solutions to difficult problems. More precisely, evolution by natural selection is a class of algorithms that share a set of properties. The question we address here is, what are the conditions that define this class of algorithms? There is a standard answer to this que...
While open-domain question answering (QA) systems have proven effective for answering simple questions, they struggle with more complex questions. Our goal is to answer more complex questions reliably, without incurring a significant cost in knowledge resource construction to support the QA. One readily available knowledge resource is a term bank,...
While open-domain question answering (QA) systems have proven effective for answering simple questions, they struggle with more complex questions. Our goal is to answer more complex questions reliably, without incurring a significant cost in knowledge resource construction to support the QA. One readily available knowledge resource is a term bank,...
Given recent successes in AI (e.g., AlphaGo's victory against Lee Sedol in the game of GO), it's become increasingly important to assess: how close are AI systems to human-level intelligence? This paper describes the Allen AI Science Challenge---an approach towards that goal which led to a unique Kaggle Competition, its results, the lessons learned...
What capabilities are required for an AI system to pass standard 4th Grade Science Tests? Previous work has examined the use of Markov Logic Networks (MLNs) to represent the requisite background knowledge and interpret test questions, but did not improve upon an information retrieval (IR) baseline. In this paper, we describe an alternative approach...
We describe two new related resources that facilitate modelling of general knowledge reasoning in 4th grade science exams. The first is a collection of curated facts in the form of tables, and the second is a large set of crowd-sourced multiple-choice questions covering the facts in the tables. Through the setup of the crowd-sourced annotation task...
It is generally believed that a metaphor tends to have a stronger emotional impact than a literal statement; however, there is no quantitative study establishing the extent to which this is true. Further, the mechanisms through which metaphors convey emotions are not well understood. We present the first data-driven study comparing the emotionality...
The importance of a research article is routinely measured by counting how many times it has been cited. However, treating all citations with equal weight ignores the wide variety of functions that citations perform. We want to automatically identify the subset of references in a bibliography that have a central academic influence on the citing pap...
Semantic composition is the task of understanding the meaning of text by
composing the meanings of the individual words in the text. Semantic
decomposition is the task of understanding the meaning of an individual word by
decomposing it into various aspects (factors, constituents, components) that
are latent in the meaning of the word. We take a di...
Inference in natural language often involves recognizing lexical entailment
(RLE); that is, identifying whether one word entails another. For example,
"buy" entails "own". Two general strategies for RLE have been proposed: One
strategy is to manually construct an asymmetric similarity measure for context
vectors (directional similarity) and another...
There have been several efforts to extend distributional semantics beyond
individual words, to measure the similarity of word pairs, phrases, and
sentences (briefly, tuples; ordered sets of words, contiguous or
noncontiguous). One way to extend beyond words is to compare two tuples using a
function that combines pairwise similarities between the co...
Given appropriate representations of the semantic relations between carpenter and wood and between mason and stone (for example, vectors in a vector space model), a suitable algorithm should be able to recognize that these relations are highly similar (carpenter is to wood as mason is to stone; the relations are analogous). Likewise, with represent...
Even though considerable attention has been given to the polarity of words
(positive and negative) and the creation of large polarity lexicons, research
in emotion analysis has had to rely on limited and small emotion lexicons. In
this paper we show how the combined strength and wisdom of the crowds can be
used to generate a large, high-quality, wo...
Knowing the degree of semantic contrast between words has widespread
application in natural language processing, including machine translation,
information retrieval, and dialogue systems. Manually-created lexicons focus on
opposites, such as {\rm hot} and {\rm cold}. Opposites are of many kinds such
as antipodals, complementaries, and gradable. Ho...
Words are associated with emotions. For example, delightful and yummy indicate the emotion of joy, gloomy and cry are indicative of sadness, shout and boiling are indicative of anger, and so on. In this paper, we present a large word-emotion association lexicon we created through a massive online annotation project.
It has been argued that analogy is the core of cognition. In AI research,
algorithms for analogy are often limited by the need for hand-coded high-level
representations as input. An alternative approach is to use high-level
perception, in which high-level representations are automatically generated
from raw data. Analogy perception is the process o...
The idea that language mediates our thoughts and enables abstract cognition has been a key idea in socio-cultural psychology. However, it is not clear what mechanisms support this process of abstraction. Peirce argued that one mechanism by which language enables abstract thought is hypostatic abstraction, the process through which a predicate (e.g....
Metaphor is ubiquitous in text, even in highly technical text. Correct inference about textual entailment requires computers to distinguish the literal and metaphorical senses of a word. Past work has treated this problem as a classical word sense disambiguation task. In this paper, we take a new approach, based on research in cognitive linguistics...
Knowing the degree of semantic contrast, or oppositeness, between words has widespread application in natural language processing, including machine translation, and information retrieval. Manually-created lexicons focus on strict opposites, such as antonyms, and have limited coverage. On the other hand, only a few automatic approaches have been pr...
Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys...
Even though considerable attention has been given to semantic orientation of words and the creation of large polarity lexicons, research in emotion analysis has had to rely on limited and small emotion lexicons. In this paper, we show how we create a high-quality, moderate-sized emotion lexicon using Mechanical Turk. In addition to questions about...
The NLP community has shown a renewed interest in deeper semantic analyses, among them automatic recognition of semantic relations in text. We present the development and evaluation of a semantic analysis task: automatic recognition of relations between pairs of nominals in a sentence. The task was part of SemEval-2007, the fourth edition of the se...
Many AI researchers and cognitive scientists have argued that analogy is the core of cognition. The most influential work on computational modeling of analogy-making is Structure Mapping Theory (SMT) and its implementation in the Structure Mapping Engine (SME). A limitation of SME is the requirement for complex hand-coded representations. We introd...
Recognizing analogies, synonyms, antonyms, and associations appear to be four distinct tasks, requiring distinct NLP algorithms. In the past, the four tasks have been treated independently, using a wide variety of algorithms. These four semantic classes, however, are a tiny sample of the full range of semantic phenomena, and we cannot afford to cre...
Many AI researchers and cognitive scientists have argued that analogy is the core of cognition. The most influential work on computational modeling of analogy-making is Structure Mapping Theory and its implementation in the Structure Mapping Engine (SME). A limitation of SME is the requirement for complex hand-coded representations. We introduce th...
Higher-order tensor decompositions are analogous to the familiar Singular Value Decomposition (SVD), but they transcend the limitations of matrices (second-order tensors). SVD is a powerful tool that has achieved impressive results in information retrieval, collaborative filtering, computational linguistics, computational vision, and other fields....
The NLP community has shown a renewed interest in deeper semantic analyses, among them automatic recognition of relations between pairs of words in a text. We present an evaluation task designed to provide a framework for comparing different approaches to classifying semantic relations between nominals in a sentence. This is part of SemEval, the 4t...
The NLP community has shown a renewed interest in deeper semantic analyses, among them automatic recognition of relations between pairs of words in a text. We present an evaluation task designed to provide a framework for comparing different approaches to classifying semantic relations between nominals in a sentence. This is part of SemEval, the 4t...
There are at least two kinds of similarity. Relational similarity is correspondence between relations, in contrast with attributional similarity, which is correspondence between attributes. When two words have a high degree of attributional similarity, we call them synonyms. When two pairs of words have a high degree of relational similarity, we sa...
We present an unsupervised learning algorithm that mines large text corpora for patterns that express implicit semantic relations. For a given input word pair X:Y with some unspecified semantic relations, the corresponding output list of patterns <P1,...,Pm> is ranked according to how well each pattern Pi expresses the relations between X and Y. Fo...
In this paper, we propose a named-entity recognition (NER) system that addresses two major limitations frequently discussed in the field. First, the system requires no human intervention such as manually labeling training data or creating gazetteers. Second, the system can handle more than the three classical named-entity types (person, location, a...
It has been argued that a central objective of nanotechnology is to make products inexpensively, and that self-replication is an effective approach to very low-cost manufacturing. The research presented here is intended to be a step towards this vision. We describe a computational simulation of nanoscale machines floating in a virtual liquid. The m...
In this position paper, we propose a first step toward automatic analysis of sentiments in dreams. 100 dreams were sampled from a dream bank created for a normative study of dreams. Two human judges assigned a score to describe dream sentiments. We ran four baseline algorithms in an attempt to automate the rating of sentiments in dreams. Particular...
We present an algorithm for learning from unlabeled text, based on the Vector Space Model (VSM) of information retrieval, that can solve verbal analogy questions of the kind found in the SAT college entrance exam. A verbal analogy has the form A:B::C:D, meaning "A is to B as C is to D"; for example, mason:stone::carpenter:wood. SAT analogy question...
This paper introduces Latent Relational Analysis (LRA), a method for measuring semantic similarity. LRA measures similarity in the semantic relations between two pairs of words. When two pairs have a high degree of relational similarity, they are analogous. For example, the pair cat:meow is analogous to the pair dog:bark. There is evidence from cog...
This paper addresses the task of finding acronym-definition pairs in text. Most of the previous work on the topic is about systems that involve manually generated rules or regular expressions. In this paper, we present a supervised learning approach to the acronym identification task. Our approach reduces the search space of the supervised learning...
It has been argued that a central objective of nanotechnology is to make products inexpensively, and that self-replication is an effective approach to very low-cost manufacturing. The research presented here is intended to be a step towards this vision. In previous work (JohnnyVon 1.0), we simulated machines that bonded together to form self-replic...
Existing statistical approaches to natural language problems are very coarse approximations to the true complexity of language processing. As such, no single technique will be best for all problem instances. Many researchers are examining ensemble methods that combine the output of multiple modules to create more accurate solutions. This paper exam...
This paper introduces Latent Relational Analysis (LRA), a method for measuring relational similarity. LRA has potential applications in many areas, including information extraction, word sense disambiguation, machine translation, and information retrieval. Relational similarity is correspondence between relations, in contrast with attributional sim...
This paper describes the National Research Council (NRC) Word Sense Disambiguation (WSD) system, as applied to the English Lexical Sample (ELS) task in Senseval-3. The NRC system approaches WSD as a classical supervised machine learning problem, using familiar tools such as the Weka machine learning software and Brill's rule-based part-of-speech ta...
This paper introduces Latent Relational Analysis (LRA), a method for measuring relational similarity. LRA has potential applications in many areas, including information extraction, word sense disambiguation, machine translation, and information retrieval.Relational similarity is correspondence between relations, in contrast with attributional simi...
The evaluative character of a word is called its semantic orientation.
Positive semantic orientation indicates praise (e.g., "honest",
"intrepid") and negative semantic orientation indicates criticism
(e.g., "disturbing", "superfluous"). Semantic orientation
varies in both direction (positive or negative) and degree (mild
to strong). An automated s...
Existing statistical approaches to natural language problems are very coarse approximations to the true complexity of language processing. As such, no single technique will be best for all problem instances. Many researchers are examining ensemble methods that combine the output of successful, separately developed modules to create more accurate so...
Keyphrases are useful for a variety of purposes, including summarizing, indexing, labeling, categorizing, clustering, highlighting, browsing, and searching. The task of automatic keyphrase extraction is to select keyphrases from within the text of a given document. Automatic keyphrase extraction makes it feasible to generate keyphrases for the huge...
We present an algorithm for learning from unlabeled text, based on the Vector Space Model (VSM) of information retrieval, that can solve verbal analogy questions of the kind found in the Scholastic Aptitude Test (SAT). A verbal analogy has the form A:B::C:D, meaning "A is to B as C is to D"; for example, mason:stone::carpenter:wood. SAT analogy que...
JohnnyVon is an implementation of self-replicating machines in continuous two-dimensional space. Two types of particles drift about in a virtual liquid. The particles are automata with discrete internal states but continuous external relationships. Their internal states are governed by finite state machines, but their external relationships are gov...
The evaluative character of a word is called its semantic orientation. A positive semantic orientation implies desirability (e.g., "honest", "intrepid") and a negative semantic orientation implies undesirability (e.g., "disturbing", "superfluous"). This paper introduces a simple algorithm for unsupervised learning of semantic orientation from extre...
JohnnyVon is an implementation of self-replicating automata in continuous two-dimensional space. Two types of particles drift about in a virtual liquid. The particles are automata with discrete internal states but continuous external relationships. Their internal states are governed by finite state machines but their external relationships are gove...
Keyphrases are useful for a variety of purposes, including summarizing, indexing, labeling, categorizing, clustering, highlighting, browsing, and searching. The task of automatic keyphrase extraction is to select keyphrases from within the text of a given document. Automatic keyphrase extraction makes it feasible to generate keyphrases for the huge...
We present an algorithm for learning from unlabeled text, based on the Vector SpaceModel (VSM) of information retrieval, that can solve verbal analogy questions of the kind found in the Scholastic Aptitude Test (SAT). A verbal analogy has the form A:B::C:D, meaning 'A is to B as C is to D'; for example, mason:stone::carpenter:wood. SAT analogy ques...
Evolvability is the capacity to evolve. This paper introduces a simple computational model of evolvability and demonstrates that, under certain conditions, evolvability can increase indefinitely, even when there is no direct selection for evolvability. The model shows that increasing evolvability implies an accelerating evolutionary pace. It is sug...
A large body of research in machine learning is concerned with supervised learning from examples. The examples are typically represented as vectors in a multi-dimensional feature space (also known as attribute-value descriptions). A teacher partitions a set of training examples into a finite number of classes. The task of the learning algorithm is...
This paper addresses the problem of classifying observations when features are context-sensitive, specifically when the testing set involves a context that is different from the training set. The paper begins with a precise definition of the problem, then general strategies are presented for enhancing the performance of classification algorithms on...
This paper addresses the problem of classifying observations when features are context-sensitive, especially when the testing set involves a context that is different from the training set. The paper begins with a precise definition of the problem, then general strategies are presented for enhancing the performance of classification algorithms on t...
In this paper, we review five heuristic strategies for handling context-sensitive features in supervised machine learning from examples. We discuss two methods for recovering lost (implicit) contextual information. We mention some evidence that hybrid strategies can have a synergetic effect. We then show how the work of several machine learning res...
We have analyzed manufacturing data from several different semiconductor manufacturing plants, using decision tree induction software called Q-YIELD. The software generates rules for predicting when a given product should be rejected. The rules are intended to help the process engineers improve the yield of the product, by helping them to discover...
The Inductive Logic Programming community has considered proof-complexity and model-complexity, but, until recently, size-complexity has received little attention. Recently a challenge was issued "to the international computing community" to discover low size-complexity Prolog programs for classifying trains. The challenge was based on a problem fi...
Inductive concept learning is the task of learning to assign cases to a discrete set of classes. In real-world applications of concept learning, there are many different types of cost involved. The majority of the machine learning literature ignores all types of cost (unless accuracy is interpreted as a type of cost measure). A few papers have inve...
This position paper argues that the Baldwin effect is widely misunderstood by the evolutionary computation community. The misunderstandings appear to fall into two general categories. Firstly, it is commonly believed that the Baldwin effect is concerned with the synergy that results when there is an evolving population of learning individuals. This...
This paper presents a theory of error in cross-validation testing of algorithms for predicting real-valued attributes. The theory justifies the claim that predicting real-valued attributes requires balancing the conflicting demands of simplicity and accuracy. Furthermore, the theory indicates precisely how these conflicting demands must be balanced...
This paper begins with a general theory of error in cross-validation testing of algorithms for supervised learning from examples. It is assumed that the examples are described by attribute-value pairs, where the values are symbolic. Cross-validation requires a set of training examples and a set of testing examples. The value of the attribute that i...
This paper presents a simple unsupervised learning algorithm for classifying reviews as recommended (thumbs up) or not rec- ommended (thumbs down). The classifi- cation of a review is predicted by the average semantic orientation of the phrases in the review that contain adjec- tives or adverbs. A phrase has a positive semantic orientation when it...
This paper presents a simple unsupervised learning algorithm for recognizing synonyms, based on statistical data acquired by querying a Web search engine. The algorithm, called PMI-IR, uses Pointwise Mutual Information (PMI) and Information Retrieval (IR) to measure the similarity of pairs of words. PMI-IR is empirically evaluated using 80 synonym...
Robert French has argued that a disembodied computer is incapable of passing a Turing Test that includes subcognitive questions. Subcognitive questions are designed to probe the network of cultural and perceptual associations that humans naturally develop as we live, embodied and embedded in the world. In this paper, I show how it is possible for a...
Keyphrases are useful for a variety of purposes, including summarizing, indexing, labeling, categorizing, clustering, highlighting, browsing, and searching. The task of automatic keyphrase extraction is to select keyphrases from within the text of a given document. Automatic keyphrase extraction makes it feasible to generate keyphrases for the huge...
Many academic journals ask their authors to provide a list of about five to fifteen key words, to appear on the first page of each article. Since these key words are often phrases of two or more words, we prefer to call them keyphrases. There is a surprisingly wide variety of tasks for which keyphrases are useful, as we discuss in this paper. Recen...
This report presents an empirical evaluation of four algorithms for automatically extracting keywords and keyphrases from documents. The four algorithms are compared using five different collections of documents. For each document, we have a target set of keyphrases, which were generated by hand. The target keyphrases were generated for human reade...
This paper presents a simple unsupervised learning algorithm for classifying reviews as recommended (thumbs up) or not recommended (thumbs down). The classification of a review is predicted by the average semantic orientation of the phrases in the review that contain adjectives or adverbs. A phrase has a positive semantic orientation when it has go...
This paper introduces ICET, a new algorithm for cost-sensitive classification. ICET uses a genetic algorithm to evolve a population of biases for a decision tree induction algorithm.
Many academic journals ask their authors to provide a list of about five to fifteen keywords, to appear on the first page of each article. Since these key words are often phrases of two or more words, we prefer to call them keyphrases. There is a wide variety of tasks for which keyphrases are useful, as we discuss in this paper. We approach the pro...
The evaluative character of a word is called its semantic orientation. A positive semantic orientation implies desirability (e.g., "honest", "intrepid") and a negative semantic orientation implies undesirability (e.g., "disturbing", "superfluous"). This paper introduces a simple algorithm for unsupervised learning of semantic orientation from extre...
JohnnyVon is an implementation of self-replicating automata in continuous twodimensional space. Two types of particles drift about in a virtual liquid. The particles are automata with discrete internal states but continuous external relationships. Their internal states are governed by finite state machines but their external relationships are gover...
This paper presents a simple unsupervised learning algorithm for recognizing synonyms, based on statistical data acquired by querying a Web search engine. The algorithm, called PMI-IR, uses Pointwise Mutual Information (PMI) and Information Retrieval (IR) to measure the similarity of pairs of words. PMI-IR is empirically evaluated using 80 synonym...
This paper presents a simple unsupervised learning algorithm for recognizing synonyms, based on statistical data acquired by querying a Web search engine. The algorithm, called PMI-IR, uses Pointwise Mutual Information (PMI) and Information Retrieval (IR) to measure the similarity of pairs of words. PMI-IR is empirically evaluated using 80 synonym...
The idea that there are any large-scale trends in the evolution of biological organisms is highly controversial. It is commonly believed, for example, that there is a large-scale trend in evolution towards increasing complexity, but empirical and theoretical arguments undermine this belief. Natural selection results in organisms that are well adapt...
Diagnosing faults in aircraft gas turbine engines is a complex problem. It involves several tasks, including rapid and accurate interpretation of patterns in engine sensor data. We have investigated contextual normalization for the development of a software tool to help engine repair technicians with interpretation of sensor data. Contextual normal...
. This study is concerned with whether it is possible to detect what information contained in the training data and background knowledge is relevant for solving the learning problem, and whether irrelevant information can be eliminated in preprocessing before starting the learning process. A case study of data preprocessing for a hybrid genetic alg...
This study is concerned with whether it is possible to detect what information contained in the training data and background knowledge is relevant for solving the learning problem, and whether irrelevant information can be eliminated in preprocessing before starting the learning process. A case study of data preprocessing for a hybrid genetic algor...
Many academic journals ask their authors to provide a list of about five to fifteen keywords, to appear on the first page of each article. Since these key words are often phrases of two or more words, we prefer to call them keyphrases. There is a wide variety of tasks for which keyphrases are useful, as we discuss in this paper. We approach the pro...