About
145
Publications
31,087
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,907
Citations
Introduction
Never-Ending Learning Question Answering Systems
https://megagon.ai/nelqa
Publications
Publications (145)
Large Language Models (LLMs) have demonstrated remarkable performance on various tasks, yet their ability to extract and internalize deeper insights from domain-specific datasets remains underexplored. In this study, we investigate how continual pre-training can enhance LLMs' capacity for insight learning across three distinct forms: declarative, s...
Large Language Models (LLMs) have shown impressive capability in language generation and understanding, but their tendency to hallucinate and produce factually incorrect information remains a key limitation. To verify LLM-generated contents and claims from other sources, traditional verification approaches often rely on holistic models that assign...
The domain of human resources (HR) includes a broad spectrum of tasks related to natural language processing (NLP) techniques. Recent breakthroughs in NLP have generated significant interest in its industrial applications in this domain and potentially alleviate challenges such as the difficulty of resource acquisition and the complexity of problem...
Although many studies have investigated and reduced hallucinations in large language models (LLMs) for single-document tasks, research on hallucination in multi-document summarization (MDS) tasks remains largely unexplored. Specifically, it is unclear how the challenges arising from handling multiple documents (e.g., repetition and diversity of inf...
Large Language Models (LLMs) have showcased remarkable capabilities surpassing conventional NLP challenges, creating opportunities for use in production use cases. Towards this goal, there is a notable shift to building compound AI systems, wherein LLMs are integrated into an expansive software infrastructure with many components like models, retri...
Symbolic knowledge graphs (KGs) play a pivotal role in knowledge-centric applications such as search, question answering and recommendation. As contemporary language models (LMs) trained on extensive textual data have gained prominence, researchers have extensively explored whether the parametric knowledge within these models can match up to that p...
Large Language Models (LLMs) have demonstrated remarkable capabilities in various NLP tasks. However, previous works have shown these models are sensitive towards prompt wording, and few-shot demonstrations and their order, posing challenges to fair assessment of these models. As these models become more powerful, it becomes imperative to understan...
Human-centered AI workflows involve stakeholders with multiple roles interacting with each other and automated agents to accomplish diverse tasks. In this paper, we call for a holistic view when designing support mechanisms, such as interaction paradigms, interfaces, and systems, for these multifaceted workflows.
We present MEGAnno, a novel exploratory annotation framework designed for NLP researchers and practitioners. Unlike existing labeling tools that focus on data labeling only, our framework aims to support a broader, iterative ML workflow including data exploration and model development. With MEGAnno's API, users can programmatically explore the data...
Triplet extraction aims to extract entities and their corresponding relations in unstructured text. Most existing methods train an extraction model on high-quality training data, and hence are incapable of extracting relations that were not observed during training. Generalizing the model to unseen relations typically requires fine-tuning on synthe...
Despite rapid developments in the field of machine learning research, collecting high-quality labels for supervised learning remains a bottleneck for many applications. This difficulty is exacerbated by the fact that state-of-the-art models for NLP tasks are becoming deeper and more complex, often increasing the amount of training data required eve...
We present a proxy dataset of vital signs with class labels indicating patient transitions from the ward to intensive care units called Ward2ICU. Patient privacy is protected using a Wasserstein Generative Adversarial Network to implicitly learn an approximation of the data distribution, allowing us to sample synthetic data. The quality of data gen...
Internet and social Web made possible the acquisition of information to feed a growing number of Machine Learning (ML) applications and, in addition, brought light to the use of crowdsourcing approaches, commonly applied to problems that are easy for humans but difficult for computers to solve, building the crowd-powered systems. In this work, we c...
With advances in machine learning, natural language processing, processing speed, and amount of data storage, conversational agents are being used in applications that were not possible to perform within a few years. NELL, a machine learning agent who learns to read the web, today has a considerably large ontology and while it can be used for multi...
This paper describes the process of automatic identification of concepts in different languages using a base that relies on simple semantic and morphosyntactic characteristics like string similarity, difference in words amount and translation position on dictionary (when exists) and a neural network that has been used as a model of machine learning...
NELL is a system that continuously reads the Web to extract knowledge in the form of entities and relations between them. It has been running since January 2010 and extracted over 450 million candidate statements, 28 million of which remain in iteration 1100. NELL's generated data comprises all the candidate statements, together with detailed metad...
Whereas people learn many different types of knowledge from diverse experiences over many years, and become better learners over time, most current machine learning systems are much more narrow, learning just a single function or data model based on statistical analysis of a single data set. We suggest that people learn better than computers precis...
NELL is a system that continuously reads the Web to extract knowledge in form of entities and relations between them. It has been running since January 2010 and extracted over 50,000,000 candidate statements. NELL's generated data comprises all the candidate statements together with detailed information about how it was generated. This information...
The growing use of mobile and wearables devices has opened a broad range of possibilities for building robust Intelligent Personal Assistants (IPA's). Most known IPA's implementations have predominantly focused on understanding what the user's immediate intention is, however, there is no evidence of trying to understand the user's context and inten...
One of the challenges for never-ending learning language (NELL) systems is to properly identify different noun phrases that denote the same concept in order to maintain the cohesion of their knowledge base. This paper investigate the coupling as an approach for improve coreference resolution on NELL. The obtained results suggests that coupling is a...
Enhancements in ways a conversational agent can participate in human-computer dialog tasks, as well as in the ability of such agents in answering questions has been explored in different research areas in the last years. But some limitations are still present in most of the current conversational agents’ models. In this paper we present a new forma...
A new paradigm of Machine Learning named Never-Ending Learning has been proposed through a system known as NELL (Never-Ending Language Learning). The major idea of this system is to learn to read the web better each day and to store the gathered knowledge in a knowledge base (KB), continually and incrementally. This paper proposes a new method that...
The Workshop Program of the Association for the Advancement of Artificial Intelligence’s Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) was held at the beginning of the conference, February 12-13, 2016. Workshop participants met and discussed issues with a selected focus — providing an informal setting for active exchange among rese...
The Workshop Program of the Association for the Advancement of Artificial Intelligence’s Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) was held at the beginning of the conference, February 12-13, 2016. Workshop participants met and discussed issues with a selected focus — providing an informal setting for active exchange among rese...
The exponentially grow of Web and data availability, the semantic web area has expanded and each day more data is expressed as knowledge bases. Knowledge bases (KB) used in most projects are represented in an ontology-based fashion, so the data can be better organized and easily accessible. It is common to map these KBs into a graph when trying to...
This tutorial introduces Lifelong Machine Learning (LML) and Machine Reading. The core idea of LML is to learn continuously and accumulate the learned knowledge, and to use the knowledge to help future learning, which is perhaps the hallmark of human learning and human intelligence. By us- ing prior knowledge seamlessly and effortlessly, we humans...
With the exponentially growing amount of available data on the Web over the last years, several projects have been created to automatically extract knowledge from this information set. As the data domains on the Web are too wide, most of these projects store the acquired knowledge in ontological knowledge bases (OKBs). Mapping it into graph-based r...
Supervised algorithms require a set of representative labeled data for building classification models. However, labeled data are usually difficult and expensive to obtain, which motivates the interest in semi-supervised learning. This type of learning uses both labeled and unlabeled data in the training process and is particularly useful in applica...
In recent years, many researches have been focusing their studies in large growing knowledge bases. Most techniques focus on building algorithms to help the Knowledge Base (KB) automatically (or semi-automatically) extends. In this article, we make use of a generalized association rule mining algorithm in order, specially, to increase the relations...
NELL (Never Ending Language Learning system) is the first system to practice the Never-Ending Machine Learning paradigm techniques. It has an inactive component to continually extend its KB: OntExt. Its main idea is to identify and add to the KB new relations which are frequently asserted in huge text data. Co-occurrence matrices are used to struct...
The first Never-Ending Learning system reported in the literature, which is called NELL (Never-Ending Language Learner), was designed to perform the task of autonomously building an knowledge base as a result of continuously reading the web. NELL is based on a learning paradigm in which, the learner, in an autonomous way, manages to constantly, inc...
An alternative to the traditional single function approximation method is the never-ending learning (NEL) approach i.e., a learning paradigm in which, the learner, in an autonomous way, manages to constantly, incrementally and continuously evolve with time. But, most important than just keep evolving, in this new paradigm acquired knowledge can, in...
Whereas people learn many different types of knowledge from diverse experiences over many years, most current machine learning systems acquire just a single function or data model from just a single data set. We propose a never-ending learning paradigm for machine learning, to better reflect the more ambitious and encompassing type of learning perf...
Large growing knowledge bases have been an interesting field in many researches in recent years. Most techniques focus on building algorithms to help the Knowledge Base (KB) automatically (or semi-automatically) extends.In this article, we make use of an association (or generalized association) rule mining algorithm in order to populate the KB and...
Whereas people learn many different types of knowledge from diverse experiences over many years, most current machine learning systems acquire just a single function or data model from just a single data set. We propose a never-ending learning paradigm for machine learning, to better reflect the more ambitious and encompassing type of learning perf...
Large and continuous growing knowledge bases (KBs) have been widely studied in recent years. A major challenge in this field is how to develop techniques to help populating such KBs and improve their coverage. In this context, this work proposes an “association rules”-base approach. We applied an association rule mining algorithm to discover new re...
We describe our approach for the SemEval-2014 task 9: Sentiment Analysis in Twitter. We make use of an ensemble learning method for sentiment classification of tweets that relies on varied features such as feature hashing, part-of-speech, and lexical features. Our system was evaluated in the Twitter message-level task.
Twitter is a microblogging site in which users can post updates (tweets) to friends (followers). It has become an immense dataset of the so-called sentiments. In this paper, we introduce an approach that automatically classifies the sentimentof tweets by using classifier ensembles and lexicons. Tweets are classified as either positive or negative c...
The goal of sentiment analysis is to determine opinions, emotions, and attitudes presented in source material. In tweet sentiment analysis, opinions in messages can be typically categorized as positive or negative. To classify them, researchers have been using traditional classifiers like Naive Bayes, Maximum Entropy, and Support Vector Machines (S...
The volume of complex network data has been exponentially increased in the last years madding graph mining area the focus of a lot of research efforts. Most algorithms for mining this kind of data assume, however, that the complex network fits in primary memory. Unfortunately, such assumption is not always true. Even considering that, in some cases...
In this paper we present an approach to treatment of the Cold-Start Problem in Recommendation System for Environment Education Web. Our approach is based on the concept of Coupled-Learning and Bootstrapping. Based on an initial set of data we apply algorithms traditional machine learning to cooperate with each other, forming various views on its ou...
Este trabalho apresenta uma abordagem para predizer o desempenho de alunos usando técnicas de Sistemas de Recomendação num ambiente educacional. A abordagem é baseada no conceito de Acoplamento, usando um conjunto inicial de dados são aplicadas técnicas tradicionais de aprendizado de máquina que cooperam entre si, formando visões diferentes sobre o...
Fault diagnosis includes the main task of classification. Bayesian networks (BNs) present several advantages in the classification task, and previous works have suggested their use as classifiers. Because a classifier is often only one part of a larger decision process, this article proposes, for industrial process diagnosis, the use of a Bayesian...
The amount of information available on the Web has been increasing daily. However, how one might know what is right or wrong? Does the Web itself can be used as a source for verification of information? NELL (Never-Ending Language Learner) is a computer system that gathers knowledge from Web. Prophet is a link prediction component on NELL that has...
The project and implementation of autonomous computational systems that incrementally learn and use what has been learnt to, continually, refine its learning abilities throughout time is still a goal far from being achieved. Such dynamic systems would conform to the main ideas of the automatic learning model conventionally characterized as never-en...
Time plays an important role in the vast majority of problems and, as such, it is a vital issue to be considered when developing computer systems for solving problems. In the literature, one of the most influential formalisms for representing time is known as Allen's Temporal Algebra based on a set of 13 relations basic and reversed that may hold b...
Bayesian networks (BN) and Bayesian classifiers (BC) are traditional probabilistic techniques that have been successfully used by various machine learning methods to help solving a variety of problems in many different domains. BNs (and BCs) can be considered a probabilistic graphical language suitable for inducing models from data aiming at knowle...
The notion of Contradiction is present in many aspects of the world and human information processing. As a consequence, more and more computer systems have been pushed into dealing with the contradiction detection task. Contradiction Detection (CD) is not a simple task, thus, it is subject to many discussions and approaches in different areas of hu...
The Machine Learning community have been introduced to NELL (Never-Ending Language Learning), a system able to learn from web and to use its knowledge to keep learning infinitely. The idea of continuously learning from the web brings concerns about reliability and accuracy, mainly when the learning process uses its own knowledge to improve its lear...
This paper describes a proposal which extends Allen's interval algebra by adapting the formalism for dealing with binary relations involving time periods with uncertain boundaries. The extended formalism has been proposed having in mind the subsequent investigation of the automatic learning of temporal relations using an inductive logic programming...
Our inspiration comes from Nell (Never Ending Language Learning), a computer program running at Carnegie Mellon University to extract structured information from unstructured web pages. We consider the problem of semi-supervised learning approach to extract category instances (e.g. country(USA), city(New York)) from web pages, starting with a handf...
A number of different computational approaches have been applied in many different biology application domains. When such tools are based on conventional computation techniques, they have shown limitations to approach complex biological problems. In the present study, a genetic algorithm (named GANEL) that is based on some Never-Ending Learning (NE...
The recent growth of virtual communities, social web and information sharing gives to information retrieval and machine learning systems a source of information referred as the "wisdom of crowds". In this work we show that this information could be used not only as a source of knowledge but as a way to bring intelligent systems closer to users by u...
Link prediction is a task that in graph-based data models, as well as, in complex networks not only to predict edges that will appear in a near future but also to find missing edges. NELL is a never ending language learner system that has the ability to continuously learn to extract structured information from unstructured text (fetched from web pa...
Variable Ordering (VO) plays an important role when inducing Bayesian Networks (BNs). Previous works in the literature suggest that it is worth pursuing the use of evolutionary strategies for identifying a suitable VO, when learning a Bayesian Network structure from data. This paper proposes a hybrid adaptive algorithm named VOMOS (Variable Orderin...
Missing values are a critical problem in data mining applications. The substitution of these values, also called imputation, can be performed by several methods. This work describes the application of an optimized version of the Bayesian Algorithm K2 as an imputation tool for a clustering genetic algorithm. The resulting hybrid system is assessed b...
The notion of time permeates every single aspect of the world around us and, as such, it should be taken into account when
developing automatic systems that implement many of its processes. In the literature several proposals for representing the
notion of time can be found. One of the most popular is the Allen’s temporal interval, based on a set o...
In the 1950s and the 1960s several computer scientists independently studied evolutionary systems with the idea that evolution could be used as an optimization tool for engineering problems. For these evolutionary-computation researchers, the mechanisms of evolution seem well suited for some of the most pressing computational problems in many field...
This work proposes and discusses an approach for inducing Bayesian classifiers aimed at balancing the tradeoff between the precise probability estimates produced by time consuming unrestricted Bayesian networks and the computational efficiency of Naive Bayes (NB) classifiers. The proposed approach is based on the fundamental principles of the Heuri...
Hybrid intelligent systems which take advantage of the Bayesian/Fuzzy collaboration have been explored in the literature in the last years. Such collaboration can play an important role mainly in real intelligent systems applications, where accuracy and comprehensibility are crucial aspects to be considered. This paper further explore the Bayes Fuz...
Traditional approaches to Relation Extraction from text require manually defining the relations to be extracted. We propose here an approach to automatically discovering relevant relations, given a large text corpus plus an initial ontology defining hundreds of noun categories (e.g., Athlete, Musician, Instrument). Our approach discovers frequently...
Variable Ordering plays an important role when inducing Bayesian Networks. Previous works in the literature suggest that the use of genetic/evolutionary algorithms (EAs) for dealing with VO, when learning a Bayesian Network structure from data, is worth pursuing. This work proposes a new crossover operator, named Random Multi-point Crossover Operat...
We consider here the problem of building a never-ending language learner; that is, an intelligent computer agent that runs forever and that each day must (1) extract, or read, information from the web to populate a growing structured knowledge base, and (2) learn to perform this task better than on the previous day. In particular, we propose an app...
Computational approaches have been applied in many different biology application domains. When such tools are based on conventional computation, they have shown limitations to approach complex biological problems. In the present study, a computational evolutionary environment (CEE) is proposed as tool to extract classification rules from biological...
Variable Orderings (VOs) have been used as a restriction in the process of Bayesian Networks (BNs) induction. The VO information can significantly reduce the search space and allow some algorithms to reach good results. Previous works reported in the literature suggest that the combination of Evolutionary Algorithms (EAs) and VOs is worth when lear...
We consider the problem of semi-supervised learning to extract categories (e.g., academic fields, athletes) and relations (e.g., PlaysSport(athlete, sport)) from web pages, starting with a handful of labeled training examples of each category or relation, plus hundreds of millions of unlabeled web documents. Semi-supervised training using only a fe...
The use of Bayesian Network Classifiers (BCs) combined with the Fuzzy rule model to explain the learned BCs have been previously presented as the BayesFuzzy approach. This paper follows along BayesFuzzy lines of investigation aiming at improving the comprehensibility of a BC model and enhancing BayesFuzzy results by combining new pruning methods. I...
We consider here the problem of building a never-ending language learner; that is, an intelligent computer agent that runs forever and that each day must (1) extract, or read, information from the web to populate a growing structured knowledge base, and (2) learn to perform this task better than on the previous day. In particular, we propose an app...
A key question regarding the future of the semantic web is “how will we acquire structured information to populate the semantic
web on a vast scale?” One approach is to enter this information manually. A second approach is to take advantage of pre-existing
databases, and to develop common ontologies, publishing standards, and reward systems to make...
This paper describes the modeling of a weed infestation risk inference system that implements a collaborative inference scheme based on rules extracted from two Bayesian network classifiers. The first Bayesian classifier infers a categorical variable value for the weed–crop competitiveness using as input categorical variables for the total density...
The substitution of missing values, also called imputation, is an important data preparation task for many domains. Ideally, the substitution of missing values should not insert biases into the dataset. This aspect has been usually assessed by some measures of the prediction capability of imputation methods. Such measures assume the simulation of m...