Chapter

Extraction of Cognitive Operations from Scientific Texts

Authors:
  • Federal Research Centre Computer Science and Control RAS
To read the full-text of this research, you can request a copy directly from the author.

Abstract

Rhetorical structure theory defines the relations between predicates and larger discourse units, but it does not consider the extralinguistic nature of text-writing at all. However, the text-writing process is totally related to the particular targeted activity. This paper presents a new approach that does not model a text as a result of a researcher’s cognitive activity embodied in it, but it models cognitive activity reflected in the scientific text. We also propose and evaluate a framework for detection of text fragments, which is related to cognitive operations in scientific texts. The obtained results confirm the usefulness of the suggested set of cognitive operations for the analysis of scientific texts. Moreover, these results justify the applicability of the proposed framework to cognitive operation extraction from scientific texts in Russian.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

Article
Full-text available
Целью настоящего исследования было создание инструмента автоматической классификации ответов испытуемых в тесте Розенцвейга при участии психолога, лингвиста и специалиста по искусственному интеллекту. В статье представлены речевые конструкции, специфичные для четырёх из девяти типов фрустрационного реагирования, и данные для каждого типа реакции об эффективности (полноте и надёжности) разработанного метода автоматической классификации ответов испытуемых. Размеченный корпус текстов реакций на фрустрацию, с которым работал лингвист, представлял собой обработанные психологом протоколы 462 испытуемых; лингвистические описания были формализованы и применены для построения признакового описания текстовых фрагментов; далее эти описания были использованы при построении классификатора типов реакций на фрустрацию с помощью методов машинного обучения; для вычисления оценок качества классификации использовалась статистическая процедура трехкратного перекрестного скользящего контроля. Разработанные лингвистические шаблоныформируют высокоуровневое признаковое описание фрагментов текста, позволяющее с высокой полнотой (R не менее 0.8) выявлять высказывания, относящиеся к различным типам реакций. Четыре реакции –M, M’, I, E –могут быть надежно выделены (F1 > 0.7) без учетаконтекста высказываний. В статье представлены результаты лингвистического анализа ответов на тест Розенцвейга, относящихся к этим четырём типам фрустрационного реагирования; для разных типов реагирования было получено от 4 до 18 лингвистических описаний. Для психологии технология лингвистических шаблонов выступает, с одной стороны, средством профессиональной рефлексии, а с другой, инструментом верификации данных проективных текстовых методик и позволяет создавать арсенал средств автоматической психодиагностики речевых продуктов деятельности (например, сетевых дискуссий).
Article
Full-text available
The paper proves that speech genres as forms of text production and interpretation claim to be one of the main objects of formal linguistic analysis in comprehensive papers on cognitive modeling – an intensively developing trend of artificial intelligence. Understanding a speech genre as a form of spiritual socio-cultural activity (artistic, scientific, political, ideological, etc.) at the level of its objectification through a system of speech actions in the text as a communication unit allows to describe the systems of speech genres in various spheres of communication. The authors analyze speech genres which objectify the main stages of an academic theoretical research. By means of artificial intelligence, the research solves the problem of recognition of the speaker’s intentions while performing cognitive-speech actions forming the genre form of a text. It contributes to the development of the fundamental problem of “understanding” the meaning of an utterance by a machine. The research is based on the interdisciplinary complex method of text analysis. In terms of software implementation, the offered approach obtains a small set of high-level linguistic features of the clauses with the templates and then trains classifiers on these features. In order to create templates, the authors carry out linguistic and psychological analysis that deals with identifying markers of cognitive and speech actions as accurately as possible in accordance with the standards of perception. In the course of our study, the authors have obtained high indexes of cognitive and speech action identification, ranging from 0.78 to 0.99.
Article
Full-text available
The dominant approach to argument mining has been to treat it as a machine learning problem based upon superficial text features, and to treat the relationships between arguments as either support or attack. However, accurately summarizing argumentation in scientific research articles requires a deeper understanding of the text and a richer model of relationships between arguments. First, this paper presents an argumentation scheme-based approach to mining a class of biomedical research articles. Argumentation schemes implemented as logic programs are formulated in terms of semantic predicates that could be obtained from a text by use of biomedical/biological natural language processing tools. The logic programs can be used to extract the underlying scheme name, premises, and implicit or explicit conclusion of an argument. Then this paper explores how arguments in a research article occur within a narrative of scientific discovery, how they are related to each other, and some implications.
Conference Paper
Full-text available
We investigate neural techniques for end-to-end computational argumentation mining (AM). We frame AM both as a token-based dependency parsing and as a token-based sequence tagging problem, including a multi-task learning setup. Contrary to models that operate on the argument component level, we find that framing AM as dependency parsing leads to subpar performance results. In contrast, less complex (local) tagging models based on BiLSTMs perform robustly across classification scenarios, being able to catch long-range dependencies inherent to the AM problem. Moreover, we find that jointly learning ‘natural’ subtasks, in a multi-task learning setup, improves performance.
Conference Paper
Full-text available
We introduce a globally normalized transition-based neural network model that achieves state-of-the-art part-of-speech tagging, dependency parsing and sentence compression results. Our model is a simple feed-forward neural network that operates on a task-specific transition system, yet achieves comparable or better accuracies than recurrent models. The key insight is based on a novel proof illustrating the label bias problem and showing that globally normalized models can be strictly more expressive than locally normalized models.
Conference Paper
Full-text available
The aim of argumentation mining is the automatic detection and identification of the argumentative structure contained within a piece of natural language text. In this paper we present the ArgMine Framework: an alignment of tools and processes that facilitate and partially automate argumentation mining research. We also report on a preliminary exploitation of the framework, where we address argumentative zoning, a sub-task of argumentation mining, whose aim is to automatically select the zones of the text that contain argumentative content. The target corpus used to train the supervised machine learning algorithms was manually annotated and is composed of Portuguese news articles, to which argumentation mining does not seem to have been applied before. Given the dataset used in our experiments and from the critical analysis of the obtained results, we conclude that lexical and syntactic-based features are not enough to successfully address argumentation zoning.
Article
Full-text available
Considering the growing volume of scientific literature, techniques that enable automatic detection of informational entities existing in scientific research articles may contribute to the extension of scientific knowledge and practical usages. Although there have been several efforts to extract informative entities from patent and biomedical research articles, there are few attempts in other scientific literatures. In this paper, we introduce an automatic semantic annotation framework for research articles based on entity recognition techniques. Our approach includes tag set modeling for semantic annotation, semi-automatic annotation tool, manual annotation for training data preparation, and supervised machine learning to develop entity type recognition module. For experiments, we choose two different domains, such as information and communication technology and chemical engineering due to their high usages. In addition, we provide three application scenarios of how our annotation framework can be used and extended further. It is to guide potential researchers who are willing to link their own contents with external data.
Article
Full-text available
TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. TensorFlow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs, and custom designed ASICs known as Tensor Processing Units (TPUs). This architecture gives flexibility to the application developer: whereas in previous "parameter server" designs the management of shared state is built into the system, TensorFlow enables developers to experiment with novel optimizations and training algorithms. TensorFlow supports a variety of applications, with particularly strong support for training and inference on deep neural networks. Several Google services use TensorFlow in production, we have released it as an open-source project, and it has become widely used for machine learning research. In this paper, we describe the TensorFlow dataflow model in contrast to existing systems, and demonstrate the compelling performance that TensorFlow achieves for several real-world applications.
Article
Full-text available
The paper introduces two methods for semantic role labeling of Russian texts. The first method is based on semantic dictionary that contains information about predicates, roles and syntaxeme features that correspond to the roles. It also uses heuristics and integer linear programming to find the best joint assignment of roles. The second method is data-driven semantic-syntactic parsing, which was implemented using MaltParser. It performs transition-based data-driven parsing simultaneously building a syntactic tree and assigning semantic roles. It was trained with various feature sets on SynTagRus Treebank, which was automatically enriched with semantic roles by the dictionary-based parser. We managed to automatically alleviate mistakes in the training corpus using output of the datadriven parser. We evaluated the performance of the parsers on the subcorpus of SynTagRus, which we manually annotated with semantic information. The dictionary-based parser and the data-driven semantic-syntactic parser showed close performance. Although the data-driven parser did not outperform the dictionary-based parser, we expect that it can be beneficial in some cases and has potentials for further improvement.
Article
Full-text available
A shared use of the AQ learning and JSM method in extracting cause-effect relationships from psychological test data is considered. The AQ learning is used to descript a test group using rules. The group description is a basis for constructing the factbase of the JSM method. The first stage of the JSM method is used for hypothesizing on cause-effect relationships in the factbase. A complete algorithm of the developed method is described and recommendations for its use are provided. A contrasting analysis of the results of the proposed method and statistical processing results is performed.
Article
Full-text available
A relational-situational method for analysis of natural language texts is outlined based on the theory of communicative grammar of the Russian language and the theory of heterogeneous semantic networks. It is shown that the relational-situational method can be used for precise search of documents in local and globalnets and electronic libraries
Book
Full-text available
This book provides a rich and accessible account of genre studies by a world-renowned applied linguist. The hardback edition discusses today's research world, its various configurations of genres, and the role of English within the genres. Theoretical and methodological issues are explored, with a special emphasis on various metaphors of genre. The book is full of carefully worded detail and each chapter ends with suggestions for pedagogical practice. The volume closes with evaluations of contrastive rhetoric, applied corpus linguistics, and critical approaches to EAP. Research Genres provides a rich and scholarly account of this key area.
Article
Full-text available
Purpose The object of this study is to develop methods for automatically annotating the argumentative role of sentences in scientific abstracts. Working from Medline abstracts, sentences were classified into four major argumentative roles: objective, method, result, and conclusion. The idea is that, if the role of each sentence can be marked up, then these metadata can be used during information retrieval to seek particular types of information such as novelty, conclusions, methodologies, aims/goals of a scientific piece of work. Design/methodology/approach Two approaches were tested: linguistic cues and positional heuristics. Linguistic cues are lexico‐syntactic patterns modelled as regular expressions implemented in a linguistic parser. Positional heuristics make use of the relative position of a sentence in the abstract to deduce its argumentative class. Findings The experiments showed that positional heuristics attained a much higher degree of accuracy on Medline abstracts with an F ‐score of 64 per cent, whereas the linguistic cues only attained an F ‐score of 12 per cent. This is mostly because sentences from different argumentative roles are not always announced by surface linguistic cues. Research limitations/implications A limitation to the study was the inability to test other methods to perform this task such as machine learning techniques which have been reported to perform better on Medline abstracts. Also, to compare the results of the study with earlier studies using Medline abstracts, the different argumentative roles present in Medline had to be mapped on to four major argumentative roles. This may have favourably biased the performance of the sentence classification by positional heuristics. Originality/value To the best of one's knowledge, this study presents the first instance of evaluating linguistic cues and positional heuristics on the same corpus.
Article
Full-text available
Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.
Article
Full-text available
Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O. 1. Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.
Article
Full-text available
Function approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and steepest--descent minimization. A general gradient--descent "boosting" paradigm is developed for additive expansions based on any fitting criterion. Specific algorithms are presented for least--squares, least--absolute--deviation, and Huber--M loss functions for regression, and multi--class logistic likelihood for classification. Special enhancements are derived for the particular case where the individual additive components are decision trees, and tools for interpreting such "TreeBoost" models are presented. Gradient boosting of decision trees produces competitive, highly robust, interpretable procedures for regression and classification, especially appropriate for mining less than clean data. Connections between this approach and the boosting methods of Freund and Shapire 1996, and Fr...
Article
Full-text available
In this article we propose a strategy for the summarization of scientific articles that concentrates on the rhetorical status of statements in an article: Material for summaries is selected in such a way that summaries can highlight the new contribution of the source article and situate it with respect to earlier work. We provide a gold standard for summaries of this kind consisting of a substantial corpus of conference articles in computational linguistics annotated with human judgments of the rhetorical status and relevance of each sentence in the articles. We present several experiments measuring our judges' agreement on these annotations. We also present an algorithm that, on the basis of the annotated training material, selects content from unseen articles and classifies it into a fixed set of seven rhetorical categories. The output of this extraction and classification system can be viewed as a single-document summary in its own right; alternatively, it provides starting material for the generation of task-oriented and user-tailored summaries designed to give users an overview of a scientific field.
Article
The automatic extraction of arguments from text, also known as argument mining, has recently become a hot topic in artificial intelligence. Current research has only focused on linguistic analysis. However, in many domains where communication may be also vocal or visual, paralinguistic features too may contribute to the transmission of the message that arguments intend to convey. For example, in political debates a crucial role is played by speech. The research question we address in this work is whether in such domains one can improve claim detection for argument mining, by employing features from text and speech in combination. To explore this hypothesis, we develop a machine learning classifier and train it on an original dataset based on the 2015 UK political elections debate.
Article
Argument mining is a core technology for automating argument search in large document collections. Despite its usefulness for this task, most current approaches to argument mining are designed for use only with specific text types and fall short when applied to heterogeneous texts. In this paper, we propose a new sentential annotation scheme that is reliably applicable by crowd workers to arbitrary Web texts. We source annotations for over 25,000 instances covering eight controversial topics. The results of cross-topic experiments show that our attention-based neural network generalizes best to unseen topics and outperforms vanilla BiLSTM models by 6% in accuracy and 11% in F-score.
Article
We describe the task of intention-based text understanding for scientific argumentation. The model of scientific argumentation presented here is based on the recognition of 28 concrete rhetorical moves in text. These moves can in turn be associated with higherlevel intentions. The intentions we aim to model operate in the limited domain of scientific argumentation and justification; it is the limitation of the domain which makes our intentions predictable and enumerable, unlike general intentions. We explain how rhetorical moves relate to higher-level intentions. We also discuss work in progress towards a corpus annotated with limited-domain intentions, and speculate about the design of an automatic recognition system, for which many components already exist today.
Book
As one of the most comprehensive machine learning texts around, this book does justice to the field's incredible richness, but without losing sight of the unifying principles. Peter Flach's clear, example-based approach begins by discussing how a spam filter works, which gives an immediate introduction to machine learning in action, with a minimum of technical fuss. Flach provides case studies of increasing complexity and variety with well-chosen examples and illustrations throughout. He covers a wide range of logical, geometric and statistical models and state-of-the-art topics such as matrix factorisation and ROC analysis. Particular attention is paid to the central role played by features. The use of established terminology is balanced with the introduction of new and useful concepts, and summaries of relevant background material are provided with pointers for revision if necessary. These features ensure Machine Learning will set a new standard as an introductory textbook.
Article
Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, ***, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.
Conference Paper
We present conditional random fields, a framework for building probabilistic models to segment and label sequence data. Conditional random fields offer several advantages over hidden Markov models and stochastic grammars for such tasks, including the ability to relax strong independence assumptions made in those models. Conditional random fields also avoid a fundamental limitation of maximum entropy Markov models (MEMMs) and other discriminative Markov models based on directed graphical models, which can be biased towards states with few successor states. We present iterative parameter estimation algorithms for conditional random fields and compare the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.
Article
Discriminative training approaches like structural SVMs have shown much promise for building highly complex and accurate models in areas like natural language processing, protein structure prediction, and information retrieval. How- ever, current training algorithms are computationally expensive or intractable on large datasets. To overcome this bottleneck, this paper explores how cutting-plane methods can provide fast training not only for classification SVMs, but also for structural SVMs. We show that for an equivalent "1-slack" reformulation of the lin- ear SVM training problem, our cutting-plane method has time complexity linear in the number of training examples. In particular, the number of iterations does not depend on the number of training examples, and it is linear in the desired precision and the regularization parameter. Furthermore, we present an extensive empirical evaluation of the method applied to binary classification, multi-class classification, HMM sequence tagging, and CFG parsing. The experiments show that the cutting- plane algorithm is broadly applicable and fast in practice. On large datasets, it is typically several orders of magnitude faster than conventional training methods de- rived from decomposition methods like SVM-light, or conventional cutting-plane methods. Implementations of our methods are available at www.joachims.org.
Article
"This volume presents an attempt to construct a unified cognitive theory of science in relatively short compass. It confronts the strong program in sociology of science and the positions of various postpositivist philosophers of science, developing significant alternatives to each in a reeadily comprehensible sytle. It draws loosely on recent developments in cognitive science, without burdening the argument with detailed results from that source. . . . The book is thus a provocative one. Perhaps that is a measure of its value: it will lead scholars and serious student from a number of science studies disciplines into continued and sharpened debate over fundamental questions."—Richard Burian, Isis "The writing is delightfully clear and accessible. On balance, few books advance our subject as well."—Paul Teller, Philosophy of Science
Article
We present conditional random fields, a framework for building probabilistic models to segment and label sequence data. Conditional random fields offer several advantages over hidden Markov models and stochastic grammars for such tasks, including the ability to relax strong independence assumptions made in those models. Conditional random fields also avoid a fundamental limitation of maximum entropy Markov models (MEMMs) and other discriminative Markov models based on directed graphical models, which can be biased towards states with few successor states. We present iterative parameter estimation algorithms for conditional random fields and compare the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.
Annotation of computer science papers for semantic relation extraction
  • Y Tateisi
  • Y Shidahara
  • Y Miyao
  • A Aizawa
CRFSuite: a fast implementation of conditional random fields (CRFs
  • N Okazaki
Discourse Analysis in a Cognitive Perspective: Dr-Sci Thesis
  • A A Kibrik
Speech genres in functional and stylistic perspective (scientific academic text)
  • V A Salimovsky
  • VA Salimovsky