Conference Paper

Automatic Detection of Uncertain Statements in the Financial Domain

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The automatic detection of uncertain statements can benefit NLP tasks such as deception detection and information extraction. Furthermore, it can enable new analyses in social sciences such as business where the quantification of uncertainty or risk plays a significant role. We approached the automatic detection of uncertain statements as a binary sentence classification task on the transcripts of spoken language for the first time in the financial domain. We created a new dataset and - besides using bag-of-words, part-of-speech tags, and dictionaries - developed rule-based features tailored to our task. Finally, we analyzed systematically, which features perform best in the financial domain as opposed to the previously researched encyclopedic domain.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Furthermore, for the first time in the community, we perform a binary sentence classification task on 10-Ks to assess directly whether our models are indeed suitable to detect linguistic uncertainty. Theil et al. (2017) created the first classifier capable to detect uncertain sentences in the financial domain. Yet, they sample their sentences from earnings call transcripts, a largely different disclosure type than 10-Ks. ...
Article
Full-text available
This monograph surveys the technology and empirics of text analytics in finance. I present various tools of information extraction and basic text analytics. I survey a range of techniques of classification and predictive analytics, and metrics used to assess the performance of text analytics algorithms. I then review the literature on text mining and predictive analytics in finance, and its connection to networks, covering a wide range of text sources such as blogs, news, web posts, corporate filings, etc. I end with textual content presenting forecasts and predictions about future directions.
Article
Full-text available
An abstract is not available.
Article
Full-text available
We explore the use of speculative lan-guage in MEDLINE abstracts. Results from a manual annotation experiment sug-gest that the notion of speculative sentence can be reliably annotated by humans. In addition, an experiment with automated methods also suggest that reliable auto-mated methods might also be developed. Distributional observations are also pre-sented as well as a discussion of possible uses for a system that can recognize spec-ulative language.
Article
Full-text available
We examine the occurrence of ethics- related terms in 10-K annual reports over 1994–2006 and offer empirical observations on the conceptual framework of Erhard etal. (Integrity: A Positive Model that Incorporates the Normative Phenomena of Morality, Ethics, and Legality (Harvard Business School, Harvard) 2007). We use a pre-Sarbanes-Oxley sample subset to compare the occurrence of ethics-related terms in our 10-K data with samples from other studies that consider virtue-related phenomena. We find that firms using ethics-related terms are more likely to be “sin” stocks, are more likely to be the object of class action lawsuits, and are more likely to score poorly on measures of corporate governance. The consistency of our results across these alternative measures of ethical behavior suggests that managers who portray their firm as “ethical” in 10-K reports are more likely to be systematically misleading the public. These results are consistent with the integrity-performance paradox.
Conference Paper
Full-text available
Our goal is to use natural language proc- essing to identify deceptive and non- deceptive passages in transcribed narra- tives. We begin by motivating an analy- sis of language-based deception that relies on specific linguistic indicators to discover deceptive statements. The indi- cator tags are assigned to a document us- ing a mix of automated and manual methods. Once the tags are assigned, an interpreter automatically discriminates between deceptive and truthful state- ments based on tag densities. The texts used in our study come entirely from "real world" sources—criminal state- ments, police interrogations and legal tes- timony. The corpus was hand-tagged for the truth value of all propositions that could be externally verified as true or false. Classification and Regression Tree techniques suggest that the approach is feasible, with the model able to identify 74.9% of the T/F propositions correctly. Implementation of an automatic tagger with a large subset of tags performed well on test data, producing an average score of 68.6% recall and 85.3% preci-
Conference Paper
Full-text available
We investigate automatic classification of speculative language ('hedging'), in biomedical text using weakly supervised machine learning. Our contributions include a precise description of the task with anno- tation guidelines, analysis and discussion, a probabilistic weakly supervised learning model, and experimental evaluation of the methods presented. We show that hedge classification is feasible using weakly supervised ML, and point toward avenues for future research.
Article
Full-text available
The Natural Language Toolkit is a suite of program modules, data sets and tutorials supporting research and teaching in com- putational linguistics and natural language processing. NLTK is written in Python and distributed under the GPL open source license. Over the past year the toolkit has been rewritten, simplifying many linguis- tic data structures and taking advantage of recent enhancements in the Python lan- guage. This paper reports on the simpli- fied toolkit and explains how it is used in teaching NLP.
Article
Full-text available
Storing and using specific instances improves the performance of several supervised learning algorithms. These include algorithms that learn decision trees, classification rules, and distributed networks. However, no investigation has analyzed algorithms that use only specific instances to solve incremental learning tasks. In this paper, we describe a framework and methodology, called instance-based learning, that generates classification predictions using only specific instances. Instance-based learning algorithms do not maintain a set of abstractions derived from specific instances. This approach extends the nearest neighbor algorithm, which has large storage requirements. We describe how storage requirements can be significantly reduced with, at most, minor sacrifices in learning rate and classification accuracy. While the storage-reducing algorithm performs well on several real-world databases, its performance degrades rapidly with the level of attribute noise in training instances. Therefore, we extended it with a significance test to distinguish noisy instances. This extended algorithm's performance degrades gracefully with increasing noise levels and compares favorably with a noise-tolerant decision tree algorithm.
Article
Relative to quantitative methods traditionally used in accounting and finance, textual analysis is substantially less` precise. Thus, understanding the art is of equal importance to understanding the science. In this survey, we describe the nuances of the method and, as users of textual analysis, some of the tripwires in implementation. We also review the contemporary textual analysis literature and highlight areas of future research. © 2016 University of Chicago on behalf of the Accounting Research Center
Article
We survey the textual sentiment literature, comparing and contrasting the various information sources, content analysis methods, and empirical models that have been used to date. We summarize the important and influential findings about how textual sentiment impacts on individual, firm-level and market-level behavior and performance, and vice versa. We point to what is agreed and what remains controversial. Promising directions for future research are emerging from the availability of more accurate and efficient sentiment measures resulting from increasingly sophisticated textual content analysis coupled with more extensive field-specific dictionaries. This is enabling more wide-ranging studies that use increasingly sophisticated models to help us better understand behavioral finance patterns across individuals, institutions and markets.
Article
In this paper it is shown how ridge estimators can be used in logistic regression to improve the parameter estimates and to diminish the error made by further predictions. Different ways to choose the unknown ridge parameter are discussed. The main attention focuses on ridge parameters obtained by cross-validation. Three different ways to define the prediction error are considered: classification error, squared error and minus log-likelihood. The use of ridge regression is illustrated by developing a prognostic index for the two-year survival probability of patients with ovarian cancer as a function of their deoxyribonucleic acid (DNA) histrogram. In this example, the number of covariates is large compared with the number of observations and modelling without restrictions on the parameters leads to overfitting. Defining a restriction on the parameters, such that neighbouring intervals in the DNA histogram differ only slightly in their influence on the survival, yields ridge-type parameter estimates with reasonable values which can be clinically interpreted. Furthermore the model can predict new observations more accurately.
Article
This paper reviews and proposes additional research concerning the textual analysis of corporate disclosures in large-sample settings. I first discuss the motivations and methodology issues for this research area. I then review the recent papers that explore corporate textual disclosures to test economic hypotheses. I also discuss the challenges facing this literature and propose some future research opportunities.
Article
We estimate classification models of deceptive discussions during quarterly earnings conference calls. Using data on subsequent financial restatements (and a set of criteria to identify especially serious accounting problems), we label each call as “truthful” or “deceptive”. Our models are developed with the word categories that have been shown by previous psychological and linguistic research to be related to deception. Using conservative statistical tests, we find that the out-of-sample performance of the models that are based on CEO or CFO narratives is significantly better than random by 6%-16% and statistically dominates or is equivalent to models based on financial and accounting variables. We find that the answers of deceptive executives have more references to general knowledge, fewer non-extreme positive emotions, and fewer references to shareholder value. In addition, deceptive CEOs use significantly more extreme positive emotion and fewer anxiety words.
Article
This paper examines the information content of the forward-looking statements (FLS) in the Management Discussion and Analysis section (MD&A) of 10-K and 10-Q filings using a Naïve Bayesian machine learning algorithm. I find that firms with better current performance, lower accruals, smaller size, lower market-to-book ratio, less return volatility, lower MD&A Fog index, and longer history tend to have more positive FLSs. The average tone of the FLS is positively associated with future earnings even after controlling for other determinants of future performance. The results also show that, despite increased regulations aimed at strengthening MD&A disclosures, there is no systematic change in the information content of MD&As over time. In addition, the tone in MD&As seems to mitigate the mispricing of accruals. When managers “warn” about the future performance implications of accruals (i.e., the MD&A tone is positive (negative) when accruals are negative (positive)), accruals are not associated with future returns. The tone measures based on three commonly used dictionaries (Diction, General Inquirer, and the Linguistic Inquiry and Word Count) do not positively predict future performance. This result suggests that these dictionaries might not work well for analyzing corporate filings.
Article
Previous research uses negative word counts to measure the tone of a text. We show that word lists developed for other disciplines misclassify common words in financial text. In a large sample of 10-Ks during 1994 to 2008, almost three-fourths of the words identified as negative by the widely used Harvard Dictionary are words typically not considered negative in financial contexts. We develop an alternative negative word list, along with five other word lists, that better reflect tone in financial text. We link the word lists to 10-K filing returns, trading volume, return volatility, fraud, material weakness, and unexpected earnings.
Conference Paper
We investigate the automatic detection of sentences containing linguistic hedges us- ing corpus statistics and syntactic pat- terns. We take Wikipedia as an already annotated corpus using its tagged weasel words which mark sentences and phrases as non-factual. We evaluate the quality of Wikipedia as training data for hedge detec- tion, as well as shallow linguistic features.
Article
This paper presents a general statistical methodology for the analysis of multivariate categorical data arising from observer reliability studies. The procedure essentially involves the construction of functions of the observed proportions which are directed at the extent to which the observers agree among themselves and the construction of test statistics for hypotheses involving these functions. Tests for interobserver bias are presented in terms of first-order marginal homogeneity and measures of interobserver agreement are developed as generalized kappa-type statistics. These procedures are illustrated with a clinical diagnosis example from the epidemiological literature.
Learning subjective nouns using extraction pattern bootstrapping
  • E Riloff
  • J Wiebe
  • T Wilson
Riloff, E., Wiebe, J., Wilson, T.: Learning subjective nouns using extraction pattern bootstrapping. In: Proceedings of the Seventh Conference on Natural Language Learning, Edmonton (2003) 25-32
The CoNLL-2010 shared task: Learning to detect hedges and their scope in natural language text
  • R Farkas
  • V Vincze
  • G Móra
  • J Csirik
  • G Szarvas
Farkas, R., Vincze, V., Móra, G., Csirik, J., Szarvas, G.: The CoNLL-2010 shared task: Learning to detect hedges and their scope in natural language text. In: Proceedings of the Fourteenth Conference on Computational Natural Language Learning: Shared Task, Uppsala (2010) 1-12
Averaged perceptron tagger
  • M Honnibal
Honnibal, M.: Averaged perceptron tagger. https://github.com/nltk/nltk/ blob/develop/nltk/tag/perceptron.py (2013) Accessed on January 27, 2017.
A good part-of-speech tagger in about 200 lines of python
  • M Honnibal
Honnibal, M.: A good part-of-speech tagger in about 200 lines of python. https: //explosion.ai/blog/part-of-speech-pos-tagger-in-python (2013) Accessed on January 27, 2017.
Estimating continuous distributions in bayesian classifiers
  • G H John
  • P Langley
John, G.H., Langley, P.: Estimating continuous distributions in bayesian classifiers. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence. (1995) 338-345
Fast effective rule induction
  • W W Cohen
Cohen, W.W.: Fast effective rule induction. In: Proceedings of the Twelfth International Conference on Machine Learning. (1995) 115-123