Conference PaperPDF Available

Automatic Summarization for Financial News Delivery on Mobile Devices.

Authors:
  • Hong Kong Metropolitan University

Abstract

Wireless access with mobile devices is a promising addition to the WWW and traditional electronic business. Mobile devices provide convenience and portable access to the huge information space on the Internet. It is desire to access the most updated financial information through mobile devices in order to make critical and urgent decision for most of the investors. In this paper, we present a financial news delivery system on mobile devices based on the fractal summarization model. Fractal summarization is developed based on the fractal theory. It generates a brief skeleton of summary at the first stage, and the details of the summary on different levels of the document are generated on demands of users. Such interactive summarization reduces the computation load in comparing with the generation of the entire summary in one batch by the traditional summarization, which is ideal for wireless access.
Automatic Summarization for Financial News Delivery on Mobile Devices
Christopher C. C. Yang and Fu Lee Wang
Department of Systems Engineering and Engineering Management
The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
{yang, flwang}@se.cuhk.edu.hk
ABSTRACT
Wireless access with mobile devices is a promising addition to the WWW and traditional electronic business. Mobile devices provide convenience and
portable access to the huge information space on the Internet. It is desire to access the most updated financial information through mobile devices in
order to make critical and urgent decision for most of the investors. In this paper, we present a financial news delivery system on mobile devices based
on the fractal summarization model. Fractal summarization is developed based on the fractal theory. It generates a brief skeleton of summary at the
first stage, and the details of the summary on different levels of the document are generated on demands of users. Such interactive summarization
reduces the computation load in comparing with the generation of the entire summary in one batch by the traditional summarization, which is ideal for
wireless access.
Index Terms: Document summarization, financial news delivery, fisheye view, fractal view, handheld devices, mobile commerce.
1. INTRODUCTION
Access to the Internet through mobile phones and other handheld devices is growing significantly in recent years. A lot of user-centered m-services
applications, such as web surfing, e-mail checking, and stock price quoting, have been developed. However, m-services should not be limited to
user-centered applications but extended to knowledge management. There is a large amount of financial news generated in the Internet everyday.
With a fast paced economy, an m-commerce organization must gain advantage by accessing the most updated and accurate financial information
available and make decision as fast as possible. As a result, it is desire to have financial news delivery through mobile devices so that investors can
retrieve relevant information anywhere any time. Because of the huge volume of the news generated everyday, most of news delivery services provide
summarization tools to support users in searching relevant information through Web browser on PC platforms, such as Lycos Financial Feed System
with summarization system from Diyatech and YellowBrix with Inxight’s Summarizer. Unfortunately, there are many shortcomings associated with
mobile devices, such as limited screen size, narrow network bandwidth, small memory capacity and low computing power. Summarizers for PC
platform are not adaptable to mobile devices directly. In order to reduce the information displayed and downloading time, a WAP gateway is setup to
summarize the news for users to preview its major content. The wireless handheld devices can conduct interactive navigation with the gateway
through wireless network to retrieve the summary piece by piece. In this paper, we present a financial news delivery system on mobile devices based
on the fractal summarization model. In addition, information visualization techniques are presented to reduce the visual loads.
2. FRACTAL SUMMARIZATION MODEL
Traditional automatic text summarization is the selection of sentences from the source document based on their significance to the document [2][9].
The selection is based on the salient features of document, such as thematic, location, title, and cue features.
The thematic feature is first identified by Luhn [9], the tfidf (term frequency inverse document frequency) [12] method is currently most widely
used approach. The system calculates the tfidf score for each term in the document first, and the thematic weight of sentence is calculated as the
sum of tfidf score of its constituent words.
The significance of sentence is indicated by its location based on the hypotheses that topic sentences tend to occur at the beginning or in the end of
documents or paragraphs [2]. Therefore, the location weight of sentence can be calculated by a simple function of its ordinal location in the
document.
The title feature is proposed based on the hypothesis that the author conceives the title as circumscribing the subject matter of the document [2]. A
dictionary of heading keywords with tfidf weights is automatically constructed from the heading sentences of document first. The heading weight
of sentence is calculated as the sum of heading weight of its constituent words.
The cue feature is proposed by Edmundson [2] based on the hypothesis that the probable relevance of a sentence is affected by the presence of
pragmatic words. A pre-stored dictionary of cue phrase with cue weights is used for calculation of cue weight. The cue weight of sentence is
calculated as the sum of cue weight of its constituent words.
Typical summarization systems obtain the sentence weights by computing the weighted sum of the weights of all the features [2][8]. The sentences
with sentence weight higher than a threshold value are selected as part of the summary. It has been proved that the weighting of different features does
not have any substantial effect on the average precision [8]. The maximal weights of each feature are normalized to one in our system.
The traditional summarization models consider the source document as a sequence of sentences. However, many studies [3][5] of human abstraction
process have shown that the human abstractors extract the topic sentences according to the document structure from the top level to the low level until
they have extracted sufficient information. On the other hand, it is believed that the document summarization on handheld devices must make use of
“tree view” [1] and “hierarchical display”, which is not suitable for a sequence of sentences. Fractal summarization model is developed based on the
fractal theory [10]. In fractal summarization, the important information is captured from the source text by exploring the hierarchical structure and
salient features of the document. A condensed version of the document that is informatively close to the original is produced iteratively using the
contractive transformation in the fractal theory. Similar to the fractal geometry, large document has a hierarchical structure with several levels,
chapters, sections, subsections, paragraphs, sentences, terms, words and characters. At the lower abstraction level of a document, more specific
information can be obtained. Although a document is not a true mathematical fractal object since a document cannot be viewed in an infinite
abstraction level, we may consider a document as a prefractal [4]. The lowest abstraction level in our consideration is a term. The fractal
summarization model applies a similar technique as fractal image compression [7]; it generates the summary by a simple recursive deterministic
algorithm based on the iterated representation of a document. In fractal summarization, the user specifies the compression ratio of summarization.
The default value of compression ratio is 4%, because high-compression ratio summary can achieve a reasonable high precision [13] and it can save
network bandwidth. The system will use the compression ratio to calculate the total number of sentences to be extracted as the summary and the
source document is segmented into range blocks of text according to the document structure (Figure 1). The system will calculate the sentence weight
as the traditional summarization and allocate the quota of sentence to each range block proportionally to the sum of sentence weight in the range block.
Each range block is then iteratively partitioned to child blocks and the quota is propagated down the summarization tree according to the sum of
sentence weights in the child blocks until a contractive mapping is found to transform the text block to less than five sentences by traditional
summarization methods, because it is proven that the optimal length of summary by extraction of fixed number of sentences is three to five sentences [6].
The summaries generated by fractal summarized are remained as tree structure, which are suitable for hierarchical display on handheld devices.
Experiments have shown that the fractal summarization outperforms the traditional summarization [14].
3. FRACTAL SUMMARIZATION OF FINANCIAL NEWS AT YAHOO! NEWS
Fractal summarization model summarize the documents based on hierarchical document structure. In addition to large text document, a lot of other
documents also exhibit hierarchical document structure, such as web-site and newspaper. The model is applied to summarize the financial news
downloaded from Yahoo! News. Because a large volume of news articles is generated everyday, categorization is required for easy searching and
browsing of news. For example, there are twenty-one categories in the Yahoo! News, each of them will be subdivided into subcategories. Each
subcategory contains around ten news articles, each news article may contain more than one section, and each section contains few paragraphs, each
paragraph contains few sentences. As we are interested with financial news, we will focus on Yahoo! News-‘Business’ category only.
Figure 1 illustrates the fractal summarization of Yahoo!
News-‘Business’ category. Fractal summarization
generates a brief skeleton of summary at the first stage, and
the details of the summary at different levels of the news tree
are generated on demands of users. The system will first
show a card contains with 6 subcategories of ‘Business’
category (Figure 2a), it gives user a general idea how the
news articles are organized, and the user can select
subcategory to obtain more details. Such interactive
summarization reduces the computation load in comparing
with the generation of the entire summary in one batch by the
traditional automatic summarization, which is ideal for
m-services.
Given a card of a summary node, there may be too many
sentences or child-nodes to be visualized or displayed in the
small screen of the hand held devices. In our system, the
size of objects depends on the significance of the objects.
The 3-scale font mode available for WML is utilized. The
prototype system using Nokia Handset Simulator is presented
on Figure 2. As shown in Figure 2a, 3 subcategories of
Business category are displayed in large font, which means
that they are more important; and the rest are in normal font
or small font according to their importance. When the user
click the anchor link of subcategory, the WAP gateway will
delivery a card depends on the quota allocated. If a large
quota is allocated to the subcategory, the system will show
another card containing of index of news article. However,
if the quota is less than 5 sentences, the system will show a
card with the summary of all news articles in the subcategory
(Figure 2b). In the summary page, when the user clicks the
anchor link ‘More’ at end of each sentence, the system will
generate the summary for the corresponding news articles
with compression ratio 20%, because it has been proved that
extraction of 20% sentences can be as informative as the full
text of the source document [11]. On the other hand, the
user can clicks the anchor link ‘Full’ to view the full text of
the news articles.
Yahoo!News Business
Weight: 1
Quota: 40
Commentary
Weight: 0.25
Quota: 10
Free Flight
Weight: 0.075
Quota: 3
News...
Paragraphs...
Earnings
Weight: 25
Quota: 10
News...
Paragraphs...
Economy
Weight: 0.2
Quota: 8
News...
Industries
Weight: 0.1
Quota: 4
Personal Finance
Weight: 0.15
Quota: 6
News...
Stock Market
Weight: 0.05
Quota: 2
Figure 1. Fractal Summarization of Yahoo! News-‘Business’ Category
(a) Subcategories (b) Summary of News
Figure 2. Screen of WAP Summarization System
4. REFERENCES
[1] Buyukkokten O. et al., 2001. “Accordion Summarization for End-Game Browsing on PDAs and Cellular Phones”. Human-Computer
Interaction Conf. 2001 (CHI 2001). Washington.
[2] Edmundson H. P., 1968. “New Method in Automatic Extraction”. Journal of the ACM, 16(2) 264-285.
[3] Endres-Niggemeyer B. et al., 1995. “How to Implement a Naturalistic Model of Abstracting”. Info. Pro. & Man. 31(5) 631-674.
[4] Feder J., 1988. Fractals. Plenum, New York.
[5] Glaser B. G. et al., 1967. “The discovery of grounded theory; strategies for qualitative research”. Aldine de Gruyter, New York.
[6] Goldstein J. et al., 1999. “Summarizing text documents: Sentence selection and evaluation metrics”. In Proc. of SIGIR, 121-128.
[7] Jacquin A., 1993. “Fractal image coding: A review”. In Proc. of the IEEE, 81(10) 1451-1465.
[8] Lam-Adesina M. et al., 2001. “Applying summarization Techniques for Term Selection in Relevance Feedback”, In Proc. of SIGIR 2001,
1-9.
[9] Luhn H. P., 1958. “The Automatic Creation of Literature Abstracts”. IBM Journal of R & D, 159-165.
[10] Mandelbrot B., 1983. The fractal geometry of nature, New York: W.H. Freeman.
[11] Morris G. et al., 1992. “The effect and limitation of automated text condensing on reading comprehension performance”. Info. Sys.
Research, 17-35.
[12] Salton G. et al., 1988. “Term-Weighting Approaches in Automatic Text Retrieval”, Info. Pro. & Man., 24, 513-523.
[13] Teufel S. et al., 1998. “Sentence Extraction and rhetorical classification for flexible abstracts”, AAAI Spring Sym. on Intel. Text
Summarization, Stanford.
[14] Yang C. C., and Wang, F. L., 2002. “Document Summarization on Handheld Device” In Proc. of Workshop on e-Business WEB2002,
Barcelona.
... There is a growing interest in the application of automatic and computeraided approaches for extracting, summarizing, and analyzing both qualitative and quantitative financial data, as a series of FNP and related workshops 35 2019; El-Haj, Litvak, Pittaras, Giannakopoulos et al., 2020b;El-Haj, Rayson and Moore, 2018;Zmandar, El-Haj, Rayson, Litvak, Giannakopoulos, Pittaras et al., 2021a) recently demonstrates. However, before these workshops, only a few attempts were made to summarize financial reports (Isonuma, Fujino, Mori, Matsuo and Sakata, 2017), while most works focused on the summarization of financial 40 news (Baralis, Cagliero and Cerquitelli, 2016;Filippova, Surdeanu, Ciaramita and Zaragoza, 2009;de Oliveira, Ahmad and Gillam, 2002;Yang and Wang, 2003;Zhang, Chen and Xiao, 2018). It is needless to say that financial reports are very different from news articles in at least four parameters: length, structure, format, and lexicon. ...
Article
This paper reports an approach for summarizing financial texts that combine several techniques for sentence representation and neural document modeling. Our approach is extractive and it follows the classic pipeline of ranking and consequent selecting of the top-ranked text chunks. We evaluate our method on the financial reports provided in the Financial Narrative Summarization (FNS 2021) shared task. The data for the shared task was created and collected from publicly available UK annual reports published by firms listed on the London Stock Exchange. The reports composed FNS 2021 dataset are very long, have many sections, and are written in “financial” language using various special terms, numerical data, and tables. The results show that our approach outperforms the FNS topline with a very serious advantage. In addition to its performance, our approach is also time-efficient.
... book) [33], well structured (e.g. chapters or sections) text [9,29,32], thus maximizing the amount of text and structural cues from which to derive summaries. The most notable exceptions to single unstructured documents are TextRank [23] and LexRank [10]. ...
Article
Pagination - the process of determining where to break an article across pages in a multi-article layout is a common layout challenge for most commercially printed newspapers and magazines. To date, no one has created an algorithm that determines a minimal pagination break point based on the content of the article. Existing approaches for automatic multi-article layout focus exclusively on maximizing content (number of articles) and optimizing aesthetic presentation (e.g., spacing between articles). However, disregarding the semantic information within the article can lead to overly aggressive cutting, thereby eliminating key content and potentially confusing the reader, or setting too generous of a break point, thereby leaving in superfluous content and making automatic layout more difficult. This is one of the remaining challenges on the path from manual layouts to fully automated processes that still ensure article content quality. In this work, we present a new approach to calculating a document minimal break point for the task of pagination. Our approach uses a statistical language model to predict minimal break points based on the semantic content of an article. We then compare 4 novel candidate approaches, and 4 baselines (currently in use by layout algorithms). Results from this experiment show that one of our approaches strongly outperforms the baselines and alternatives. Results from a second study suggest that humans are not able to agree on a single "best" break point. Therefore, this work shows that a semantic-based lower bound break point prediction is necessary for ideal automated document synthesis within a real-world context.
... Obviously this works best with web pages that are predominantly textual in nature, but it is possible to get decent results with most web pages. Specific approaches targeted at email processing/viewing [21], presentation/navigation of information retrieval results [36], financial news delivery [49] and others have been reported. Attempts to use context in summarizing web pages is also being explored [22]. ...
Article
Full-text available
Web browsing using small screen handheld devices is becoming more and more common. There has been a realization over the last couple of years that handheld devices are becoming much more than Personal Digital Assistants (PDAs), as were originally called, that they are here to stay and are about to become direct competitors to laptop and desktop computers. There were two principal shortcomings that prevented the widespread adoption of these devices in the past. The first was the absence of workable connectivity and network access with good speed. The second problem was its relatively tiny display area. Since the incorporation of wireless technology inside the handheld computers and 3.5G systems offering high network speeds, the first problem is being aggressively addressed. The second problem is a different type of problem and requires manipulation of web pages to create a different paradigm for serving and browsing web pages. Researchers have been working to solve this problem and this review presents a concise summary of the state of the art of the research related to web page manipulation for small screen devices.
... Other related research in search results summarisation combines more recent trends in multi-document text summarisation includes approaches based on linguistic analysis (Radev & Fan, 2000), and with focus on small screen delivery (Boguraev, Bellamy, & Swart, 2001;Radev, Fan, & Zhang, 2001). More general work on text summarisation for small screen devices has seen specific approaches targeted at email processing/viewing (Corston-Oliver, 2001), financial news delivery (Yang & Wang, 2003) and web page viewing (Buyukkokten et al., 2001). ...
Article
Full-text available
In recent years, small screen devices have seen widespread increase in their acceptance and use. Combining mobility with increased technological advances many such devices can now be considered mobile information terminals. However, user interactions with small display devices remain a challenge due to the inherent input restrictions and limited display capabilities. These challenges are particularly evident for tasks, such as information seeking. For the presentation of retrieval results we consider that a personalised and context dependent approach could offer benefits, particularly for retrieving information in a non-traditional environment. As a starting point, in this paper we report an investigation into the effects of summary length as a function of screen size, where query-biased summaries are used to present retrieval results. Following a brief description of our proposed system, we report a user study aimed at exploring whether there is an optimal summary size for three types of device (smartphone, PDA and laptop), given their different screen sizes.
... However, it has been proved that other document features play a role as important as the thematic feature [10,22]. Therefore, a more advance summarization model combined with other document features is required for browsing of large document and other information sources on handheld devices [49][50][51]. With powerful summarization tool, the ability of handheld devices will be greatly enhanced. ...
Article
Wireless access with handheld devices is a promising addition to the WWW and traditional electronic business. Handheld devices provide convenience and portable access to the huge information space on the Internet without requiring users to be stationary with network connection. Many customer-centered m-services applications have been developed. The mobile computing, however, should be extended to decision support in an organization. There is a desire of accessing most update and accurate information on handheld devices for fast decision making in an organization. Unfortunately, loading and visualizing large documents on handheld devices are impossible due to their shortcomings. In this paper, we introduce the fractal summarization model for document summarization on handheld devices. Fractal summarization is developed based on the fractal theory. It generates a brief skeleton of summary at the first stage, and the details of the summary on different levels of the document are generated on demands of users. Such interactive summarization reduces the computation load in comparing with the generation of the entire summary in one batch by the traditional automatic summarization, which is ideal for wireless access. The three-tier architecture with the middle-tier conducting the major computation is also discussed. Visualization of summary on handheld devices is also investigated. The automatic summarization, the three-tier architecture, and the information visualization are potential solutions to the existing problems in information delivery to handheld devices for mobile commerce.
Conference Paper
In today's world, plenty of textual news on stock markets written in different languages are available for traders, financial promoters, and private investors. However, their potential in supporting trading in multiple foreign markets is limited by the large volume of the textual corpora, which is practically unmanageable for manual inspection. Although, text mining and information retrieval techniques allow the automatic generation of interesting summaries from document collections, the study and application of multilingual summarization algorithms to financial news is still an open research problem. This paper addresses the summarization of collections of financial documents written in different languages to enhance the financial actor's awareness of foreign markets. Specifically, the proposed mining system (i) is able to cope with news written in multiple languages, (ii) generates multiple-level summaries covering specific and high-level concepts in separate sections, on behalf of users with different skill levels, and (iii) ranks the summary content based on both objective and subjective quality indices. These features are taking an increasingly important role in financial data summarization. As a case study, a preliminary implementation of the proposed system has been presented and validated on real multilingual news ranging over stocks of different markets. The preliminary results show the effectiveness and usability of the proposed approach.
Conference Paper
While email is a major conduit for information sharing in enterprise, there has been little work on exploring the files sent along with these messages -- attachments. These accompanying documents can be large (multiple megabytes), lengthy (multiple pages), and not optimized for the smaller screen sizes, limited reading time, and expensive bandwidth of mobile users. Thus, attachments can increase data storage costs (for both end users and email servers), drain users' time when irrelevant, cause important information to be missed when ignored, and pose a serious access issue for mobile users. To address these problems we created AttachMate, a novel email attachment summarization system. AttachMate can summarize the content of email attachments and automatically insert the summary into the text of the email. AttachMate also stores all files in the cloud, reducing file storage costs and bandwidth consumption. In this paper, the primary contribution is the AttachMate client/server architecture. To ground, support and validate the AttachMate system we present two upfront studies (813 participants) to understand the state and limitations of attachments, a novel algorithm to extract representative concept sentences (tested through two validation studies), and a user study of AttachMate within an enterprise.
Article
A news article generally contains a high-level overview of the facts early on, followed by paragraphs of more detailed information. This structure allows copy editors to truncate the latter paragraphs of an article in order to satisfy space limitations without losing critical information. Existing approaches to this problem of automatic multi-article layout focus exclusively on maximizing content and aesthetics. However, no algorithm can determine how "good" a truncation point is based on the semantic content, or article readability. Yet, disregarding the semantic information within the article can lead to either overly aggressive cutting, thereby eliminating key content and potentially confusing the reader; conversely, it may set too generous of a truncation point, thus leaving in superfluous content and making automatic layout more difficult. This is one of the remaining challenges on the path from manual layouts to fully automated processes with high quality output. In this work, we present a new semantic-focused approach to rate the quality of a truncation point. We built models based on results from an extensive user study on over 700 news articles. Further results show that existing techniques over-cut content. We demonstrate the layout impact through a second evaluation that implements our models in the first layout approach that integrates both layout and semantic quality. The primary contribution of this work is the demonstration that semantic-based modeling is critical for high-quality automated document synthesis within a real-world context.
Article
Full-text available
With the rapid advancement of wireless communication technologies, mobile devices become very useful ubiquitous terminals. Typical devices are mobile phones with web browsing facilities, but there are many types of computing abilities. A low-end mobile phone has less ability than a smart phone with a complete operating system providing a platform for application developers. In general, there are typical shortcomings for mobile phone devices such as narrow bandwidth and the small-sized display. Therefore, document summarisation on mobile phones is one of the most convenient applications. This paper proposes compact and fast approaches that can summarise documents on mobile devices efficiently. The proposed method improves unsupervised schemes using the original non-negative matrix factorisation (NMF) that can determine the paragraph precedence without morphological and syntax analyses. In order to speed up the summarisation, the proposed technique is applied to the NMF method. From simulation results for test data of DUC2006, it turns out that the matrix size could be reduced by about 95% and the precision of summarisation speeding becomes 8.5 times faster than the original method without degrading the precision of extracted paragraphs.
Article
The paper presents a study investigating the effects of incorporating novelty detection in automatic text summarisation. Condensing a textual document, automatic text summarisation can reduce the need to refer to the source document. It also offers a means to deliver device-friendly content when accessing information in non-traditional environments. An effective method of summarisation could be to produce a summary that includes only novel information. However, a consequence of focusing exclusively on novel parts may result in a loss of context, which may have an impact on the correct interpretation of the summary, with respect to the source document. In this study we compare two strategies to produce summaries that incorporate novelty in different ways: a constant length summary, which contains only novel sentences, and an incremental summary, containing additional sentences that provide context. The aim is to establish whether a summary that contains only novel sentences provides sufficient basis to determine relevance of a document, or if indeed we need to include additional sentences to provide context. Findings from the study seem to suggest that there is only a minimal difference in performance for the tasks we set our users and that the presence of contextual information is not so important. However, for the case of mobile information access, a summary that contains only novel information does offer benefits, given bandwidth constraints.
Conference Paper
Full-text available
Query-expansion is an effective Relevance Feedback technique for improving performance in Information Retrieval. In general query-expansion methods select terms from the complete contents of relevant documents. One problem with this approach is that expansion terms unrelated to document relevance can be introduced into the modified query due to their presence in the relevant documents and distribution in the document collection. Motivated by the hypothesis that query-expansion terms should only be sought from the most relevant areas of a document, this investigation explores the use of document summaries in query-expansion. The investigation explores the use of both context-independent standard summaries and query-biased summaries. Experimental results using the Okapi BM25 probabilistic retrieval model with the TREC-8 ad hoc retrieval task show that query-expansion using document summaries can be considerably more effective than using full-document expansion. The paper also presents a novel approach to term-selection that separates the choice of relevant documents from the selection of a pool of potential expansion terms. Again, this technique is shown to be more effective that standard methods.
Chapter
"...a blend of erudition (fascinating and sometimes obscure historical minutiae abound), popularization (mathematical rigor is relegated to appendices) and exposition (the reader need have little knowledge of the fields involved) ...and the illustrations include many superb examples of computer graphics that are works of art in their own right." Nature
Conference Paper
We demonstrate a new browsing technique for devices with small displays such as PDAs or cellular phones. We concentrate on end-game browsing, where the user is close to or on the target page. We make browsing more efficient and easier by Accordion Summarization. In this technique the Web page is first represented as a short summary. The user can then drill down to discover relevant parts of the page. If desired, keywords can be highlighted and exposed automatically. We discuss our techniques, architecture, interface facilities, and the result of user evaluations. We measured a 57% improvement in browsing speed and 75% reduction in input effort.
Article
Four working steps taken from a comprehensive empirical model of expert abstracting are studied in order to prepare an explorative implementation of a simulation model. It aims at explaining the knowledge processing activities during professional summarizing. Following the case-based and holistic strategy of qualitative empirical research, we develop the main features of the simulation system by investigating in detail a small but central test case—four working steps where an expert abstractor discovers what the paper is about and drafts the topic sentence of the abstract. Following the KADS methodology of knowledge engineering, our discussion begins with the empirical model (a conceptual model in KADS terms) and aims at a computational model which is implementable without determining the concrete implementation tools (the design model according to KADS). The envisaged solution uses a blackboard system architecture with cooperating object-oriented agents representing cognitive strategies and a dynamic text representation which borrows its conceptual relations in particular from RST (Rhetorical Structure Theory). As a result of the discussion we feel that a small simulation model of professional summarizing is feasible.
Article
The optimal amount of information needed in a given decision-making situation lies somewhere along a continuum from "not enough" to "too much". Ackoff proposed that information systems often hinder the decision-making process by creating information overload. To deal with this problem, he called for systems that could filter and condense data so that only relevant information reached the decision maker. The potential for information overload is especially critical in text-based information. The purpose of this research is to investigate the effects and theoretical limitations of extract condensing as a text processing tool in terms of recipient performance. In the experiment described here, an environment is created in which the effects of text condensing are isolated from the effects of message and individual recipient differences. The data show no difference in reading comprehension performance between the condensed forms and the original document. This indicates that condensed forms can be produced that are equally as informative as the original document. These results suggest that it is possible to apply a relatively simple computer algorithm to text and produce extracts that capture enough of the information contained in the original document so that the recipient can perform as if he or she had read the original. These results also identify a methodology for assessing the effectiveness of text condensing schemes. The research presented here contributes to a small but growing body of work on text-based information systems and, specifically, text condensing.