Fig 2 - uploaded by Yang Gao
Content may be subject to copyright.
Document encoder architecture. We insert a [CLS] token before each sentence and use interval segment embeddings to distinguish different sentences.
Source publication
Sentence selection and summary generation are two main steps to generate informative and readable summaries. However, most previous works treat them as two separated subtasks. In this paper, we propose a novel extractive-and-abstractive hybrid framework for single document summarization task by jointly learning to select sentence and rewrite summar...
Context in source publication
Context 1
... require it to output representations in sentence level and word level. To get the representation for each sentence, we adopt similar modifications [12] to the input sequence and embedding of BERT, we insert a [CLS] token before each sentence and use interval segment embeddings to distinguish multiple sentences within a document. As illustrated in Fig. 2, the vector T i which is the i th [CLS] symbol from the top BERT layer will be used as the representation for sentence s i , the vector h i is the representation for each word. ...
Similar publications
The potential of synthetic data to replace real data creates a huge demand for synthetic data in data-hungry AI. This potential is even greater when synthetic data is used for training along with a small number of real images from domains other than the test domain. We find that this potential varies depending on (i) the number of cross-domain real...
Most of the existing medication recommendation models are predicted with only structured data such as medical codes, with the remaining other large amount of unstructured or semi-structured data underutilization. To increase the utilization effectively, we proposed a method of enhancing medication recommendation with Large Language Model (LLM) text...
This article deals with the derivatives Stimulation and Stimulierung from the perspective of synonymy. On the one hand, the lexicographic representation of their meaning in selected monolingual dictionaries is compared, and on the other hand, the derivatives are analysed and quantified from the point of view of their occurrence and co-occurrence in...
Existed pre-trained models have achieved state-of-the-art performance on various text classification tasks. These models have proven to be useful in learning universal language representations. However, the semantic discrepancy between similar texts cannot be effectively distinguished by advanced pre-trained models, which have a great influence on...
Large scale pretrained language models have demonstrated state-of-the-art performance in language understanding tasks. Their application has recently expanded into multimodality learning, leading to improved representations combining vision and language. However, progress in adapting language models towards conditional Natural Language Generation (...
Citations
... Another research was focused on abstractive and extractive single-document summarization on the CNN / Daily Mail dataset using BERT as the pre-trained encoder. Evaluation using ROUGE F1 obtained ROUGE-1 results of 41.76, ROUGE-2 amounting to 19.31, and ROUGE-L of 38.86 [13]. Other research tried to combine summarization models using BERT and OpenAI GPT-2, providing abstract and comprehensive keyword-based information from a collection of scientific articles on the COVID-19 Open Research Dataset Challenge. ...
Reviewing court decision documents for references in handling similar cases can be time-consuming. From this perspective, we need a system that can allow the summarization of court decision documents to enable adequate information extraction. This study used 50 court decision documents taken from the official website of the Supreme Court of the Republic of Indonesia, with the cases raised being Narcotics and Psychotropics. The court decision document dataset was divided into two types, court decision documents with the identity of the defendant and court decision documents without the defendant's identity. We used BERT specific to the IndoBERT model to summarize the court decision documents. This study uses four types of IndoBert models: IndoBERT-Base-Phase 1, IndoBERT-Lite-Bas-Phase 1, IndoBERT-Large-Phase 1, and IndoBERT-Lite-Large-Phase 1. This study also uses three types of ratios and ROUGE-N in summarizing court decision documents consisting of ratios of 20%, 30%, and 40% ratios, as well as ROUGE1, ROUGE2, and ROUGE3. The results have found that IndoBERT pre-trained model had a better performance in summarizing court decision documents with or without the defendant's identity with a 40% summarizing ratio. The highest ROUGE score produced by IndoBERT was found in the INDOBERT-LITE-BASE PHASE 1 model with a ROUGE value of 1.00 for documents with the defendant's identity and 0.970 for documents without the defendant's identity at a ratio of 40% in R-1. For future research, it is expected to be able to use other types of Bert models such as IndoBERT Phase-2, LegalBert, etc.
... However, contains reviews of food until 2012. The dataset is finalised after searching on various dataset repositories and concluded as best for the short text summarisation [23]. The link is given for the dataset so that if anyone needs to use it for short text summarisation for future enhancement. ...
... The proposed model using BERT as a pre trained model for words representation in tokens and then LSTM and transformer are used to process the results [23]. ...
Retraction: [Abdulwahid Al Abdulwahid, Software solution for text summarisation using machine learning based Bidirectional Encoder Representations from Transformers algorithm, IET Software 2023 (https://doi.org/10.1049/sfw2.12098)].
The above article from IET Software, published online on 2 February 2023 in Wiley Online Library (wileyonlinelibrary.com), has been retracted by agreement between the Editor‐in‐Chief, Hana Chockler, the Institution of Engineering and Technology (the IET) and John Wiley and Sons Ltd. This article was published as part of a Guest Edited special issue. Following an investigation, the IET and the journal have determined that the article was not reviewed in line with the journal’s peer review standards and there is evidence that the peer review process of the special issue underwent systematic manipulation. Accordingly, we cannot vouch for the integrity or reliability of the content. As such we have taken the decision to retract the article. The authors have been informed of the decision to retract.
... It paraphrases the extractive sentences using an abstractive model, removing irrelevant information and normalizing the expressions. Various rewriting systems have been developed, including sentence compression [9], syntax simplification [12], and paraphrasing [10], [15], [16], [17]. The human evaluation shows that rewriting methods can improve the readability and conciseness of extractive summaries [15], [17]. ...
... It worth noting that contextualized rewriting as expressed in Eq 3 is different from previous non-contextualized rewriters [15], [16], [17], which do not use contextual information but directly calculate P (Y j |E j ). ...
... To generate oracle extraction, we match each sentence in the human summary to each document sentence, choosing the document sentence with the best matching score as the oracle extraction. Specifically, we use the average recall of ROUGE-1/2/L as the scoring function, which follows Wei et al. [16]. Differing from existing work [5], which aims to find a set of sentences that maximizes ROUGE matching with the whole summary, we find the best match for each summary sentence. ...
The rewriting method for text summarization combines the advantage of extractive and abstractive approaches, improving the conciseness and readability of extractive summaries. Exiting rewriting systems take extractive sentences as the only input and rewrite each sentence independently, which may lose critical background knowledge and break cross-sentence coherence of the summary. To this end, we propose contextualized rewriting to consume the entire document and maintain the summary coherence, representing extractive sentences as a part of the document encoding and introducing group-tags to align the extractive sentences to the summary. We further propose a general framework for rewriting with an external extractor and a joint internal extractor, representing sentence selection as a special token prediction. We demonstrate the framework's effectiveness by implementing three rewriter instances on various pre-trained models. Experiments show that contextualized rewriting significantly outperforms previous non-contextualized rewriting, achieving strong improvements on ROUGE scores upon multiple extractors. Empirical results further suggest that joint modeling of sentence selection and rewriting can largely enhance performance.
... Convolutional Neural Networks (CNN) (Cao et al. 2016), Recurrent Neural Networks (RNN) (Nallapati et al. 2017;Al-Sabahi et al. 2018;Zhou et al. 2018;Chen et al. 2018) or a combination of both (Cheng and Lapata 2016) have been used for extractive summarization. Transformer-based methods have taken the field to new heights (Wei et al. 2019;Zhong et al. 2019;Liu 2019;Liu and Lapata 2019). Hidasi et al. (2016) discuss parallel RNNs to extract features from data of different modalities in recommender systems. ...
Recurrent Neural Networks (RNN) and their variants like Gated Recurrent Units (GRUs) have been the de-facto method in Natural Language Processing (NLP) for solving a range of NLP problems, including extractive text summarization. However, for certain sequential data with multiple temporal dependencies like the human text data, using a single RNN over the whole sequence might prove to be inadequate. Transformer models that use multiheaded attention have shown that human text contains multiple dependencies. Supporting networks like attention layers are needed to augment the RNNs to capture the numerous dependencies in text. In this work, we propose a novel combination of RNNs, called Parallel RNNs (PRNN), where small and narrow RNN units work on a sequence, in parallel and independent of each other, for the task of extractive text summarization. These PRNNs, without the need for any attention layers, capture various dependencies present in the sentence and document sequences. Our model achieved a 10% gain in ROUGE-2 score over the single RNN model on the popular CNN/Dailymail dataset. The boost in performance indicates that such an ensemble arrangement of RNNs improves the performance compared to the standard single RNNs, which alludes to the fact that constituent units of the PRNN learn various input sequence dependencies. Hence, the sequence is represented better using the combined representation from the constituent RNNs.
The surge in text data has driven extensive research into developing diverse automatic summarization approaches to effectively handle vast textual information. There are several reviews on this topic, yet no large‐scale analysis based on quantitative approaches has been conducted. To provide a comprehensive overview of the field, this study conducted a bibliometric analysis of 3108 papers published from 2010 to 2022, focusing on automatic summarization research regarding topics and trends, top sources, countries/regions, institutions, researchers, and scientific collaborations. We have identified the following trends. First, the number of papers has experienced 65% growth, with the majority being published in computer science conferences. Second, Asian countries and institutions, notably China and India, actively engage in this field and demonstrate a strong inclination toward inter‐regional international collaboration, contributing to more than 24% and 20% of the output, respectively. Third, researchers show a high level of interest in multihead and attention mechanisms, graph‐based semantic analysis, and topic modeling and clustering techniques, with each topic having a prevalence of over 10%. Finally, scholars have been increasingly interested in self‐supervised and zero/few‐shot learning, multihead and attention mechanisms, and temporal analysis and event detection. This study is valuable when it comes to enhancing scholars' and practitioners' understanding of the current hotspots and future directions in automatic summarization.
This article is categorized under: Algorithmic Development > Text Mining
Nowadays, we are facing to huge amount of data that makes the task of information analysis quite complex. In this context, automatic text summarization has gained a great deal of success where it is able to extract an efficient short version of documents covering the most important information. In this paper, we propose a new extractive approach for automatic text summarization based on deep learning techniques. This extractive approach can be easily applied on any document independently of its language. Furthermore, by selecting sentences from the document, we guarantee the grammatical and linguistic correctness of summaries. Some experimental results were conducted in order to improve the performance of the proposed approach.