Chapter
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Punctuation restoration is the process of adding punctuation symbols to raw text. It is typically used as a post-processing task of Automatic Speech Recognition (ASR) systems. In this paper we present an approach for punctuation restoration for texts in Slovene language. The system is trained using bi-directional Recurrent Neural Networks fed by word embeddings only. The evaluation results show our approach is capable of restoring punctuations with a high recall and precision. The F1 score is specifically high for commas and periods, which are considered most important punctuation symbols for the understanding of the ASR based transcripts.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The second category treats punctuation recovery as a sequence labeling task, where words are represented by labels corresponding to punctuation marks. The literature suggests that Condition Random Fields (CRFs), which are feature-rich models, are well suited to this task, achieving F1 scores of around 55% for English datasets [11]. The third category uses neural networks, specifically deep neural networks (DNNs), to predict punctuation in a text. ...
... Comparison with Punctuation Prediction Accuracy for Other Languages, for example, comma (-0.1) and period(+0.2) compared to [11], and in [22] Automatic Grammar Error Correction SCUT AGEC model is incapable of correcting certain punctuation errors, which the researchers attribute to the inadequate size of the training set and the quality of the examples used. ...
... While the results do not provide much insight into the methods employed, as they do for other languages. Notably, significantly higher accuracy can be attained for Arabic in predicting punctuation than for languages [11]. The relatively high difference can be attributed to different reasons, but we think it indicates that Arabic grammar and vocabulary are reasonably rich compared to other languages (i.e. ...
Conference Paper
Determining the context of speech in the Arabic language helps greatly in clarifying the meanings of words and thus accurately determining the content of the sentence. Punctuation marks implicitly define an important part of this context. In this paper, We discuss the characteristics of the Arabic language and the problems associated with missing punctuation. We achieve good accuracy in punctuation prediction compared to other languages. In this work, we focus on five main punctuation marks: Commas, interrogation marks, exclamation marks, and periods. The interrogation and exclamation marks outperform (achieve high accuracy 94%) compared to the rest of the punctuation marks prediction. And we provide a new corpus for Arabic punctuation that will be the first corpus in this field.
... Several researchers have focused on punctuation prediction for various languages, including Slovenian, (4) Chinese, (5) Portuguese, (6) Arabic, (7) and others. Additionally, some researchers have explored the development of generalized models for this purpose. ...
Article
Full-text available
In the absence of explicit punctuation, the Arabic language's semantic and contextual nature poses a unique challenge, necessitating the reintroduction of punctuation marks for elucidating sentence structure and meaning. We investigate the impact of sentence length on punctuation prediction in the context of Arabic language processing. Leveraging Deep Neural Networks (DNNs), specifically Bi-Directional Long Short-Term Memory (Bi-LSTM) models. Our study goes beyond restoration, aiming to accurately predict punctuation marks in unprocessed text. The investigation focuses on five primary punctuation marks (.?,: and !), contributing to a more comprehensive understanding of predicting diverse punctuation marks in Arabic texts and we have achieved 85 % in accuracy . This research not only advances our understanding of Arabic language processing but also serves as a broader exploration of the relationship between sentence length and punctuation prediction.
Article
Full-text available
Representing words as numerical vectors based on the contexts in which they appear has become the de facto method of analyzing text with machine learning. In this paper, we provide a guide for training these representations on clinical text data, using a survey of relevant research. Specifically, we discuss different types of word representations, clinical text corpora, available pre-trained clinical word vector embeddings, intrinsic and extrinsic evaluation, applications, and limitations of these approaches. This work can be used as a blueprint for clinicians and healthcare workers who may want to incorporate clinical text features in their own models and applications.
Conference Paper
Full-text available
This paper explores the possibility of using multiplicative gate to build two recurrent neural network structures. These two structures are called Deep Simple Gated Unit (DSGU) and Simple Gated Unit (SGU), which are structures for learning long-term dependencies. Compared to traditional Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), both structures require fewer parameters and less computation time in sequence classification tasks. Unlike GRU and LSTM, which require more than one gate to control information flow in the network, SGU and DSGU only use one multiplicative gate to control the flow of information. We show that this difference can accelerate the learning speed in tasks that require long dependency information. We also show that DSGU is more numerically stable than SGU. In addition, we also propose a standard way of representing the inner structure of RNN called RNN Conventional Graph (RCG), which helps to analyze the relationship between input units and hidden units of RNN.
Conference Paper
Full-text available
The output of automatic speech recognition systems is generally an unpunctuated stream of words which is hard to process for both humans and machines. We present a two-stage recurrent neural network based model using long short-term memory units to restore punctuation in speech transcripts. In the first stage, textual features are learned on a large text corpus. The second stage combines textual features with pause durations and adapts the model to speech domain. Our approach reduces the number of punctuation errors by up to 16.9% when compared to a decision tree that combines hidden-event language model posteriors with inter-word pause information, having largest improvements in period restoration.
Conference Paper
Full-text available
Semantic analysis of multimodal video aims to index segments of interest at a conceptual level. In reaching this goal, it requires an analysis of several information streams. At some point in the analysis these streams need to be fused. In this paper, we consider two classes of fusion schemes, namely early fusion and late fusion. The former fuses modalities in feature space, the latter fuses modalities in semantic space. We show by experiment on 184 hours of broadcast video data and for 20 semantic concepts, that late fusion tends to give slightly better performance for most concepts. However, for those concepts where early fusion performs better the difference is more significant.
Conference Paper
Full-text available
We study the problem of detecting linguistic events at interword boundaries, such as sentence boundaries and disfluency loca tions, in speech transcribed by an automatic recognizer. Recovering such events is crucial to facilitate speech understanding a nd other natural language processing tasks. Our approach is based on a combination of prosodic cues modeled by decision trees, and word-based event N-gram language models. Several model com- bination approaches are investigated. The techniques are eval- uated on conversational speech from the Switchboard corpus. Model combination is shown to give a significant win over in- dividual knowledge sources.
Article
An attentional mechanism has lately been used to improve neural machine translation (NMT) by selectively focusing on parts of the source sentence during translation. However, there has been little work exploring useful architectures for attention-based NMT. This paper examines two simple and effective classes of attentional mechanism: a global approach which always attends to all source words and a local one that only looks at a subset of source words at a time. We demonstrate the effectiveness of both approaches over the WMT translation tasks between English and German in both directions. With local attention, we achieve a significant gain of 5.0 BLEU points over non-attentional systems which already incorporate known techniques such as dropout. Our ensemble model using different attention architectures has established a new state-of-the-art result in the WMT'15 English to German translation task with 25.9 BLEU points, an improvement of 1.0 BLEU points over the existing best system backed by NMT and an n-gram reranker.
Dictionary of modern Slovene: problems and solutions (Book series Prevodoslovje in uporabno jezikoslovje), 1st edn
  • N Logar
Postavljanje vejic v Slovenščini s pomočjo strojnega učenja in izboljšanega korpusa Šolar
  • A Krajnc
  • M Robnik-Sikonja
Punctuation prediction for unsegmented transcript based on word vector
  • X Che