Accuracy for the different models over an incremental set of features, ranked by their importance computed from the different ML strategies. The set of features is gradually incremented from 1 to 153 features.

Accuracy for the different models over an incremental set of features, ranked by their importance computed from the different ML strategies. The set of features is gradually incremented from 1 to 153 features.

Source publication
Article
Full-text available
The analysis of discourse and the study of what characterizes it in terms of communicative objectives is essential to most tasks of Natural Language Processing. Consequently, research on textual genres as expressions of such objectives presents an opportunity to enhance both automatic techniques and resources. To conduct an investigation of this ki...

Context in source publication

Context 1
... {a 0 m , .., a n m } is calculated with a j m being the accuracy of the model m computed with the j first features of F. Since features are included by importance, accuracy growth is expected to be the fastest for the set of features. We compute such a process for the 16 feature selection methods combined with the BMs resulting from Section V-A. Fig. 3 shows accuracy measures for the combinations mentioned. Notably, the accuracy increased significantly when the first features of the set, ranked by importance, were input. Thereafter, at a certain point accuracy stabilized. For a given feature method, seven A sets have been computed and it is possible to calculate a new set of mean ...

Similar publications

Article
Full-text available
Topic modeling is a probabilistic graphical model for discovering latent topics in text corpora by using multinomial distributions of topics over words. Topic labeling is used to assign meaningful labels for the discovered topics. In this paper, we present a new topic labeling method that uses automatic term recognition to discover and assign relev...

Citations

... On the other hand, textbooks with disorganized or fragmented themes may hinder students' ability to connect concepts and grasp the material effectively. The findings from semantic analysis techniques provide actionable insights for authors and publishers [9]. High semantic similarity and accurate named entity recognition suggest that textbooks are well-aligned with educational objectives and present relevant information effectively. ...
Research
Full-text available
This study explores innovations in semantic analysis of textbooks using Natural Language Processing (NLP) to enhance the evaluation and development of educational materials. Semantic analysis focuses on understanding the meaning and context within texts, providing insights into how effectively textbooks convey concepts and support learning objectives. By employing advanced NLP techniques such as semantic similarity measurement, named entity recognition, and context-aware topic modeling, this research offers a detailed examination of textbook content. Semantic similarity measurement reveals the alignment of textbook content with key educational themes and concepts, while named entity recognition identifies and categorizes important entities such as historical figures, scientific terms, and key concepts. Context-aware topic modeling uncovers the underlying themes and structures of textbooks, providing a comprehensive view of how information is organized and presented. The results demonstrate how these semantic analysis techniques can identify gaps in content coverage, assess the coherence and relevance of instructional material, and improve the alignment of textbooks with curriculum standards. The study also highlights the potential of semantic analysis to enhance the development of more effective educational resources by providing actionable insights for authors and publishers. Ultimately, this research showcases the transformative potential of NLP in advancing the evaluation and creation of textbooks, contributing to improved educational outcomes through more precise and contextually relevant content analysis.
... Textbooks that convey encouragement and positivity are likely to contribute to better student engagement and a more constructive approach to learning. The evaluation of content alignment with educational standards using NLP tools has highlighted the effectiveness of automated content checks in identifying gaps and ensuring that textbooks meet curricular requirements [9]. ...
Research
Full-text available
This study explores the use of Natural Language Processing (NLP) to uncover pedagogical patterns within educational textbooks, aiming to enhance the understanding of instructional content and its effectiveness. By applying advanced NLP techniques, including readability assessment, thematic analysis, sentiment analysis, and content alignment checks, the research provides a detailed examination of textbook content. Readability assessments reveal how well textbooks cater to different student levels by analyzing sentence structure, vocabulary, and overall complexity. Thematic analysis, utilizing topic modeling, uncovers the primary educational themes and patterns within textbooks, assessing their alignment with curricular objectives. Sentiment analysis evaluates the emotional tone of the text, offering insights into how the tone may influence student engagement and motivation. Content alignment checks compare textbook material against educational standards, identifying gaps and ensuring that the content supports key learning outcomes. The study also includes a comparative analysis of textbooks from various publishers and educational levels, highlighting variations in pedagogical approaches and content quality. The results provide valuable feedback for educators, authors, and publishers, aiming to improve the design and effectiveness of educational materials. This research demonstrates the potential of NLP to reveal pedagogical patterns and enhance the development of textbooks that are both engaging and aligned with educational goals, ultimately contributing to better learning experiences for students.
... This preference of reviews may be due to the motivation of the reviewer, who voluntarily writes these comments to share experiences and help others, either by criticizing or praising the product. In any case, the user normally tries to include as much information as possible, both in the evaluation part (e.g., "the laptop is incredibly light"),and in the product description (e.g., "the room was not insulated") [62]. Therefore, users of online reviews will generally make use of this feature, depending on the amount of information they want to include in the narration of their experience and the evaluation of the product. ...
Article
Full-text available
We present a data-driven approach to discover and extract patterns in textual genres with the aim of identifying whether there is an interesting variation of linguistic features among different narrative genres depending on their respective communicative purposes. We want to achieve this goal by performing a multilevel discourse analysis according to (1) the type of feature studied (shallow, syntactic, semantic, and discourse-related); (2) the texts at a document level; and (3) the textual genres of news, reviews, and children’s tales. To accomplish this, several corpora from the three textual genres were gathered from different sources to ensure a heterogeneous representation, paying attention to the presence and frequency of a series of features extracted with computational tools. This deep analysis aims at obtaining more detailed knowledge of the different linguistic phenomena that directly shape each of the genres included in the study, therefore showing the particularities that make them be considered as individual genres but also comprise them inside the narrative typology. The findings suggest that this type of multilevel linguistic analysis could be of great help for areas of research within natural language processing such as computational narratology, as they allow a better understanding of the fundamental features that define each genre and its communicative purpose. Likewise, this approach could also boost the creation of more consistent automatic story generation tools in areas of language generation.
Chapter
In this paper, we investigate whether there is a standardized writing composition for articles in Wikipedia and, if so, what it entails. By employing a Neural Gas approximation to the topology of our dataset, we generate a graph that represents various prevalent textual compositions adopted by the texts in our dataset. Subsequently, we examine significantly attractive regions within our graph by tracking the evolution of articles over time. Our observations reveal the coexistence of different stable compositions and the emergence and disappearance of certain unstable compositions over time.