Azuraliza Abu Bakar’s research while affiliated with National University of Malaysia and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (60)


Co-authorship Prediction Method Based on Degree of Gravity and Article Keywords Similarity
  • Article

March 2025

·

5 Reads

Physica A Statistical Mechanics and its Applications

·

·

Azuraliza Abu Bakar



Model structure with sentiment analysis and churn forecasting model
SentScore calculation and consolidation process
Time series results with different datasets. a) Time series for consolidated BloonsTD. b) Time series for consolidated ONI. c) Time series for consolidated SMC. d) Time series for consolidated Stellaris
Actual vs. forecasted result of the four games. a) Actual vs. forecasted result for BloonsTD. b) Actual vs. forecasted result for ONI. c) Actual vs. forecasted result for SMC. d) Actual vs. forecasted result for Stellaris
Forecasting results with and without Sentiment Analysis of the four selected games. a) CF x SA for BloonsTD. b) CF for BloonsTD. c) CF x SA for ONI. d) CF for ONI. e) CF x SA for SMC. f) CF for SMC. g) CF x SA for Stellaris. h) CF for Stellaris
Enhancing churn forecasting with sentiment analysis of steam reviews
  • Article
  • Publisher preview available

August 2024

·

102 Reads

·

2 Citations

Social Network Analysis and Mining

Customer churn prediction is crucial for businesses seeking to retain their customer base. In this study, we present an enhanced approach for churn forecasting by integrating sentiment analysis of Steam reviews into the churn forecasting model, leveraging the vast Steam database. Steam, a prominent digital distribution platform for video games, boasts a large user dataset. Our dataset comprises 12,000 user reviews across four game types. Our approach involves extracting sentiment polarity to generate a sentiment score, which is then embedded in the time series data used for churn modelling. We developed a churn forecasting (CF) model using Vector Autoregression, incorporating the sentiment analysis (SA) predictive model with a Support Vector Machine using grid search and hyperopt, achieving an accuracy of 89%, precision of 84%, recall of 85%, and an F1 score of 84%. Experimental results demonstrate that our approach outperforms traditional churn prediction models, significantly enhancing churn forecasting accuracy. The CF model with SA yielded the lowest Mean Absolute Percentage Error (MAPE) of 10.37% among the four models developed for each game type, indicating its efficacy. These findings underscore the value of integrating sentiment analysis of user reviews to gain valuable insights for businesses aiming to reduce churn and enhance customer satisfaction. In information systems, this study emphasizes understanding game industry stakeholders’ demands, including shareholders and customers through machine learning development research. The game industry reacts proactively to stakeholder complaints by creating an inclusive and responsive organizational culture using sentiment analysis. Ultimately, the data analytics methodology gives insights into advanced consumer behaviour and information systems supporting long-term success and producing strategic business outcomes.

View access options

Missing data imputation using correlation coefficient and min-max normalization weighting

July 2024

·

24 Reads

·

3 Citations

Intelligent Data Analysis

Missing data is one of the challenges a researcher encounters while attempting to draw information from data. The first step in solving this issue is to have the data stage ready for processing. Much effort has been made in this area; removing instances with missing data is a popular method for handling missing data, but it has drawbacks, including bias. It will be impacted negatively on the results. How missing values are handled depends on several vectors, including data types, missing rates, and missing mechanisms. It covers missing data patterns as well as missing at random, missing at completely random, and missing not at random. Other suggestions include using numerous imputation techniques divided into various categories, such as statistical and machine learning methods. One strategy to improve a model’s output is to weight the feature values to better the performance of classification or regression approaches. This research developed a new imputation technique called correlation coefficient min-max weighted imputation (CCMMWI). It combines the correlation coefficient and min-max normalization techniques to balance the feature values. The proposed technique seeks to increase the contribution of features by considering how those elements relate to the desired functionality. We evaluated several established techniques to assess the findings, including statistical techniques, mean and EM imputation, and machine learning imputation techniques, including k-NNI, and MICE. The evaluation also used the imputation techniques CBRL, CBRC, and ExtraImpute. We use various sizes of datasets, missing rates, and random patterns. To compare the imputed datasets and original data, we finally provide the findings and assess them using the root mean squared error (RMSE), mean absolute error (MAE), and R2. According to the findings, the proposed CCMMWI performs better than most other solutions in practically all missing-rate scenarios.


Dynamical analysis of a novel memristor-type chaotic map

June 2024

·

69 Reads

·

1 Citation

Xiong Yu

·

Azuraliza Abu Bakar

·

Kunshuai Li

·

[...]

·

Haiwei Sang

As a unique nonlinear component, the discrete memristor, with its simple structure, is associated with the ability to lead to excellent chaotic performance in the construction of chaotic systems. This characteristic has elevated the discrete memristor to a hot topic in the field of chaos. This paper introduces a cosine hyperchaotic map. Numerical simulations reveal its rich dynamical behaviors. The chaotic map exhibits diverse chaotic control models, including partial amplitude control, total amplitude control, initial boosting, and parameter-offset boosting, with dynamical distribution diagrams plotted for amplitude control to quantify the range of amplitude modulation. Additionally, a localized boosting free region is identified, which exhibits extreme sensitivity to initial values. Dual offset parameters are introduced to control this localized boosting free region, enhancing the flexibility of the system. Finally, the map is implemented on STM32 to validate the numerical simulation results.


Figure 2. Strategies to improve the quality of CDMO
Strategies for improving the quality of community detection based on modularity optimization

June 2024

·

46 Reads

IAES International Journal of Artificial Intelligence (IJ-AI)

Community detection is a field of interest in social networks. Many new methods have emerged for community detection solution, however the modularity optimization method is the most prominent. Community detection based on modularity optimization (CDMO) has fundamental problems in the form of solution degeneration and resolution limits. From the two problems, the resolution limit is more concerned because it affects the resulting community's quality. During the last decade, many studies have attempted to address the problems, but so far they have been carried out partially, no one has thoroughly discussed efforts to improve the quality of CDMO. In this paper, we aim to investigate works in handling resolution limit and improving the quality of CDMO, along with their strengths and limitations. We derive six categories of strategies to improve the quality of CDMO, namely developing multi-resolution modularity, creating local modularity, creating modularity density, creating new metrics as an alternative to modularity, creating new quality metrics as a substitute for modularity, involving node attributes in determining community detection, and extending the single objective function into a multi-objective function. These strategies can be used as a guide in developing community detection methods. By considering network size, network type, and community distribution, we can choose the appropriate strategy in improving the quality of community detection.


Incorporating syntax information into attention mechanism vector for improved aspect-based opinion mining

May 2024

·

43 Reads

·

3 Citations

Neural Computing and Applications

In Aspect-based Sentiment Analysis (ABSA), accurately determining the sentiment polarity of specific aspects within text requires a nuanced understanding of linguistic elements, including syntax. Traditional ABSA approaches, particularly those leveraging attention mechanisms, have shown effectiveness but often fall short in integrating crucial syntax information. Moreover, while some methods employ Graph Neural Networks (GNNs) to extract syntax information, they face significant limitations, such as information loss due to pooling operations. Addressing these challenges, our study proposes a novel ABSA framework that bypasses the constraints of GNNs by directly incorporating syntax-aware insights into the analysis process. Our approach, the Syntax-Informed Attention Mechanism Vector (SIAMV), integrates syntactic distances obtained from dependency trees and part-of-speech (POS) tags into the attention vectors, ensuring a deeper focus on linguistically relevant elements. This not only substantially enhances ABSA accuracy by enriching the attention mechanism but also maintains the integrity of sequential information, a task managed by adopting Long Short-Term Memory (LSTM) networks. The LSTM’s inputs, consisting of syntactic distance, POS tags, and the sentence itself, are processed to generate a syntax vector. This vector is then combined with the attention vector, offering a robust model that adeptly captures the nuances of language. Moreover, the sequential processing capability of LSTM ensures minimal information loss across the text by preserving the context and dependencies inherent in the sentence structure, unlike traditional pooling methods. Our experimental findings demonstrate that this innovative combination of SIAMV and LSTM significantly outperforms existing GNN-based ABSA models in accuracy, thereby setting a new standard for sentiment analysis research. By overcoming the traditional reliance on GNNs and their pooling-induced information loss, our method presents a comprehensive model that adeptly captures and analyzes sentiment at the aspect level, marking a significant advancement in the field of ABSA. The syntax distance programming code for required to replicate the experiment is accessible: https://github.com/Makera86/Syntax-Distance.git.



Generating Attribute Similarity Graphs: A User Behavior-Based Approach from Real- Time Microblogging Data on Platform X

March 2024

·

52 Reads

Social network analysis is a powerful tool for understanding various phenomena, but it requires data with explicit connections among users. However, such data is hard to obtain in real-time, especially from platforms like X, commonly known as Twitter, where users share topic-related content rather than personal connections. Therefore, this paper tackles a new problem of building a social network graph in real-time where explicit connections are unavailable. Our methodology is centred around the concept of user similarity as the fundamental basis for establishing connections, suggesting that users with similar characteristics are more likely to form connections. To implement this concept, we extracted easily accessible attributes from the Twitter platform and proposed a novel graph model based on similarity. We also introduce an Attribute-Weighted Euclidean Distance (AWED) to calculate user similarities. We compare the proposed graph with synthetic graphs based on network properties, online social network characteristics, and predictive analysis. The results suggest that the AWED graph provides a more precise representation of the dynamic connections that exist in real-world online social networks, surpassing the inherent constraints of synthetic graphs. We demonstrate that the proposed method of graph construction is simple, flexible, and effective for network analysis tasks.


Citations (48)


... Deep learning offers advantages like avoiding manual feature definition and using complex network structures to extract and generalize data features, resulting in higher document classification accuracy. Convolutional neural networks (CNN) can extract local features and transfer them to global features by pooling layers from text sequences [5], [6]. Recurrent neural network (RNN) and long short-term memory (LSTM) models are suitable for processing sequential data because they can remember the dependency between tokens [7], [8]. ...

Reference:

Chinese paper classification based on pre-trained language model and hybrid deep learning method
Enhancing churn forecasting with sentiment analysis of steam reviews

Social Network Analysis and Mining

... In chaotic systems, memristors can effectively replicate the nonlinear behaviors and enables the control of chaos complexity [22,23]; in nonlinear circuits, memristors can be utilized to achieve complex dynamics and memory functions, thereby offering new avenues for circuit design and optimization [24,25]; within neural networks, memristors can emulate the changes in membrane potential and synaptic transmission processes of neurons, and facilitate the learning of neural networks [26,27]. Furthermore, memristors are versatile components that can be integrated into both continuous and discrete systems, which can enhance the adaptability for system modeling and control [28][29][30]. Recently, Xu et al. [31] designed an implementable Hodgkin-Huxley circuit with two N-type locally active memristors that can generate periodic and chaotic firing activities. ...

Dynamical analysis of a novel memristor-type chaotic map

... utilize the co-attention mechanism to achieve interactive learning of syntactic and semantic information between aspects and context. Aziz et al. [44] incorporate syntactic distance and part-of-speech tags information into the attention mechanism vector to improve the concentration on emotion words. ...

Incorporating syntax information into attention mechanism vector for improved aspect-based opinion mining

Neural Computing and Applications

... Research by Zhang et al. (2024); Aziz et al. (2024) has shown that different types of dependency relationships in a sentence have an indirect but significant impact on sentiment analysis. Consequently, Yuan et al. (2023) adopted a method of analyzing each edge and dynamically assigning weights, which effectively captures the differences in dependency relationships but significantly increases model complexity and computational time. ...

CoreNLP dependency parsing and pattern identification for enhanced opinion mining in aspect-based sentiment analysis
  • Citing Article
  • April 2024

Journal of King Saud University - Computer and Information Sciences

... Machine learning (ML) has numerous exciting applications across a wide range of fields, including social media (Mahmood et al. 2020), politics (Padmanabhan and Barfar 2021), health (Khalid et al. 2023), marketing (Azmi et al. 2024), and education (Ab Malik et al. 2023). As ML and artificial intelligence (AI) become increasingly commonplace in organizations over the next several years, interest in these fields is expected to grow significantly. ...

Title Mining Association Rules to Determine the Over-spending Behavior Among Low Income Households in Malaysia

... We conclude this overview of approaches for the detection of power users in social networks by pointing out that there are a multitude of approaches to this topic in the literature and that it would be impossible to consider all of them in this section. For instance, the approaches described in [43][44][45][46][47][48][49][50] are only recent approaches that detect power users, taking into account information other than that mentioned in the approaches described above. In this section, we have focused on structural approaches for the detection of power users, especially those based on centrality measures, as they are the most similar to our approach. ...

Measuring User Influence in Real-Time on Twitter Using Behavioural Features
  • Citing Article
  • March 2024

Physica A Statistical Mechanics and its Applications

... Two events are considered matched (≃) if they have the same attribute values for HC and ET, and if the event timestamps (T) are within the same window of a predefined interval (Tint, e.g., 15 minutes). A suspect (s) with the SID of the matched events is then assigned to a bot group (B |Ti -Tj|≤ Tint To measure the similarity of event tuples, the Jaccard similarity coefficient [34] is employed. The Jaccard index ranges from 0 to 1, with 1 indicating an exact match. ...

Detecting Community Through User Similarity Analysis on Twitter
  • Citing Conference Paper
  • January 2024

... Initially, the WOAEDL-ABRC technique applied min-max normalization to normalize the data of input. Min-max normalization was a data preprocessing model that could be generally used to scale features within an exact range, naturally among 0 and 1 [24]. When used for botnet recognition, this model confirmed that every network traffic feature was even, making it very simple to classify anomalies indicative of malicious activity. ...

A Novel Approach for Data Feature Weighting Using Correlation Coefficients and Min–Max Normalization

... This step was performed to remove the non-contributing/noise variables. Furthermore, we used a point-biserial correlation test between the dependent variable (categorical) and the independent variables (continuous) [46,47]. Running these tests before the binary logistic regression model ensures that the results of our logistic model are accurate. ...

Impact of Missing Data on Correlation Coefficient Values: Deletion and Imputation Methods for Data Preparation

Malaysian Journal of Fundamental and Applied Sciences