Guoyin Wang

Guoyin Wang
Chongqing University of Posts and Telecommunications · College of Computer Science and Technology

About

145
Publications
14,445
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,494
Citations

Publications

Publications (145)
Article
Pawlak rough set (PRS) and neighborhood rough set (NRS) are the two most common rough set theoretical models. Although the PRS can use equivalence classes to represent knowledge, it is unable to process continuous data. On the other hand, NRSs, which can process continuous data, rather lose the ability of using equivalence classes to represent know...
Article
Full-text available
Joint relational triple extraction treats entity recognition and relation extraction as a joint task to extract relational triples, and this is a critical task in information extraction and knowledge graph construction. However, most existing joint models still fall short in terms of extracting overlapping triples. Moreover, these models ignore the...
Article
Feature selection method with rough sets based on incremental learning has the major advantage of the higher efficiency in a dynamic information system, which has attracted extensive research. However, the incremental approximation feature selection with an accelerator (IAFSA) remains ambiguous for a dynamic information system with fuzzy decisions...
Article
As one of the rapidly developing methodologies for dealing with complex problems in line with human cognition, granular computing has made significant achievements in knowledge discovery. Neighborhood classifier, as a typical description of granular computing, is an effective method for the classification of continuous data. However, in the phase o...
Article
Granular computing, a new paradigm for solving large-scale and complex problems, has made significant progresses in knowledge discovery. Granular ball computing (GBC) is a novel granular computing method, which can rapidly generate scalable and robust information granules, that is, granular balls. However, a comprehensive index for measuring the pe...
Article
In order to solve the problem that the traditional spectral clustering algorithm is time-consuming and resource consuming when applied to large-scale data, resulting in poor clustering effect or even unable to cluster, this paper proposes a spectral clustering algorithm based on granular-ball(GBSC). The algorithm changes the construction method of...
Article
Density peaks clustering algorithm (DP) has difficulty in clustering large-scale data, because it requires the distance matrix to compute the density and $\delta$ -distance for each object, which has $O(n^2)$ time complexity. Granular ball (GB) is a coarse-grained representation of data. It is based on the fact that an object and its local neig...
Article
Full-text available
Gaussian mixture model (GMM) is widely used in many domains, e.g. data mining. The unsupervised learning of the finite mixture (ULFM) model based on the minimum message length (MML) criterion for mixtures enables adaptive model selection and parameter estimates. However, some datasets have a hierarchical structure. If the MML criterion does not con...
Article
Full-text available
At present, image completion models are often used to handle images in public datasets and are not competent for tasks in practical scenarios such as USV scenes. On one hand, the practical missing regions are often located at the boundaries, which presents a challenge for the model to extract image features. On the other hand, real images are often...
Article
Machine understanding and thinking require prior knowledge consisting of explicit and implicit knowledge. The current knowledge base contains various explicit knowledge but not implicit knowledge. As part of implicit knowledge, the typical characteristics of the things referred to by the concept are available by concept cognition for knowledge grap...
Article
Unmanned Aerial Vehicles (UAVs) play an important role in the Internet of Things (IoT) , and form the paradigm of the Internet of UAVs, due to their characteristics of flexibility, mobility and low costs. However, resource constraints such as dynamic wireless channels, limited battery capacities and computation resources of UAVs make traditional me...
Preprint
Full-text available
Granular ball computing (GBC), as an efficient, robust, and scalable learning method, has become a popular research topic of granular computing. GBC includes two stages: granular ball generation (GBG) and multi-granularity learning based on the granular ball (GB). However, the stability and efficiency of existing GBG methods need to be further impr...
Article
Full-text available
From the perspective of human cognition, three-way decision (3WD) explores thinking, problem solving, and information processing in three paradigms. Rough fuzzy sets (RFS) are constructed to handle fuzzy concepts by extending the classical rough sets. In three-way decision with rough fuzzy sets (3WDRFS), current works are mainly concerned with calc...
Preprint
Human cognition has a ``large-scale first'' cognitive mechanism, therefore possesses adaptive multi-granularity description capabilities. This results in computational characteristics such as efficiency, robustness, and interpretability. Although most existing artificial intelligence learning methods have certain multi-granularity features, they do...
Article
Hierarchical quotient space structure (HQSS), as a typical description of granular computing (GrC), focuses on hierarchically granulating fuzzy data and mining hidden knowledge. The key step of constructing HQSS is to transform the fuzzy similarity relation into fuzzy equivalence relation. However, on one hand, the transformation process has high t...
Preprint
In recent years, the problem of fuzzy clustering has been widely concerned. The membership iteration of existing methods is mostly considered globally, which has considerable problems in noisy environments, and iterative calculations for clusters with a large number of different sample sizes are not accurate and efficient. In this paper, starting f...
Preprint
Most of the existing clustering methods are based on a single granularity of information, such as the distance and density of each data. This most fine-grained based approach is usually inefficient and susceptible to noise. Therefore, we propose a clustering algorithm that combines multi-granularity Granular-Ball and minimum spanning tree (MST). We...
Article
Due to simplicity, K-means has become a widely used clustering method. However, its clustering result is seriously affected by the initial centers and the allocation strategy makes it hard to identify manifold clusters. Many improved K-means are proposed to accelerate it and improve the quality of initialize cluster centers, but few researchers pay...
Preprint
In some specific scenarios, face sketch was used to identify a person. However, drawing a complete face sketch often needs skills and takes time, which hinder its widespread applicability in the practice. In this study, we proposed a new task named sketch less face image retrieval (SLFIR), in which the retrieval was carried out at each stroke and a...
Chapter
User alignment aims to identify accounts of one natural person across networks. Nevertheless, different social purposes in multiple networks and randomness of following friends form the diverse local structures of the same person, leading to a high degree of non-isomorphism across networks. The edges resulting in non-isomorphism are harmful to lear...
Article
Full-text available
Community structure can be used to analyze and understand the structural functions in a network, reveal its implicit information, and predict its dynamic development pattern. Existing community detection algorithms are very sensitive to the sparsity of network, and they have difficulty in obtaining stable community detection results. To address the...
Article
Granular computing (GrC) is an efficient way to reveal descriptions of data in line with human cognition and plays a critical role in knowledge discovery. Information granules (IGs), the basic computing unit of GrC, are key components of knowledge representation and processing. Rough sets are one of the classical GrC models and generate IGs based o...
Article
Full-text available
Feature selection achieves dimensionality reduction by selecting some effective features from the original feature set. However, in the process of feature selection, most conventional methods do not accurately describe various correlations between features and the dynamic changes of the relation, leading to an incomplete definition of the evaluatio...
Preprint
Full-text available
p>The centrality and diversity of the labeled data are very influential to the performance of semi-supervised learning (SSL). Most existing SSL models select the labeled data randomly and equally allocate the labeling quota among the classes, leading to considerable unstableness and degeneration of performance. Active learning has been proposed to...
Preprint
Full-text available
The centrality and diversity of the labeled data are very influential to the performance of semi-supervised learning (SSL). Most existing SSL models select the labeled data randomly and equally allocate the labeling quota among the classes, leading to considerable unstableness and degeneration of performance. Active learning has been proposed to ad...
Article
Full-text available
Heterogeneous network representation learning shows its superior capacity in complex network analysis. It aims to embed nodes into a low-dimensional space and pursues a meaningful vector representation for each node. At present, the research of heterogeneous networks mainly focuses on the fusion of network structure information, semantic informatio...
Preprint
Full-text available
GBSVM (Granular-ball Support Vector Machine) is an important attempt to use the coarse granularity of a granular-ball as the input to construct a classifier instead of a data point. It is the first classifier whose input contains no points, i.e., $x_i$, in the history of machine learning. However, on the one hand, its dual model is not derived, and...
Article
Full-text available
Granular-ball computing (GBC) is an efficient, robust, and scalable learning method for granular computing. The granular ball (GB) generation method is based on GB computing. This article proposes a method for accelerating GB generation using division to replace $k$ -means. It can significantly improve the efficiency of GB generation while ensuri...
Article
Full-text available
Learning label noise is gaining increasing attention from a variety of disciplines, particularly in supervised machine learning for classification tasks. The k nearest neighbors (kNN) classifier is often used as a natural way to edit the training sets due to its sensitivity to label noise. However, the kNN-based editor may remove too many instances...
Article
Aspect sentiment triplet extraction (ASTE) is a popular subtask related to aspect-based sentiment analysis (ABSA). It extracts aspects and their associated opinion expressions and sentiment polarities from comment sentences. Previous studies have proposed a multitask learning framework that jointly extracts aspect and opinion terms and treats the s...
Article
Full-text available
Nowadays, attributed multiplex heterogeneous network (AMHN) representation learning has shown superiority in many network analysis tasks due to its ability to preserve both the structure of the network and the semantics of the nodes. However, few people consider the correlation between content attributes within each node. No personalized analysis m...
Article
The interpretability of convolutional neural networks (CNNs) is attracting increasing attention. Class activation maps (CAM) intuitively explain the classification mechanisms of CNNs by highlighting important areas. However, as coarse-grained explanations, classical CAM methods are incapable of explaining the classification mechanism in detail. Ins...
Article
Fine-grained sketch-based image retrieval (FG-SBIR) addresses the problem of retrieving a specific photo from a given query sketch. However, its widespread applicability is limited because it is difficult for most people to draw a complete sketch, and the drawing process is often time consuming. In this study, we aim to retrieve the target photo fr...
Article
Full-text available
In recent years, convolutional neural networks (CNNs) have been applied successfully in many fields. However, these deep neural models are still considered as “black box” for most tasks. One of the fundamental issues underlying this problem is understanding which features are most influential in image recognition tasks and how CNNs process these fe...
Article
As the significant expansion of classical rough sets, local rough sets(LRS) is an effective model for processing large-scale datasets with finite labels. However, the process of establishing a category of monotonic uncertainty measure with strong distinguishing ability for LRS remains ambiguous. To construct this model, both the monotonicity of loc...
Preprint
Edge detection, a basic task in the field of computer vision, is an important preprocessing operation for the recognition and understanding of a visual scene. In conventional models, the edge image generated is ambiguous, and the edge lines are also very thick, which typically necessitates the use of non-maximum suppression (NMS) and morphological...
Article
Full-text available
Soft clustering can be regarded as a cognitive computing method that seeks to deal with the clustering with fuzzy boundary. As a classical soft clustering algorithm, rough k-means (RKM) has yielded various extensions. However, some challenges remain in existing RKM extensions. On the one hand, the user-defined cutoff threshold is subjective and can...
Article
Full-text available
Mapping the vertices of network onto a tree helps to reveal the hierarchical community structures. The leading tree is a granular computing (GrC) model for efficient hierarchical clustering and it requires two elements: the distance between granules, and the density calculated in Euclidean space. For the non-Euclidean network data, the vertices nee...
Preprint
Fine-grained sketch-based image retrieval (FG-SBIR) addresses the problem of retrieving a particular photo in a given query sketch. However, its widespread applicability is limited by the fact that it is difficult to draw a complete sketch for most people, and the drawing process often takes time. In this study, we aim to retrieve the target photo...
Preprint
Full-text available
Granular-ball computing is an efficient, robust, and scalable learning method for granular computing. The basis of granular-ball computing is the granular-ball generation method. This paper proposes a method for accelerating the granular-ball generation using the division to replace $k$-means. It can greatly improve the efficiency of granular-ball...
Article
Fine grained sketch-based image retrieval (FG-SBIR) addresses the problem of retrieving a particular photo for a given query sketch. However, its widespread applicability is limited by the fact that it is difficult to draw a complete sketch, and the drawing process often takes more time than the text/tag method. On-the-fly FG-SBIR was proposed to a...
Article
This paper presents a strong data-mining method based on a rough set, which can simultaneously realize feature selection, classification, and knowledge representation. Although a rough set, a popular method for feature selection, has good interpretability, it is not sufficiently efficient and accurate to deal with large-scale datasets with high dim...
Article
Full-text available
Feature selection is an important preprocessing step in data mining and pattern recognition. The neighborhood rough set (NRS) model is a widely-used rough set model for feature selection on continuous data. All currently known NRS models are defined on a distance metric — mostly the Euclidean distance metric — which invalidates the NRS models in sc...
Article
Edge detection, a basic task in the field of computer vision, is an important preprocessing operation for the recognition and understanding of a visual scene. In conventional models, the edge image generated is ambiguous, and the edge lines are also very thick, which typically necessitates the use of non-maximum suppression (NMS) and morphological...
Preprint
Full-text available
This paper present a strong data mining method based on rough set, which can realize feature selection, classification and knowledge representation at the same time. Rough set has good interpretability, and is a popular method for feature selections. But low efficiency and low accuracy are its main drawbacks that limits its application ability. In...
Article
A multi-granularity knowledge space is a computational model that simulates human thinking and solves complex problems. However, as the amount of data increases, the multi-granularity knowledge space will have a larger number of layers, which will reduce its problem-solving ability. Therefore, we define a knowledge space distance measurement and pr...
Article
We present a model, called relative probability density (RPD), to detect label noise by utilizing the contrasting characteristics in different classes. RPD has a natural ratio structure so that a powerful measurement, the Kullback–Leibler Importance Estimation Procedure (KLIEP), can be directly applied for its calculation instead of calculating the...
Preprint
In recent years, convolutional neural networks (CNNs) have been applied successfully in many fields. However, such deep neural models are still regarded as black box in most tasks. One of the fundamental issues underlying this problem is understanding which features are most influential in image recognition tasks and how they are processed by CNNs....
Article
In recent years, convolutional neural networks (CNNs) have been successfully applied in the field of image processing, and have been deployed to a variety of artificial intelligence systems. However, such neural models are still considered to be “black box” for most tasks. Two of fundamental issues underlying this problem are as follows: 1. What ty...
Article
Full-text available
This article presents a general sampling method, called granular-ball sampling (GBS), for classification problems by introducing the idea of granular computing. The GBS method uses some adaptively generated hyperballs to cover the data space, and the points on the hyperballs constitute the sampled data. GBS is the first sampling method that not onl...
Article
Naive Bayes classifier (NBC) is a classical binary generative classifier that has been extensively researched and developed for use in various applications owing to its simplicity and high efficiency. However, in practice, the distinct advantages of the NBC are often challenged by the conditional independence assumption among attributes and the zer...
Article
The Synthetic Minority Oversampling Technique (SMOTE) is a prevalent method for imbalanced classification. The plain SMOTE is intrinsically flawed in that it generates new samples blindly, thus being susceptible to label noise. Many variants of the SMOTE focus on balancing the number of classes and avoiding the introduction or removal of label-nois...
Article
This paper presents an efficient graph semisupervised learning (GSSL) method that meets the criterion of optimization without iterations. Most existing GSSL methods require iterative optimization to achieve a preeer relationships. Additionally, existing GSSL methods must learn from scratch for unseen data because graph structures are specifically b...
Article
Recommender systems are an effective tool to resolve information overload by enabling the selection of the subsets of items from a universal set based on user preferences. The operation of most of recommender systems depends on the prediction ratings, which may introduce a degree of uncertainty into the process of recommendation. However, systems e...
Article
Full-text available
This article presents a simple sampling method, which is very easy to be implemented, for classification by introducing the idea of random space division, called ``random space division sampling'' (RSDS). It can extract the boundary points as the sampled result by efficiently distinguishing the label noise points, inner points, and boundary points....
Article
Full-text available
Multi-scale decision system (MDS) is an effective tool to describe hierarchical data in machine learning. Optimal scale combination (OSC) selection and attribute reduction are two key issues related to knowledge discovery in MDSs. However, searching for all OSCs may result in a combinatorial explosion, and the existing approaches typically incur ex...
Article
Full-text available
Mitigating label noise is a crucial problem in classification. Noise filtering is an effective method of dealing with label noise which does not need to estimate the noise rate or rely on any loss function. However, most filtering methods focus mainly on binary classification, leaving the more difficult counterpart problem of multiclass classificat...
Article
Imbalanced classification is an important task in supervised learning, and Synthetic Minority Over-sampling Technique (SMOTE) is the most common method to address it. However, the performance of SMOTE deteriorates in the presence of label noise. Current generalizations of SMOTE try to tackle this problem by either selecting some samples in minority...
Article
Granular computing is an efficient and scalable computing method. Most of the existing granular computing-based classifiers treat the granules as a preliminary feature procession method, without revising the mathematical model and improving the main performance of the classifiers themselves. So far, only few methods, such as the G-svm and WLMSVM, h...
Article
Full-text available
Online social networks play more and more important roles in the modern society in terms of the rapid and large scale information spread. Many efforts have been made to understand these phenomena in the computer science communities and other relative fields, ranging from popular topic detection to information diffusion modeling. In this article, a...
Article
Full-text available
The existing noise detection methods required the classifiers or distance measurements or data overall distribution, and ‘curse of dimensionality’ and other restrictions made them insufficiently effective in complex data, e.g. different attribute weights, high-dimensionality, containing feature noise, nonlinearity, etc. This is also the main reason...
Article
Full-text available
Graph based semi-supervised learning (GSSL) has intuitive representation and can be improved by exploiting the matrix calculation. However, it has to perform iterative optimization to achieve a preset objective, which usually leads to low efficiency. Another inconvenience lying in GSSL is that when new data come, the graph construction and the opti...
Article
Full-text available
Constructing information granules (IGs) has been of significant interest to the discipline of granular computing. The principle of justifiable granularity has been proposed to guide the design of IGs, opening an avenue of pursuits of building IGs carried out on a basis of well-defined and intuitively appealing principles. However, how to improve th...
Article
Detecting clusters of arbitrary shape and constantly delivering the results for newly arrived items are two critical challenges in the study of data stream clustering. However, the existing clustering methods could not deal with these two problems simultaneously. In this paper, we employ the density peaks based clustering (DPClust) algorithm to con...
Article
Full-text available
Approximation computation is a critical step in rough sets theory used in knowledge discovery and other related tasks. In practical applications, an information system often evolves over time by the variation of attributes or objects. Effectively computing approximations is vital in data mining. Dominance-based rough set approach can handle informa...
Article
Confidential dominance relation based rough set is a model of incomplete ordered information processing, computation of approximations of which is a core issue. In real-life applications, the attribute set is dynamically changed. According to the variation of the attribute set, confidential dominance and dominated class are firstly calculated. Then...