Bing-Kun Bao's research while affiliated with Nanjing University of Posts and Telecommunications and other places

Publications (18)

Article
Full-text available
Cross-modality person re-identification aims at matching the RGB images of a specific person in variable appearances with his/her images in another modality like infrared modality, sketch modality, etc. It is challenging due to domain gap and intra-class variations. Existing Re-id models usually employ symmetric or identical feature extractors in t...
Article
Full-text available
Low-rank and sparse decomposition (LRSD) has been gained considerable attention due to its success in computer vision and many other numerous fields. However, the traditional LRSD methods have the problem of the low approximation accuracy of the rank function. To deal with this problem, the truncated γ norm is used to approximate the rank function...
Chapter
Giving machines the ability to perceive human emotions and enable them to recognize our emotional states is one of the important goals to realize human-computer interaction. In the past decades, facial expression recognition (FER) has always been a research hotspot in the field of computer vision. However, the existing facial expression datasets ge...
Article
Low-rank representation (LRR) is a very competitive technique in various real-world applications for its powerful capability in discovering latent structure of noisy or corrupted data set. However, traditional low-rank models treat each data point and feature equally so that noisy data cannot be detected and suppressed effectively and have obvious...
Preprint
Full-text available
Recently, many super-resolution algorithms have been proposed to recover high resolution images to improve visualization and help better analyze images. Among them, total variation regularization (TV) methods have been proven to have a good effect in retaining image edge information. However, these TV methods do not consider the temporal correlatio...
Preprint
Full-text available
Recently, many super-resolution algorithms have been proposed to recover high resolution images to improve visualization and help better analyze images. Among them, total variation regularization (TV) methods have been proven to have a good effect in retaining image edge information. However, these TV methods do not consider the temporal correlatio...
Preprint
Full-text available
Recently, many super-resolution algorithms have been proposed to recover high resolution images to improve visualization and help better analyze images. Among them, total variation regularization (TV) methods have been proven to have a good effect in retaining image edge information. However, these TV methods do not consider the temporal correlatio...
Article
In current graph embedding methods, low dimensional projections are obtained by preserving either global geometrical structure of data or local geometrical structure of data. In this paper, the PCA (Principal Component Analysis) idea of minimizing least-squares reconstruction errors is regularized with graph embedding, to unify various local manifo...
Article
Full-text available
In this paper, we try to solve the personalized travel recommendation problem by exploiting the multi-modal data available from the real world social media, and a probabilistic graph model so called Sentiment-aware Multi-modal Topic Model (SMTM) is proposed to mine the latent semantics of the multi-modal data on the online travel website. Distingui...
Article
Super-resolution of facial images, a.k.a. face hallucination, has been intensively studied in the past decades due to the increasingly emerging analysis demands in video surveillance, e.g., face detection, verification, identification. However, the actual performance of most previous hallucination approaches will drop dramatically when a very low-r...
Article
Dimensionality reduction in high dimensional multi-view datasets is an important research topic. It can keep essential features to improve performance in subsequent tasks such as classification and clustering. This paper proposes a generalized framework, which extends the PCA idea of minimizing least squares reconstruction errors, to include data d...
Article
Automatic visual scene recognition has attracted increasing attention for developing multimedia systems as it provides rich information beyond object recognition and action recognition. Each scene image often contains or is characterized by a certain of same essential objects and relations, for example, scene images of “wedding” usually have brideg...
Article
Full-text available
Manifold alignment is very prevalent in machine learning for extracting common latent space from multiple datasets. These algorithms generally aim to achieve higher alignment accuracies by preserving the original structure while ensuring closeness between manifolds. This paper proposes a novel semi-supervised manifold alignment method that combines...
Article
This paper aims to propose a candidate solution to the challenging task of single-image blind super-resolution (SR), via extensively exploring the potentials of learning-based SR schemes in the literature. The task is formulated into an energy functional to be minimized with respect to both an intermediate super-resolved image and a nonparametric b...

Citations

... Dimensionality reduction (DR) plays an important role in some fields such as machine learning and pattern recognition [1,2]. Many effective methods have been proposed over the past few decades. ...
... Multimodal sentiment analysis aims to identify users' sentiment polarities, as well as their attitudes towards topics or events, from different forms of data. As the core field of social media analysis, sentiment analysis has received not only extensive attention from academia [1,2] but also has broad commercial application prospects, such as personalized advertising [3], opinion mining [4], and decision making [5], etc. ...
... Another technique to ameliorate the blurry effect of the pixel-wise MSE loss is to use GAN models [6] as proposed by authors in [5,8,9,[25][26][27]. Although GAN-oriented face hallucination networks generate perceptually more convincing images, they suffer from two aspects. ...
... Cheng [27] proposed a scene-oriented Semantic Description of Object (SDO) approach in order to increase the inter-class distance and reduce the intra-class differences in scene recognition tasks. In our previous work [28], we proposed exploiting the relations between the entire image and the manually configured objects in an image with the ancillary information from the scene graph to recognize an image scene. In [29], the deep visually sensitive features obtained by feeding the pre-trained CNNs with the scene images enhanced by detected saliency are proved to be effective for scene recognition. ...
... In response to the shortcomings of poor data diversity generated by the original GAN, D Berthelot et al. designed BEGAN based on the idea of equilibrium, which improved the ability of GAN network data augmentation to a new level [14]. The images generated by BEGAN are high quality and effectively improve the prediction performance of the model [15,16]. ...
... These methods project data from two different but correlated manifolds to a subspace, simultaneously preserving the local structures and ensuring their closeness. A subgroup of this technique includes semisupervised methods [31]- [34] that utilize several sample-wise correspondences known in advance between the manifolds while learning a new subspace. In contrast, the second subgroup, which we focus on in this article, contains unsupervised manifold alignment methods that do not require correspondences to be predetermined. ...
... Thanks to the improved information collection technology, the data obtained by researchers in many fields are highdimensional data with multiple perspectives. As researched in [22], it reveals that a common projection subspace obtained from multi-view raw data can capture more comprehensive structures and intrinsic feature information. Canonical correlation analysis (CCA) [23], as the most classic multi-view learning method, has an excellent performance in various fields. ...
... The most dominant advantage is scalability because AIaaS providers can elastically provision and release hardware resources available to the platform and thus scale horizontally in accordance with the user-defined configurations and requirements if the consumption of computing resources for the defined AI model has increased (Boag et al. 2018;Elshawi et al. 2018;Pandl et al. 2021). The scalability of the cloud, combined with the number of available hardware resources, results in a large amount of processing power provisioned by the cloud and enables the AIaaS to respond to extensive requests with scalable and responsive utilization of CPUs and GPUs (Bao et al. 2018). Since AI algorithms are based on the knowledge inferred from a substantial quantity of data, the processing is performed by allocating significant computational resources that require the cloud's capability (Rouhani et al. 2018). ...