Liang Wang

Chinese Academy of Sciences, Peping, Beijing, China

Are you Liang Wang?

Claim your profile

Publications (98)115.66 Total impact

  • Zifeng Wu · Yongzhen Huang · Liang Wang ·
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes to learn features from sets of labeled raw images. With this method, the problem of over- fitting can be effectively suppressed, so that deep CNNs can be trained from scratch with a small number of training data, i.e., 420 labeled albums with about 30 000 photos. This method can effectively deal with sets of images, no matter if the sets bear temporal structures. A typical approach to sequential image analysis usually leverages motions between adjacent frames, while the proposed method focuses on capturing the co-occurrences and frequencies of features. Nevertheless, our method outperforms previous best performers in terms of album classification, and achieves comparable or even better performances in terms of gait based human identification. These results demonstrate its effectiveness and good adaptivity to different kinds of set data.
    IEEE Transactions on Multimedia 11/2015; 17(11):1-1. DOI:10.1109/TMM.2015.2477681 · 2.30 Impact Factor
  • Qiyue Yin · Shu Wu · Ran He · Liang Wang ·
    [Show abstract] [Hide abstract]
    ABSTRACT: Multi-view clustering, which aims to cluster datasets with multiple sources of information, has a wide range of applications in the communities of data mining and pattern recognition. Generally, it makes use of the complementary information embedded in multiple views to improve clustering performance. Recent methods usually find a low-dimensional embedding of multi-view data, but often ignore some useful prior information that can be utilized to better discover the latent group structure of multi-view data. To alleviate this problem, a novel pairwise sparse subspace representation model for multi-view clustering is proposed in this paper. The objective function of our model mainly includes two parts. The first part aims to harness prior information to achieve a sparse representation of each high-dimensional data point with respect to other data points in the same view. The second part aims to maximize the correlation between the representations of different views. An alternating minimization method is provided as an efficient solution for the proposed multi-view clustering algorithm. A detailed theoretical analysis is also conducted to guarantee the convergence of the proposed method. Moreover, we show that the must-link and cannot-link constraints can be naturally integrated into the proposed model to obtain a link constrained multi-view clustering model. Extensive experiments on five real world datasets demonstrate that the proposed model performs better than several state-of-the-art multi-view clustering methods.
    Neurocomputing 05/2015; 156. DOI:10.1016/j.neucom.2015.01.017 · 2.08 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Glycosylation can have a multifaceted impact on the properties and functions of peptides and plays a critical role in interacting with or binding to the target molecules. Herein, based on the previously reported method for macrocyclic glycopeptide synthesis, two series of tyrocidine A glycosylated derivatives (1a-f and 2a-f) were synthesized and evaluated for their antibacterial activities to further study the structure and activity relationships (SAR). Biological studies showed that the synthetic glycosylated derivatives had good antibacterial activities towards methicillin-resistant Staphylococcus aureus and vancomycin-resistant Enterococcus. SAR studies based on various glycans and linkages were used to enhance the biochemical profile, resulting in the identification of several potent antibiotics, such as 1f, with a great improved therapeutic index than tyrocidine A. Copyright © 2015 European Peptide Society and John Wiley & Sons, Ltd. Copyright © 2015 European Peptide Society and John Wiley & Sons, Ltd.
    Journal of Peptide Science 04/2015; 21(7). DOI:10.1002/psc.2774 · 1.55 Impact Factor
  • Source
    Fang Zhao · Yongzhen Huang · Liang Wang · Tieniu Tan ·
    [Show abstract] [Hide abstract]
    ABSTRACT: With the rapid growth of web images, hashing has received increasing interests in large scale image retrieval. Research efforts have been devoted to learning compact binary codes that preserve semantic similarity based on labels. However, most of these hashing methods are designed to handle simple binary similarity. The complex multilevel semantic structure of images associated with multiple labels have not yet been well explored. Here we propose a deep semantic ranking based method for learning hash functions that preserve multilevel semantic similarity between multi-label images. In our approach, deep convolutional neural network is incorporated into hash functions to jointly learn feature representations and mappings from them to hash codes, which avoids the limitation of semantic representation power of hand-crafted features. Meanwhile, a ranking list that encodes the multilevel similarity information is employed to guide the learning of such deep hash functions. An effective scheme based on surrogate loss is used to solve the intractable optimization problem of multivariate ranking measures involved in the learning procedure. Experimental results show the superiority of our proposed approach over several state-of-the-art hashing methods in term of ranking evaluation metrics when tested on multi-label image datasets.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We report on early results from a pilot program searching for metal-poor stars with LAMOST and follow-up high-resolution observation acquired with the MIKE spectrograph attached to the Magellan~II telescope. We performed detailed abundance analysis for eight objects with iron abundances [Fe/H] < -2.0, including five extremely metal-poor (EMP; [Fe/H] < -3.0) stars with two having [Fe/H] < -3.5. Among these objects, three are newly discovered EMP stars, one of which is confirmed for the first time with high-resolution spectral observations. Three program stars are regarded as carbon-enhanced metal-poor (CEMP) stars, including two stars with no enhancement in their neutron-capture elements, which thus possibly belong to the class of CEMP-no stars; one of these objects also exhibits significant enhancement in nitrogen, and is thus a potential carbon and nitrogen-enhanced metal-poor star. The [X/Fe] ratios of the sample stars generally agree with those reported in the literature for other metal-poor stars in the same [Fe/H] range. We also compared the abundance patterns of individual program stars with the average abundance pattern of metal-poor stars, and find only one chemically peculiar object with abundances of at least two elements (other than C and N) showing deviations larger than 0.5dex. The distribution of [Sr/Ba] versus [Ba/H] agrees that an additional nucleosynthesis mechanism is needed aside from a single r-process. Two program stars with extremely low abundances of Sr and Ba support the prospect that both main and weak r-process may have operated during the early phase of Galactic chemical evolution. The distribution of [C/N] shows that there are two groups of carbon-normal giants with different degrees of mixing. However, it is difficult to explain the observed behavior of the [C/N] of the nitrogen-enhanced unevolved stars based on current data.
    The Astrophysical Journal 01/2015; 798(2). DOI:10.1088/0004-637X/798/2/110 · 5.99 Impact Factor
  • Source
    Ran He · Man Zhang · Liang Wang · Ye Ji · Qiyue Yin ·
    [Show abstract] [Hide abstract]
    ABSTRACT: In multimedia applications, the text and image components in a web document form a pairwise constraint that potentially indicates the same semantic concept. This paper studies cross-modal learning via the pairwise constraint, and aims to find the common structure hidden in different modalities. We first propose a compound regularization framework to deal with the pairwise constraint, which can be used as a general platform for developing cross-modal algorithms. For unsupervised learning, we propose a cross-modal subspace clustering method to learn a common structure for different modalities. For supervised learning, to reduce the semantic gap and the outliers in pairwise constraints, we propose a cross-modal matching method based on compound ?21 regularization along with an iteratively reweighted algorithm to find the global optimum. Extensive experiments demonstrate the benefits of joint text and image modeling with semantically induced pairwise constraints, and show that the proposed cross-modal methods can further reduce the semantic gap between different modalities and improve the clustering/retrieval accuracy.
    IEEE Transactions on Image Processing 11/2014; 24(12). DOI:10.1109/TIP.2015.2466106 · 3.63 Impact Factor
  • Muhammad Rauf · Yongzhen Huang · Liang Wang ·
    [Show abstract] [Hide abstract]
    ABSTRACT: The Bag-of-words framework is probably one of the best models used in image classification. In this model, coding plays a very important role in the classification process. There are many coding meth- ods that have been proposed to encode images in different ways. The relationship between different codewords is studied, but the relationship among descriptors is not fully discovered. In this work, we aim to draw a relationship between descriptors, and propose a new method that can be used with other coding methods to improve the performance. The basic idea behind this is encoding the descriptor not only with its n- earest codewords but also with the codewords of its nearest neighboring descriptors. Experiments on several benchmark datasets show that even using this simple relationship between the descriptors helps to improve coding methods.
    CCPR; 11/2014
  • Jingyu Liu · Yongzhen Huang · Liang Wang · Shu Wu ·
    [Show abstract] [Hide abstract]
    ABSTRACT: Feature coding and pooling are two critical stages in the widely used Bag-of-Features (BOF) framework in image classification. After coding, each local feature formulates its representation by the visual codewords. However, the two-dimensional feature-code layout is transformed to a one-dimensional codeword representation after pooling. The property for each local feature is ignored and the whole representation is tightly coupled. To resolve this problem, we propose a hierarchical feature coding approach which regards each feature-code representation as a high level feature. Codeword learning, coding and pooling are also applied to these new features, and thus a high level representation of the image is obtained. Experiments on different datasets validate our analysis and demonstrate that the new representation is more discriminative than that in the previous BOF framework. Moreover, we show that various kinds of traditional feature coding algorithms can be easily embedded into our framework to achieve better performance.
    Neurocomputing 11/2014; 144:509–515. DOI:10.1016/j.neucom.2014.04.022 · 2.08 Impact Factor
  • Feng Liu · Yongzhen Huang · Liang Wang · Wankou Yang · Changyin Sun ·
    [Show abstract] [Hide abstract]
    ABSTRACT: Spatial information is an important cue for visual object analysis. Various studies in this field have been conducted. However, they are either too rigid or too fragile to efficiently utilize such information. In this paper, we propose to model the distribution of objects׳ local appearance patterns by using their co-occurrence at different spatial locations. In order to represent such a distribution, we propose a flexible framework called spatial feature co-pooling, with which the relations between patterns are discovered. As the final representation resulted from our framework is of high dimensionality, we propose a semi-greedy (SG) grafting algorithm to select the most discriminative features. Experimental results on the CIFAR 10, UIUC Sports and VOC 2007 datasets show that our method is effective and comparable with the state-of-art algorithms.
    Neurocomputing 09/2014; 139:415–422. DOI:10.1016/j.neucom.2014.02.015 · 2.08 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a study of the spectral line shape associated with a High Resolution Spectrograph on the 2.16 m telescope at the Xinglong Observing Station of National Astronomical Observatories, Chinese Academy of Sciences. This measurement is based on modeling the instrumental line shape obtained by unresolved modes from a Yb-fiber mode-locked laser frequency comb. With the current repetition rate of 250 MHz and 26 GHz mode spacing on the spectrograph, we find the absolute variation of the line center, 0.0597 pixel in the direction of the CCDs, and 0.00275 pixel (~3 m s−1) for relative variation in successive exposures on a short timescale. A novel double-Gaussian model is presented to improve the quality of the fit by a factor of 2.47 in a typical single exposure. We also use analysis with raw moments and central moments to characterize the change in line shape across the detector. A trend in charge transfer efficiency can be found on the E2V 4096 × 4096 CCD that provides a correction for wavelength calibration aiming to reach a level of precision for radial velocity below 1 ms−1.
    Research in Astronomy and Astrophysics 08/2014; 14(8):1037. DOI:10.1088/1674-4527/14/8/014 · 1.64 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a novel unsupervised fall detection system that employs the collected acoustic signals (footstep sound signals) from an elderly person׳s normal activities to construct a data description model to distinguish falls from non-falls. The measured acoustic signals are initially processed with a source separation (SS) technique to remove the possible interferences from other background sound sources. Mel-frequency cepstral coefficient (MFCC) features are next extracted from the processed signals and used to construct a data description model based on a one class support vector machine (OCSVM) method, which is finally applied to distinguish fall from non-fall sounds. Experiments on a recorded dataset confirm that our proposed fall detection system can achieve better performance, especially with high level of interference from other sound sources, as compared with existing single microphone based methods.
    Signal Processing 08/2014; 110. DOI:10.1016/j.sigpro.2014.08.021 · 2.21 Impact Factor
  • Zhen Zhou · Li Zhong · Liang Wang ·
    [Show abstract] [Hide abstract]
    ABSTRACT: Clustering methods are widely deployed in the fields of data mining and pattern recognition. Many of them require the number of clusters as the input, which may not be practical when it is totally unknown. Several existing visual methods for cluster tendency assessment can be used to estimate the number of clusters by displaying the pairwise dissimilarity matrix into an intensity image where objects are reordered to reveal the hidden data structure as dark blocks along the diagonal. A major limitation of the existing methods is that they are not capable to highlight cluster structure with complex clusters. To address this problem, this paper proposes an effective approach by using Markov Random Fields, which updates each object with its local information dynamically and maximizes the global probability measure. The proposed method can be used to determine the cluster tendency and partition data simultaneously. Experimental results on synthetic and real-world datasets demonstrate the effectiveness of the proposed method.
    Neurocomputing 07/2014; 136:49–55. DOI:10.1016/j.neucom.2014.01.032 · 2.08 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: With the rapid developments in the exoplanet field, more and more terrestrial exoplanets are being detected. Characterising their atmospheres using transit observations will become a key datum in the quest for detecting an Earth-like exoplanet. The atmospheric transmission spectrum of our Earth will be an ideal template for comparison with future exo-Earth candidates. By observing a lunar eclipse, which offers a similar configuration to that of an exoplanet transit, we have obtained a high resolution and high signal-to-noise ratio transmission spectrum of the Earth's atmosphere. This observation was performed with the High Resolution Spectrograph at Xinglong Station, China during the total lunar eclipse in December 2011. We compare the observed transmission spectrum with our atmospheric model, and determine the characteristics of the various atmospheric species in detail. In the transmission spectrum, O2, O3, O2-O2, NO2 and H2O are detected, and their column densities are measured and compared with the satellites data. The visible Chappuis band of ozone produces the most prominent absorption feature, which suggests that ozone is a promising molecule for the future exo-Earth characterization. The individual O2 lines are resolved and O2 isotopes are clearly detected. Our new observations do not confirm the absorption features of Ca II or Na I which have been reported in previous lunar eclipse observations. However, features in these and some other strong Fraunhofer line positions do occur in the observed spectrum. We propose that these are due to a Raman-scattered component in the forward-scattered sunlight appearing in the lunar umbral spectrum. Water vapour absorption is found to be rather weak in our spectrum because the atmosphere we probed is relatively dry, which prompts us to discuss the detectability of water vapour in Earth-like exoplanet atmospheres.
    International Journal of Astrobiology 05/2014; 14(02). DOI:10.1017/S1473550414000172 · 1.26 Impact Factor
  • Yongzhen Huang · Zifeng Wu · Liang Wang · Chunfeng Song ·
    [Show abstract] [Hide abstract]
    ABSTRACT: Global spatial structure is an important factor for visual object recognition but has not attracted sufficient attention in recent studies. Especially, the problems of features' ambiguity and sensitivity to location change in the image space are not yet well solved. In this paper, we propose multiple spatial pooling (MSP) to address these problems. MSP models global spatial structure with multiple Gaussian distributions and then pools features according to the relations between features and Gaussian distributions. Such a process is further generalized into a unified framework, which formulates multiple pooling using matrix operation with structured sparsity. Experiments in terms of scene classification and object categorization demonstrate that MSP can enhance traditional algorithms with small extra computational cost.
    Neurocomputing 04/2014; 129:225–231. DOI:10.1016/j.neucom.2013.09.037 · 2.08 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Human gait is an important biometric feature, which can be used to identify a person remotely. However, view change can cause significant difficulties for gait recognition because it will alter available visual features for matching substantially. Moreover, it is observed that different parts of gait will be affected differently by view change. By exploring relations between two gaits from two different views, it is also observed that a part of gait in one view is more related to a typical part than any other parts of gait in another view. A new method proposed in this paper considers such variance of correlations between gaits across views that is not explicitly analyzed in the other existing methods. In our method, a novel motion co-clustering is carried out to partition the most related parts of gaits from different views into the same group. In this way, relationships between gaits from different views will be more precisely described based on multiple groups of the motion co-clustering instead of a single correlation descriptor. Inside each group, a linear correlation between gait information across views is further maximized through canonical correlation analysis (CCA). Consequently, gait information in one view can be projected onto another view through a linear approximation under the trained CCA subspaces. In the end, a similarity between gaits originally recorded from different views can be measured under the approximately same view. Comprehensive experiments based on widely adopted gait databases have shown that our method outperforms the state-of-the-art.
    IEEE Transactions on Image Processing 02/2014; 23(2):696-709. DOI:10.1109/TIP.2013.2294552 · 3.63 Impact Factor
  • Peng Zhang · Liang Wang · Wei Huang · Lei Xie · Guang Chen ·
    [Show abstract] [Hide abstract]
    ABSTRACT: Robust, accurate and efficient pedestrian tracking is a critical task in intelligent visual surveillance systems and robotic vision applications. Unfortunately, tracking in realistic scenarios is not easy and may fail due to the challenges of partial occlusion, viewpoint changes and cluttered background. To overcome these challenges, different tracking strategies have been proposed to model the tracking process as a first-order temporal Markov chain. It has been well-known that target appearance modeling approaches play essential function in this process, but most these approaches only represent the object characteristics in the pixel/texture level instead of investigating the latent information in the semantic understanding level. Therefore, the obtained optimum state based on the texture similarity may not be as same as observed by human vision system. To resolve this limitation, in this paper, we proposed a multiple pedestrian tracking algorithm based on couple-states analysis, the hidden state is used to obtain the estimated observations during the Markov chain transition process, and the latent state is used to find the semantic information about each observation. By maximizing the likelihood probability of the couple states for each estimation, the optimum state of target can be found more accurately, and error accumulation can also be effectively decreased during tracking. The performance of the proposed tracking has been verified on different benchmark surveillance video database, the results showed that the proposed tracking is able to track multiple pedestrians more accurately in variety of challenge scenarios when compared with other state-of-art multiple pedestrian tracking approaches.
    Soft Computing 01/2014; 19(1):85-97. DOI:10.1007/s00500-014-1375-9 · 1.27 Impact Factor
  • Chunfeng Song · Yongzhen Huang · Feng Liu · Zhenyu Wang · Liang Wang ·
    [Show abstract] [Hide abstract]
    ABSTRACT: For unsupervised problems like clustering, linear or non-linear data transformations are widely used techniques. Generally, they are beneficial to data representation. However, if data have a complicated structure, these techniques would be unsatisfying for clustering. In this paper, we propose a new clustering method based on the deep auto-encoder network, which can learn a highly non-linear mapping function. Via simultaneously considering data reconstruction and compactness, our method can obtain stable and effective clustering. Experimental results on four databases demonstrate that the proposed model can achieve promising performance in terms of normalized mutual information, cluster purity and accuracy.
    Intelligent Data Analysis 01/2014; 18(6):S65-S76. DOI:10.3233/IDA-140709 · 0.61 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Multiparty-multilevel digital rights management of audio requires blind detection of multiple watermarks. The proposed audio watermarking method offers copyright protection based on analysis filterbank decomposition, psychoacoustic model and empirical mode decomposition (EMD). The novel blind audio watermarking algorithm embeds the watermark bits in the final residue of the subbands in the transform domain. The watermarking system performance is optimized by selecting appropriate segment length for applying EMD process and by selecting the number of subbands for watermark embedding. Experimental results show that the proposed scheme is robust against various common signal processing manipulations while multiple watermark messages can be embedded.
    Multimedia Tools and Applications 01/2014; 74(15). DOI:10.1007/s11042-014-1905-6 · 1.35 Impact Factor
  • Kaiye Wang · Ran He · Wei Wang · Liang Wang · Tieniu Tan ·
    [Show abstract] [Hide abstract]
    ABSTRACT: Cross-modal matching has recently drawn much attention due to the widespread existence of multimodal data. It aims to match data from different modalities, and generally involves two basic problems: the measure of relevance and coupled feature selection. Most previous works mainly focus on solving the first problem. In this paper, we propose a novel coupled linear regression framework to deal with both problems. Our method learns two projection matrices to map multimodal data into a common feature space, in which cross-modal data matching can be performed. And in the learning procedure, the ell_21-norm penalties are imposed on the two projection matrices separately, which leads to select relevant and discriminative features from coupled feature spaces simultaneously. A trace norm is further imposed on the projected data as a low-rank constraint, which enhances the relevance of different modal data with connections. We also present an iterative algorithm based on half-quadratic minimization to solve the proposed regularized linear regression problem. The experimental results on two challenging cross-modal datasets demonstrate that the proposed method outperforms the state-of-the-art approaches.
    Proceedings of the 2013 IEEE International Conference on Computer Vision; 12/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose a novel computer vision-based fall detection system for monitoring an elderly person in a home care, assistive living application. Initially, a single camera covering the full view of the room environment is used for the video recording of an elderly person's daily activities for a certain time period. The recorded video is then manually segmented into short video clips containing normal postures, which are used to compose the normal dataset. We use the codebook background subtraction technique to extract the human body silhouettes from the video clips in the normal dataset and information from ellipse fitting and shape description, together with position information, is used to provide features to describe the extracted posture silhouettes. The features are collected and an online one class support vector machine (OCSVM) method is applied to find the region in feature space to distinguish normal daily postures and abnormal postures such as falls. The resultant OCSVM model can also be updated by using the online scheme to adapt to new emerging normal postures and certain rules are added to reduce false alarm rate and thereby improve fall detection performance. From the comprehensive experimental evaluations on datasets for 12 people, we confirm that our proposed person-specific fall detection system can achieve excellent fall detection performance with 100% fall detection rate and only 3% false detection rate with the optimally tuned parameters. This work is a semiunsupervised fall detection system from a system perspective because although an unsupervised-type algorithm (OCSVM) is applied, human intervention is needed for segmenting and selecting of video clips containing normal postures. As such, our research represents a step toward a complete unsupervised fall detection system.
    11/2013; 17(6):1002-14. DOI:10.1109/JBHI.2013.2274479

Publication Stats

995 Citations
115.66 Total Impact Points


  • 2010-2015
    • Chinese Academy of Sciences
      • National Pattern Recognition Laboratory
      Peping, Beijing, China
    • Nanyang Technological University
      Tumasik, Singapore
  • 2007-2013
    • University of Melbourne
      • Department of Electrical and Electronic Engineering
      Melbourne, Victoria, Australia
  • 2010-2011
    • University of Bath
      • Department of Computer Science
      Bath, England, United Kingdom
    • Yantai University
      • School of Pharmacy
      Chifu, Shandong Sheng, China
  • 2006-2008
    • Monash University (Australia)
      • Department of Electrical and Computer Systems Engineering, Clayton
      Melbourne, Victoria, Australia