Jinglu Hu

Waseda University, Tokyo, Tokyo-to, Japan

Are you Jinglu Hu?

Claim your profile

Publications (161)23.64 Total impact

  • Article: Feature subset selection: a correlation‐based SVM filter approach
    [show abstract] [hide abstract]
    ABSTRACT: The central criterion of feature selection is that good feature sets contain features that are highly correlated with the output, yet uncorrelated with each other. Based on this criterion, we address the problem of feature selection through correlation-based feature clustering and support vector machine (SVM) based feature ranking. Correlation-based clustering is proposed to group features into some clusters based on the correlation between two features. As a result, a feature is highly correlated to any other feature in the same cluster but uncorrelated to the features in other clusters. From each cluster, we select a feature as the delegate based on its influence quantities on the output. The influence quantities are measured by the feature sensitivity in the SVM. The proposed approach can identify relevant features and eliminate redundancy among them effectively. The effectiveness of the proposed approach is demonstrated through comparisons with other methods using real-world data with different dimensions. © 2011 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.
    IEEJ Transactions on Electrical and Electronic Engineering 02/2011; 6(2):173 - 179. · 0.36 Impact Factor
  • Source
    Conference Proceeding: Local linear multi-SVM method for gene function classification
    Benhui Chen, Feiran Sun, Jinglu Hu
    [show abstract] [hide abstract]
    ABSTRACT: This paper proposes a local linear multi-SVM method based on composite kernel for solving classification tasks in gene function prediction. The proposed method realizes a nonlinear separating boundary by estimating a series of piecewise linear boundaries. Firstly, according to the distribution information of training data, a guided partitioning approach composed of separating boundary detection and clustering technique is used to obtain local subsets, and each subset is utilized to capture prior knowledge of corresponding local linear boundary. Secondly, a composite kernel is introduced to realize the local linear multi-SVM model. Instead of building multiple local SVM models separately, the prior knowledge of local subsets is used to construct a composite kernel, then the local linear multi-SVM model is realized by using the composite kernel exactly in the same way as a single SVM model. Experimental results on benchmark datasets demonstrate that the proposed method improves the classification performance efficiently.
    Nature and Biologically Inspired Computing (NaBIC), 2010 Second World Congress on; 01/2011
  • Source
    Conference Proceeding: Hierarchical Multi-label Classification incorporating prior information for gene function prediction
    Benhui Chen, Jinglu Hu
    [show abstract] [hide abstract]
    ABSTRACT: This paper proposes an improved Hierarchical Multi-label Classification (HMC) method for solving the gene function prediction. The HMC task is transferred into a series of binary SVM classification tasks. By introducing the hierarchy constraint into learning procedures, two measures with incorporating prior information are implemented to improve the HMC performance. Firstly, for imbalanced functional classes, a hierarchical SMOTE is proposed as over-sampling preprocessing to improve the SVM learning performance. Secondly, an improved True Path Rule consistency approach is introduced to ensemble the results of binary probabilistic SVM classifications. It can improve the classification results and guarantee the hierarchy constraint of classes.
    Intelligent Systems Design and Applications (ISDA), 2010 10th International Conference on; 01/2011
  • Source
    Article: Feature subset selection: a correlation‐based SVM filter approach
    Boyang Li, Qiangwei Wang, Jinglu Hu
    [show abstract] [hide abstract]
    ABSTRACT: The central criterion of feature selection is that good feature sets contain features that are highly correlated with the output, yet uncorrelated with each other. Based on this criterion, we address the problem of feature selection through correlation-based feature clustering and support vector machine (SVM) based feature ranking. Correlation-based clustering is proposed to group features into some clusters based on the correlation between two features. As a result, a feature is highly correlated to any other feature in the same cluster but uncorrelated to the features in other clusters. From each cluster, we select a feature as the delegate based on its influence quantities on the output. The influence quantities are measured by the feature sensitivity in the SVM. The proposed approach can identify relevant features and eliminate redundancy among them effectively. The effectiveness of the proposed approach is demonstrated through comparisons with other methods using real-world data with different dimensions.  2011 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.
    01/2011; 6:173-179.
  • Conference Proceeding: Combining binary-SVM and pairwise label constraints for multi-label classification
    Weifeng Gu, Benhui Chen, Jinglu Hu
    [show abstract] [hide abstract]
    ABSTRACT: Multi-label classification is an extension of traditional classification problem in which each instance is associated with a set of labels. Recent research has shown that the ranking approach is an effective way to solve this problem. In the multi-labeled sets, classes are often related to each other. Some implicit constraint rules are existed among the labels. So we present a novel multi-label ranking algorithm inspired by the pairwise constraint rules mined from the training set to enhance the existing method. In this method, one-against-all decomposition technique is used firstly to divide a multi-label problem into binary class sub-problems. A rank list is generated by combining the probabilistic outputs of each binary Support Vector Machine (SVM) classifier. Label constraint rules are learned by minimizing the ranking loss. Experimental performance evaluation on well-known multi-label benchmark datasets show that our method improves the classification accuracy efficiently, compared with some existed methods.
    Systems Man and Cybernetics (SMC), 2010 IEEE International Conference on; 11/2010
  • Source
    Conference Proceeding: eSBH: An Accurate Constructive Heuristic Algorithm for DNA Sequencing by Hybridization
    Yang Chen, Jinglu Hu
    [show abstract] [hide abstract]
    ABSTRACT: Sequencing by hybridization is a promising cost-effective technology for high-throughput DNA sequencing via microarray chips. However, due to the effects of spectrum errors rooted from experimental conditions, a fast and accurate reconstruction of original sequences has become a challenging problem. In the last decade, a variety of analyses and designs have been tried to overcome this problem, where different strategies have different tradeoffs in speed and accuracy. Motivated by the idea that the errors could be identified by analyzing the interrelation of spectrum elements, this paper presents a new constructive heuristic algorithm, featuring an accurate reconstruction guided by a set of well-defined criteria and rules. The experiments on benchmark instance sets demonstrate that the proposed method can reconstruct long DNA sequences more accurately than current approaches in the literature.
    BioInformatics and BioEngineering (BIBE), 2010 IEEE International Conference on; 07/2010
  • Source
    Conference Proceeding: An improved multi-label classification based on label ranking and delicate boundary SVM.
    Benhui Chen, Weifeng Gu, Jinglu Hu
    International Joint Conference on Neural Networks, IJCNN 2010, Barcelona, Spain, 18-23 July, 2010; 01/2010
  • Source
    Conference Proceeding: An adaptive niching EDA based on clustering analysis.
    Benhui Chen, Jinglu Hu
    Proceedings of the IEEE Congress on Evolutionary Computation, CEC 2010, Barcelona, Spain, 18-23 July 2010; 01/2010
  • Source
    Conference Proceeding: A novel frequency band selection method for Common Spatial Pattern in Motor Imagery based Brain Computer Interface.
    Gufei Sun, Jinglu Hu, Gengfeng Wu
    International Joint Conference on Neural Networks, IJCNN 2010, Barcelona, Spain, 18-23 July, 2010; 01/2010
  • Article: An Adaptive Niching EDA with Balance Searching Based on Clustering Analysis.
    Benhui Chen, Jinglu Hu
    IEICE Transactions. 01/2010; 93-A:1792-1799.
  • Source
    Article: An improved multi-label classification method and its application to functional genomics.
    Benhui Chen, Weifeng Gu, Jinglu Hu
    [show abstract] [hide abstract]
    ABSTRACT: In this paper, a multi-label classification method based on label ranking and delicate boundary Support Vector Machine (SVM) is proposed for solving the functional genomics applications. Firstly, an improved probabilistic SVM with delicate decision boundary is used as scoring approach to obtain a proper label rank. Secondly, an instance-dependent thresholding strategy is proposed to decide classification results. A d-folds validation approach is utilised to determine a set of target thresholds for all training samples as teachers, then an appropriate instance-dependent threshold for each testing instance is obtained by applying k-Nearest Neighbours (KNN) strategy on this teacher threshold set.
    International Journal of Computational Biology and Drug Design 01/2010; 3(2):133-45.
  • Source
    Chapter: Protein Structure Prediction Based on HP Model Using an Improved Hybrid EDA
    Benhui Chen, Jinglu Hu
    [show abstract] [hide abstract]
    ABSTRACT: Protein structure prediction (PSP) is one of the most important problems in computational biology. This chapter introduces a novel hybrid Estimation of Distribution Algorithm (EDA) to solve the PSP problem on HP model. Firstly, a composite fitness function containing the information of folding structure core (H-Core) is introduced to replace the traditional fitness function of HP model. The new fitness function is expected to select better individuals for probabilistic model of EDA. Secondly, local search with guided operators is utilized to refine found solutions for improving efficiency of EDA. Thirdly, an improved backtracking-based repairing method is introduced to repair invalid individuals sampled by the probabilistic model of EDA. It can significantly reduce the number of backtracking searching operation and the computational cost for long sequence protein. Experimental results demonstrate that the new method outperforms the basic EDAs method. At the same time, it is very competitive with other existing algorithms for the PSP problem on lattice HP models.
    12/2009: pages 193-214;
  • Source
    Conference Proceeding: Interesting rules mining with deductive method
    [show abstract] [hide abstract]
    ABSTRACT: In this paper, we propose a novel rule deductive method to mine the real demanded association rules for any given user. This method does not like the most existing methods that mine frequent itemsets starting from candidate two-itemsets to candidate (n-1)-itemsets with inductive method and produce huge rough rules on these frequent itemsets. On the contrary, it avoids producing huge amounts of frequent itemsets contained by their upper long frequent itemsets and can interact with users by making them pick up their interested items to deduce the final interesting association rules. Moreover, it can do dynamic response to users in any time when users want to check whether their interested frequent itemsets have been founded. Its several dynamic response strategies have been proposed. These dynamic response algorithms can find most long frequent itemsets in initial time. Therefore, users can find their interested rules in short time with high probability. So, our method also can be used applied in online data mining.
    ICCAS-SICE, 2009; 09/2009
  • Source
    Conference Proceeding: An improved backtracking method for EDAs based protein folding
    Benhui Chen, Long Li, Jinglu Hu
    [show abstract] [hide abstract]
    ABSTRACT: Many evolutionary algorithm (EA) based methods have been proposed to solve protein structure prediction (PSP) problem in HP-lattice model. One of common difficulties of those methods is the existence of invalid individuals produced by geometrical constraints in the conformation of protein (i.e. self-avoidance in the chain). A backtracking method is often used to repair the invalid individuals of genetic search in those methods. However, there is a disadvantage in basic backtracking method, the repairing computational cost is very heavy for long sequence instances. This paper proposes an improved backtracking-based repairing method for long sequence protein folding. A detection procedure is added in backtracking method to avoid entering invalid closed areas when selecting directions for the residues. Experimental results show that the proposed method can significantly reduce the number of backtracking searching operations and the computational cost for the long protein sequences.
    ICCAS-SICE, 2009; 09/2009
  • Source
    Conference Proceeding: A fast SVM training method for very large datasets
    Boyang Li, Qiangwei Wang, Jinglu Hu
    [show abstract] [hide abstract]
    ABSTRACT: In a standard support vector machine (SVM), the training process has O(n<sup>3</sup>) time and O(n<sup>2</sup>) space complexities, where n is the size of training dataset. Thus, it is computationally infeasible for very large datasets. Reducing the size of training dataset is naturally considered to solve this problem. SVM classifiers depend on only support vectors (SVs) that lie close to the separation boundary. Therefore, we need to reserve the samples that are likely to be SVs. In this paper, we propose a method based on the edge detection technique to detect these samples. To preserve the entire distribution properties, we also use a clustering algorithm such as K-means to calculate the centroids of clusters. The samples selected by edge detector and the centroids of clusters are used to reconstruct the training dataset. The reconstructed training dataset with a smaller size makes the training process much faster, but without degrading the classification accuracies.
    Neural Networks, 2009. IJCNN 2009. International Joint Conference on; 07/2009
  • Source
    Conference Proceeding: A novel EDAs based method for HP model protein folding
    Benhui Chen, Long Li, Jinglu Hu
    [show abstract] [hide abstract]
    ABSTRACT: The protein structure prediction (PSP) problem is one of the most important problems in computational biology. This paper proposes a novel Estimation of Distribution Algorithms (EDAs) based method to solve the PSP problem on HP model. Firstly, a composite fitness function containing the information of folding structure core formation is introduced to replace the traditional fitness function of HP model. It can help to select more optimum individuals for probabilistic model of EDAs algorithm. And a set of guided operators are used to increase the diversity of population and the likelihood of escaping from local optima. Secondly, an improved backtracking repairing algorithm is proposed to repair invalid individuals sampled by the probabilistic model of EDAs for the long sequence protein instances. A detection procedure of feasibility is added to avoid entering invalid closed areas when selecting directions for the residues. Thus, it can significant reduce the number of backtracking operation and the computational cost for long sequence protein. Experimental results demonstrate that the proposed method outperform the basic EDAs method. At the same time, it is very competitive with the other existing algorithms for the PSP problem on lattice HP models.
    Evolutionary Computation, 2009. CEC '09. IEEE Congress on; 06/2009
  • Source
    Conference Proceeding: A Novel Clustering based Niching EDA for Protein Folding.
    Benhui Chen, Jinglu Hu
    World Congress on Nature & Biologically Inspired Computing, NaBIC 2009, 9-11 December 2009, Coimbatore, India; 01/2009
  • Source
    Article: Network Administrator Assistance System Based on Fuzzy C-means Analysis.
    JACIII. 01/2009; 13:91-96.
  • Article: Human Resource Selection Based on Performance Classification Using Weighted Support Vector Machine.
    Qiangwei Wang, Boyang Li, Jinglu Hu
    JACIII. 01/2009; 13:407-415.
  • Article: Study of multi-branch structure of Universal Learning Networks.
    Appl. Soft Comput. 01/2009; 9:393-403.

Institutions

  • 2003–2011
    • Waseda University
      • Graduate School of Information, Production and Systems
      Tokyo, Tokyo-to, Japan
  • 2007
    • University of Ulster
      • School of Computing and Mathematics
      Belfast, NIR, United Kingdom
  • 2004
    • Chongqing University
      Chongqing, Chongqing Shi, China
  • 1999–2003
    • Kyushu University
      • Department of Electrical Engineering
      Fukuoka-shi, Fukuoka-ken, Japan