Bo Jin

Georgia State University, Atlanta, Georgia, United States

Are you Bo Jin?

Claim your profile

Publications (18)7.23 Total impact

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Predicting protein subcellular locations may help us understand protein functions and analyse protein interactions with other molecules. Many machine learning and computational techniques have been used to predict protein subcellular locations. In this paper, we propose a new hybrid classification system called SVM-ANFIS based on Support Vector Machines and Adaptive Neuro Fuzzy Inference System for protein subcellular location prediction. The experimental results show that the new system can not only achieve high total accuracies but also improve local accuracies in protein subcellular location prediction.
    International Journal of Computational Intelligence in Bioinformatics and Systems Biology 01/2009; 1(1).
  • [Show abstract] [Hide abstract]
    ABSTRACT: To deal with different membership functions of the same linguistic term, a new interval reasoning method using new granular sets is proposed based on Yin Yang methodology. To make interval-valued granular reasoning efficiently and optimize interval membership functions based on training data effectively, a granular neural network (GNN) with a new high-speed evolutionary interval learning is designed. Simulation results in nonlinear function approximation and bioinformatics have shown that the GNN with the evolutionary interval learning is able to extract interval-valued granular rules effectively and efficiently from training data by using the new evolutionary interval learning algorithm.
    IEEE Transactions on Fuzzy Systems 05/2008; · 5.48 Impact Factor
  • 04/2008: pages 229 - 239; , ISBN: 9780470397428
  • 05/2007: pages 45 - 56; , ISBN: 9780470124642
  • [Show abstract] [Hide abstract]
    ABSTRACT: With the growing interests of biological data prediction and chemical data prediction, more powerful and flexible kernels need to be designed so that the prior knowledge and relationships within data can be expressed effectively in kernel functions. In this paper, Granular Kernel Trees (GKTs) are proposed and parallel Genetic Algorithms (GAs) are used to optimise the parameters of GKTs. In applications, SVMs with new kernel trees are employed for drug activity comparisons. The experimental results show that GKTs and evolutionary GKTs can achieve better performances than traditional RBF kernels in terms of prediction accuracy.
    International Journal of Data Mining and Bioinformatics 02/2007; 1(3):270-85. · 0.39 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we present a genetic fuzzy feature transformation method for support vector machines (SVMs) to do more accurate data classification. Given data are first transformed into a high feature space by a fuzzy system, and then SVMs are used to map data into a higher feature space and then construct the hyperplane to make a final decision. Genetic algorithms are used to optimize the fuzzy feature transformation so as to use the newly generated features to help SVMs do more accurate biomedical data classification under uncertainty. The experimental results show that the new genetic fuzzy SVMs have better generalization abilities than the traditional SVMs in terms of prediction accuracy.
    Inf. Sci. 01/2007; 177:476-489.
  • [Show abstract] [Hide abstract]
    ABSTRACT: To make interval-valued granular reasoning efficiently and optimize interval membership functions based on training data effectively, a new Genetic Granular Neural Network (GGNN) is desinged. Simulation results have shown that the GGNN is able to extract useful fuzzy knowledge effectively and efficiently from training data to have high training accuracy.
    Advances in Neural Networks - ISNN 2007, 4th International Symposium on Neural Networks, ISNN 2007, Nanjing, China, June 3-7, 2007, Proceedings, Part II; 01/2007
  • Bo Jin, Yan-Qing Zhang
    [Show abstract] [Hide abstract]
    ABSTRACT: Not Available
    Granular Computing, 2006 IEEE International Conference on; 06/2006
  • Bo Jin, Yan-qing Zhang
    [Show abstract] [Hide abstract]
    ABSTRACT: With the growing interest of biological data prediction and chemical data prediction, more and more complicated kernels are designed to integrate data structures and relationships. We proposed a kind of evolutionary granular kernel trees (EGKTs) for drug activity comparisons [1]. In EGKTs, feature granules and tree structures are predefined based on the possible substituent locations. In this paper, we present a new system to evolve the structures of granular kernel trees (GKTs) in the case that we lack knowledge to predefine kernel trees. The new granular kernel tree structure evolving system is used for cyclooxygenase-2 inhibitor activity comparison. Experimental results show that the new system can achieve better performance than SVMs with traditional RBF kernels in terms of prediction accuracy.
    Transactions on Computational Systems Biology - TCSB. 01/2006;
  • Bo Jin, Yan-Qing Zhang
    [Show abstract] [Hide abstract]
    ABSTRACT: Due to the fact that the training time and space complexities of SVMs are mainly dependent on the size of training set, SVMs are not suitable for classifying large data sets with several millions of examples. To solve this problem, we in this paper propose a new algorithm called minimum enclosing ball (MEB) based SVM (MEB-SVM). In MEB-SVM, the boundary of each class data set is first measured by several MEBs, and then an SVM is trained by the data locating on the two class boundaries. Experiments on the KDDCUP-99 intrusion detection data set with about five million examples, the Ringnorm artificial data set with one hundred million examples, and the NDC data set with two million examples show that the new algorithm has competitive performance in terms of running time, testing accuracy and number of support vectors.
    Fuzzy Systems, 2006 IEEE International Conference on; 01/2006
  • Bo Jin, Yan-Qing Zhang
    [Show abstract] [Hide abstract]
    ABSTRACT: How to design powerful and flexible kernels to improve the system performance is an important topic in kernel based classification. In this paper, we present a new granular kernel method to improve the performance of Support Vector Machines (SVMs). In the system, genetic algorithms (GAs) are used to generate feature granules and optimize them together with fusions and parameters of granular kernels. The new granular kernel method is used for cyclooxygenase-2 inhibitor activity comparison. Experimental results show that the new method can achieve better performance than SVMs with traditional RBF kernels in terms of prediction accuracy.
    Advances in Neural Networks - ISNN 2006, Third International Symposium on Neural Networks, Chengdu, China, May 28 - June 1, 2006, Proceedings, Part I; 01/2006
  • [Show abstract] [Hide abstract]
    ABSTRACT: Kernel methods, specifically support vector machines (SVMs), have been widely used in many fields for data classification and pattern recognition. The performance of SVMs is mainly affected by kernel functions. With the growing interest of biological data prediction and chemical data prediction such as structure-property based molecule comparison, protein structure prediction and long DNA sequence comparison, more powerful and flexible kernels need to be designed in order effectively to express the prior knowledge and relationships within each data item. In this paper, the granular kernel concept is presented and related properties are described in detail. A hierarchical kernel design method is proposed to construct granular kernel trees (GKTs). For a particular problem, genetic algorithms (GAs) are used to find the optimum parameter settings of GKTs. In applications, SVMs with new kernel trees are employed for the comparisons of drug activities. The experimental results show that SVMs with GKTs and evolutionary GKTs can achieve better performances than SVMs with traditional RBF kernels in terms of prediction accuracy.
    Computational Intelligence in Bioinformatics and Computational Biology, 2005. CIBCB '05. Proceedings of the 2005 IEEE Symposium on; 12/2005
  • Bo Jin, Yan-Qing Zhang
    [Show abstract] [Hide abstract]
    ABSTRACT: In support vector machines (SVMs) learning, data to be classified are directly fed to the algorithms without modification. In many real world applications, objects however cannot be represented by original feature vectors accurately because the original features of vectors might contain noise, imprecise description, or unrelated information, which negatively affect SVMs to learn useful knowledge from raw given data. To challenging this problem, we in this paper present an evolutionary feature weights optimization method, which is used to transform the raw data into a "better" feature space to improve SVMs classification accuracies.
    Fuzzy Information Processing Society, 2005. NAFIPS 2005. Annual Meeting of the North American; 07/2005
  • [Show abstract] [Hide abstract]
    ABSTRACT: In many life science applications, due to the requirement of real time data processing and/or the very large size of the available dataset, a classifier with high efficiency is usually more preferable, or even necessary, with the prerequisite of not deteriorating effectiveness too much. That means a more desirable classifier in this context should run faster but still retain high accuracy. In this paper, we show a simple but fast method for modeling a linear granular support vector machine by splitting the feature space to two smaller subspaces and then building a SVM for each of them. The hyperplane used to halve the feature space is searched by applying the extended statistical margin maximization principle along the direction orthogonal to the first principle component. One public medical dataset is used to compare the resulting linear GSVM to one optimized SVM with the radial basis function (RBF) kernel in the whole feature space. The experimental results show that finding the splitting hyperplane is not a trivial task and the linear GSVM is even a little better than the optimized RBF-SVM in terms of testing accuracy, but the linear GSVM is more robust against noises, more stable to model parameters, and runs much faster. In general, GSVM provides an interesting new mechanism, which is competitive to kernel mapping methods, to address complex classification problems effectively and efficiently
    Fuzzy Systems, 2005. FUZZ '05. The 14th IEEE International Conference on; 06/2005
  • [Show abstract] [Hide abstract]
    ABSTRACT: In the paper, we present a fuzzy hybrid kernel that combines several conventional kernels by using the TSK model. The major technical merit is to make a more reliable kernel fusing different kernels. Support vector machine (SVM) with the fuzzy hybrid kernel is employed for protein subcellular localization classification. Experimental results indicate that SVM with the new fuzzy hybrid kernel is better than those with conventional kernels
    Fuzzy Systems, 2005. FUZZ '05. The 14th IEEE International Conference on; 06/2005
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Protein homology prediction between protein sequences is one of critical problems in computational biology. Such a complex classification problem is common in medical or biological information processing applications. How to build a model with superior generalization capability from training samples is an essential issue for mining knowledge to accurately predict/classify unseen new samples and to effectively support human experts to make correct decisions. A new learning model called granular support vector machines (GSVM) is proposed based on our previous work. GSVM systematically and formally combines the principles from statistical learning theory and granular computing theory and thus provides an interesting new mechanism to address complex classification problems. It works by building a sequence of information granules and then building support vector machines (SVM) in some of these information granules on demand. A good granulation method to find suitable granules is crucial for modeling a GSVM with good performance. In this paper, we also propose an association rules-based granulation method. For the granules induced by association rules with high enough confidence and significant support, we leave them as they are because of their high "purity" and significant effect on simplifying the classification task. For every other granule, a SVM is modeled to discriminate the corresponding data. In this way, a complex classification problem is divided into multiple smaller problems so that the learning task is simplified. The proposed algorithm, here named GSVM-AR, is compared with SVM by KDDCUP04 protein homology prediction data. The experimental results show that finding the splitting hyperplane is not a trivial task (we should be careful to select the association rules to avoid overfitting) and GSVM-AR does show significant improvement compared to building one single SVM in the whole feature space. Another advantage is that the utility of GSVM-AR is very good because it is easy to be implemented. More importantly and more interestingly, GSVM provides a new mechanism to address complex classification problems.
    Artificial Intelligence in Medicine 01/2005; 35(1-2):121-34. · 1.36 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Support Vector Machine (SVM) as a learning system has been widely employed for pattern recognition and data classification tasks such as biological data classification. Choosing appropriate parameters are essential for SVM to achieve a high global performance. In this paper, we propose a new binary multi-SVM voting system without difficult parameter selection for protein subcellular localization prediction. The sufficient experimental results demonstrate that the multi-SVM voting system can achieve higher average prediction accuracies for the protein subcellular localization prediction than the traditional single-SVM system.
    Computational Science and Its Applications - ICCSA 2005, International Conference, Singapore, May 9-12, 2005, Proceedings, Part III; 01/2005
  • [Show abstract] [Hide abstract]
    ABSTRACT: We propose a new learning model called granular support vector machines for data classification problems. Granular support vector machines systematically and formally combines the principles from statistical learning theory and granular computing theory. It works by building a sequence of information granules and then building a support vector machine in each information granule. In this paper, we also give a simple but efficient implementation method for modeling a granular support vector machine by building just two information granules in the top-down way (that is, halving the whole feature space). The hyperplane used to halve the feature space is selected by extending statistical margin maximization principle. The experiment results on three medical binary classification problems show that finding the splitting hyperplane is not a trivial task. For some datasets and some kernel functions, granular support vector machines with two information granules could achieve some improvement on testing accuracy, but for some other datasets, building one single support vector machine in the whole feature space gets a little better performance. How to get the optimal information granules is still an open problem. The important issue is that granular support vector machines proposed in This work provides an interesting new mechanism to address complex classification problems, which are common in medical or biological information processing applications.
    Computational Intelligence in Bioinformatics and Computational Biology, 2004. CIBCB '04. Proceedings of the 2004 IEEE Symposium on; 11/2004