Yen-Chu Hsu

National Taiwan Ocean University, Keelung, Taiwan, Taiwan

Are you Yen-Chu Hsu?

Claim your profile

Publications (5)3.23 Total impact

  • Source
    Dataset: Table S3
    [Show abstract] [Hide abstract]
    ABSTRACT: Results of inter-dataset training and testing of the proposed method for the identification of DS-related homologs. Only DS-related homologs were used as positive data in this experiment, in which common homologs and non-homologs were both regarded as negative data. Performance measures listed in this table include AUC, MCC, sensitivity and specificity. (0.06 MB PDF)
    Preview · Dataset · Oct 2010
  • Source
    Dataset: Table S4
    [Show abstract] [Hide abstract]
    ABSTRACT: Results of the structural alignments and hinge loop determinations for DSCO pairs in Datasets L and M. The 1,093 DSCO pairs successfully identified by the proposed method are listed here each with detailed information of the ranges of hinge loops determined by Eisenberg's and our methods, several structural similarity measures as well as the DS score defined in this work, and the virtual superimposition computed by our method. Structural superimpositions shown in this table were drawn using Jmol. (9.67 MB PDF)
    Preview · Dataset · Oct 2010
  • Source
    Dataset: Dataset S1
    [Show abstract] [Hide abstract]
    ABSTRACT: PDB entry list for the Dataset L. A list of the PDB entries of the protein pairs constituting sub-datasets Lds, Lch and Lnh. (0.13 MB XLS)
    Preview · Dataset · Oct 2010
  • Source
    Dataset: Table S1
    [Show abstract] [Hide abstract]
    ABSTRACT: DS-detecting performance of DynDom assessed based on Eisenberg's DS dataset. Among the 39 query proteins, 12 are detected to posses hinge loops by DynDom [35]. The locations and ranges of hinge loops determined by DynDom are compared to those reported by Eisenberg et al. in [2]. (0.11 MB PDF)
    Preview · Dataset · Oct 2010
  • Source
    Dataset: Figure S3
    [Show abstract] [Hide abstract]
    ABSTRACT: Performances of several protein structure/sequence comparison methods for the detection of global structural similarities between DS-related homologs with various sequence identities. An experiment that determines the simultaneous alignment qualities of the hinge loops, main domains and swapped domains for several protein structure/sequence comparison methods. (0.53 MB PDF)
    Preview · Dataset · Oct 2010
  • Source
    Dataset: Figure S4
    [Show abstract] [Hide abstract]
    ABSTRACT: Examples of the A⋅D profile and related hinge loop detection procedure. (a) Crystallins with PDB identifiers 4gcrA and 1blbA, a quasi-domain swapping case [2]. (b) Crystallins with PDB identifiers 4gcrA and 2a5mA, a pair of common global homologs. (c) Acetyltransferases with PDB identifiers 1s60A and 1b6bA, a pair of quasi-domain swapping homologs with a small C-terminal-swapped “domain”. (1.56 MB PDF)
    Preview · Dataset · Oct 2010
  • Source
    Dataset: Table S5
    [Show abstract] [Hide abstract]
    ABSTRACT: Structure-based sequence alignments for DSCO pairs in Datasets L and M performed by several protein structural comparison methods. The structure-based sequence alignments performed by TM-align [31], SARST [34] and the proposed DS-detecting method as well as the sequence alignments performed by BLAST [42] for the 1,093 DSCO pairs shown in Table S4 are listed here. (9.84 MB PDF)
    Preview · Dataset · Oct 2010
  • Source
    Dataset: Table S2
    [Show abstract] [Hide abstract]
    ABSTRACT: Sensitivity and specificity of various alignment methods and structural similarity measures for the identification of common structural homologs and/or DS-related homologs. Sensitivity and specificity values of all alignment methods were determined based on S-div [41], except those of BLAST, which were determined based on a normalized sequence similarity score calculated according to the Formula 8 in [34]. (0.07 MB XLS)
    Preview · Dataset · Oct 2010
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This work presents a novel detection method for three-dimensional domain swapping (DS), a mechanism for forming protein quaternary structures that can be visualized as if monomers had "opened" their "closed" structures and exchanged the opened portion to form intertwined oligomers. Since the first report of DS in the mid 1990s, an increasing number of identified cases has led to the postulation that DS might occur in a protein with an unconstrained terminus under appropriate conditions. DS may play important roles in the molecular evolution and functional regulation of proteins and the formation of depositions in Alzheimer's and prion diseases. Moreover, it is promising for designing auto-assembling biomaterials. Despite the increasing interest in DS, related bioinformatics methods are rarely available. Owing to a dramatic conformational difference between the monomeric/closed and oligomeric/open forms, conventional structural comparison methods are inadequate for detecting DS. Hence, there is also a lack of comprehensive datasets for studying DS. Based on angle-distance (A-D) image transformations of secondary structural elements (SSEs), specific patterns within A-D images can be recognized and classified for structural similarities. In this work, a matching algorithm to extract corresponding SSE pairs from A-D images and a novel DS score have been designed and demonstrated to be applicable to the detection of DS relationships. The Matthews correlation coefficient (MCC) and sensitivity of the proposed DS-detecting method were higher than 0.81 even when the sequence identities of the proteins examined were lower than 10%. On average, the alignment percentage and root-mean-square distance (RMSD) computed by the proposed method were 90% and 1.8Å for a set of 1,211 DS-related pairs of proteins. The performances of structural alignments remain high and stable for DS-related homologs with less than 10% sequence identities. In addition, the quality of its hinge loop determination is comparable to that of manual inspection. This method has been implemented as a web-based tool, which requires two protein structures as the input and then the type and/or existence of DS relationships between the input structures are determined according to the A-D image-based structural alignments and the DS score. The proposed method is expected to trigger large-scale studies of this interesting structural phenomenon and facilitate related applications.
    Full-text · Article · Oct 2010 · PLoS ONE
  • Source
    Dataset: Figure S1
    [Show abstract] [Hide abstract]
    ABSTRACT: The number of DS-related homologs, common homologs and non-homologs remaining in the test set from the experiments presented in Fig. 2 as the alignment ratio cutoff decreases. The alignment cutoff applied in this study is designed to remove globally-superimposeable homologous protein pairs from the testing datasets. Since many common homologous pairs are globally-superimposeable, as this cutoff lowers, the amount of common homologs decreases much more rapidly than the amount of DS-related homologs, which are only partially-superimposeable, decreases. Meanwhile, the amount of non-homologous pairs remains nearly unchanged. Interestingly, relative to the amount of all homologs, including DS-related and common ones, the amount of DS-related homologs remaining in the dataset increases as the alignment ratio cutoff becomes lower within the tested range. (0.46 MB PDF)
    Preview · Dataset · Oct 2010
  • Source
    Dataset: Dataset S2
    [Show abstract] [Hide abstract]
    ABSTRACT: PDB entry list for the Dataset M. A list of the PDB entries of the protein pairs constituting sub-datasets Mds, Mch and Mnh. (0.24 MB XLS)
    Preview · Dataset · Oct 2010
  • Source
    Dataset: Table S6
    [Show abstract] [Hide abstract]
    ABSTRACT: Number of SSEs in the swapped domains. Here an SSE means an α-helix or a β-strand. The number of SSEs that a swapped domain contains roughly reflects the size of the domain. The ranges of SSEs were extracted from the PDB files according to the HELIX and SHEET records. (0.07 MB PDF)
    Preview · Dataset · Oct 2010
  • Source
    Dataset: Figure S2
    [Show abstract] [Hide abstract]
    ABSTRACT: Stability evaluations of the discriminatory model of the proposed method by k-fold cross-validations. The stability of the discriminatory model applied in the proposed DS-scoring scheme was evaluated based on two datasets. (a) Evaluations based on Dataset L. (b) Evaluations based on Dataset M. (1.34 MB PDF)
    Preview · Dataset · Oct 2010
  • Source

    Full-text · Conference Paper · Jan 2009
  • [Show abstract] [Hide abstract]
    ABSTRACT: Protein data has an explosive increasing rate both in volume and diversity, yet many of its structures remain unresolved, as well their functions remain to be identified. The conventional sequence alignment tools are insufficient in remote homology detection, while the current structural alignment tools would encounter the difficulties for proteins of unresolved structure. Here, we aimed to overcome the combination of two major obstacles for detecting remote homologous proteins: proteins with unresolved structure, and proteins of low sequence identity but high structural similarity. We proposed a novel method for improving the performance of protein matching problem, especially for mining remote homologous proteins. In this study, existing secondary structure prediction techniques were applied to provide the locations of secondary structure elements of proteins. The proposed LESS (Length Encoded Secondary Structure) profile was then constructed for segment-based similarity comparison in parallel computing. As compared to a conventional residue-based sequence alignment tool, detection of remote protein homologies through LESS profile is favourable in terms of speed and high sequence diversity, and its accuracy and performance can improve the deficiencies of the traditional primary sequence alignment methodology. This method may further support biologists in protein folding, evolution, and function prediction.
    No preview · Conference Paper · Jan 2009
  • [Show abstract] [Hide abstract]
    ABSTRACT: With the advancement of biological techniques, researches in the fields of marine evolution, ecology, and aquaculture have an explosive increasing rate both in volume and diversity. More than tens of thousands of genomic sequences were available for important marine species. However, most of the structures and corresponding functions remain unresolved and unknown. To discover the biological characteristics of genomic sequences of a marine species, an efficient and effective method for detecting distantly related proteins based on experimentally known functions from model species becomes an important strategy. In this study, Ensembl and NCBI genetic databases were employed to build a primitive database of selected marine species. The system contained an abundance of useful DNA, RNA and Protein information, and was named as the Marine Species Genome Database (MSGD). To identify remote proteins, we have proposed a novel LESS (length encoded secondary structure) profile to improve the information retrieval applications, especially for identifying protein sequences without resolved structures and within low sequence identity. The matching algorithms applied several existing secondary structure prediction techniques and a feasible encoding mechanism with respect to the length distribution of secondary structures. Due to the conservation of secondary structures of proteins in evolution, the proposed system demonstrated its suitability for similarity comparison of distantly related proteins, and several important protein sequences can be retrieved by MSGD while those well-known residue-based matching methods missed the identification.
    No preview · Conference Paper · Jan 2009
  • Tien-Yu Chen · Yen-Chu Hsu · Hsin-Wei Wang · Tun-Wen Pai

    No preview · Conference Paper · Jan 2008