search GenBank: interactive orchestration and ad-hoc choreography of Web services in the exploration of the biomedical resources of the National Center For Biotechnology Information

BMC Bioinformatics (Impact Factor: 2.58). 03/2013; 14(1):73. DOI: 10.1186/1471-2105-14-73
Source: PubMed


Due to the growing number of biomedical entries in data repositories of the National Center for Biotechnology Information (NCBI), it is difficult to collect, manage and process all of these entries in one place by third-party software developers without significant investment in hardware and software infrastructure, its maintenance and administration. Web services allow development of software applications that integrate in one place the functionality and processing logic of distributed software components, without integrating the components themselves and without integrating the resources to which they have access. This is achieved by appropriate orchestration or choreography of available Web services and their shared functions. After the successful application of Web services in the business sector, this technology can now be used to build composite software tools that are oriented towards biomedical data processing.

We have developed a new tool for efficient and dynamic data exploration in GenBank and other NCBI databases. A dedicated search GenBank system makes use of NCBI Web services and a package of Entrez Programming Utilities (eUtils) in order to provide extended searching capabilities in NCBI data repositories. In search GenBank users can use one of the three exploration paths: simple data searching based on the specified user’s query, advanced data searching based on the specified user’s query, and advanced data exploration with the use of macros. search GenBank orchestrates calls of particular tools available through the NCBI Web service providing requested functionality, while users interactively browse selected records in search GenBank and traverse between NCBI databases using available links. On the other hand, by building macros in the advanced data exploration mode, users create choreographies of eUtils calls, which can lead to the automatic discovery of related data in the specified databases.

search GenBank extends standard capabilities of the NCBI Entrez search engine in querying biomedical databases. The possibility of creating and saving macros in the search GenBank is a unique feature and has a great potential. The potential will further grow in the future with the increasing density of networks of relationships between data stored in particular databases. search GenBank is available for public use at

Download full-text


Available from: Dariusz Mrozek,
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: MicroRNAs constitute an important class of noncoding, single-stranded, ~22 nucleotide long RNA molecules encoded by endogenous genes. They play an important role in regulating gene transcription and the regulation of normal development. MicroRNAs can be associated with disease; however, only a few microRNA-disease associations have been confirmed by traditional experimental approaches. We introduce two methods to predict microRNA-disease association. The first method, KATZ, focuses on integrating the social network analysis method with machine learning and is based on networks derived from known microRNA-disease associations, disease-disease associations, and microRNA-microRNA associations. The other method, CATAPULT, is a supervised machine learning method. We applied the two methods to 242 known microRNA-disease associations and evaluated their performance using leave-one-out cross-validation and 3-fold cross-validation. Experiments proved that our methods outperformed the state-of-the-art methods.
    08/2015; 2015(10):810514. DOI:10.1155/2015/810514
  • [Show abstract] [Hide abstract]
    ABSTRACT: Background: Leptin, a 16 kDa peptide hormone synthesized and secreted specifically from white adipose cells protects neurons against amyloid β-induced toxicity, by increasing Apolipoprotein E (APO E)-dependent uptake of β amyloid into the cells, thereby, protect individuals from developing Alzheimer's disease (AD). The APO E ε4 allele is a known genetic risk factor for AD by accelerating onset. It is estimated that the lifetime risk of developing AD increases to 29% for carriers with one ε4 allele and 9% for those with no ε4 allele. Objectives: To determine the levels of serum leptin, cholesterol, low density lipoprotein (LDL-C), and high density lipoprotein (HDL-C) in the diagnosed cases of AD and the association of them with cognitive decline and Apolipoprotein E (APO E) genotypes in AD. Materials and methods: Serum levels of serum leptin, cholesterol, LDL-C, and HDL-C along with APO E polymorphism were studied in 39 subjects with probable AD and 42 cognitive normal individuals. Results: AD group showed significantly lower levels of leptin (P = 0.00) as compared to control group. However, there was no significant difference in cholesterol, triglycerides, LDL-C, and HDL-C levels in AD and control groups. The frequency of ε4 allele in AD (38.5%) was found to be significantly higher than in control (10.3%). ε3 allele was more frequent than ε4 allele in AD and control group.
    Annals of Indian Academy of Neurology 10/2015; DOI:10.4103/0972-2327.157255 · 0.60 Impact Factor