Research interests

  • Interests
    reasonable platform independence

Publications

  • The 2nd DBCLS BioHackathon: interoperable bioinformatics Web services for integrated applications.

    Toshiaki Katayama, Mark D Wilkinson, Rutger Vos, Takeshi Kawashima, Shuichi Kawashima, Mitsuteru Nakao, Yasunori Yamamoto, Hong-Woo Chun, Atsuko Yamaguchi, Shin Kawano, [......], Martin Senger, Jessica Severin, Yasumasa Shigemoto, Hideaki Sugawara, James Taylor, Oswaldo Trelles, Chisato Yamasaki, Riu Yamashita, Noriyuki Satoh, Toshihisa Takagi

    Journal of biomedical semantics. 08/2011; 2:4.

    ABSTRACT: The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices ... [more] ABSTRACT: The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009. Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system for analysis of the transcription factor binding sites (TFBSs) enriched based on differential gene expression data obtained from a microarray experiment; iii) a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv) a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs. Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i) the absence of several useful data or analysis functions in the Web service "space"; ii) the lack of documentation of methods; iii) lack of compliance with the SOAP/WSDL specification among and between various programming-language libraries; and iv) incompatibility between various bioinformatics data formats. Although it was still difficult to solve real world problems posed to the developers by the biological researchers in attendance because of these problems, we note the promise of addressing these issues within a semantic framework.
  • Multifunctional crop trait ontology for breeders' data: field book, annotation, data discovery and semantic enrichment of the literature.

    Rosemary Shrestha, Elizabeth Arnaud, Ramil Mauleon, Martin Senger, Guy F Davenport, David Hancock, Norman Morrison, Richard Bruskiewich, Graham McLaren

    AoB plants. 01/2010; 2010:plq008.

    Agricultural crop databases maintained in gene banks of the Consultative Group on International Agricultural Research (CGIAR) are valuable sources of information for breeders. These databases provide comparative phenotypic and genotypic information that can help elucidate functional aspects of plant... [more] Agricultural crop databases maintained in gene banks of the Consultative Group on International Agricultural Research (CGIAR) are valuable sources of information for breeders. These databases provide comparative phenotypic and genotypic information that can help elucidate functional aspects of plant and agricultural biology. To facilitate data sharing within and between these databases and the retrieval of information, the crop ontology (CO) database was designed to provide controlled vocabulary sets for several economically important plant species. Existing public ontologies and equivalent catalogues of concepts covering the range of crop science information and descriptors for crops and crop-related traits were collected from breeders, physiologists, agronomists, and researchers in the CGIAR consortium. For each crop, relationships between terms were identified and crop-specific trait ontologies were constructed following the Open Biomedical Ontologies (OBO) format standard using the OBO-Edit tool. All terms within an ontology were assigned a globally unique CO term identifier. The CO currently comprises crop-specific traits for chickpea (Cicer arietinum), maize (Zea mays), potato (Solanum tuberosum), rice (Oryza sativa), sorghum (Sorghum spp.) and wheat (Triticum spp.). Several plant-structure and anatomy-related terms for banana (Musa spp.), wheat and maize are also included. In addition, multi-crop passport terms are included as controlled vocabularies for sharing information on germplasm. Two web-based online resources were built to make these COs available to the scientific community: the 'CO Lookup Service' for browsing the CO; and the 'Crops Terminizer', an ontology text mark-up tool. The controlled vocabularies of the CO are being used to curate several CGIAR centres' agronomic databases. The use of ontology terms to describe agronomic phenotypes and the accurate mapping of these descriptions into databases will be important steps in comparative phenotypic and genotypic studies across species and gene-discovery experiments.
  • The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows. The DBCLS BioHackathon Consortium*.

    Toshiaki Katayama, Kazuharu Arakawa, Mitsuteru Nakao, Keiichiro Ono, Kiyoko F Aoki-Kinoshita, Yasunori Yamamoto, Atsuko Yamaguchi, Shuichi Kawashima, Hong-Woo Chun, Jan Aerts, [......], Daron M Standley, Hideaki Sugawara, Toshiyuki Tashiro, Oswaldo Trelles, Rutger A Vos, Mark D Wilkinson, William York, Christian M Zmasek, Kiyoshi Asai, Toshihisa Takagi

    Journal of biomedical semantics. 01/2010; 1(1):8.

    Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However, vario... [more] Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However, various incompatibilities among database resources and analysis services make it difficult to connect and integrate these into interoperable workflows. To resolve this situation, we invited domain specialists from web service providers, client software developers, Open Bio* projects, the BioMoby project and researchers of emerging areas where a standard exchange data format is not well established, for an intensive collaboration entitled the BioHackathon 2008. The meeting was hosted by the Database Center for Life Science (DBCLS) and Computational Biology Research Center (CBRC) and was held in Tokyo from February 11th to 15th, 2008. In this report we highlight the work accomplished and the common issues arisen from this event, including the standardization of data exchange formats and services in the emerging fields of glycoinformatics, biological interaction networks, text mining, and phyloinformatics. In addition, common shared object development based on BioSQL, as well as technical challenges in large data management, asynchronous services, and security are discussed. Consequently, we improved interoperability of web services in several fields, however, further cooperation among major database centers and continued collaborative efforts between service providers and software developers are still necessary for an effective advance in bioinformatics web service technologies.
  • 6.89
    Impact points
    The phenotype and genotype experiment object model (PaGE-OM): a robust data structure for information related to DNA variation.

    Anthony J Brookes, Heikki Lehvaslaiho, Juha Muilu, Yasumasa Shigemoto, Takashige Oroguchi, Takeshi Tomiki, Atsuhiro Mukaiyama, Akihiko Konagaya, Toshio Kojima, Ituro Inoue, [......], Gudmundur A Thorisson, Debasis Dash, Haseena Rajeevan, Matthew W Darlison, Mark Woon, David Fredman, Albert V Smith, Martin Senger, Kimitoshi Naito, Hideaki Sugawara

    Human mutation. 07/2009; 30(6):968-77.

    Torrents of genotype-phenotype data are being generated, all of which must be captured, processed, integrated, and exploited. To do this optimally requires the use of standard and interoperable "object models," providing a description of how to partition the total spectrum of information b... [more] Torrents of genotype-phenotype data are being generated, all of which must be captured, processed, integrated, and exploited. To do this optimally requires the use of standard and interoperable "object models," providing a description of how to partition the total spectrum of information being dealt with into elemental "objects" (such as "alleles," "genotypes," "phenotype values," "methods") with precisely stated logical interrelationships (such as "A objects are made up from one or more B objects"). We herein propose the Phenotype and Genotype Experiment Object Model (PaGE-OM; www.pageom.org), which has been tested and implemented in conjunction with several major databases, and approved as a standard by the Object Management Group (OMG). PaGE-OM is open-source, ready for use by the wider community, and can be further developed as needs arise. It will help to improve information management, assist data integration, and simplify the task of informatics resource design and construction for genotype and phenotype data projects.
  • 7.33
    Impact points
    Interoperability with Moby 1.0--it's better than sharing your toothbrush!

    Mark D Wilkinson, Martin Senger, Edward Kawas, Richard Bruskiewich, Jerome Gouzy, Celine Noirot, Philippe Bardou, Ambrose Ng, Dirk Haase, Enrique de Andres Saiz, [......], Antonio J Pérez, Jose Aldana, M Mar Rojano, Raul Fernandez-Santa Cruz, Ismael Navas, Gary Schiltz, Andrew Farmer, Damian Gessler, Heiko Schoof, Andreas Groscurth

    Briefings in bioinformatics. 06/2008; 9(3):220-31.

    The BioMoby project was initiated in 2001 from within the model organism database community. It aimed to standardize methodologies to facilitate information exchange and access to analytical resources, using a consensus driven approach. Six years later, the BioMoby development community is pleased t... [more] The BioMoby project was initiated in 2001 from within the model organism database community. It aimed to standardize methodologies to facilitate information exchange and access to analytical resources, using a consensus driven approach. Six years later, the BioMoby development community is pleased to announce the release of the 1.0 version of the interoperability framework, registry Application Programming Interface and supporting Perl and Java code-bases. Together, these provide interoperable access to over 1400 bioinformatics resources worldwide through the BioMoby platform, and this number continues to grow. Here we highlight and discuss the features of BioMoby that make it distinct from other Semantic Web Service and interoperability initiatives, and that have been instrumental to its deployment and use by a wide community of bioinformatics service providers. The standard, client software, and supporting code libraries are all freely available at http://www.biomoby.org/.
  • The generation challenge programme platform: semantic standards and workbench for crop science.

    Richard Bruskiewich, Martin Senger, Guy Davenport, Manuel Ruiz, Mathieu Rouard, Tom Hazekamp, Masaru Takeya, Koji Doi, Kouji Satoh, Marcos Costa, [......], Mark Wilkinson, Benjamin Good, James Wagner, Jane Morris, David Marshall, Anthony Collins, Shoshi Kikuchi, Thomas Metz, Graham McLaren, Theo van Hintum

    International journal of plant genomics. 02/2008; 2008:369601.

    The Generation Challenge programme (GCP) is a global crop research consortium directed toward crop improvement through the application of comparative biology and genetic resources characterization to plant breeding. A key consortium research activity is the development of a GCP crop bioinformatics p... [more] The Generation Challenge programme (GCP) is a global crop research consortium directed toward crop improvement through the application of comparative biology and genetic resources characterization to plant breeding. A key consortium research activity is the development of a GCP crop bioinformatics platform to support GCP research. This platform includes the following: (i) shared, public platform-independent domain models, ontology, and data formats to enable interoperability of data and analysis flows within the platform; (ii) web service and registry technologies to identify, share, and integrate information across diverse, globally dispersed data sources, as well as to access high-performance computational (HPC) facilities for computationally intensive, high-throughput analyses of project data; (iii) platform-specific middleware reference implementations of the domain model integrating a suite of public (largely open-access/-source) databases and software tools into a workbench to facilitate biodiversity analysis, comparative analysis of crop genomic data, and plant breeding decision making.
  • The Generation Challenge Programme Platform: Semantic Standards and Workbench for Crop Science

    Bruskiewich Richard, Senger Martin, Davenport Guy, Ruiz Manuel, Rouard Mathieu, Hazekamp Tom, Takeya Masaru, Doi Koji, Satoh Kouji, Costa Marcos, [......], Wilkinson Mark, Good Benjamin, Wagner James, Morris Jane, Marshall David, Collins Anthony, Kikuchi Shoshi, Metz Thomas, McLaren Graham, Theo van Hintum

    International Journal of Plant Genomics. 01/2008;

    The Generation Challenge programme (GCP) is a global crop research consortium directed toward crop improvement through the application of comparative biology and genetic resources characterization to plant breeding. A key consortium research activity is the development of a GCP crop bioinformatics p... [more] The Generation Challenge programme (GCP) is a global crop research consortium directed toward crop improvement through the application of comparative biology and genetic resources characterization to plant breeding. A key consortium research activity is the development of a GCP crop bioinformatics platform to support GCP research. This platform includes the following: (i) shared, public platform-independent domain models, ontology, and data formats to enable interoperability of data and analysis flows within the platform; (ii) web service and registry technologies to identify, share, and integrate information across diverse, globally dispersed data sources, as well as to access high-performance computational (HPC) facilities for computationally intensive, high-throughput analyses of project data; (iii) platform-specific middleware reference implementations of the domain model integrating a suite of public (largely open-access/-source) databases and software tools into a workbench to facilitate biodiversity analysis, comparative analysis of crop genomic data, and plant breeding decision making.
  • 3.43
    Impact points
    BioMoby extensions to the Taverna workflow management and enactment software.

    Edward Kawas, Martin Senger, Mark D Wilkinson

    BMC bioinformatics. 02/2006; 7:523.

    BACKGROUND: As biology becomes an increasingly computational science, it is critical that we develop software tools that support not only bioinformaticians, but also bench biologists in their exploration of the vast and complex data-sets that continue to build from international genomic, proteomic, ... [more] BACKGROUND: As biology becomes an increasingly computational science, it is critical that we develop software tools that support not only bioinformaticians, but also bench biologists in their exploration of the vast and complex data-sets that continue to build from international genomic, proteomic, and systems-biology projects. The BioMoby interoperability system was created with the goal of facilitating the movement of data from one Web-based resource to another to fulfill the requirements of non-expert bioinformaticians. In parallel with the development of BioMoby, the European myGrid project was designing Taverna, a bioinformatics workflow design and enactment tool. Here we describe the marriage of these two projects in the form of a Taverna plug-in that provides access to many of BioMoby's features through the Taverna interface. RESULTS: The exposed BioMoby functionality aids in the design of "sensible" BioMoby workflows, aids in pipelining BioMoby and non-BioMoby-based resources, and ensures that end-users need only a minimal understanding of both BioMoby, and the Taverna interface itself. Users are guided through the construction of syntactically and semantically correct workflows through plug-in calls to the Moby Central registry. Moby Central provides a menu of only those BioMoby services capable of operating on the data-type(s) that exist at any given position in the workflow. Moreover, the plug-in automatically and correctly connects a selected service into the workflow such that users are not required to understand the nature of the inputs or outputs for any service, leaving them to focus on the biological meaning of the workflow they are constructing, rather than the technical details of how the services will interoperate. CONCLUSION: With the availability of the BioMoby plug-in to Taverna, we believe that BioMoby-based Web Services are now significantly more useful and accessible to bench scientists than are more traditional Web Services.
  • 2.29
    Impact points
    Generation Challenge Programme (GCP): standards for crop data.

    Richard Bruskiewich, Guy Davenport, Tom Hazekamp, Thomas Metz, Manuel Ruiz, Reinhard Simon, Masaru Takeya, Jennifer Lee, Martin Senger, Graham McLaren, Theo van Hintum

    Omics : a journal of integrative biology. 02/2006; 10(2):215-9.

    The Generation Challenge Programme (GCP) is an international research consortium striving to apply molecular biological advances to crop improvement for developing countries. Central to its activities is the creation of a next generation global crop information platform and network to share genetic ... [more] The Generation Challenge Programme (GCP) is an international research consortium striving to apply molecular biological advances to crop improvement for developing countries. Central to its activities is the creation of a next generation global crop information platform and network to share genetic resources, genomics, and crop improvement information. This system is being designed based on a comprehensive scientific domain object model and associated shared ontology. This model covers germplasm, genotype, phenotype, functional genomics, and geographical information data types needed in GCP research. This paper provides an overview of this modeling effort.
  • 7.48
    Impact points
    SOAP-based services provided by the European Bioinformatics Institute.

    S Pillai, V. Silventoinen, K Kallio, M Senger, S Sobhany, J Tate, S Velankar, A Golovin, K Henrick, P Rice, P Stoehr, R Lopez

    Nucleic acids research. 08/2005; 33(Web Server issue):W25-8.

    SOAP (Simple Object Access Protocol) (http://www.w3.org/TR/soap) based Web Services technology (http://www.w3.org/ws) has gained much attention as an open standard enabling interoperability among applications across heterogeneous architectures and different networks. The European Bioinformatics Inst... [more] SOAP (Simple Object Access Protocol) (http://www.w3.org/TR/soap) based Web Services technology (http://www.w3.org/ws) has gained much attention as an open standard enabling interoperability among applications across heterogeneous architectures and different networks. The European Bioinformatics Institute (EBI) is using this technology to provide robust data retrieval and data analysis mechanisms to the scientific community and to enhance utilization of the biological resources it already provides [N. Harte, V. Silventoinen, E. Quevillon, S. Robinson, K. Kallio, X. Fustero, P. Patel, P. Jokinen and R. Lopez (2004) Nucleic Acids Res., 32, 3-9]. These services are available free to all users from http://www.ebi.ac.uk/Tools/webservices.
  • 4.93
    Impact points
    Taverna: a tool for the composition and enactment of bioinformatics workflows.

    Tom Oinn, Matthew Addis, Justin Ferris, Darren Marvin, Martin Senger, Mark Greenwood, Tim Carver, Kevin Glover, Matthew R Pocock, Anil Wipat, Peter Li

    Bioinformatics (Oxford, England). 12/2004; 20(17):3045-54.

    MOTIVATION: In silico experiments in bioinformatics involve the co-ordinated use of computational tools and information repositories. A growing number of these resources are being made available with programmatic access in the form of Web services. Bioinformatics scientists will need to orchestrate ... [more] MOTIVATION: In silico experiments in bioinformatics involve the co-ordinated use of computational tools and information repositories. A growing number of these resources are being made available with programmatic access in the form of Web services. Bioinformatics scientists will need to orchestrate these Web services in workflows as part of their analyses. RESULTS: The Taverna project has developed a tool for the composition and enactment of bioinformatics workflows for the life sciences community. The tool includes a workbench application which provides a graphical user interface for the composition of workflows. These workflows are written in a new language called the simple conceptual unified flow language (Scufl), where by each step within a workflow represents one atomic task. Two examples are used to illustrate the ease by which in silico experiments can be represented as Scufl workflows using the workbench application.
  • Bioinformatics

    R D Stevens, H. J. Tipney, C J Wroe, T. M. Oinn, M. Senger, P W Lord, C A Goble, A Brass, M. Tassabehji

    11/2004;

    Motivation: In silico experiments necessitate the virtual organization of people, data, tools and machines.The scientific process also necessitates an awareness of the experience base, both of personal data as well as the wider context of work.The management of all these data and the co-ordination o... [more] Motivation: In silico experiments necessitate the virtual organization of people, data, tools and machines.The scientific process also necessitates an awareness of the experience base, both of personal data as well as the wider context of work.The management of all these data and the co-ordination of resources to manage such virtual organizations and the data surrounding them needs significant computational infrastructure support.
  • 4.93
    Impact points
    Exploring Williams-Beuren syndrome using myGrid.

    R D Stevens, H. J. Tipney, C J Wroe, T. M. Oinn, M. Senger, P W Lord, C A Goble, A Brass, M. Tassabehji

    Bioinformatics (Oxford, England). 09/2004; 20 Suppl 1:i303-10.

    MOTIVATION: In silico experiments necessitate the virtual organization of people, data, tools and machines. The scientific process also necessitates an awareness of the experience base, both of personal data as well as the wider context of work. The management of all these data and the co-ordination... [more] MOTIVATION: In silico experiments necessitate the virtual organization of people, data, tools and machines. The scientific process also necessitates an awareness of the experience base, both of personal data as well as the wider context of work. The management of all these data and the co-ordination of resources to manage such virtual organizations and the data surrounding them needs significant computational infra-structure support. RESULTS: In this paper, we show that (my)Grid, middleware for the Semantic Grid, enables biologists to perform and manage in silico experiments, then explore and exploit the results of their experiments. We demonstrate (my)Grid in the context of a series of bioinformatics experiments focused on a 1.5 Mb region on chromosome 7 which is deleted in Williams-Beuren syndrome (WBS). Due to the highly repetitive nature of sequence flanking/in the WBS critical region (WBSCR), sequencing of the region is incomplete leaving documented gaps in the released sequence. (my)Grid was used in a series of experiments to find newly sequenced human genomic DNA clones that extended into these 'gap' regions in order to produce a complete and accurate map of the WBSCR. Once placed in this region, these DNA sequences were analysed with a battery of prediction tools in order to locate putative genes and regulatory elements possibly implicated in the disorder. Finally, any genes discovered were submitted to a range of standard bioinformatics tools for their characterization. We report how (my)Grid has been used to create workflows for these in silico experiments, run those workflows regularly and notify the biologist when new DNA and genes are discovered. The (my)Grid services collect and co-ordinate data inputs and outputs for the experiment, as well as much provenance information about the performance of experiments on WBS. AVAILABILITY: The (my)Grid software is available via http://www.mygrid.org.uk
  • Exploring Williams-Beuren syndrome using

    Robert D. Stevens, Hannah J. Tipney, Chris Wroe, Thomas M. Oinn, Martin Senger, Phillip W. Lord, Carole A. Goble, Andy Brass, M. Tassabehji

    Proceedings Twelfth International Conference on Intelligent Systems for Molecular Biology/Third European Conference on Computational Biology 2004, Glasgow, UK, July 31-August 4, 2004; 01/2004

  • On the Use of Agents in a BioInformatics Grid

    Luc Moreau, Simon Miles, Carole Goble, Mark Greenwood, Vijay Dialani, Matthew Addis, Nedim Alpdemir, Rich Cawley, David De Roure, Justin Ferris, [......], V Radenkovic, Angus Roberts, Alan Robinson, Tom Rodden, Martin Senger, Nick Sharman, Robert Stevens, Brian Warboys, Anil Wipat, Chris Wroe

    06/2003;

    MyGrid is an e-Science Grid project that aims to help biologists and bioinformaticians to perform workflow-based in silico experiments, and help them to automate the management of such workflows through personalisation, notification of change and publication of experiments. In this paper, we describ... [more] MyGrid is an e-Science Grid project that aims to help biologists and bioinformaticians to perform workflow-based in silico experiments, and help them to automate the management of such workflows through personalisation, notification of change and publication of experiments. In this paper, we describe the architecture of myGrid and how it will be used by the scientist. We then show how myGrid can benefit from agents technologies. We have identified three key uses of agent technologies in myGrid: user agents, able to customize and personalise data, agent communication languages offering a generic and portable communication medium, and negotiation allowing multiple distributed entities to reach service level agreements.
  • On the use of agents in a BioInformatics grid

    L Moreau, S. Miles, C. Goble, M. Greenwood, V. Dialani, M Addis, N. Alpdemir, R. Cawley, D. De Roure, J. Ferris, [......], M.V. Radenkovic, A. Roberts, A. Robinson, T. Rodden, M. Senger, N. Sharman, R. Stevens, B. Warboys, A. Wipat, C. Wroe

    Cluster Computing and the Grid, 2003. Proceedings. CCGrid 2003. 3rd IEEE/ACM International Symposium on; 06/2003

    My Grid is an e-Science Grid project that aims to help biologists and bioinformaticians to perform workflow-based in silico experiments, and help them to automate the management of such workflows through personalisation, notification of change and publication of experiments. In this paper, we descri... [more] My Grid is an e-Science Grid project that aims to help biologists and bioinformaticians to perform workflow-based in silico experiments, and help them to automate the management of such workflows through personalisation, notification of change and publication of experiments. In this paper, we describe the architecture of my Grid and how it will be used by the scientist. We then show how my Grid can benefit from agents technologies. We have identified three key uses of agent technologies in my Grid: user agents, able to customize and personalise data, agent communication languages offering a generic and portable communication medium, and negotiation allowing multiple distributed entities to reach service level agreements.

Following (34)

35
Publications
35
Followers