Description
Journal of Biomedical Discovery and Collaboration is an open access, peer-reviewed, online journal soon to be launched by BioMed Central. Journal of Biomedical Discovery and Collaboration will encompass all aspects of scientific information management and studies of scientific practice, with a particular emphasis on biomedical laboratory investigations. Currently, many scattered disciplines study aspects of scientific practice, including informatics, computer science, sociology, cognitive psychology, scientometrics, rhetoric, and history and philosophy of science. The journal will connect these disparate perspectives with each other, and with contemporary scientific practice. Journal of Biomedical Discovery and Collaboration will emphasize original research, but will also consider the following article types: software articles, case studies, discovery notes, discovery diaries, reviews, commentaries, and debate articles. It will publish scholarly studies of scientific practice, information needs, tool development, bibliometrics, and data representation methods, amongst others.
Website
Other titles
JBDC
ISSN
1747-5333
OCLC
65636895
Material type
Document, Periodical, Internet resource
Document type
Internet Resource, Computer File, Journal / Magazine / Newspaper
Publications in this journal
Authors: Barbara Mirel, Felix Eichinger, Benjamin J Keller, Matthias Kretzler
Journal of biomedical discovery and collaboration. 6:1-33.
Background: Bioinformatics visualization tools are often not robust enough to support biomedical specialists’ complex exploratory analyses. Tools need to accommodate the workflows that scientistsBackground: Bioinformatics visualization tools are often not robust enough to support biomedical specialists’ complex exploratory analyses. Tools need to accommodate the workflows that scientists actually perform for specific translational research questions. To understand and model one of these workflows, we conducted a case-based, cognitive task analysis of a biomedical specialist’s exploratory workflow for the question: What functional interactions among gene products of high throughput expression data suggest previously unknown mechanisms of a disease? Results: From our cognitive task analysis four complementary representations of the targeted workflow were developed. They include: usage scenarios, flow diagrams, a cognitive task taxonomy, and a mapping between cognitive tasks and user-centered visualization requirements. The representations capture the flows of cognitive tasks that led a biomedical specialist to inferences critical to hypothesizing. We created representations at levels of detail that could strategically guide visualization development, and we confirmed this by making a trial prototype based on user requirements for a small portion of the workflow. Conclusions: Our results imply that visualizations should make available to scientific users “bundles of features†consonant with the compositional cognitive tasks purposefully enacted at specific points in the workflow. We also highlight certain aspects of visualizations that: (a) need more built-in flexibility; (b) are critical for negotiating meaning; and (c) are necessary for essential metacognitive support.
Authors: Wiley Souba
Journal of biomedical discovery and collaboration. 6:53-69.
Discovery, as a public attribution, and discovering, the act of conducting research, are experiences that entail "languaging" the unknown. This distinguishing property of language - its ability toDiscovery, as a public attribution, and discovering, the act of conducting research, are experiences that entail "languaging" the unknown. This distinguishing property of language - its ability to bring forth, out of the unspoken realm, new knowledge, original ideas, and novel thinking - is essential to the discovery process. In sharing their ideas and views, scientists create co-negotiated linguistic distinctions that prompt the revision of established mental maps and the adoption of new ones. While scientific mastery entails command of the conversational domain unique to a specific discipline, there is an emerging conversational domain that must be mastered that goes beyond the language unique to any particular specialty. Mastery of this new conversational domain gives researchers access to their hidden mental maps that limit their ways of thinking about and doing science. The most effective scientists use language to recontextualize their approach to problem-solving, which triggers new insights (previously unavailable) that result in new discoveries. While language is not a replacement for intuition and other means of knowing, when we try to understand what's outside of language we have to use language to do so.
Authors: Don R Swanson
Journal of biomedical discovery and collaboration. 6:34-47.
It is possible to find in the medical literature many articles that have been neglected or ignored, in some cases for many years, but which are worth bringing to light because they report unusualIt is possible to find in the medical literature many articles that have been neglected or ignored, in some cases for many years, but which are worth bringing to light because they report unusual findings that may be of current scientific interest. Resurrecting previously published but neglected hypotheses that have merit might be overlooked because it would seem to lack the novelty of "discovery" -- but the potential value of so doing is hardly arguable. Finding neglected hypotheses may be not only of great practical value, but also affords the opportunity to study the structure of such hypotheses in the hope of illuminating the more general problem of hypothesis generation.
Authors: George Hripcsak, Charles Knirsch, Li Zhou, Adam Wilcox, Genevieve Melton
Journal of biomedical discovery and collaboration. 6:48-52.
Large-scale electronic health record research introduces biases compared to traditional manually curated retrospective research. We used data from a community-acquired pneumonia study for which weLarge-scale electronic health record research introduces biases compared to traditional manually curated retrospective research. We used data from a community-acquired pneumonia study for which we had a gold standard to illustrate such biases. The challenges include data inaccuracy, incompleteness, and complexity, and they can produce in distorted results. We found that a naïve approach approximated the gold standard, but errors on a minority of cases shifted mortality substantially. Manual review revealed errors in both selecting and characterizing the cohort, and narrowing the cohort improved the result. Nevertheless, a significantly narrowed cohort might contain its own biases that would be difficult to estimate.
Authors: Gareth A Palidwor, Miguel A Andrade-Navarro
Journal of biomedical discovery and collaboration. 5:1-6.
The MEDLINE database of medical literature is routinely used by researchers and doctors to find articles pertaining to their area of interest. Insight into historical changes in research areas may beThe MEDLINE database of medical literature is routinely used by researchers and doctors to find articles pertaining to their area of interest. Insight into historical changes in research areas may be gained by chronological analysis of the 18 million records currently in the database, however such analysis is generally complex and time consuming. The authors' MLTrends web application graphs term usage in MEDLINE over time, allowing the determination of emergence dates for biomedical terms and historical variations in term usage intensity. MLTrends may be used at: http://www.ogic.ca/mltrends.
Authors: Heather Piwowar, Wendy Chapman
Journal of biomedical discovery and collaboration. 5:7-20.
Background: The ability to locate publicly available gene expression microarray datasets effectively and efficiently facilitates the reuse of these potentially valuable resources. CentralizedBackground: The ability to locate publicly available gene expression microarray datasets effectively and efficiently facilitates the reuse of these potentially valuable resources. Centralized biomedical databases allow users to query dataset metadata descriptions, but these annotations are often too sparse and diverse to allow complex and accurate queries. In this study we examined the ability of PubMed article identifiers to locate publicly available gene expression microarray datasets, and investigated whether the retrieved datasets were representative of publicly available datasets found through statements of data sharing in the associated research articles. Results: In a recent article, Ochsner and colleagues identified 397 studies that had generated gene expression microarray data. Their search of the full text of each publication for statements of data sharing revealed 203 publicly available datasets, including 179 in the Gene Expression Omnibus (GEO) or ArrayExpress databases. Our scripted search of GEO and ArrayExpress for PubMed identifiers of the same 397 studies returned 160 datasets, including six not found by the original search for data sharing statements. As a proportion of datasets found by either method, the search for data sharing statements identified 91.4% of the 209 publicly available datasets, compared to only 76.6% found by our search carried out using PubMed identifiers. Searching GEO or ArrayExpress alone retrieved 63.2% and 46.9% of all available datasets, respectively. There was no difference in the type of datasets found by PubMed identifier searches in terms of research theme or the technology used. However, the studies identified were more likely to have larger sample sizes, were more frequently cited, and published in higher impact journals. Conclusions: Searching database entries using PubMed identifiers can identify the majority of publicly available datasets, but caution is required when this method is used to collect data for policy evaluation since studies in low impact journals are disproportionately excluded. We urge authors of all datasets to complete the citation fields for their dataset submissions once publication details are known, thereby ensuring their work has maximum visibility and can contribute to subsequent studies.
Authors: Siddhartha Reddy Jonnalagadda, Philip Topham
Journal of biomedical discovery and collaboration. 5:50-75.
Background: Today, there are more than 18 million articles related to biomedical research indexed in MEDLINE, and information derived from them could be used effectively to save the great amount ofBackground: Today, there are more than 18 million articles related to biomedical research indexed in MEDLINE, and information derived from them could be used effectively to save the great amount of time and resources spent by government agencies in understanding the scientific landscape, including key opinion leaders and centers of excellence. Associating biomedical articles with organization names could significantly benefit the pharmaceutical marketing industry, health care funding agencies and public health officials and be useful for other scientists in normalizing author names, automatically creating citations, indexing articles and identifying potential resources or collaborators. Large amount of extracted information helps in disambiguating organization names using machine-learning algorithms. Results: We propose NEMO, a system for extracting organization names in the affiliation and normalizing them to a canonical organization name. Our parsing process involves multi-layered rule matching with multiple dictionaries. The system achieves more than 98% f-score in extracting organization names. Our process of normalization that involves clustering based on local sequence alignment metrics and local learning based on finding connected components. A high precision was also observed in normalization. Conclusion: NEMO is the missing link in associating each biomedical paper and its authors to an organization name in its canonical form and the Geopolitical location of the organization. This research could potentially help in analyzing large social networks of organizations for landscaping a particular topic, improving performance of author disambiguation, adding weak links in the co-author network of authors, augmenting NLM's MARS system for correcting errors in OCR output of affiliation field, and automatically indexing the PubMed citations with the normalized organization name and country. Our system is available as a graphical user interface available for download along with this paper.
Authors: Trevor Cohen, G Kerr Whitfield, Roger W Schvaneveldt, Kavitha Mukund, Thomas Rindflesch
Journal of biomedical discovery and collaboration. 5:21-49.
Background. EpiphaNet is an interactive knowledge discovery system which enables researchers to explore visually sets of relations extracted from MEDLINE using a combination of language processingBackground. EpiphaNet is an interactive knowledge discovery system which enables researchers to explore visually sets of relations extracted from MEDLINE using a combination of language processing techniques. In this paper, we discuss the theoretical and methodological foundations of the system, and evaluate the utility of the models that underlie it for literature-based discovery. In addition, we present a summary of results drawn from a qualitative analysis of over six hours of interaction with the system by basic medical scientists. Results: The system is able to simulate open and closed discovery, and is shown to generate associations that are both surprising and interesting within the area of expertise of the researchers concerned. Conclusions: EpiphaNet provides an interactive visual representation of associations between concepts, which is derived from distributional statistics drawn from across the spectrum of biomedical citations in MEDLINE. This tool is available online, providing biomedical scientists with the opportunity to identify and explore associations of interest to them.
Authors: Barbara Mirel
Journal of biomedical discovery and collaboration. 4(1):2.
ABSTRACT: BACKGROUND: Current usability studies of bioinformatics tools suggest that tools for exploratory analysis support some tasks related to finding relationships of interest but not the deepABSTRACT: BACKGROUND: Current usability studies of bioinformatics tools suggest that tools for exploratory analysis support some tasks related to finding relationships of interest but not the deep causal insights necessary for formulating plausible and credible hypotheses. To better understand design requirements for gaining these causal insights in systems biology analyses a longitudinal field study of 15 biomedical researchers was conducted. Researchers interacted with the same protein-protein interaction tools to discover possible disease mechanisms for further experimentation. RESULTS: Findings reveal patterns in scientists' exploratory and explanatory analysis and reveal that tools positively supported a number of well-structured query and analysis tasks. But for several of scientists' more complex, higher order ways of knowing and reasoning the tools did not offer adequate support. Results show that for a better fit with scientists' cognition for exploratory analysis systems biology tools need to better match scientists' processes for validating, for making a transition from classification to model-based reasoning, and for engaging in causal mental modelling. CONCLUSIONS: As the next great frontier in bioinformatics usability, tool designs for exploratory systems biology analysis need to move beyond the successes already achieved in supporting formulaic query and analysis tasks and now reduce current mismatches with several of scientists' higher order analytical practices. The implications of results for tool designs are discussed.
Authors: Neil Smalheiser, Sandra L De Groote, Mary M Case
Journal of biomedical discovery and collaboration. 4:6.
Authors: Matthijs den Besten, Arthur J Thomas, Ralph Schroeder
Journal of biomedical discovery and collaboration. 4:5.
Background It is often said that the life sciences are transforming into an information science. As laboratory experiments are starting to yield ever increasing amounts of data and the capacity toBackground It is often said that the life sciences are transforming into an information science. As laboratory experiments are starting to yield ever increasing amounts of data and the capacity to deal with those data is catching up, an increasing share of scientific activity is seen to be taking place outside the laboratories, sifting through the data and modelling "in silico" the processes observed "in vitro." The transformation of the life sciences and similar developments in other disciplines have inspired a variety of initiatives around the world to create technical infrastructure to support the new scientific practices that are emerging. The e-Science programme in the United Kingdom and the NSF Office for Cyberinfrastructure are examples of these. In Switzerland there have been no such national initiatives. Yet, this has not prevented scientists from exploring the development of similar types of computing infrastructures. In 2004, a group of researchers in Switzerland established a project, SwissBioGrid, to explore whether Grid computing technologies could be successfully deployed within the life sciences. This paper presents their experiences as a case study of how the life sciences are currently operating as an information science and presents the lessons learned about how existing institutional and technical arrangements facilitate or impede this operation. Results SwissBioGrid gave rise to two pilot projects: one for proteomics data analysis and the other for high-throughput molecular docking ("virtual screening") to find new drugs for neglected diseases (specifically, for dengue fever). The proteomics project was an example of a data management problem, applying many different analysis algorithms to Terabyte-sized datasets from mass spectrometry, involving comparisons with many different reference databases; the virtual screening project was more a purely computational problem, modelling the interactions of millions of small molecules with a limited number of protein targets on the coat of the dengue virus. Both present interesting lessons about how scientific practices are changing when they tackle the problems of large-scale data analysis and data management by means of creating a novel technical infrastructure. Conclusions In the experience of SwissBioGrid, data intensive discovery has a lot to gain from close collaboration with industry and harnessing distributed computing power. Yet the diversity in life science research implies only a limited role for generic infrastructure; and the transience of support means that researchers need to integrate their efforts with others if they want to sustain the benefits of their success, which are otherwise lost.
Authors: Hong Yu, Shashank Agarwal, Mark Johnston, Aaron Cohen
Journal of biomedical discovery and collaboration. 4(1):1.
ABSTRACT: BACKGROUND: Biomedical scientists need to access figures to validate research facts and to formulate or to test novel research hypotheses. However, figures are difficult to comprehendABSTRACT: BACKGROUND: Biomedical scientists need to access figures to validate research facts and to formulate or to test novel research hypotheses. However, figures are difficult to comprehend without associated text (e.g., figure legend and other reference text). We are developing automated systems to extract the relevant explanatory information along with figures extracted from full text articles. Such systems could be very useful in improving figure retrieval and in reducing the workload of biomedical scientists, who otherwise have to retrieve and read the entire full-text journal article to determine which figures are relevant to their research. As a crucial step, we studied the importance of associated text in biomedical figure comprehension. METHODS: Twenty subjects evaluated three figure-text combinations: figure+legend, figure+legend+title+abstract, and figure+full-text. Using a Likert scale, each subject scored each figure+text according to the extent to which the subject thought he/she understood the meaning of the figure and the confidence in providing the assigned score. Additionally, each subject entered a free text summary for each figure-text. We identified missing information using indicator words present within the text summaries. Both the Likert scores and the missing information were statistically analyzed for differences among the figure-text types. We also evaluated the quality of text summaries with the text-summarization evaluation method the ROUGE score. RESULTS: Using figure+legend category as a baseline, the comprehension and confidence scores entered by biomedical scientists increased 27% and 7% when title+abstract were added, and 40% and 11% when full-text was available. Figure comprehension on the basis of missing information analysis increased 25-128% when title+abstract were added, and 49-169% when full-text was available. The ROUGE score also followed the same trend, increasing 25%-30% when title+abstract were added, and 33-155% when full-text was available. The differences in figure comprehension at different levels of associated text were in most cases statistically significant. CONCLUSION: We conclude that the texts that appear in full-text biomedical articles are useful for understanding the meaning of a figure, and an effective figure-mining system needs to unlock the information beyond figure legend. Our work provides important guidance to the figure mining systems that extract information only from figure and figure legend.
Authors: Alan Gross
Journal of biomedical discovery and collaboration. 4:3.
This paper documents the cognitive strategies that led to Faraday's first significant scientific discovery. For Faraday, discovery is essentially a matter seeing as, of substituting for the eye allThis paper documents the cognitive strategies that led to Faraday's first significant scientific discovery. For Faraday, discovery is essentially a matter seeing as, of substituting for the eye all possess the eye of analysis all scientists must develop. In the process of making his first significant discovery, Faraday learns to dismiss the magnetic attractions and repulsions he and others had observed; by means of systematic variations in his experimental set-up, he learns to see these motions as circular: it is the first indication that an electro-magnetic field exists. In communicating his discoveries, Faraday, of course, takes into consideration his various audiences' varying needs and their differences in scientific competence; but whatever his audience, Faraday learns to convey what it feels like to do science, to shift from seeing to seeing as, from sight to insight.
Authors: Gregory F Welch, Diane H Sonnenwald, Henry Fuchs, Bruce Cairns, Ketan Mayer-Patel, Hanna M. Söderholm, Ruigang Yang, Andrei State, Herman Towles, Adrian Ilie, Manoj K Ampalam, Srinivas Krishnan, Vincent Noel, Michael Noland, James E. Manning
Journal of biomedical discovery and collaboration. 4:4.
Two-dimensional (2D) videoconferencing has been explored widely in the past 15-20 years to support collaboration in healthcare. Two issues that arise in most evaluations of 2D videoconferencing inTwo-dimensional (2D) videoconferencing has been explored widely in the past 15-20 years to support collaboration in healthcare. Two issues that arise in most evaluations of 2D videoconferencing in telemedicine are the difficulty obtaining optimal camera views and poor depth perception. To address these problems, we are exploring the use of a small array of cameras to reconstruct dynamic three-dimensional (3D) views of a remote environment and of events taking place within. The 3D views could be sent across wired or wireless networks to remote healthcare professionals equipped with fixed displays or with mobile devices such as personal digital assistants (PDAs). The remote professionals' viewpoints could be specified manually or automatically (continuously) via user head or PDA tracking, giving the remote viewers head-slaved or hand-slaved virtual cameras for monoscopic or stereoscopic viewing of the dynamic reconstructions. We call this idea remote 3D medical collaboration. In this article we motivate and explain the vision for 3D medical collaboration technology; we describe the relevant computer vision, computer graphics, display, and networking research; we present a proof-of-concept prototype system; and we present evaluation results supporting the general hypothesis that 3D remote medical collaboration technology could offer benefits over conventional 2D videoconferencing in emergency healthcare.
Authors: Tim Lenoir, Patrick Herron
Journal of biomedical discovery and collaboration. 4:8.
Background: The Context and Purpose of the Study Over the last decade China has emerged as a major producer of scientific publications, currently ranking second behind the US. During that timeBackground: The Context and Purpose of the Study Over the last decade China has emerged as a major producer of scientific publications, currently ranking second behind the US. During that time Chinese strategic policy initiatives have placed indigenous innovation at the heart of its economy while focusing internal R&D investments and the attraction of foreign investment in nanotechnology as one of their four top areas. China's scientific research publication and nanotechnology research publication production has reached a rank of second in the world, behind only the US. Despite these impressive gains, some scholars argue that the quality of Chinese nanotech research is inferior to US research quality due to lower overall times cited rates, suggesting that the US is still the world leader. We combine citation analysis, text mining, mapping, and data visualization to gauge the development and application of nanotechnology in China, particularly in biopharmananotechnology, and to measure the impact of Chinese policy on nanotechnology research production. Results, the main findings Our text mining-based methods provide results that counter existing claims about Chinese nanotechnology research quality. Due in large part to its strategic innovation policy, China's output of nanotechnology publications is on pace to surpass US production in or around 2012.A closer look at Chinese nanotechnology research literature reveals a large increase in research activity in China's biopharmananotechnology research since the implementation in January, 2006 of China's Medium & Long Term Scientific and Technological Development Plan Guidelines for the period 2006-2020 ("MLP"). Conclusions Since the implementation of the MLP, China has enjoyed a great deal of success producing bionano research findings while attracting a great deal of foreign investment from pharmaceutical corporations setting up advanced drug discovery operations. Given the combination of current scientific production growth as well as economic growth, a relatively low scientific capacity, and the ability of its policy to enhance such trends, China is in some sense already the new world leader in nanotechnology. Further, the Chinese national innovation system may be the new standard by which other national S&T policies should be measured.
Authors: Gary Merrill
Journal of biomedical discovery and collaboration. 4:7.
This paper advances a detailed exploration of the complex relationships among terms, concepts, and synonymy in the UMLS Metathesaurus, and proposes the study and understanding of the MetathesaurusThis paper advances a detailed exploration of the complex relationships among terms, concepts, and synonymy in the UMLS Metathesaurus, and proposes the study and understanding of the Metathesaurus from a model-theoretic perspective. Initial sections provide the background and motivation for such an approach, and a careful informal treatment of these notions is offered as a context and basis for the formal analysis. What emerges from this is a set of puzzles and confusions in the Metathesaurus and its literature pertaining to synonymy and its relation to terms and concepts. A model theory for a segment of the Metathesaurus is then constructed, and its adequacy relative to the informal treatment is demonstrated. Finally, it is shown how this approach clarifies and addresses the puzzles educed from the informal discussion, and how the model-theoretic perspective may be employed to evaluate some fundamental criticisms of the Metathesaurus.
Authors: William A Baumgartner, K Bretonnel Cohen, Lawrence Hunter
Journal of biomedical discovery and collaboration. 3:1.
ABSTRACT: BACKGROUND: Improved evaluation methodologies have been identified as a necessary prerequisite to the improvement of text mining theory and practice. This paper presents a publiclyABSTRACT: BACKGROUND: Improved evaluation methodologies have been identified as a necessary prerequisite to the improvement of text mining theory and practice. This paper presents a publicly available framework that facilitates thorough, structured, and large-scale evaluations of text mining technologies. The extensibility of this framework and its ability to uncover system-wide characteristics by analyzing component parts as well as its usefulness for facilitating third-party application integration are demonstrated through examples in the biomedical domain. RESULTS: Our evaluation framework was assembled using the Unstructured Information Management Architecture. It was used to analyze a set of gene mention identification systems involving 225 combinations of system, evaluation corpus, and correctness measure. Interactions between all three were found to affect the relative rankings of the systems. A second experiment evaluated gene normalization system performance using as input 4,097 combinations of gene mention systems and gene mention system-combining strategies. Gene mention system recall is shown to affect gene normalization system performance much more than does gene mention system precision, and high gene normalization performance is shown to be achievable with remarkably low levels of gene mention system precision. CONCLUSION: The software presented in this paper demonstrates the potential for novel discovery resulting from the structured evaluation of biomedical language processing systems, as well as the usefulness of such an evaluation framework for promoting collaboration between developers of biomedical language processing technologies. The code base is available as part of the BioNLP UIMA Component Repository on SourceForge.net.
Authors: Neil R Smalheiser, Wei Zhou, Vetle I Torvik
Journal of biomedical discovery and collaboration. 3:2.
ABSTRACT: BACKGROUND: PubMed is designed to provide rapid, comprehensive retrieval of papers that discuss a given topic. However, because PubMed does not organize the search output further, it isABSTRACT: BACKGROUND: PubMed is designed to provide rapid, comprehensive retrieval of papers that discuss a given topic. However, because PubMed does not organize the search output further, it is difficult for users to grasp an overview of the retrieved literature according to non-topical dimensions, to drill-down to find individual articles relevant to a particular individual's need, or to browse the collection. RESULTS: In this paper, we present Anne O'Tate, a web-based tool that processes articles retrieved from PubMed and displays multiple aspects of the articles to the user, according to pre-defined categories such as the "most important" words found in titles or abstracts; topics; journals; authors; publication years; and affiliations. Clicking on a given item opens a new window that displays all papers that contain that item. One can navigate by drilling down through the categories progressively, e.g., one can first restrict the articles according to author name and then restrict that subset by affiliation. Alternatively, one can expand small sets of articles to display the most closely related articles. We also implemented a novel cluster-by-topic method that generates a concise set of topics covering most of the retrieved articles. CONCLUSION: Anne O'Tate is an integrated, generic tool for summarization, drill-down and browsing of PubMed search results that accommodates a wide range of biomedical users and needs. It can be accessed at 4. Peer review and editorial matters for this article were handled by Aaron Cohen.
Authors: Belinda Linden
Journal of biomedical discovery and collaboration. 3:3.
ABSTRACT: BACKGROUND: The term blue skies research implies a freedom to carry out flexible, curiosity-driven research that leads to outcomes not envisaged at the outset. This research oftenABSTRACT: BACKGROUND: The term blue skies research implies a freedom to carry out flexible, curiosity-driven research that leads to outcomes not envisaged at the outset. This research often challenges accepted thinking and introduces new fields of study. Science policy in the UK has given growing support for short-term goal-oriented scientific research projects, with pressure being applied on researchers to demonstrate the future application of their work. These policies carry the risk of restricting freedom, curbing research direction, and stifling rather than stimulating the creativity needed for scientific discovery. METHODS: This study tracks the tortuous routes that led to three major discoveries in cardiology. It then investigates the constraints in current research, and opportunities that may be lost with existing funding processes, by interviewing selected scientists and fund providers for their views on curiosity-driven research and the freedom needed to allow science to flourish. The transcripts were analysed using a grounded theory approach to gather recurrent themes from the interviews. RESULTS: The results from these interviews suggest that scientists often cannot predict the future applications of research. Constraints such as lack of scientific freedom, and a narrow focus on relevance and accountability were believed to stifle the discovery process. Although it was acknowledged that some research projects do need a clear and measurable framework, the interviewees saw a need for inquisitive, blue skies research to be managed in a different way. They provided examples of situations where money allocated to 'safe' funding was used for more innovative research. CONCLUSION: This sample of key UK scientists and grant providers acknowledge the importance of basic blue skies research. Yet the current evaluation process often requires that scientists predict their likely findings and estimate short-term impact, which does not permit freedom of research direction. There is a vital need for prominent scientists and for universities to help the media, the public, and policy makers to understand the importance of innovative thought along with the need for scientists to have the freedom to challenge accepted thinking. Encouraging an avenue for blue skies research could have immense influence over future scientific discoveries.
Authors: Kristina M Hettne, Marissa de Mos, Anke G J de Bruijn, Marc Weeber, Scott Boyer, Erik M van Mulligen, Montserrat Cases, Jordi Mestres, Johan van der Lei
Journal of biomedical discovery and collaboration. 2:2.
BACKGROUND: Collaborative efforts of physicians and basic scientists are often necessary in the investigation of complex disorders. Difficulties can arise, however, when large amounts of informationBACKGROUND: Collaborative efforts of physicians and basic scientists are often necessary in the investigation of complex disorders. Difficulties can arise, however, when large amounts of information need to reviewed. Advanced information retrieval can be beneficial in combining and reviewing data obtained from the various scientific fields. In this paper, a team of investigators with varying backgrounds has applied advanced information retrieval methods, in the form of text mining and entity relationship tools, to review the current literature, with the intention to generate new insights into the molecular mechanisms underlying a complex disorder. As an example of such a disorder the Complex Regional Pain Syndrome (CRPS) was chosen. CRPS is a painful and debilitating syndrome with a complex etiology that is still unraveled for a considerable part, resulting in suboptimal diagnosis and treatment. RESULTS: A text mining based approach combined with a simple network analysis identified Nuclear Factor kappa B (NFkappaB) as a possible central mediator in both the initiation and progression of CRPS. CONCLUSION: The result shows the added value of a multidisciplinary approach combined with information retrieval in hypothesis discovery in biomedical research. The new hypothesis, which was derived in silico, provides a framework for further mechanistic studies into the underlying molecular mechanisms of CRPS and requires evaluation in clinical and epidemiological studies.
Authors: Rajan P Kulkarni
Journal of biomedical discovery and collaboration. 2:3.
ABSTRACT: Nanotechnology research has lately been of intense interest because of its perceived potential for many diverse fields of science. Nanotechnology's tools have found application in diverseABSTRACT: Nanotechnology research has lately been of intense interest because of its perceived potential for many diverse fields of science. Nanotechnology's tools have found application in diverse fields, from biology to device physics. By the 1990s, there was a concerted effort in the United States to develop a national initiative to promote such research. The success of this effort led to a significant influx of resources and interest in nanotechnology and nanobiotechnology and to the establishment of centralized research programs and facilities. Further government initiatives (at federal, state, and local levels) have firmly cemented these disciplines as 'big science,' with efforts increasingly concentrated at select laboratories and centers. In many respects, these trends mirror certain changes in academic science over the past twenty years, with a greater emphasis on applied science and research that can be more directly utilized for commercial applications.We also compare the National Nanotechnology Initiative and its successors to the Human Genome Project, another large-scale, government funded initiative. These precedents made acceptance of shifts in nanotechnology easier for researchers to accept, as they followed trends already established within most fields of science. Finally, these trends are examined in the design of technologies for detection and treatment of cancer, through the Alliance for Nanotechnology in Cancer initiative of the National Cancer Institute. Federal funding of these nanotechnology initiatives has allowed for expansion into diverse fields and the impetus for expanding the scope of research of several fields, especially biomedicine, though the ultimate utility and impact of all these efforts remains to be seen.
Authors: P. Bryan Heidorn, Carole L. Palmer, Dan Wright
Journal of biomedical discovery and collaboration. 2:1.
Data management and integration are complicated and ongoing problems that will require commitment of resources and expertise from the various biological science communities. Primary components ofData management and integration are complicated and ongoing problems that will require commitment of resources and expertise from the various biological science communities. Primary components of successful cross-scale integration are smooth information management and migration from one context to another. We call for a broadening of the definition of bioinformatics and bioinformatics training to span biological disciplines and biological scales. Training programs are needed that educate a new kind of informatics professional, Biological Information Specialists, to work in collaboration with various discipline-specific research personnel. Biological Information Specialists are an extension of the informationist movement that began within library and information science (LIS) over 30 years ago as a professional position to fill a gap in clinical medicine. These professionals will help advance science by improving access to scientific information and by freeing scientists who are not interested in data management to concentrate on their science.
Authors: Helen L Johnson, William A Baumgartner, Martin Krallinger, K Bretonnel Cohen, Lawrence Hunter
Journal of biomedical discovery and collaboration. 2:4.
ABSTRACT: BACKGROUND: Most biomedical corpora have not been used outside of the lab that created them, despite the fact that the availability of the gold-standard evaluation data that they provide isABSTRACT: BACKGROUND: Most biomedical corpora have not been used outside of the lab that created them, despite the fact that the availability of the gold-standard evaluation data that they provide is one of the rate-limiting factors for the progress of biomedical text mining. Data suggest that one major factor affecting the use of a corpus outside of its home laboratory is the format in which it is distributed. This paper tests the hypothesis that corpus refactoring - changing the format of a corpus without altering its semantics - is a feasible goal, namely that it can be accomplished with a semi-automatable process and in a time-effcient way. We used simple text processing methods and limited human validation to convert the Protein Design Group corpus into two new formats: WordFreak and embedded XML. We tracked the total time expended and the success rates of the automated steps. RESULTS: The refactored corpus is available for download at the BioNLP SourceForge website http://bionlp.sourceforge.net. The total time expended was just over three person-weeks, consisting of about 102 hours of programming time (much of which is one-time development cost) and 20 hours of manual validation of automatic outputs. Additionally, the steps required to refactor any corpus are presented. CONCLUSION: We conclude that refactoring of publicly available corpora is a technically and economically feasible method for increasing the usage of data already available for evaluating biomedical language processing systems.
Authors: Celeste Michelle Condit, L Bruce Railsback
Journal of biomedical discovery and collaboration. 2:5.
ABSTRACT: BACKGROUND: Biological organisms and their components are better conceived within categories based on similarity rather than on identity. Biologists routinely operate with similarity-basedABSTRACT: BACKGROUND: Biological organisms and their components are better conceived within categories based on similarity rather than on identity. Biologists routinely operate with similarity-based concepts such as "model organism" and "motif." There has been little exploration of the characteristics of the similarity-based categories that exist in biology. This study uses the case of the discovery and classification of zinc finger proteins to explore how biological categories based in similarity are represented. RESULTS: The existence of a category of "zinc finger proteins" was based in 1) a lumpy gradient of similarity, 2) a link between function and structure, 3) establishment of a range of appearance across systems and organisms, and 4) an evolutionary locus as a historically based common-ground. CONCLUSION: More systematic application of the idea of similarity-based categorization might eliminate the assumption that biological characteristics can only contribute to narrow categorization of humans. It also raises possibilities for refining data-driven exploration efforts.
Authors: Francisco M Couto, Mário J Silva, Vivian Lee, Emily Dimmer, Evelyn Camon, Rolf Apweiler, Harald Kirsch, Dietrich Rebholz-Schuhmann
Journal of biomedical discovery and collaboration. 1:19.
BACKGROUND: Annotation of proteins with gene ontology (GO) terms is ongoing work and a complex task. Manual GO annotation is precise and precious, but it is time-consuming. Therefore, instead ofBACKGROUND: Annotation of proteins with gene ontology (GO) terms is ongoing work and a complex task. Manual GO annotation is precise and precious, but it is time-consuming. Therefore, instead of curated annotations most of the proteins come with uncurated annotations, which have been generated automatically. Text-mining systems that use literature for automatic annotation have been proposed but they do not satisfy the high quality expectations of curators. RESULTS: In this paper we describe an approach that links uncurated annotations to text extracted from literature. The selection of the text is based on the similarity of the text to the term from the uncurated annotation. Besides substantiating the uncurated annotations, the extracted texts also lead to novel annotations. In addition, the approach uses the GO hierarchy to achieve high precision. Our approach is integrated into GOAnnotator, a tool that assists the curation process for GO annotation of UniProt proteins. CONCLUSION: The GO curators assessed GOAnnotator with a set of 66 distinct UniProt/SwissProt proteins with uncurated annotations. GOAnnotator provided correct evidence text at 93% precision. This high precision results from using the GO hierarchy to only select GO terms similar to GO terms from uncurated annotations in GOA. Our approach is the first one to achieve high precision, which is crucial for the efficient support of GO curators. GOAnnotator was implemented as a web tool that is freely available at http://xldb.di.fc.ul.pt/rebil/tools/goa/.
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed.
The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor.
Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.