Article

caTIES: a grid based system for coding and retrieval of surgical pathology reports and tissue specimens in support of translational research.

Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania 15232, USA.
Journal of the American Medical Informatics Association (Impact Factor: 3.93). 05/2010; 17(3):253-64. DOI: 10.1136/jamia.2009.002295
Source: PubMed

ABSTRACT The authors report on the development of the Cancer Tissue Information Extraction System (caTIES)--an application that supports collaborative tissue banking and text mining by leveraging existing natural language processing methods and algorithms, grid communication and security frameworks, and query visualization methods. The system fills an important need for text-derived clinical data in translational research such as tissue-banking and clinical trials. The design of caTIES addresses three critical issues for informatics support of translational research: (1) federation of research data sources derived from clinical systems; (2) expressive graphical interfaces for concept-based text mining; and (3) regulatory and security model for supporting multi-center collaborative research. Implementation of the system at several Cancer Centers across the country is creating a potential network of caTIES repositories that could provide millions of de-identified clinical reports to users. The system provides an end-to-end application of medical natural language processing to support multi-institutional translational research programs.

0 Followers
 · 
80 Views
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper reviews the research literature on text mining (TM) with the aim to find out (1) which cancer domains have been the subject of TM efforts, (2) which knowledge resources can support TM of cancer-related information and (3) to what extent systems that rely on knowledge and computational methods can convert text data into useful clinical information. These questions were used to determine the current state of the art in this particular strand of TM and suggest future directions in TM development to support cancer research.
    International Journal of Medical Informatics 09/2014; 83(9). DOI:10.1016/j.ijmedinf.2014.06.009 · 2.72 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Epilepsy is a common serious neurological disorder with a complex set of possible phenotypes ranging from pathologic abnormalities to variations in electroencephalogram. This paper presents a system called Phenotype Exaction in Epilepsy (PEEP) for extracting complex epilepsy phenotypes and their correlated anatomical locations from clinical discharge summaries, a primary data source for this purpose. PEEP generates candidate phenotype and anatomical location pairs by embedding a named entity recognition method, based on the Epilepsy and Seizure Ontology, into the National Library of Medicine's MetaMap program. Such candidate pairs are further processed using a correlation algorithm. The derived phenotypes and correlated locations have been used for cohort identification with an integrated ontology-driven visual query interface. To evaluate the performance of PEEP, 400 de-identified discharge summaries were used for development and an additional 262 were used as test data. PEEP achieved a micro-averaged precision of 0.924, recall of 0.931, and F1-measure of 0.927 for extracting epilepsy phenotypes. The performance on the extraction of correlated phenotypes and anatomical locations shows a micro-averaged F1-measure of 0.856 (Precision: 0.852, Recall: 0.859). The evaluation demonstrates that PEEP is an effective approach to extracting complex epilepsy phenotypes for cohort identification.
    Journal of Biomedical Informatics 06/2014; 51. DOI:10.1016/j.jbi.2014.06.006 · 2.48 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The Clinical Document Architecture (CDA) is a widely adopted international standard for clinical documents design; the CDA document with coded entries increases the semantic interoperability in the course of clinical document exchange. This study developed a CDA entry generation pipeline for converting free-text documents into CDA documents with entry-level. The results showed that CDA entry coding could be accomplished using natural language processing.
    2014 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI); 06/2014