Article

Identification of relations between risk factors and their pathologies or health conditions by mining scientific literature.

UFR SMBH Léonard de Vinci, Université Paris 13, 93017 Bobigny France.
Studies in health technology and informatics 01/2010; 160(Pt 2):964-8. pp.964-8
Source: PubMed

ABSTRACT Risk factors discovery and prevention is an active research field within the biomedical domain. Despite abundant existing information on risk factors, as found in bibliographical databases or on several websites, accessing this information may be difficult. Methods from Natural Language Processing and Information Extraction can be helpful to access it more easily. Specifically, we show a procedure for analyzing massive amounts of scientific literature and for detecting linguistically marked associations between pathologies and risk factors. This approach allowed us to extract over 22,000 risk factors and associated pathologies. The performed evaluations pointed out that (1) over 88% of risk factors for coronary heart disease are correct, (2) associated pathologies, when they could be compared to MeSH indexing, are correct in about 70%, and (3) in existing terminologies links between risk factors and their pathologies are seldom recorded.

0 0
 · 
1 Bookmark
 · 
12 Views

Full-text (3 Sources)

View
0 Downloads
Available from
9 Apr 2013

Keywords

active research field
 
associations
 
bibliographical databases
 
biomedical domain
 
massive amounts
 
Natural Language Processing
 
pathologies
 
risk factors
 
Risk factors discovery
 
scientific literature
 
websites