Identifying cancer hazards is the first step towards cancer prevention. The IARC Monographs Programme, which has evaluated nearly 1000 agents for carcinogenic potential since 1971, typically selects agents for hazard identification on the basis of public nominations, expert advice, published data on carcinogenicity, and public health importance.
Here we present a novel and complementary strategy for identifying agents for hazard evaluation using chemoinformatics, database integration and automated text mining.
To inform selection among a broad range of pesticides nominated for evaluation, we identified and screened nearly 6000 relevant chemical structures, thereafter systematically compiled information on 980 pesticides, creating chemical similarity network maps that allowed cluster visualization by chemical similarity, pesticide class, and publicly available information concerning cancer epidemiology, cancer bioassays, and carcinogenic mechanisms. For the IARC Monograph meetings that took place in March and June 2015, this approach supported high priority evaluation of glyphosate, malathion, parathion, tetrachlorvinphos, diazinon, DDT, lindane, and 2,4-D.
This systematic approach, accounting for chemical similarity and overlaying multiple data sources, can be used by risk assessors as well as researchers to systematize, inform and increase efficiency in selecting and prioritizing agents for hazard identification, risk assessment, regulation or further investigation. This approach could be extended to an array of outcomes and agents, including occupational carcinogens, drugs, and foods.