Literature-based compound profiling: application to toxicogenomics

Raoul Frijters, Stefan Verhoeven, Wynand Alkema, René van Schaik, Jan Polman

Research output: Contribution to journalArticleAcademicpeer-review


Introduction: To reduce continuously increasing costs in drug development, adverse effects of drugs need to be detected as early as possible in the process. In recent years, compound-induced gene expression profiling methodologies have been developed to assess compound toxicity, including Gene Ontology term and pathway over-representation analyses. The objective of this study was to introduce an additional approach, in which literature information is used for compound profiling to evaluate compound toxicity and mode of toxicity. Methods: Gene annotations were built by text mining in Medline abstracts for retrieval of co-publications between genes, pathology terms, biological processes and pathways. This literature information was used to generate compound-specific keyword fingerprints, representing over-represented keywords calculated in a set of regulated genes after compound administration. To see whether keyword fingerprints can be used for assessment of compound toxicity, we analyzed microarray data sets of rat liver treated with 11 hepatotoxicants. Results: Analysis of keyword fingerprints of two genotoxic carcinogens, two nongenotoxic carcinogens, two peroxisome proliferators and two randomly generated gene sets, showed that each compound produced a specific keyword fingerprint that correlated with the experimentally observed histopathological events induced by the individual compounds. By contrast, the random sets produced a flat aspecific keyword profile, indicating that the fingerprints induced by the compounds reflect biological events rather than random noise. A more detailed analysis of the keyword profiles of diethylhexylphthalate, dimethylnitrosamine and methapyrilene (MPy) showed that the differences in the keyword fingerprints of these three compounds are based upon known distinct modes of action. Visualization of MPy-linked keywords and MPy-induced genes in a literature network enabled us to construct a mode of toxicity proposal for MPy, which is in agreement with known effects of MPy in literature. Conclusion: Compound keyword fingerprinting based on information retrieved from literature is a powerful approach for compound profiling, allowing evaluation of compound toxicity and analysis of the mode of action. © 2007 Future Medicine Ltd.
Original languageEnglish
Pages (from-to)1521-1534
Number of pages14
Issue number11
Publication statusPublished - 1 Nov 2007


  • algorithms
  • animals
  • carcinogens/toxicity
  • databases, bibliographic
  • databases, genetic
  • gene expression profiling
  • liver/drug effects
  • mutagens/toxicity
  • natural language processing
  • peroxisome proliferators/toxicity
  • rats
  • toxicogenetics/methods
  • vocabulary, controlled


Dive into the research topics of 'Literature-based compound profiling: application to toxicogenomics'. Together they form a unique fingerprint.

Cite this