DeepRiPP integrates multiomics data to automate discovery of novel ribosomally synthesized natural products

Nishanth J. Merwin, Walaa K. Mousa, Chris A. Dejong, Michael A. Skinnider, Michael J. Cannon, Haoxin Li, Keshav Dial, Mathusan Gunabalasingam, Chad Johnston, Nathan A. Magarvey

Research output: Contribution to journalArticlepeer-review

85 Scopus citations

Abstract

Microbial natural products represent a rich resource of evolved chemistry that forms the basis for the majority of pharmacotherapeutics. Ribosomally synthesized and posttranslationally modified peptides (RiPPs) are a particularly interesting class of natural products noted for their unique mode of biosynthesis and biological activities. Analyses of sequenced microbial genomes have revealed an enormous number of biosynthetic loci encoding RiPPs but whose products remain cryptic. In parallel, analyses of bacterial metabolomes typically assign chemical structures to only a minority of detected metabolites. Aligning these 2 disparate sources of data could provide a comprehensive strategy for natural product discovery. Here we present DeepRiPP, an integrated genomic and metabolomic platform that employs machine learning to automate the selective discovery and isolation of novel RiPPs. DeepRiPP includes 3 modules. The first, NLPPrecursor, identifies RiPPs independent of genomic context and neighboring biosynthetic genes. The second module, BARLEY, prioritizes loci that encode novel compounds, while the third, CLAMS, automates the isolation of their corresponding products from complex bacterial extracts. DeepRiPP pinpoints target metabolites using large-scale comparative metabolomics analysis across a database of 10,498 extracts generated from 463 strains. We apply the DeepRiPP platform to expand the landscape of novel RiPPs encoded within sequenced genomes and to discover 3 novel RiPPs, whose structures are exactly as predicted by our platform. By building on advances in machine learning technologies, DeepRiPP integrates genomic and metabolomic data to guide the isolation of novel RiPPs in an automated manner.

Original languageEnglish (US)
Pages (from-to)371-380
Number of pages10
JournalProceedings of the National Academy of Sciences of the United States of America
Volume117
Issue number1
DOIs
StatePublished - Jan 7 2020
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • General

Keywords

  • Genome mining
  • Machine learning
  • Metabolomics
  • Natural products
  • RiPPs

Fingerprint

Dive into the research topics of 'DeepRiPP integrates multiomics data to automate discovery of novel ribosomally synthesized natural products'. Together they form a unique fingerprint.

Cite this