DeFine: deep convolutional neural networks accurately quantify intensities of transcription factor-DNA binding and facilitate evaluation of functional non-coding variants

Meng Wang, Cheng Tai, E. Weinan, Liping Wei

Research output: Contribution to journalArticlepeer-review

79 Scopus citations

Abstract

The complex system of gene expression is regulated by the cell type-specific binding of transcription factors (TFs) to regulatory elements. Identifying variants that disrupt TF binding and lead to human diseases remains a great challenge. To address this, we implement sequence-based deep learning models that accurately predict the TF binding intensities to given DNA sequences. In addition to accurately classifying TF-DNA binding or unbinding, our models are capable of accurately predicting real-valued TF binding intensities by leveraging large-scale TF ChIP-seq data. The changes in the TF binding intensities between the altered sequence and the reference sequence reflect the degree of functional impact for the variant. This enables us to develop the tool DeFine (Deep learning based Functional impact of non-coding variants evaluator, http://define.cbi.pku.edu.cn) with improved performance for assessing the functional impact of non-coding variants including SNPs and indels. DeFine accurately identifies the causal functional non-coding variants from disease-associated variants in GWAS. DeFine is an effective and easy-to-use tool that facilities systematic prioritization of functional non-coding variants.

Original languageEnglish (US)
Pages (from-to)E69
JournalNucleic acids research
Volume46
Issue number11
DOIs
StatePublished - Dec 1 2018

All Science Journal Classification (ASJC) codes

  • Genetics

Fingerprint

Dive into the research topics of 'DeFine: deep convolutional neural networks accurately quantify intensities of transcription factor-DNA binding and facilitate evaluation of functional non-coding variants'. Together they form a unique fingerprint.

Cite this