A compact mathematical programming formulation for DNA motif finding

Carl Kingsford, Elena Zaslavsky, Mona Singh

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Scopus citations

Abstract

In the motif finding problem one seeks a set of mutually similar subsequences within a collection of biological sequences. This is an important and widely-studied problem, as such shared motifs in DNA often correspond to regulatory elements. We study a combinatorial framework where the goal is to find subsequences of a given length such that the sum of their pairwise distances is minimized. We describe a novel integer linear program for the problem, which uses the fact that; distances between subsequences come from a limited set of possibilities. We show how to tighten its linear programming relaxation by adding an exponential set of constraints and give an efficient separation algorithm that can find violated constraints, thereby showing that the tightened linear program can still be solved in polynomial time. We apply our approach to find optimal solutions for the motif finding problem and show that it is effective in practice in uncovering known transcription factor binding sites.

Original languageEnglish (US)
Title of host publicationCombinatorial Pattern Matching - 17th Annual Symposium, CPM 2006, Proceedings
PublisherSpringer Verlag
Pages233-245
Number of pages13
ISBN (Print)3540354557, 9783540354550
DOIs
StatePublished - Jan 1 2006
Event17th Annual Symposium on Combinatorial Pattern Matching, CPM 2006 - Barcelona, Spain
Duration: Jul 5 2006Jul 7 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4009 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other17th Annual Symposium on Combinatorial Pattern Matching, CPM 2006
CountrySpain
CityBarcelona
Period7/5/067/7/06

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'A compact mathematical programming formulation for DNA motif finding'. Together they form a unique fingerprint.

  • Cite this

    Kingsford, C., Zaslavsky, E., & Singh, M. (2006). A compact mathematical programming formulation for DNA motif finding. In Combinatorial Pattern Matching - 17th Annual Symposium, CPM 2006, Proceedings (pp. 233-245). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4009 LNCS). Springer Verlag. https://doi.org/10.1007/11780441_22