Darwin-WGA: A co-processor provides increased sensitivity in whole genome alignments with high speedup

Yatish Turakhia, Sneha D. Goenka, Gill Bejerano, WIlliam J. Dally

Research output: Chapter in Book/Report/Conference proceedingConference contribution

28 Scopus citations

Abstract

Whole genome alignment (WGA) is an indispensable tool in comparative genomics to study how different lifeforms have been shaped by evolution at the molecular level. Existing software whole genome aligners require several CPU weeks to compare a pair of mammalian genomes and still miss several biologically-meaningful, high-scoring alignment regions. These aligners are based on the seed-filter-and-extend paradigm with an ungapped filtering stage. Ungapped filtering is responsible for the low sensitivity of these aligners but is used because it is 200× faster than performing gapped alignment, using dynamic programming, in software. In this paper, we show that both performance and sensitivity can be greatly improved by using a hardware accelerator for WGA. Using the genomes of two roundworms (C. elegans and C. Briggsae) and four fruit flies (D. melanogaster, D. simulans, D. yakuba, and D. pseudoobscura), we show that replacing ungapped filtering with gapped filtering increases the number of matching base-pairs in alignments by up to 3×. Our accelerator, Darwin-WGA, is the first hardware accelerator for whole genome alignment and accelerates the gapped filtering stage. Darwin-WGA also employs GACT-X, a novel algorithm used in the extension stage to align arbitrarily long genome sequences using a small on-chip memory, that provides better quality alignments at 2× improvement in memory and speed over the previously published GACT algorithm. Implemented on an FPGA, Darwin-WGA provides up to 24× improvement (performance/$) in WGA over iso-sensitive software. An ASIC implementation of the proposed architecture on TSMC 40nm technology takes around 43W power with 36mm2 area. It achieves up to 10× performance/watt improvement on whole genome alignments over state-of-the-art software at higher sensitivity, and up to 1,500× performance/watt improvement compared to iso-sensitive software. Darwin-WGA is released under open-source MIT license and is available from https://github.com/gsneha26/Darwin-WGA.

Original languageEnglish (US)
Title of host publicationProceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages359-372
Number of pages14
ISBN (Electronic)9781728114446
DOIs
StatePublished - Mar 26 2019
Externally publishedYes
Event25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019 - Washington, United States
Duration: Feb 16 2019Feb 20 2019

Publication series

NameProceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019

Conference

Conference25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019
Country/TerritoryUnited States
CityWashington
Period2/16/192/20/19

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Keywords

  • Co-processor
  • Comparative Genomics
  • Gapped Filtering
  • Whole Genome Alignment

Fingerprint

Dive into the research topics of 'Darwin-WGA: A co-processor provides increased sensitivity in whole genome alignments with high speedup'. Together they form a unique fingerprint.

Cite this