Synthesis of Heterogeneous Distributed Architectures for Memory-intensive Applications

Chao Huang, Srivaths Ravi, Anand Raghunathan, Niraj K. Jha

Research output: Contribution to journalConference articlepeer-review

9 Scopus citations


Memory-intensive applications present unique challenges to an ASIC designer in terms of the choice of memory organization, memory size requirements, bandwidth and access latencies, etc. The high potential of single-chip distributed logic-memory architectures in addressing many of these issues has been recognized in general-purpose computing, and more recently in ASIC design. However, such architectures will be adopted widely by designers only when general techniques and tools for efficient high-level synthesis (HLS) of multi-partitioned ASICs become available. The techniques presented in this paper are motivated by the fact that many memoryintensive applications exhibit irregular array data access patterns (due to conditionals in loop nests, etc.). Synthesis should, therefore, be capable of determining a partitioned architecture, wherein array data and computations may have to be heterogeneously distributed for achieving the best performance speedup. Furthermore, the synthesis methodology should not be restricted by the nature of array index functions (affine or otherwise) in a behavior. Therefore, our methodology employs simulation to provide information about the access patterns of array data references in a behavior, which is used by the rest of our analysis. We use a combination of clustering and min-cut style partitioning techniques to partition the behavior into sub-behaviors while considering various factors including data access locality, balanced workloads, inter-partition communication, etc. Finally, we also employ an iterative improvement strategy to determine the best way of distributing array data into physical memory in each partition. Our experiments with several benchmark applications show that the proposed techniques can yield partitioned architectures that can achieve upto 2.2X performance speed-up over conventional HLS solutions, while achieving upto 1.6X performance speedup over the best homogeneous partitioning solution feasible.

Original languageEnglish (US)
Pages (from-to)46-53
Number of pages8
JournalIEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers
StatePublished - 2003
EventIEEE/ACM International Conference on Computer Aided Design ICCAD 2003: IEEE/ACM Digest of Technical Papers - San Jose, CA, United States
Duration: Nov 9 2003Nov 13 2003

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Science Applications
  • Computer Graphics and Computer-Aided Design


Dive into the research topics of 'Synthesis of Heterogeneous Distributed Architectures for Memory-intensive Applications'. Together they form a unique fingerprint.

Cite this