Skip to main navigation Skip to search Skip to main content

Performance and portability studies with OpenACC accelerated version of GTC-P

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Accelerator-based heterogeneous computing is of paramount importance to High Performance Computing. The increasing complexity of the cluster architectures requires more generic, high-level programming models. OpenACC is a directive-based parallel programming model, which provides performance on and portability across a wide variety of platforms, including GPU, multicore CPU, and many-core processors. GTC-P is a discovery-science-capable real-world application code based on the Particle-In-Cell (PIC) algorithm that is well-established in the HPC area. Several native versions of GTC-P have been developed for supercomputers on TOP500 with different architectures, including Titan, Mira, etc. Motivated by the state-of-Art portability, we implemented the first OpenACC version of GTC-P and evaluated its performance portability across NVIDIA GPUS, Intel x86 and OpenPOWER CPUs. In this paper, we also proposed two key optimization methods for OpenACC implementation of PIC algorithm on multicore CPU and GPU including removing atomic operation and taking advantage of shared memory. OpenACC shows both impressive productivity and performance in a perspective of portability and scalability. The OpenACC version achieves more than 90% performance compared with the native versions with only about 300 LOC.

Original languageEnglish (US)
Title of host publicationProceedings - 17th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2016
EditorsHong Shen, Hong Shen, Yingpeng Sang, Hui Tian
PublisherIEEE Computer Society
Pages13-18
Number of pages6
ISBN (Electronic)9781509050819
DOIs
StatePublished - Jul 2 2016
Event17th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2016 - Guangzhou, China
Duration: Dec 16 2016Dec 18 2016

Publication series

NameParallel and Distributed Computing, Applications and Technologies, PDCAT Proceedings
Volume0

Conference

Conference17th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2016
Country/TerritoryChina
CityGuangzhou
Period12/16/1612/18/16

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Theoretical Computer Science
  • Computer Science Applications

Keywords

  • CUDA
  • GPU
  • GTC-P
  • Gyrokinetic PIC code
  • OpenACC
  • OpenPOWER

Fingerprint

Dive into the research topics of 'Performance and portability studies with OpenACC accelerated version of GTC-P'. Together they form a unique fingerprint.

Cite this