Programmability and Performance of Heterogeneous Platforms

By: K. Krommydas, T. R. W. Scogland, and W. Feng

In: International Conference on Parallel and Distributed Systems, 2013

Posted: 01 Oct 2013

Tagged: GPU

In some work that could be considered a continuation of the architecture specific optimization analysis of GEM, Konstantinos Krommydas and I evaluate programmability performance tradeoffs across three architectures, an Intel CPU, Intel Xeon Phi, and an NVIDIA Kepler GPU. Some of the results were surprising, not the least of which being that when fully optimized the GPU core code ended up more readable than the highly optimized CPU code.


General-purpose computing on an ever-broadening array of parallel devices has led to an increasingly complex and multi-dimensional landscape with respect to programmability and performance optimization. The growing diversity of parallel architectures presents many challenges to the domain scientist, including device selection, programming model, and level of investment in optimization. All of these choices influence the balance between programmability and performance.

In this paper, we characterize the performance achievable across a range of optimizations, along with their programma- bility, for multi- and many-core platforms – specifically, an Intel Sandy Bridge CPU, Intel Xeon Phi co-processor, and NVIDIA Kepler K20 GPU – in the context of an n-body, molecular-modeling application called GEM. Our systematic approach to optimization delivers implementations with speed- ups of 194.98×, 885.18×, and 1020.88× on the CPU, Xeon Phi, and GPU, respectively, over the na ̈ıve serial version. Beyond the speed-ups, we characterize the incremental optimization of the code from na ̈ıve serial to fully hand-tuned on each platform through four distinct phases of increasing complexity to expose the strengths and weaknesses of the programming models offered by each platform.

author = {Krommydas, Konstantinos and Scogland, Thomas R W and Feng, Wu-chun},
title = {{On the Programmability and Performance of Heterogeneous Platforms}},
booktitle = {International Conference on Parallel and Distributed Systems},
year = {2013},
address = {Gungam},
month = oct,
rating = {0},
date-added = {2014-01-12T17:00:05GMT},
date-modified = {2014-01-12T17:07:10GMT}

You have reached the end.

blog comments powered by Disqus