Abstract
Solving large-scale problems in a variety of scientific and engineering fields requires efficient hierarchical methods to exploit parallelism. In this paper we present optimizations to enhance the performance of parallel N-body simulations (NBS) using the Barnes Hut approximation for a 60-core MIC accelerator. We focus on two sources of performance degradation in NBS: (1) the semi-static parallelism which leads to dynamic load unbalancing and (2) the processing of very large data exceeding the cache capacity. A first proposed optimization is to dynamically balance the load by computing load in an iteration as an estimate for the load in the next iteration. This optimization helps in even distribution of the load for the next iteration. The second proposed optimization subdivides the data into well-Adjusted chunks to enhance data reuse in shared caches. The proposed optimizations are tested on a 60-core MIC accelerator. Evaluation results showed that optimized NBS produces a speedup of up to 33% due to dynamic load balancing and 260% due to enhanced cached data reuse.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 7th IEEE International Advanced Computing Conference, IACC 2017 |
| Editors | Yarlagadda PadmaSai, Deepak Garg |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 783-788 |
| Number of pages | 6 |
| ISBN (Electronic) | 9781509015603 |
| DOIs | |
| State | Published - 12 Jul 2017 |
Publication series
| Name | Proceedings - 7th IEEE International Advanced Computing Conference, IACC 2017 |
|---|
Bibliographical note
Publisher Copyright:© 2017 IEEE.
Keywords
- Cache Reuse
- Load Balancing
- Many Integrated Core (MIC)
- N-Body Simulations
- Parallel Programming
ASJC Scopus subject areas
- Computer Networks and Communications
- Computer Science Applications
- Hardware and Architecture
- Electrical and Electronic Engineering