Power Efficient Sharing-Aware GPU Data Management

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

18 Scopus citations

Abstract

The power consumed by memory system in GPUs is a significant fraction of the total chip power. As thread level parallelism increases, GPUs are likely to stress cache and memory bandwidth even more, thereby exacerbating power consumption. We observe that neighboring concurrent thread arrays (CTAs) within GPU applications share considerable amount of data. However, the default GPU scheduling policy spreads these CTAs to different streaming multiprocessor cores (SM) in a round-robin fashion. Since each SM has a private L1 cache, the shared data among CTAs are replicated across L1 caches of different SMs. Data replication reduces the effective L1 cache size which in turn increases the data movement and power consumption. The goal of this paper is to reduce data movement and increase effective cache space in GPUs. We propose a sharing-aware CTA scheduler that attempts to assign CTAs with data sharing to the same SM to reduce redundant storage of data in private L1 caches across SMs. We further enhance the scheduler with a sharing-aware cache allocation and replacement policy. The sharing-aware cache management approach dynamically classifies private and shared data. Private blocks are given higher priority to stay longer in L1 cache, and shared blocks are given higher priority to stay longer in L2 cache. Essentially, this approach increases the lifetime of shared blocks and private blocks in different cache levels. The experimental results show that the proposed scheme reduces the off-chip traffic by 19\% which translates to an average DRAM power reduction of 10% and performance improvement of 7%.

Original languageEnglish
Title of host publicationProceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium, IPDPS 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages698-707
Number of pages10
ISBN (Electronic)9781538639146
DOIs
StatePublished - 30 Jun 2017
Externally publishedYes

Publication series

NameProceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium, IPDPS 2017

Bibliographical note

Publisher Copyright:
© 2017 IEEE.

Keywords

  • Data Sharing
  • GPU Cache Management
  • Thread Block Scheduling

ASJC Scopus subject areas

  • Information Systems
  • Computer Networks and Communications
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Power Efficient Sharing-Aware GPU Data Management'. Together they form a unique fingerprint.

Cite this