Experimental analysis of SMP scalability in the presence of coherence traffic and snoop filtering

Mayez A. Al-Mouhamed*, Khaled A. Daud

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Commodity multi-core SMPs may generate an enormous amount of coherency traffic. However, the impact of coherence traffic and snoop filtering on parallel program scalability has not attracted sufficient attention. We experimentally analyze the shared data access patterns of four typical applications having different memory layout. An OpenMp optimized execution model is derived for each application with emphasis on data dependencies and implied coherence messages. Using an 8-core SMP we present the obtained speedups versus change in the number of cores and problem scale. A discussion of potential limitation on scalability due to the application or SMP is presented. To assess the coherence behavior and its impact on scalability of parallel programs, a synthetic benchmark which alternates the data block ownership among two cores of the same or different processors is presented. It is found that coherence overheads including snoop filtering are responsible of significant limitation on parallel program scalability. For 8-core SMPs, speedup can be reduced by factors of 2.5 and 5 for row-major and column-major access patterns as compared to the use of private data, respectively. A truly parallel coherence protocol implementation is needed to provide truly scalable shared-memory model.

Original languageEnglish
Title of host publicationProceedings of the 14th IEEE International Conference on High Performance Computing and Communications, HPCC-2012 - 9th IEEE International Conference on Embedded Software and Systems, ICESS-2012
Pages81-88
Number of pages8
DOIs
StatePublished - 2012

Publication series

NameProceedings of the 14th IEEE International Conference on High Performance Computing and Communications, HPCC-2012 - 9th IEEE International Conference on Embedded Software and Systems, ICESS-2012

Keywords

  • HPC
  • distributed-memory
  • parallel programming
  • performance evaluation and speedup
  • shared-memory system

ASJC Scopus subject areas

  • Software

Fingerprint

Dive into the research topics of 'Experimental analysis of SMP scalability in the presence of coherence traffic and snoop filtering'. Together they form a unique fingerprint.

Cite this