Abstract
Up to writing this paper, existing High Performance Computing (HPC) systems do not provide proper quality of service (QoS) controls and reliability features because of two limitations: first, standard middleware libraries such as Message Passing Interface (MPI) and Parallel Virtual Machine (PVM) do not provide means for applications to specify service quality for computation and communication. Second, modern high-speed interconnects such as Infiniband, Myrinet and Quadrics are optimized for performance rather than fault-tolerance and QoS control. The Data-Centric Publish-Subscribe (DCPS) model - the core of Data Distribution Service (DDS) systems - defines standards that enable applications running on heterogeneous platforms to control various QoS policies in a net-centric system. In this paper, we present our novel model of incorporating DDS QoS and reliability controls into HPC systems.
| Original language | English |
|---|---|
| Title of host publication | 2014 IEEE/ACIS 15th International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, SNPD 2014 - Proceedings |
| Editors | Satoshi Takahashi, Ju Yeon Jo |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| ISBN (Electronic) | 9781479956043 |
| DOIs | |
| State | Published - 2014 |
Publication series
| Name | 2014 IEEE/ACIS 15th International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, SNPD 2014 - Proceedings |
|---|
Bibliographical note
Publisher Copyright:© 2014 IEEE.
Keywords
- HPC
- MPI
- Middleware
- QoS
ASJC Scopus subject areas
- Computer Science Applications
- Computer Networks and Communications
- Artificial Intelligence