Abstract
Domain-specific accelerators for signal processing, image processing, and machine learning are increasingly being implemented on SRAM-based field-programmable gate arrays (FPGAs). Owing to the inherent error tolerance of such applications, approximate arithmetic operations, in particular, the design of approximate multipliers, have become an important research problem. Truncation of lower bits is a widely used approximation approach; however, analyzing and limiting the effects of carry-propagation due to this approximation has not been explored in detail yet. In this article, an optimized carry-aware approximate radix-4 Booth multiplier design is presented that leverages the built-in slice look-up tables (LUTs) and carry-chain resources in a novel configuration. The proposed multiplier simplifies the computation of the upper and lower bits and provides significant benefits in terms of FPGA resource usage (LUTs saving 38.5%-42.9%), Power Delay Product (PDP saving 49.4%-53%), performance metric (LUTs × critical path delay (CPD) × PDP saving 68.9%-73.1%) and errors (70% improvement in mean relative error distance) compared to the latest state-of-the-art designs. Therefore, the proposed designs are an attractive choice to implement multiplication on FPGA-based accelerators.
| Original language | English |
|---|---|
| Article number | 76 |
| Journal | ACM Transactions on Embedded Computing Systems |
| Volume | 22 |
| Issue number | 4 |
| DOIs | |
| State | Published - 3 Aug 2023 |
Bibliographical note
Publisher Copyright:© 2023 Association for Computing Machinery.
Keywords
- Neural Network
ASJC Scopus subject areas
- Software
- Hardware and Architecture
Fingerprint
Dive into the research topics of 'Toward Optimal Softcore Carry-aware Approximate Multipliers on Xilinx FPGAs'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver