Skip to main navigation Skip to search Skip to main content

Recovery time and fault tolerance improvement for circuits mapped on SRAM-based FPGAs

  • Anees Ullah*
  • , Luca Sterpone
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

The rapid adoption of FPGA-based systems in space and avionics demands dependability rules from the design to the layout phases to protect against radiation effects. Triple Modular Redundancy is a widely used fault tolerance methodology to protect circuits against radiation-induced Single Event Upsets implemented on SRAM-based FPGAs. The accumulation of SEUs in the configuration memory can cause the TMR replicas to fail, requiring a periodic write-back of the configuration bit-stream. The associated system downtime due to scrubbing and the probability of simultaneous failures of two TMR domains are increasing with growing device densities. We propose a methodology to reduce the recovery time of TMR circuits with increased resilience to Cross-Domain Errors. Our methodology consists of an automated tool-flow for fine-grain error detection, error flags convergence and non-overlapping domain placement. The fine-grain error detection logic identifies the faulty domain using gate-level functions while the error flag convergence logic reduces the overwhelming number of flag signals. The non-overlapping placement enables selective domain reconfiguration and greatly reduces the number of Cross-Domain Errors. Our results demonstrate an evident reduction of the recovery time due to fast error detection time and selective partial reconfiguration of faulty domains. Moreover, the methodology drastically reduces Cross-Domain Errors in Look-Up Tables and routing resources. The improvements in recovery time and fault tolerance are achieved at an area overhead of a single LUT per majority voter in TMR circuits.

Original languageEnglish
Pages (from-to)425-442
Number of pages18
JournalJournal of Electronic Testing: Theory and Applications (JETTA)
Volume30
Issue number4
DOIs
StatePublished - Aug 2014

Keywords

  • Cross-Domain Errors (CDEs)
  • Partial and Dynamic Reconfiguration
  • Single Event Upsets (SEUs)
  • Triple Modular Redundancy (TMR)

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Recovery time and fault tolerance improvement for circuits mapped on SRAM-based FPGAs'. Together they form a unique fingerprint.

Cite this