Fault Tolerance in RRAM-based AI Accelerator with Guided Randomized Activation

Soyed Tuhin Ahmed, Eduardo Ortega, Ryan Dempsey, T. Patrick Xiao, Ben Feinberg, Christopher H. Bennett, Matthew J. Marinella, Krishnendu Chakrabarty
IEEE International Test Conference (ITC), 2025

Abstract

Resistive Random Access Memory (RRAM)-based analog in-memory computing (IMC) AI accelerators offer significant advantages over digital accelerators, including lower power consumption, reduced data movement, and higher computational efficiency. However, their deployment in safety-critical and edge applications is challenging due to their hardware non-idealities, such as programming error, conductance drift, and read noise, which degrade the inferencing accuracy of the implemented neural networks (NNs). Existing methods, including noise injection during training and activation function modifications, provide limited fault-tolerance in realistic scenarios with non-idealities. We propose a fault-tolerant activation function with architectural optimization that enhances robustness against hardware-induced variations with minimal hardware and NN architectural changes. During training, the proposed activation function features a stochastic negative region, which inherently injects noise into the negative region of the activation. During inferencing, the proposed activation function operates deterministically, ensuring compatibility with existing hardware while maintaining computational efficiency. Extensive evaluations with benchmark datasets demonstrate that the proposed approach significantly improves inferencing accuracy by up to 60% under varying noise levels, outperforming conventional activation functions as well as existing fault-tolerant activation functions. By enhancing fault-tolerance to hardware-induced errors, the proposed method enables reliable and energy-efficient RRAM-based analog IMC.

BibTeX

@inproceedings{ahmed2025faulttolerance,
  author    = {Soyed Tuhin Ahmed and Eduardo Ortega and Ryan Dempsey and T. Patrick Xiao and Ben Feinberg and Christopher H. Bennett and Matthew J. Marinella and Krishnendu Chakrabarty},
  title     = {{Fault Tolerance in RRAM-based AI Accelerator with Guided Randomized Activation}},
  booktitle = {IEEE International Test Conference (ITC)},
  year      = {2025},
  month     = {sep},
  address   = {Anaheim, CA, USA},
  doi       = {10.1109/ITC58126.2025.00039}
}

← All papers