MoF-LoRA: Mixture of Low-Rank Fault-Tolerant Experts for RRAM-based In-Memory Computing

Soyed Tuhin Ahmed, Eduardo Ortega, T. Patrick Xiao, Ben Feinberg, Christopher H. Bennett, Matthew J. Marinella, Krishnendu Chakrabarty
IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS), 2026

Abstract

Resistive random-access memory (RRAM)-based in-memory computing (R-IMC) architectures achieve energy efficiency by avoiding data movement for matrix-vector computation. However, R-IMC suffers from reliability challenges, such as programming errors, conductance drift, and stuck-at faults, which introduce noise and degrade performance. Several approaches have been proposed in the literature to enhance reliability; however, these solutions are limited to a few specific types of non-idealities, and they cannot be applied to large models, pre-trained models, or R-IMC architectures without introducing considerable computation and memory overhead. We propose a design-time solution that uses a mixture of fault-tolerant experts by adapting pre-trained models via low-rank adaptation. We train non-idealities-specific experts, which are merged with the base weights before mapping, ensuring no added cost yet reliable R-IMC inferencing. For a range of vision and language tasks with transformer models, we show that our approach improves inferencing accuracy by up to 76% under a mixture of non-idealities, enabling reliable mapping and inferencing of vision–language models with no additional cost at runtime.

BibTeX

@article{ahmed2026moflora,
  author    = {Soyed Tuhin Ahmed and Eduardo Ortega and T. Patrick Xiao and Ben Feinberg and Christopher H. Bennett and Matthew J. Marinella and Krishnendu Chakrabarty},
  title     = {{MoF-LoRA: Mixture of Low-Rank Fault-Tolerant Experts for RRAM-based In-Memory Computing}},
  journal   = {IEEE Journal on Emerging and Selected Topics in Circuits and Systems},
  year      = {2026},
  pages     = {1--1},
  doi       = {10.1109/JETCAS.2026.3655242}
}

← All papers