Design of Analog-AI Hardware Accelerators for Transformer-based Language Models (Invited)
G. W. Burr, H. Tsai, W. Simon, I. Boybat, S. Ambrogio, C.-E. Ho, and 28 more authors
In 2023 International Electron Devices Meeting (IEDM), 2023
Analog Non-Volatile Memory-based accelerators offer high-throughput and energy-efficient Multiply-Accumulate operations for the large Fully-Connected layers that dominate Transformer-based Large Language Models. We describe architectural, wafer-scale testing, chip-demo, and hardware-aware training efforts towards such accelerators, and quantify the unique raw-throughput and latency benefits of Fully- (rather than Partially-)Weight-Stationary systems.