Neural Network Inference in-Memory: Input-Conditioned Quantisation for effective-number-of-bits Improvement

Home
Happenings
Events
Neural Network Inference in-Memory: Input-Conditioned Quantisation for effective-number-of-bits Improvement

Date12th Feb 2024

Time09:00 AM

Venue Malviya (ESB-234)

PAST EVENT

Details

Compute-in-memory (CIM) ADC columns perform Multiply-Accumulate (MAC) operations in the analog domain. Pitch-matching analog-to-digital converters (ADCs) to dense memory pitches limits the area and achievable precision of ADCs. Consequently, CIM literature focuses on increasing MAC effective-number-of-bits (ENOB) by performing quantisation with lower step-size (to reduce error variance), clipping the long tail of the MAC distribution in the process. This work proposes an alternative, which is to quantise the input-conditioned MAC-Distribution (ICMD), which is the MAC distribution resulting from a given input. A one-shot ICMD location technique is detailed, which enables the proposed input-conditioned MAC quantisation (ICQ) scheme. Various DNNs are implemented in state-of-the-art clipped quantisation technique: Optimal Clipping Criterion (OCC) and compared against ICQ. Results show that DNNs employing ICQ see accuracy improvement over DNNs that do not employ ICQ by 1.6-39.4% at iso-ADC precision. DNN layers implemented using ICQ saw an improvement in MAC ENOB up to 4.3 bits over the precision of the employed quantiser, which is upto 2.2 bits more than the OCC result for the same.

Speakers

Mr. Ashwin Balagopal (EE17D200)

Electrical Engineering