Papers
arxiv:2404.03828

Outlier-Efficient Hopfield Layers for Large Transformer-Based Models

Published on Apr 4
Authors:
,
,
,
,
,

Abstract

We introduce an Outlier-Efficient Modern Hopfield Model (termed OutEffHop) and use it to address the outlier-induced challenge of quantizing gigantic transformer-based models. Our main contribution is a novel associative memory model facilitating outlier-efficient associative memory retrievals. Interestingly, this memory model manifests a model-based interpretation of an outlier-efficient attention mechanism (Softmax_1): it is an approximation of the memory retrieval process of OutEffHop. Methodologically, this allows us to debut novel outlier-efficient Hopfield layers a powerful attention alternative with superior post-quantization performance. Theoretically, the Outlier-Efficient Modern Hopfield Model retains and improves the desirable properties of the standard modern Hopfield models, including fixed point convergence and exponential storage capacity. Empirically, we demonstrate the proposed model's efficacy across large-scale transformer-based and Hopfield-based models (including BERT, OPT, ViT and STanHop-Net), benchmarking against state-of-the-art methods including Clipped_Softmax and Gated_Attention. Notably, OutEffHop achieves on average sim22+\% reductions in both average kurtosis and maximum infinity norm of model outputs accross 4 models.

Community

Sign up or log in to comment

Models citing this paper 11

Browse 11 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2404.03828 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2404.03828 in a Space README.md to link it from this page.

Collections including this paper 1