@macadeliccc on Hugging Face: "Reducing perplexity in LLM's through layer selective rank reduction…"

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

macadeliccc

posted an update Feb 9

Post

Reducing perplexity in LLM's through layer selective rank reduction

Layer-Selective Rank Reduction (LASER) is a denoising method that improves reasoning by the strategic removal of higher-order components from weight matrices in the multi-layer perceptron (MLP) layers without the need for additional parameters or training data. This process leverages singular value decomposition to identify and eliminate these components. This simple, yet effective, method has shown to improve question-answering performance by up to 27.4 percentage points.

LaserRMT implements this through a process by calculating signal to noise ratio (SNR) for each layer and selectively reducing the rank of these layers.The SNR method meticulously computes the SNR by leveraging singular value decomposition (SVD) to separate the signal (higher-order components) from the noise (lower-order components) within the weight matrices of the model's layers. The SNR calculation is what determines which layers would benefit from rank reduction without compromising the models integrity.

If a layer is identified that could benefit from rank reduction, then the layer will enter an incremental process where the weight matrices are reduced and reconstructed by retaining only the singular values that surpass the threshold. In the case of laserRMT, the threshold is calculated by Marchenko-Pastur Law.

@staticmethod
    def marchenko_pastur_threshold(sigma, n, m):
        beta = n / m if n < m else m / n
        threshold = sigma * np.sqrt((1 + np.sqrt(beta))**2)
        return thr

The two primary benefits of applying this method are reducing computational overhead of large language models and simultaneously improving output quality.

Credit to @ehartford @fernandofernandes @DavidGF for laserRMT

Resources:
☄️ AutoLaser: https://colab.research.google.com/drive/11j0e-w6BfvqeFN1gUrpOqdW0vcKqfVqP?usp=sharing
laserRMT: https://github.com/cognitivecomputations/laserRMT
The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction (2312.13558)

Undi95

Feb 10

I tried with my Borealis model, got an error :

Traceback (most recent call last):
File "/content/laserRMT/rmt_laser.py", line 199, in
loop_check, min_loss = modifier.search_optimal_layer_modification(layer_types=['mlp.down_proj', 'mlp.up_proj', 'self_attn.q_proj', 'self_attn.k_proj', 'self_attn.v_proj', 'self_attn.o_proj'],
File "/content/laserRMT/rmt_laser.py", line 132, in search_optimal_layer_modification
initial_perplexity = self.calculate_model_perplexity()
File "/content/laserRMT/rmt_laser.py", line 101, in calculate_model_perplexity
input_tok = gptq_data_utils.get_test_tokens(dataset, seed=0, seqlen=seqlen, model=model_str)
File "/content/laserRMT/lib/utils/gptq_data_utils.py", line 196, in get_test_tokens
return get_c4_new(train_samples, seed, seqlen, model)[1].input_ids
File "/content/laserRMT/lib/utils/gptq_data_utils.py", line 134, in get_c4_new
traindata = load_dataset(
File "/usr/local/lib/python3.10/dist-packages/datasets/load.py", line 2129, in load_dataset
builder_instance = load_dataset_builder(
File "/usr/local/lib/python3.10/dist-packages/datasets/load.py", line 1852, in load_dataset_builder
builder_instance: DatasetBuilder = builder_cls(
File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 373, in init
self.config, self.config_id = self._create_builder_config(
File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 539, in _create_builder_config
raise ValueError(
ValueError: BuilderConfig 'allenai--c4' not found. Available: ['en', 'en.noblocklist', 'en.noclean', 'realnewslike', 'multilingual', 'af', 'am', 'ar', 'az', 'be', 'bg', 'bg-Latn', 'bn', 'ca', 'ceb', 'co', 'cs', 'cy', 'da', 'de', 'el', 'el-Latn', 'en-multi', 'eo', 'es', 'et', 'eu', 'fa', 'fi', 'fil', 'fr', 'fy', 'ga', 'gd', 'gl', 'gu', 'ha', 'haw', 'hi', 'hi-Latn', 'hmn', 'ht', 'hu', 'hy', 'id', 'ig', 'is', 'it', 'iw', 'ja', 'ja-Latn', 'jv', 'ka', 'kk', 'km', 'kn', 'ko', 'ku', 'ky', 'la', 'lb', 'lo', 'lt', 'lv', 'mg', 'mi', 'mk', 'ml', 'mn', 'mr', 'ms', 'mt', 'my', 'ne', 'nl', 'no', 'ny', 'pa', 'pl', 'ps', 'pt', 'ro', 'ru', 'ru-Latn', 'sd', 'si', 'sk', 'sl', 'sm', 'sn', 'so', 'sq', 'sr', 'st', 'su', 'sv', 'sw', 'ta', 'te', 'tg', 'th', 'tr', 'uk', 'und', 'ur', 'uz', 'vi', 'xh', 'yi', 'yo', 'zh', 'zh-Latn', 'zu']

This error is also present in the notebook you shared (first cell of Laser).
Pls fix?
I have another try with the second Laser SNR in the background on a 7b model, running on Runpod, will edit result.

macadeliccc

Feb 10

•

edited Feb 10

I believe that this is related to the particular laser method. rmt_laser_snr and laser_snr_math are functional. I will look into the errors and see if its related to my notebook.

KnutJaegersberg

Feb 10

Want this to run on CPU

NeuralNovel

Feb 11

•

edited Feb 11

Does ☄️ AutoLaser only work with colab pro/A100?

macadeliccc

Feb 11

No a 7b model should work on V100 as well. If you want to test a larger model then it’s likely you will need the A100

In this post