Na0s
/

Llama-3.1-8B-Pruned-4-Layers_LoRA-PEFT-3.0

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Na0s commited on Sep 5

Commit

95b2396

•

1 Parent(s): f1557e0

Update README.md

Files changed (1) hide show

README.md +1 -2

README.md CHANGED Viewed

@@ -19,8 +19,7 @@ base_model: Na0s/Llama-3.1-8B-Pruned-4-Layers_LoRA-PEFT-2.0
 - We then recovered the performance loss induced by the pruning process by fine-tuning (from 0.2642 MMLU-Pro 0-shot to 0.3120), this step is called healing the pruned model.
 ### Upcoming Work:
-- More healing through SFT/DPO/TPO to see if we can get closer to the meta-llama/Meta-Llama-3.1-8B performance (which has an MMLU-Pro 0-shot of 0.3659). **(In Progress)**
-- Evaluate on benchmarks other than MMLU-PRO 0-shot (Unfortunately [lighteval](https://github.com/huggingface/lighteval) is broken right now [issue #191](https://github.com/huggingface/nanotron/issues/191), [issue #213](https://github.com/huggingface/nanotron/issues/213)).
 - Compare the same exact process when applied to meta-llama/LLama-3.1-70B.
 ### Training Details:

 - We then recovered the performance loss induced by the pruning process by fine-tuning (from 0.2642 MMLU-Pro 0-shot to 0.3120), this step is called healing the pruned model.
 ### Upcoming Work:
+- More healing through SFT/DPO/TPO to see if we can get closer to the meta-llama/Meta-Llama-3.1-8B performance (which has an MMLU-Pro 0-shot of 0.3659 vs 0.3120 for our model). **(In Progress)**
 - Compare the same exact process when applied to meta-llama/LLama-3.1-70B.
 ### Training Details: