Na0s commited on
Commit
95b2396
1 Parent(s): f1557e0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -2
README.md CHANGED
@@ -19,8 +19,7 @@ base_model: Na0s/Llama-3.1-8B-Pruned-4-Layers_LoRA-PEFT-2.0
19
  - We then recovered the performance loss induced by the pruning process by fine-tuning (from 0.2642 MMLU-Pro 0-shot to 0.3120), this step is called healing the pruned model.
20
 
21
  ### Upcoming Work:
22
- - More healing through SFT/DPO/TPO to see if we can get closer to the meta-llama/Meta-Llama-3.1-8B performance (which has an MMLU-Pro 0-shot of 0.3659). **(In Progress)**
23
- - Evaluate on benchmarks other than MMLU-PRO 0-shot (Unfortunately [lighteval](https://github.com/huggingface/lighteval) is broken right now [issue #191](https://github.com/huggingface/nanotron/issues/191), [issue #213](https://github.com/huggingface/nanotron/issues/213)).
24
  - Compare the same exact process when applied to meta-llama/LLama-3.1-70B.
25
 
26
  ### Training Details:
 
19
  - We then recovered the performance loss induced by the pruning process by fine-tuning (from 0.2642 MMLU-Pro 0-shot to 0.3120), this step is called healing the pruned model.
20
 
21
  ### Upcoming Work:
22
+ - More healing through SFT/DPO/TPO to see if we can get closer to the meta-llama/Meta-Llama-3.1-8B performance (which has an MMLU-Pro 0-shot of 0.3659 vs 0.3120 for our model). **(In Progress)**
 
23
  - Compare the same exact process when applied to meta-llama/LLama-3.1-70B.
24
 
25
  ### Training Details: