|
--- |
|
license: llama3 |
|
library_name: transformers |
|
tags: |
|
- nsfw |
|
- not-for-all-audiences |
|
- llama-3 |
|
- text-generation-inference |
|
- moe |
|
- mergekit |
|
- merge |
|
model-index: |
|
- name: Llama-Salad-4x8B-V3 |
|
results: |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: IFEval (0-Shot) |
|
type: HuggingFaceH4/ifeval |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: inst_level_strict_acc and prompt_level_strict_acc |
|
value: 66.54 |
|
name: strict accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HiroseKoichi/Llama-Salad-4x8B-V3 |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: BBH (3-Shot) |
|
type: BBH |
|
args: |
|
num_few_shot: 3 |
|
metrics: |
|
- type: acc_norm |
|
value: 31.93 |
|
name: normalized accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HiroseKoichi/Llama-Salad-4x8B-V3 |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MATH Lvl 5 (4-Shot) |
|
type: hendrycks/competition_math |
|
args: |
|
num_few_shot: 4 |
|
metrics: |
|
- type: exact_match |
|
value: 8.53 |
|
name: exact match |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HiroseKoichi/Llama-Salad-4x8B-V3 |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: GPQA (0-shot) |
|
type: Idavidrein/gpqa |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: acc_norm |
|
value: 7.05 |
|
name: acc_norm |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HiroseKoichi/Llama-Salad-4x8B-V3 |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MuSR (0-shot) |
|
type: TAUR-Lab/MuSR |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: acc_norm |
|
value: 6.45 |
|
name: acc_norm |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HiroseKoichi/Llama-Salad-4x8B-V3 |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MMLU-PRO (5-shot) |
|
type: TIGER-Lab/MMLU-Pro |
|
config: main |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 27.98 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HiroseKoichi/Llama-Salad-4x8B-V3 |
|
name: Open LLM Leaderboard |
|
--- |
|
|
|
# Llama-Salad-4x8B-V3 |
|
Changes in V3: |
|
- Uses `L3-8B-Stheno-v3.2` as the base model instead of `Meta-Llama-3-8B-Instruct` |
|
- Removed `opus-v1.2-llama-3-8b-instruct-run3.5-epoch2.5` and added `Einstein-v6.1-Llama3-8B` |
|
- Swapped `Llama-3-Soliloquy-8B-v2` for `L3-8B-Stheno-v3.2` |
|
|
|
I was clearly wrong when I said V2 would be difficult to improve on, because V3 is significantly better in just about every aspect. Stheno-v3.2 fixed all of the issues present in Stheno-v3.1, making it my favorite roleplay model and the best base model for llama-3 MoE merges. |
|
|
|
The one thing I do want to improve on is finding a better conversational model than Meta-Llama-3-8B-Instruct; it's good for that use case, but I'm sure there's a better one out there. I tried using llama-3-cat-8b-instruct-v1, but it absolutely tanked the model's situational awareness and kept making blatantly contradictory statements. |
|
|
|
# Quantization Formats |
|
**GGUF** |
|
- Static: |
|
- https://huggingface.co/mradermacher/Llama-Salad-4x8B-V3-GGUF |
|
- Imatrix: |
|
- https://huggingface.co/mradermacher/Llama-Salad-4x8B-V3-i1-GGUF |
|
|
|
# Details |
|
- **License**: [llama3](https://llama.meta.com/llama3/license/) |
|
- **Instruct Format**: [llama-3](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/) |
|
- **Context Size**: 8K |
|
|
|
## Models Used |
|
- [L3-8B-Stheno-v3.2](https://huggingface.co/Sao10K/L3-8B-Stheno-v3.2) |
|
- [Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct) |
|
- [Llama-3-8B-Synthia-v3.5](https://huggingface.co/migtissera/Llama-3-8B-Synthia-v3.5) |
|
- [Einstein-v6.1-Llama3-8B](https://huggingface.co/Weyaxi/Einstein-v6.1-Llama3-8B) |
|
|
|
## Merge Config |
|
```yaml |
|
base_model: Sao10K/L3-8B-Stheno-v3.2 |
|
gate_mode: hidden |
|
dtype: bfloat16 |
|
experts_per_token: 2 |
|
experts: |
|
- source_model: NousResearch/Meta-Llama-3-8B-Instruct |
|
positive_prompts: |
|
- "chat" |
|
- "conversation" |
|
- source_model: Weyaxi/Einstein-v6.1-Llama3-8B |
|
positive_prompts: |
|
- "science" |
|
- "physics" |
|
- "chemistry" |
|
- "biology" |
|
- "math" |
|
- "step-by-step" |
|
- "logical reasoning" |
|
- "multilingual" |
|
- "translation" |
|
- "language translation" |
|
- "foreign language" |
|
negative_prompts: |
|
- "programming language" |
|
- source_model: migtissera/Llama-3-8B-Synthia-v3.5 |
|
positive_prompts: |
|
- "summarize" |
|
- "paraphrase" |
|
- "list" |
|
- "explain" |
|
- "define" |
|
- "analyze" |
|
- "rephrase" |
|
- "elaborate" |
|
- "programming language" |
|
- "JavaScript" |
|
- "Python programming language" |
|
- "Rust programming language" |
|
- "C++ programming language" |
|
- "GO programming language" |
|
- "Ruby programming language" |
|
- "Haskell programming language" |
|
- "SQL query language" |
|
- "CSS markup styling language" |
|
- "code" |
|
- source_model: Sao10K/L3-8B-Stheno-v3.2 |
|
positive_prompts: |
|
- "characters" |
|
- "scene" |
|
- "roleplay" |
|
- "erotic roleplay" |
|
- "sexual fetish" |
|
- "NSFW" |
|
- "creative writing" |
|
- "storytelling" |
|
- "narration" |
|
- "narrative setting" |
|
- "narrative plot" |
|
- "narrative exposition" |
|
- "narrative theme" |
|
- "narrative climax" |
|
``` |
|
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) |
|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_HiroseKoichi__Llama-Salad-4x8B-V3) |
|
|
|
| Metric |Value| |
|
|-------------------|----:| |
|
|Avg. |24.75| |
|
|IFEval (0-Shot) |66.54| |
|
|BBH (3-Shot) |31.93| |
|
|MATH Lvl 5 (4-Shot)| 8.53| |
|
|GPQA (0-shot) | 7.05| |
|
|MuSR (0-shot) | 6.45| |
|
|MMLU-PRO (5-shot) |27.98| |
|
|
|
|