metadata
license: apache-2.0
base_model: argilla/zephyr-7b-spin-iter2-v0
tags:
- generated_from_trainer
model-index:
- name: zephyr-7b-spin-iter3-v0
results: []
datasets:
- argilla/10k_prompts_SPIN_iter3_zephyr_top
- argilla/10k_prompts_SPIN_iter2_zephyr_top
- DIBT/10k_prompts_ranked
zephyr-7b-spin-iter3-v0
A model matching the results of SPIN with very little data (30x less), carefully curated by the amazing Data Is Better Together community
This model is a fine-tuned version of argilla/zephyr-7b-spin-iter2-v0 on the argilla/10k_prompts_SPIN_iter3_zephyr_top and the argilla/10k_prompts_SPIN_iter2_zephyr_top dataset.
Check this repo for full reproducible code using the original SPIN implementation and distilabel.
If you want to contribute to high quality datasets like this, contribute to the DIBT prompt collective initiative.
MT-Bench results
Model | 1st Turn Score | 2nd Turn Score | Average Score | SPIN paper Score |
---|---|---|---|---|
zephyr-7b-sft-full | 6.6625 | 6.0250 | 6.34375 | 5.94 |
zephyr-7b-spin-iter0-v0 | 6.64375 | 6.1750 | 6.409375 | 6.46 |
zephyr-7b-spin-iter1-v0 | 6.90625 | 6.3000 | 6.603125 | 6.65 |
zephyr-7b-spin-iter2-v0 | 7.1375 | 6.3125 | 6.725000 | 6.78 |
zephyr-7b-spin-iter3-v0 | 7.09375 | 6.4500 | 6.771875 | - |
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-07
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 2
- total_train_batch_size: 64
- total_eval_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 2.0
Training results
Training Loss | Epoch | Step | Validation Loss | Rewards/real | Rewards/generated | Rewards/accuracies | Rewards/margins | Logps/generated | Logps/real | Logits/generated | Logits/real |
---|---|---|---|---|---|---|---|---|---|---|---|
0.2928 | 0.49 | 25 | 0.3951 | -2.6212 | -20.3268 | 0.9062 | 17.7056 | -700.5638 | -278.0876 | -2.8098 | -2.8090 |
0.1487 | 0.97 | 50 | 0.1319 | -2.9077 | -29.1459 | 0.9375 | 26.2382 | -702.3276 | -278.1449 | -2.8218 | -2.8066 |
0.006 | 1.46 | 75 | 0.1269 | -2.6037 | -29.1519 | 0.9583 | 26.5482 | -702.3289 | -278.0841 | -2.8175 | -2.8037 |
0.0086 | 1.94 | 100 | 0.1099 | -2.9181 | -29.6970 | 0.9271 | 26.7789 | -702.4378 | -278.1470 | -2.8177 | -2.8051 |
Framework versions
- Transformers 4.37.0
- Pytorch 2.1.2+cu121
- Datasets 2.14.6
- Tokenizers 0.15.2