jikaixuan
/

zephyr-ds

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

jikaixuan commited on Jan 13

Commit

e639da6

•

1 Parent(s): cf31064

Model save

Files changed (2) hide show

README.md +10 -10
training_args.bin +2 -2

README.md CHANGED Viewed

@@ -15,15 +15,15 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6400
-- Rewards/chosen: 0.0301
-- Rewards/rejected: -0.0273
-- Rewards/accuracies: 0.6370
-- Rewards/margins: 0.0574
-- Logps/rejected: -253.2124
-- Logps/chosen: -269.2556
-- Logits/rejected: -2.4963
-- Logits/chosen: -2.4945
 ## Model description
@@ -60,7 +60,7 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
 |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
-| 0.6442        | 1.0   | 955  | 0.6400          | 0.0301         | -0.0273          | 0.6370             | 0.0574          | -253.2124      | -269.2556    | -2.4963         | -2.4945       |
 ### Framework versions

 This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.6409
+- Rewards/chosen: 0.0197
+- Rewards/rejected: -0.0229
+- Rewards/accuracies: 0.6130
+- Rewards/margins: 0.0426
+- Logps/rejected: -253.1684
+- Logps/chosen: -269.3594
+- Logits/rejected: -2.4973
+- Logits/chosen: -2.4954
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
 |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.6468        | 1.0   | 955  | 0.6409          | 0.0197         | -0.0229          | 0.6130             | 0.0426          | -253.1684      | -269.3594    | -2.4973         | -2.4954       |
 ### Framework versions

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:55735e718abcb162ead37e5dd69bbbcafb5bd7d79744ef90d96d7c4c7c4f2969
-size 4728

 version https://git-lfs.github.com/spec/v1
+oid sha256:6dab51201b4225ab9086b23809d9065b793c1c64bd39c30b7fcefee8ce4762f6
+size 4792