fblgit commited on
Commit
0608844
1 Parent(s): ae3a42f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -8
README.md CHANGED
@@ -4,7 +4,7 @@ tags:
4
  - generated_from_trainer
5
  base_model: 152334H/miqu-1-70b-sf
6
  model-index:
7
- - name: qlora-out
8
  results: []
9
  license: cc0-1.0
10
  datasets:
@@ -13,16 +13,20 @@ datasets:
13
 
14
  # ShinojiResearch/Senku-70B-Full
15
 
16
- ## Model Details
 
 
 
 
 
17
  Finetune of miqu-70b-sf dequant of miqudev's leak of Mistral-70B (allegedly an early mistral medium). My diffs are available under CC-0 (That is the Senku-70B repo, full includes the merge), this is a merge with the leaked model, you can use the other repository to save bandwidth.
18
 
19
- EQ-Bench: 84.89
20
- GSM8k: 77.18 (71.04 when using ChatML)
21
- Hellaswag: 87.67
22
 
23
- Edit: Upon further testing a score of 85.09 was achieved using ChatML instead of Mistral's prompt.
24
 
25
  I recommend using the ChatML format instead, I will run more benchmarks. This also fixes the bug with Miqu dequant failing to provide a stop.
 
26
  <|im_start|>system
27
  Provide some context and/or instructions to the model.
28
  <|im_end|>
@@ -30,11 +34,12 @@ Provide some context and/or instructions to the model.
30
  The user’s message goes here
31
  <|im_end|>
32
  <|im_start|>assistant <|im_end|>
 
33
 
34
- Credit to https://twitter.com/hu_yifei for providing GSM & Hellaswag. It is the first open weight model to dethrone GPT-4 on EQ bench,
 
35
 
36
  ## Base Model Details
37
-
38
  This model is a fine-tuned version of [152334H/miqu-1-70b-sf](https://huggingface.co/152334H/miqu-1-70b-sf) on the Slimorca dataset.
39
  It achieves the following results on the evaluation set:
40
  - Loss: 0.3110
 
4
  - generated_from_trainer
5
  base_model: 152334H/miqu-1-70b-sf
6
  model-index:
7
+ - name: Senku-70B-Full
8
  results: []
9
  license: cc0-1.0
10
  datasets:
 
13
 
14
  # ShinojiResearch/Senku-70B-Full
15
 
16
+ [<img src="https://cdna.artstation.com/p/assets/images/images/034/109/324/large/bella-factor-senku-ishigami.jpg?1611427638" width="420">](Senku-70B-Full)
17
+ ## UPDATE: **85.09** EQ-Bench with ChatML teamplate
18
+ * EQ-Bench: (Mistral) *84.89* -> **85.09** (ChatML)
19
+ * GSM8k: (Mistral) *77.18* -> **71.04** (ChatML)
20
+ * Hellaswag: (Mistral) 87.67 -> ??
21
+
22
  Finetune of miqu-70b-sf dequant of miqudev's leak of Mistral-70B (allegedly an early mistral medium). My diffs are available under CC-0 (That is the Senku-70B repo, full includes the merge), this is a merge with the leaked model, you can use the other repository to save bandwidth.
23
 
24
+ **Update**: Upon further testing a score of **85.09** was achieved using ChatML instead of Mistral's prompt.
 
 
25
 
26
+ ## Prompt Template
27
 
28
  I recommend using the ChatML format instead, I will run more benchmarks. This also fixes the bug with Miqu dequant failing to provide a stop.
29
+ ```
30
  <|im_start|>system
31
  Provide some context and/or instructions to the model.
32
  <|im_end|>
 
34
  The user’s message goes here
35
  <|im_end|>
36
  <|im_start|>assistant <|im_end|>
37
+ ```
38
 
39
+ ## Kudos
40
+ `Credit to https://twitter.com/hu_yifei for providing GSM & Hellaswag. It is the first open weight model to dethrone GPT-4 on EQ bench.`
41
 
42
  ## Base Model Details
 
43
  This model is a fine-tuned version of [152334H/miqu-1-70b-sf](https://huggingface.co/152334H/miqu-1-70b-sf) on the Slimorca dataset.
44
  It achieves the following results on the evaluation set:
45
  - Loss: 0.3110