fblgit commited on
Commit
dfaae00
1 Parent(s): fe4ea44

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -14
README.md CHANGED
@@ -1,7 +1,9 @@
1
  ---
2
  language:
3
  - en
4
- license: mit
 
 
5
  library_name: transformers
6
  base_model:
7
  - Qwen/Qwen2.5-32B-Instruct
@@ -23,7 +25,8 @@ model-index:
23
  value: 45.03
24
  name: strict accuracy
25
  source:
26
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS
 
27
  name: Open LLM Leaderboard
28
  - task:
29
  type: text-generation
@@ -38,7 +41,8 @@ model-index:
38
  value: 58.07
39
  name: normalized accuracy
40
  source:
41
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS
 
42
  name: Open LLM Leaderboard
43
  - task:
44
  type: text-generation
@@ -53,7 +57,8 @@ model-index:
53
  value: 39.43
54
  name: exact match
55
  source:
56
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS
 
57
  name: Open LLM Leaderboard
58
  - task:
59
  type: text-generation
@@ -68,7 +73,8 @@ model-index:
68
  value: 20.13
69
  name: acc_norm
70
  source:
71
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS
 
72
  name: Open LLM Leaderboard
73
  - task:
74
  type: text-generation
@@ -83,7 +89,8 @@ model-index:
83
  value: 24.5
84
  name: acc_norm
85
  source:
86
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS
 
87
  name: Open LLM Leaderboard
88
  - task:
89
  type: text-generation
@@ -100,7 +107,8 @@ model-index:
100
  value: 54.57
101
  name: accuracy
102
  source:
103
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS
 
104
  name: Open LLM Leaderboard
105
  ---
106
 
@@ -109,12 +117,17 @@ This model is an experimental version of our latest innovation: `MGS`. Its up to
109
  We didn't applied our known `UNA` algorithm to the forward pass, but they are entirely compatible and operates in different parts of the neural network and in different ways, tho they both can be seen as a regularization technique.
110
  ![TheBeagle-v2-MGS](https://huggingface.co/fblgit/TheBeagle-v2beta-32B-MGS/resolve/main/TheBeagle-v2-MGS.png)
111
 
112
- `.. In the Loving Memory of my LoLa, coming back to your heart ..`
 
 
 
 
 
113
 
114
  ## MGS
115
  MGS stands for... Many-Geeks-Searching... and thats it. Hint: `1+1 is 2, and 1+1 is not 3`
116
 
117
- We still believe on 1-Epoch should be enough, so we just did 1 Epoch only as usual.
118
 
119
  ## Dataset
120
  Used here the first decent (corpora & size) dataset on the hub: `Magpie-Align/Magpie-Pro-300K-Filtered`
@@ -126,10 +139,10 @@ It achieves the following results on the evaluation set:
126
 
127
  [All versions available](https://huggingface.co/fblgit/TheBeagle-v2beta-MGS-GGUF/tree/main)
128
 
129
- Quantz by bartowski:
130
- https://huggingface.co/bartowski/TheBeagle-v2beta-32B-MGS-GGUF
 
131
 
132
-
133
  ## Training
134
  [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
135
 
@@ -176,7 +189,7 @@ The following hyperparameters were used during training:
176
  | 0.715 | 0.9629 | 798 | 0.5378 |
177
 
178
 
179
- # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
180
  Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_fblgit__TheBeagle-v2beta-32B-MGS)
181
 
182
  | Metric |Value|
@@ -192,4 +205,30 @@ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-le
192
  ## Thanks
193
  - Qwen Team for their outstanding model
194
  - MagPie Team for contributing plenty of datasets
195
- - Cybertron Cloud Compute
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  language:
3
  - en
4
+ license: other
5
+ license_name: qwen
6
+ license_link: https://huggingface.co/Qwen/Qwen2.5-72B-Instruct/blob/main/LICENSE
7
  library_name: transformers
8
  base_model:
9
  - Qwen/Qwen2.5-32B-Instruct
 
25
  value: 45.03
26
  name: strict accuracy
27
  source:
28
+ url: >-
29
+ https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS
30
  name: Open LLM Leaderboard
31
  - task:
32
  type: text-generation
 
41
  value: 58.07
42
  name: normalized accuracy
43
  source:
44
+ url: >-
45
+ https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS
46
  name: Open LLM Leaderboard
47
  - task:
48
  type: text-generation
 
57
  value: 39.43
58
  name: exact match
59
  source:
60
+ url: >-
61
+ https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS
62
  name: Open LLM Leaderboard
63
  - task:
64
  type: text-generation
 
73
  value: 20.13
74
  name: acc_norm
75
  source:
76
+ url: >-
77
+ https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS
78
  name: Open LLM Leaderboard
79
  - task:
80
  type: text-generation
 
89
  value: 24.5
90
  name: acc_norm
91
  source:
92
+ url: >-
93
+ https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS
94
  name: Open LLM Leaderboard
95
  - task:
96
  type: text-generation
 
107
  value: 54.57
108
  name: accuracy
109
  source:
110
+ url: >-
111
+ https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS
112
  name: Open LLM Leaderboard
113
  ---
114
 
 
117
  We didn't applied our known `UNA` algorithm to the forward pass, but they are entirely compatible and operates in different parts of the neural network and in different ways, tho they both can be seen as a regularization technique.
118
  ![TheBeagle-v2-MGS](https://huggingface.co/fblgit/TheBeagle-v2beta-32B-MGS/resolve/main/TheBeagle-v2-MGS.png)
119
 
120
+ ## CHANGELOG
121
+ **UPDATE**: 26/Oct
122
+ * Updated `tokenizer_config.json` (from the base_model)
123
+ * Regenerated Quants (being uploaded)
124
+ * Re-submitted Leaderboard Evaluation, MATH & IFeval have relevant updates
125
+ * Aligned LICENSE with `Qwen` terms.
126
 
127
  ## MGS
128
  MGS stands for... Many-Geeks-Searching... and thats it. Hint: `1+1 is 2, and 1+1 is not 3`
129
 
130
+ We still believe on 1-Epoch should be enough, so we just did 1 Epoch only.
131
 
132
  ## Dataset
133
  Used here the first decent (corpora & size) dataset on the hub: `Magpie-Align/Magpie-Pro-300K-Filtered`
 
139
 
140
  [All versions available](https://huggingface.co/fblgit/TheBeagle-v2beta-MGS-GGUF/tree/main)
141
 
142
+ ## Licensing terms:
143
+
144
+ **On top of the Qwen LICENSE, we add an extra term for derivatives to include "Beagle" or "MGS" on the model name, this will help us to track better the study. Thank you**
145
 
 
146
  ## Training
147
  [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
148
 
 
189
  | 0.715 | 0.9629 | 798 | 0.5378 |
190
 
191
 
192
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) without chat template.
193
  Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_fblgit__TheBeagle-v2beta-32B-MGS)
194
 
195
  | Metric |Value|
 
205
  ## Thanks
206
  - Qwen Team for their outstanding model
207
  - MagPie Team for contributing plenty of datasets
208
+ - Cybertron Cloud Compute
209
+
210
+ # Citations
211
+ ```
212
+ @misc{thebeagle-v2,
213
+ title={TheBeagle v2: MGS},
214
+ author={Xavier Murias},
215
+ year={2024},
216
+ publisher = {HuggingFace},
217
+ journal = {HuggingFace repository},
218
+ howpublished = {\url{https://huggingface.co/fblgit/TheBeagle-v2beta-32B-MGS}},
219
+ }
220
+ @misc{qwen2.5,
221
+ title = {Qwen2.5: A Party of Foundation Models},
222
+ url = {https://qwenlm.github.io/blog/qwen2.5/},
223
+ author = {Qwen Team},
224
+ month = {September},
225
+ year = {2024}
226
+ }
227
+
228
+ @article{qwen2,
229
+ title={Qwen2 Technical Report},
230
+ author={An Yang and Baosong Yang and Binyuan Hui and Bo Zheng and Bowen Yu and Chang Zhou and Chengpeng Li and Chengyuan Li and Dayiheng Liu and Fei Huang and Guanting Dong and Haoran Wei and Huan Lin and Jialong Tang and Jialin Wang and Jian Yang and Jianhong Tu and Jianwei Zhang and Jianxin Ma and Jin Xu and Jingren Zhou and Jinze Bai and Jinzheng He and Junyang Lin and Kai Dang and Keming Lu and Keqin Chen and Kexin Yang and Mei Li and Mingfeng Xue and Na Ni and Pei Zhang and Peng Wang and Ru Peng and Rui Men and Ruize Gao and Runji Lin and Shijie Wang and Shuai Bai and Sinan Tan and Tianhang Zhu and Tianhao Li and Tianyu Liu and Wenbin Ge and Xiaodong Deng and Xiaohuan Zhou and Xingzhang Ren and Xinyu Zhang and Xipin Wei and Xuancheng Ren and Yang Fan and Yang Yao and Yichang Zhang and Yu Wan and Yunfei Chu and Yuqiong Liu and Zeyu Cui and Zhenru Zhang and Zhihao Fan},
231
+ journal={arXiv preprint arXiv:2407.10671},
232
+ year={2024}
233
+ }
234
+ ```