File size: 3,040 Bytes
d1a79ec
 
 
 
 
 
 
 
bc84ea7
d1a79ec
bc84ea7
3809bc3
 
 
fddc384
da2cadc
61c626f
915124b
 
da2cadc
40600c2
 
b890398
 
f557bed
b815f19
db7ad2f
 
f557bed
db7ad2f
 
 
f557bed
5348a89
db7ad2f
f557bed
da2cadc
fddc384
da2cadc
61c626f
da2cadc
915124b
 
40600c2
 
b815f19
 
f557bed
 
4ad3073
 
f557bed
db7ad2f
 
 
f557bed
5348a89
29a4018
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
---
language:
- en
- zh
tags:
- qwen
- llama
- llama-2
license: gpl-3.0
---

NEW VERSIONS: [https://huggingface.co/CausalLM/14B](https://huggingface.co/CausalLM/14B)


This is the LLaMAfied replica of [Qwen/Qwen-7B-Chat](https://huggingface.co/Qwen/Qwen-7B-Chat) (Original Version before 25.09.2023), recalibrated to fit the original LLaMA/LLaMA-2-like model structure.

You can use LlamaForCausalLM for model inference, which is the same as LLaMA/LLaMA-2 models (using GPT2Tokenizer converted from the original tiktoken, by [vonjack](https://huggingface.co/vonjack)).

The model has been edited to be white-labelled, meaning the model will no longer call itself a Qwen.

Up until now, the model has undergone numerical alignment of weights and preliminary reinforcement learning in order to align with the original model. Some errors and outdated knowledge have been addressed through model editing methods. This model remains completely equivalent to the original version, without having any dedicated supervised finetuning on downstream tasks or other extensive conversation datasets.

PROMPT FORMAT: [chatml](https://github.com/openai/openai-python/blob/main/chatml.md)

CURRENT MMLU: 53.48

CURRENT CEval (val): 54.13

```
MMLU - stem ACC: 46.40 Humanities ACC: 47.61 other ACC: 61.31 social ACC: 61.78 AVERAGE ACC:53.48

CEval (val) - STEM acc: 45.28 Social Science acc: 66.19 Humanities acc: 58.76 Other acc: 54.62 Hard acc:28.64 AVERAGE acc:54.13
```

Issue: Compared to the original Qwen-7B-Chat scoring 53.90 in MMLU and 54.18 in CEval (val), the our scores dropped slightly [-0.42 in MMLU, -0.05 in CEval (val)] due to insufficient realignment.


这是 [通义千问 Qwen/Qwen-7B-Chat](https://huggingface.co/Qwen/Qwen-7B-Chat) (在 25.09.2023 之前的原始版本) 的 LLaMA 化版本,经过重新校准以适应原始的类似 LLaMA/LLaMA-2 的模型结构。

您可以使用 LlamaCausalLM 进行模型推理,和 LLaMA/LLaMA-2 保持一致(使用由 [vonjack](https://huggingface.co/vonjack) 从原始 tiktoken 转换而来的 GPT2Tokenizer 分词器)。

模型已经被编辑实现白标化,不再自称通义千问。

到目前为止,该模型已经进行了权重的数值对齐和初步的强化学习,以与原始模型保持一致。 一些错误和过时的知识已通过模型编辑方法得到解决。 该模型与原始版本完全等效,尚未对下游任务或其他广泛的对话数据集进行任何专门的监督微调。

PROMPT 格式: [chatml](https://github.com/openai/openai-python/blob/main/chatml.md)

当前的 MMLU: 53.48

当前的 CEval (val): 54.13

```
MMLU - stem ACC: 46.40 Humanities ACC: 47.61 other ACC: 61.31 social ACC: 61.78 AVERAGE ACC:53.48

CEval (val) - STEM acc: 45.28 Social Science acc: 66.19 Humanities acc: 58.76 Other acc: 54.62 Hard acc:28.64 AVERAGE acc:54.13
```

问题:相比原本的 Qwen-7B-Chat 的 MMLU 分数 53.90 和 CEval (val) 分数 54.18,由于不够充分的重新对齐,分数都略有下降 [-0.42 in MMLU, -0.05 in CEval (val)]。