root
commited on
Commit
•
56da4d9
1
Parent(s):
b47f672
add new data
Browse files- README.md +6 -4
- data/hotpotqa.e5_mistral_retriever_chunkbysents1200/test.json +3 -0
- data/longbook_choice_eng.e5_mistral_retriever_chunkbysents1200/test.json +3 -0
- data/longbook_choice_eng_gpt4_same/test.json +3 -0
- data/longbook_qa_eng.e5_mistral_retriever_chunkbysents1200/test.json +3 -0
- data/longbook_qa_eng_gpt4_same/test.json +3 -0
- data/longbook_sum_eng_gpt4_same/test.json +3 -0
- data/longdialogue_qa_eng_gpt4_same/test.json +3 -0
- data/multifieldqa_en.e5_mistral_retriever_chunkbysents1200/test.json +3 -0
- data/musique.e5_mistral_retriever_chunkbysents1200/test.json +3 -0
- data/qasper.e5_mistral_retriever_chunkbysents1200/test.json +3 -0
- data/qmsum.e5_mistral_retriever_chunkbysents1200/test.json +3 -0
- data/quality.e5_mistral_retriever_chunkbysents1200/test.json +3 -0
README.md
CHANGED
@@ -16,10 +16,10 @@ tags:
|
|
16 |
We introduce Llama3-ChatQA-2, which bridges the gap between open-source LLMs and leading proprietary models (e.g., GPT-4-Turbo) in long-context understanding and retrieval-augmented generation (RAG) capabilities. Llama3-ChatQA-2 is developed using an improved training recipe from [ChatQA-1.5 paper](https://arxiv.org/pdf/2401.10225), and it is built on top of [Llama-3 base model](https://huggingface.co/meta-llama/Meta-Llama-3-70B). Specifically, we continued training of Llama-3 base models to extend the context window from 8K to 128K tokens, along with a three-stage instruction tuning process to enhance the model’s instruction-following, RAG performance, and long-context understanding capabilities. Llama3-ChatQA-2 has two variants: Llama3-ChatQA-2-8B and Llama3-ChatQA-2-70B. Both models were originally trained using [Megatron-LM](https://github.com/NVIDIA/Megatron-LM), we converted the checkpoints to Hugging Face format. **For more information about ChatQA 2, check the [website](https://chatqa2-project.github.io/)!**
|
17 |
|
18 |
## Other Resources
|
19 |
-
[Llama3-ChatQA-2-
|
20 |
|
21 |
## Overview of Benchmark Results
|
22 |
-
Results in [ChatRAG Bench](https://huggingface.co/datasets/nvidia/ChatRAG-Bench) are as follows:
|
23 |
|
24 |
|
25 |
![Example Image](overview.png)
|
@@ -116,7 +116,9 @@ print(tokenizer.decode(response, skip_special_tokens=True))
|
|
116 |
```
|
117 |
|
118 |
## Command to run generation
|
|
|
119 |
python evaluate_cqa_vllm_chatqa2.py --model-folder ${model_path} --eval-dataset ${dataset_name} --start-idx 0 --end-idx ${num_samples} --max-tokens ${max_tokens} --sample-input-file ${dataset_path}
|
|
|
120 |
|
121 |
see all_command.sh for all detailed configuration.
|
122 |
|
@@ -135,5 +137,5 @@ Peng Xu ([email protected]), Wei Ping ([email protected])
|
|
135 |
|
136 |
|
137 |
## License
|
138 |
-
The use of this model is governed by the [META LLAMA 3 COMMUNITY LICENSE AGREEMENT](https://llama.meta.com/llama3/license/)
|
139 |
-
|
|
|
16 |
We introduce Llama3-ChatQA-2, which bridges the gap between open-source LLMs and leading proprietary models (e.g., GPT-4-Turbo) in long-context understanding and retrieval-augmented generation (RAG) capabilities. Llama3-ChatQA-2 is developed using an improved training recipe from [ChatQA-1.5 paper](https://arxiv.org/pdf/2401.10225), and it is built on top of [Llama-3 base model](https://huggingface.co/meta-llama/Meta-Llama-3-70B). Specifically, we continued training of Llama-3 base models to extend the context window from 8K to 128K tokens, along with a three-stage instruction tuning process to enhance the model’s instruction-following, RAG performance, and long-context understanding capabilities. Llama3-ChatQA-2 has two variants: Llama3-ChatQA-2-8B and Llama3-ChatQA-2-70B. Both models were originally trained using [Megatron-LM](https://github.com/NVIDIA/Megatron-LM), we converted the checkpoints to Hugging Face format. **For more information about ChatQA 2, check the [website](https://chatqa2-project.github.io/)!**
|
17 |
|
18 |
## Other Resources
|
19 |
+
[Llama3-ChatQA-2-8B](https://huggingface.co/nvidia/Llama3-ChatQA-2-8B)   [Evaluation Data](https://huggingface.co/nvidia/Llama3-ChatQA-2-70B/tree/main/data)   [Training Data](https://huggingface.co/datasets/nvidia/ChatQA2-Long-SFT-data)   [Retriever](https://huggingface.co/intfloat/e5-mistral-7b-instruct)   [Website](https://chatqa2-project.github.io/)   [Paper](https://arxiv.org/abs/2407.14482)
|
20 |
|
21 |
## Overview of Benchmark Results
|
22 |
+
<!-- Results in [ChatRAG Bench](https://huggingface.co/datasets/nvidia/ChatRAG-Bench) are as follows: -->
|
23 |
|
24 |
|
25 |
![Example Image](overview.png)
|
|
|
116 |
```
|
117 |
|
118 |
## Command to run generation
|
119 |
+
```
|
120 |
python evaluate_cqa_vllm_chatqa2.py --model-folder ${model_path} --eval-dataset ${dataset_name} --start-idx 0 --end-idx ${num_samples} --max-tokens ${max_tokens} --sample-input-file ${dataset_path}
|
121 |
+
```
|
122 |
|
123 |
see all_command.sh for all detailed configuration.
|
124 |
|
|
|
137 |
|
138 |
|
139 |
## License
|
140 |
+
The Model is released under Non-Commercial License and the use of this model is also governed by the [META LLAMA 3 COMMUNITY LICENSE AGREEMENT](https://llama.meta.com/llama3/license/)
|
141 |
+
|
data/hotpotqa.e5_mistral_retriever_chunkbysents1200/test.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8bdbe87695af0ffc9aafef2abbfd3753b9c49286a298265ab7d31cb53263b526
|
3 |
+
size 23071838
|
data/longbook_choice_eng.e5_mistral_retriever_chunkbysents1200/test.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:408446c1f66282c22f5250f7bb34b61c3e138030456c37ff80aae8107a0bfb86
|
3 |
+
size 373336874
|
data/longbook_choice_eng_gpt4_same/test.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:08c8424d91c27fb8cb00778f6790db51a42b554e51cece592f410dc1199ca47f
|
3 |
+
size 302380927
|
data/longbook_qa_eng.e5_mistral_retriever_chunkbysents1200/test.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ed337ee7c70c6bf6f64c955ee17d8f59c001a4a2ca5b33056e749163592cbd61
|
3 |
+
size 598663025
|
data/longbook_qa_eng_gpt4_same/test.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0fab613a11b84bc3812b44b38cca84d9142d63eee2e8d73b1f395ee37b0948ad
|
3 |
+
size 476461679
|
data/longbook_sum_eng_gpt4_same/test.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4853bca9814332ec28e8920fe73be06daf61a3975fef4a20ee0fe705a1e9c2f8
|
3 |
+
size 50591948
|
data/longdialogue_qa_eng_gpt4_same/test.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6bcff853f1d8c62c098a2bb176c6d7b0caec848d20543c95ffd965f43e4c7309
|
3 |
+
size 83709847
|
data/multifieldqa_en.e5_mistral_retriever_chunkbysents1200/test.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9a3607042f7620baced5ee63faa359a275ff4d4561d7b6d93a502bee4e414972
|
3 |
+
size 9086582
|
data/musique.e5_mistral_retriever_chunkbysents1200/test.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2926144219b2a9df26b8a9ee551af2558ce92223637dbc910ead2fec37cedca7
|
3 |
+
size 28347035
|
data/qasper.e5_mistral_retriever_chunkbysents1200/test.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:77001fab84bca90813cf07c4d2ba7f5ecf3a545e1b78719a3ea513e6d4b696b7
|
3 |
+
size 80736248
|
data/qmsum.e5_mistral_retriever_chunkbysents1200/test.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d3c13449783630ee5ab648a485f0ebc8eff1bc810fa99e97d374166564882cbd
|
3 |
+
size 32187716
|
data/quality.e5_mistral_retriever_chunkbysents1200/test.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d8ee44ed63c72a79476b5828c18799a1a4682915d56808861d99f09205a03450
|
3 |
+
size 106451846
|