Wanfq commited on
Commit
8851fdc
β€’
1 Parent(s): 5037254

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -19
README.md CHANGED
@@ -15,11 +15,11 @@ library_name: transformers
15
 
16
  <div id="top" align="center">
17
 
18
- **Knowledge Fusion of Large Language Models**
19
 
20
 
21
  <h4> |<a href="https://arxiv.org/abs/2401.10491"> πŸ“‘ Paper </a> |
22
- <a href="https://huggingface.co/Wanfq/FuseLLM-7B"> πŸ€— Model </a> |
23
  <a href="https://github.com/fanqiwan/FuseLLM"> 🐱 Github Repo </a> |
24
  </h4>
25
 
@@ -38,8 +38,7 @@ _<sup>†</sup> Sun Yat-sen University,
38
 
39
 
40
  ## News
41
- - **Jan 23, 2024:** πŸ”₯πŸ”₯ We release the code for FuseLLM, including the data construction and model training process!
42
- - **Jan 22, 2024:** πŸ”₯ We're excited to announce that the FuseLLM-7B, which is the fusion of [Llama-2-7B](https://huggingface.co/meta-llama/Llama-2-7b-hf), [OpenLLaMA-7B](https://huggingface.co/openlm-research/open_llama_7b_v2), and [MPT-7B](https://huggingface.co/mosaicml/mpt-7b), is now available on πŸ€— [Huggingface Models](https://huggingface.co/Wanfq/FuseLLM-7B). Happy exploring!
43
 
44
 
45
  ## WIP
@@ -59,7 +58,6 @@ _<sup>†</sup> Sun Yat-sen University,
59
  - [Training](#training)
60
  - [Evaluation](#evaluation)
61
  - [Citation](#citation)
62
- - [Acknowledgements](#acknowledgments)
63
 
64
  ## Overview
65
 
@@ -134,9 +132,9 @@ pip install -r requirements.txt
134
  ### Usage
135
 
136
  ```python
137
- from transformers import AutoTokenizer, AutoModelForCausalLM
138
  tokenizer = AutoTokenizer.from_pretrained("Wanfq/FuseLLM-7B", use_fast=False)
139
- model = AutoModelForCausalLM.from_pretrained("Wanfq/FuseLLM-7B", torch_dtype="auto")
140
  model.cuda()
141
  inputs = tokenizer("<your text here>", return_tensors="pt").to(model.device)
142
  tokens = model.generate(
@@ -351,16 +349,11 @@ The evaluation code we used in our evaluation are list as follows:
351
 
352
  If you find this work is relevant with your research or applications, please feel free to cite our work!
353
  ```
354
- @misc{wan2024knowledge,
355
- title={Knowledge Fusion of Large Language Models},
356
- author={Fanqi Wan and Xinting Huang and Deng Cai and Xiaojun Quan and Wei Bi and Shuming Shi},
357
- year={2024},
358
- eprint={2401.10491},
359
- archivePrefix={arXiv},
360
- primaryClass={cs.CL}
361
  }
362
- ```
363
-
364
- ## Acknowledgments
365
-
366
- This repo benefits from [Stanford-Alpaca](https://github.com/tatsu-lab/stanford_alpaca) and [Explore-Instruct](https://github.com/fanqiwan/Explore-Instruct). Thanks for their wonderful works!
 
15
 
16
  <div id="top" align="center">
17
 
18
+ <p style="font-size: 36px; font-weight: bold;">Knowledge Fusion of Large Language Models</p>
19
 
20
 
21
  <h4> |<a href="https://arxiv.org/abs/2401.10491"> πŸ“‘ Paper </a> |
22
+ <a href="https://huggingface.co/FuseAI"> πŸ€— Huggingface Repo </a> |
23
  <a href="https://github.com/fanqiwan/FuseLLM"> 🐱 Github Repo </a> |
24
  </h4>
25
 
 
38
 
39
 
40
  ## News
41
+ - **Jan 22, 2024:** πŸ”₯ We release [FuseLLM-7B](https://huggingface.co/Wanfq/FuseLLM-7B), which is the fusion of three open-source foundation LLMs with distinct architectures, including [Llama-2-7B](https://huggingface.co/meta-llama/Llama-2-7b-hf), [OpenLLaMA-7B](https://huggingface.co/openlm-research/open_llama_7b_v2), and [MPT-7B](https://huggingface.co/mosaicml/mpt-7b).
 
42
 
43
 
44
  ## WIP
 
58
  - [Training](#training)
59
  - [Evaluation](#evaluation)
60
  - [Citation](#citation)
 
61
 
62
  ## Overview
63
 
 
132
  ### Usage
133
 
134
  ```python
135
+ from transformers import AutoTokenizer, AutoModel
136
  tokenizer = AutoTokenizer.from_pretrained("Wanfq/FuseLLM-7B", use_fast=False)
137
+ model = AutoModel.from_pretrained("Wanfq/FuseLLM-7B", torch_dtype="auto")
138
  model.cuda()
139
  inputs = tokenizer("<your text here>", return_tensors="pt").to(model.device)
140
  tokens = model.generate(
 
349
 
350
  If you find this work is relevant with your research or applications, please feel free to cite our work!
351
  ```
352
+ @inproceedings{wan2024knowledge,
353
+ title={Knowledge Fusion of Large Language Models},
354
+ author={Fanqi Wan and Xinting Huang and Deng Cai and Xiaojun Quan and Wei Bi and Shuming Shi},
355
+ booktitle={The Twelfth International Conference on Learning Representations},
356
+ year={2024},
357
+ url={https://openreview.net/pdf?id=jiDsk12qcz}
 
358
  }
359
+ ```