BAAI
/

shunxing1234 commited on
Commit
32a030f
1 Parent(s): 84b3d9a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -2
README.md CHANGED
@@ -1,5 +1,55 @@
1
  ---
2
  license: other
3
- license_name: baai-aquila-model-license-agreement
4
- license_link: LICENSE
5
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: other
 
 
3
  ---
4
+
5
+
6
+ ![Aquila_logo](./log.jpeg)
7
+
8
+
9
+ <h4 align="center">
10
+ <p>
11
+ <b>English</b> |
12
+ <a href="https://huggingface.co/BAAI/AquilaChat2-34B/blob/main/README_zh.md">简体中文</a>
13
+ </p>
14
+ </h4>
15
+
16
+
17
+ We opensource our **Aquila2** series, now including **Aquila2**, the base language models, namely **Aquila2-7B** and **Aquila2-34B**, as well as **AquilaChat2**, the chat models, namely **AquilaChat2-7B** and **AquilaChat2-34B**, as well as the long-text chat models, namely **AquilaChat2-7B-16k** and **AquilaChat2-34B-16k**
18
+
19
+ The additional details of the Aquila model will be presented in the official technical report. Please stay tuned for updates on official channels.
20
+
21
+ ## Chat Model Performance
22
+
23
+ <br>
24
+ <p align="center">
25
+ <img src="chat_metrics.jpeg" width="1024"/>
26
+ <p>
27
+ <br>
28
+
29
+ ## Quick Start AquilaChat2-34B(Chat model)
30
+
31
+ ### 1. Inference
32
+
33
+ ```python
34
+ from transformers import AutoTokenizer, AutoModelForCausalLM
35
+ import torch
36
+ device = torch.device("cuda")
37
+ model_info = "BAAI/AquilaChat2-34B"
38
+ tokenizer = AutoTokenizer.from_pretrained(model_info, trust_remote_code=True)
39
+ model = AutoModelForCausalLM.from_pretrained(model_info, trust_remote_code=True)
40
+ model.eval()
41
+ model.to(device)
42
+ text = "请给出10个要到北京旅游的理由。"
43
+ tokens = tokenizer.encode_plus(text)['input_ids']
44
+ tokens = torch.tensor(tokens)[None,].to(device)
45
+ stop_tokens = ["###", "[UNK]", "</s>"]
46
+ with torch.no_grad():
47
+ out = model.generate(tokens, do_sample=True, max_length=512, eos_token_id=100007, bad_words_ids=[[tokenizer.encode(token)[0] for token in stop_tokens]])[0]
48
+ out = tokenizer.decode(out.cpu().numpy().tolist())
49
+ print(out)
50
+ ```
51
+
52
+
53
+ ## License
54
+
55
+ Aquila2 series open-source model is licensed under [ BAAI Aquila Model Licence Agreement](https://huggingface.co/BAAI/AquilaChat2-7B/blob/main/BAAI-Aquila-Model-License%20-Agreement.pdf)