--- license: mit tags: - text generation - RAG - baichuan2 --- This model is a 7B Chinese version of [Self-RAG](https://huggingface.co/selfrag/selfrag_llama2_7b). It is trained on Baichuan2-7B-Chat with a sample of [belle](https://github.com/LianjiaTech/BELLE) sft data, acompanying with interleaving passages from zhwiki. The reflection tokens are aligned with the original verison (in English), so the usage is the same. Hope you enjoy. ### Usage ``` from transformers import AutoTokenizer, AutoModelForCausalLM from vllm import LLM, SamplingParams model = LLM(YOUR_MODEL_PATH, dtype="half") sampling_params = SamplingParams(temperature=0.0, top_p=1.0, max_tokens=100, skip_special_tokens=False) def format_prompt(input, paragraph=None): prompt = "### Instruction:\n{0}\n\n### Response:\n".format(input) if paragraph is not None: prompt += "[Retrieval]{0}".format(paragraph) return prompt query_1 = "你好" query_2 = "世界最高的山峰是什么?" queries = [query_1, query_2] preds = model.generate([format_prompt(query) for query in queries], sampling_params) for pred in preds: print("Model prediction: {0}".format(pred.outputs[0].text)) # Model prediction: ```