Vicuna-13B-V1.1

Vicuna 13B model weights.

2023.04.16 Obtain the Vicuna weights by merging the LLaMA-13B model and Vicuna delta weights v1.1, and upload to the huggingfae.co model repository https://huggingface.co/uukuguy/vicuna-13b-v1.1

# Make sure you have git-lfs installed (https://git-lfs.com)
git lfs install
git clone https://huggingface.co/uukuguy/vicuna-13b-v1.1

# if you want to clone without large files – just their pointers
# prepend your git clone with the following env var:
GIT_LFS_SKIP_SMUDGE=1

Model Card

Model details

Model type: Vicuna is an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT. It is an auto-regressive language model, based on the transformer architecture.

Model date: Vicuna-13B-V1.1 weights was merged in April 2023.

Organizations developing the model: The Vicuna team with members from UC Berkeley, CMU, Stanford, and UC San Diego.

Paper or resources for more information: https://vicuna.lmsys.org/

License: Apache License 2.0

Where to send questions or comments about the model: https://github.com/uukuguy/Vicuna-LoRA/issues

Intended use

Primary intended uses: The primary use of Vicuna is research on large language models and chatbots.

Primary intended users: The primary intended users of the model are researchers and hobbyists in natural language processing, machine learning, and artificial intelligence.

Major updates of weights v1.1

Refactor the tokenization and separator. In Vicuna v1.1, the separator has been changed from "###" to the EOS token "". This change makes it easier to determine the generation stop criteria and enables better compatibility with other libraries. Fix the supervised fine-tuning loss computation for better model quality.