{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "B3S1AZp6Kc0R" }, "source": [ "# Installing\n", "\n", "If running on Google Colab you will need a Colab Pro+ subscription.\n", "\n", "Change the runtime type to a high memory or connect your local machine to colab, launch a vm perhaps.\n", "\n", "First, you will need to obtain the LLaMA weights. \n", "\n", "You can sign up for the official weights here: https://huggingface.co/docs/transformers/main/model_doc/llama\n", "\n", "There are alternative models available on huggingface however. This guide will assume you do not have access to the official weights.\n", "\n", "If you do have access to the official weights skip to: Clone the delta weights" ] }, { "cell_type": "markdown", "metadata": { "id": "8_MM5GNkKc0T" }, "source": [ "## Clone the LLaMA weights" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "5TioBc7WKc0U" }, "outputs": [], "source": [ "# Setup git lfs\n", "!git lfs install --skip-smudge --force\n", "!git lfs env\n", "!git config filter.lfs.process \"git-lfs filter-process --skip\"\n", "!git config filter.lfs.smudge \"git-lfs smudge --skip -- %f\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "B5Q-x5yBKc0V" }, "outputs": [], "source": [ "# Cloning the 7b parameter model repo\n", "!git lfs clone https://huggingface.co/decapoda-research/llama-7b-hf" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "d0r8HBUyKc0V" }, "outputs": [], "source": [ "# Cloning the 13b parameter model repo\n", "!git lfs clone https://huggingface.co/decapoda-research/llama-13b-hf" ] }, { "cell_type": "markdown", "metadata": { "id": "QeDIT-a4Kc0V" }, "source": [ "## Applying the Vicuna delta weights" ] }, { "cell_type": "markdown", "metadata": { "id": "p7BGGUICKc0W" }, "source": [ "### Install PyTorch with CUDA support\n", "\n", "If you already have this installed in your environment, you can skip this step." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "IfGE0ALrKc0W" }, "outputs": [], "source": [ "# First we need to upgrade setuptools, pip and wheel\n", "!pip install --upgrade setuptools pip wheel" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "RpvAYAD8Kc0W" }, "outputs": [], "source": [ "# For CUDA 11.X:\n", "!pip install nvidia-cuda-runtime-cu11 --index-url https://pypi.ngc.nvidia.com" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "1oEi1ggaKc0X" }, "outputs": [], "source": [ "# For CUDA 12.x\n", "!pip install nvidia-cuda-runtime-cu12 --index-url https://pypi.ngc.nvidia.com" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "fQYYrn4tKc0X" }, "outputs": [], "source": [ "# For PyTorch cu117\n", "!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "G7lnb0sYKc0X" }, "outputs": [], "source": [ "# For PyTorch cu118\n", "!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118" ] }, { "cell_type": "markdown", "metadata": { "id": "oBVyjHuWKc0Y" }, "source": [ "### Running the Fast-Chat apply delta script" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "a4b11Z8lKc0Y" }, "outputs": [], "source": [ "# Install FastChat\n", "!pip install fschat\n", "\n", "# Install the latest main branch of huggingface/transformers\n", "!pip install git+https://github.com/huggingface/transformers" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "3vJTP1m0Kc0Y" }, "outputs": [], "source": [ "import json\n", "\n", "def convert_llama_names(model_name: str) -> None:\n", " \"\"\"Convert LlamaForCausalLM to LlamaForCausalLM and LLaMATokenizer to LlamaTokenizer\"\"\"\n", " with open(f\"{model_name}/config.json\", \"r\", encoding='utf-8') as f:\n", " data = f.read()\n", "\n", " config = json.loads(data)\n", " config[\"architectures\"] = [\"LlamaForCausalLM\"]\n", " with open(f\"{model_name}/config.json\", \"w\", encoding='utf-8') as f:\n", " json.dump(config, f)\n", "\n", "\n", " with open(f\"{model_name}/tokenizer_config.json\", \"r\", encoding='utf-8') as f:\n", " data = f.read()\n", "\n", " config = json.loads(data)\n", " config[\"tokenizer_class\"] = \"LlamaTokenizer\"\n", "\n", " with open(f\"{model_name}/tokenizer_config.json\", \"w\", encoding='utf-8') as f:\n", " json.dump(config, f)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "qyboxxkXKc0Y" }, "outputs": [], "source": [ "!git lfs clone https://huggingface.co/lmsys/vicuna-7b-delta-v1.1" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "9tlvfvapKc0Z" }, "outputs": [], "source": [ "# 7b Model\n", "convert_llama_names(\"llama-7b-hf\")\n", "!python -m fastchat.model.apply_delta --base llama-7b-hf --target vicuna-7b --delta ./vicuna-7b-delta-v1.1" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "5VzAdoySKc0Z" }, "outputs": [], "source": [ "!git lfs clone https://huggingface.co/lmsys/vicuna-13b-delta-v1.1" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "jRBc7q5RKc0Z" }, "outputs": [], "source": [ "# 13b\n", "convert_llama_names(\"llama-13b-hf\")\n", "!python -m fastchat.model.apply_delta --base llama-13b-hf --target vicuna-13b --delta ./vicuna-13b-delta-v1.1" ] }, { "cell_type": "markdown", "metadata": { "id": "7n8Ug87RKc0Z" }, "source": [ "# Installing Auto-Vicuna\n", "\n", "Note that running this does not work in colab or the notebook, it is for demonstration purposes only." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "gIlfwNlEKc0Z" }, "outputs": [], "source": [ "!pip install auto-vicuna" ] }, { "cell_type": "markdown", "metadata": { "id": "0Z5NlQOXKc0Z" }, "source": [ "# Running Auto-Vicuna" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "sBJiQP7MKc0a" }, "outputs": [], "source": [ "!auto_vicuna --vicuna_weights vicuna-7b" ] }, { "cell_type": "markdown", "metadata": { "id": "oqqFeF4zKc0a" }, "source": [ "You can also create a .env file with \n", "\n", "```\n", "VICUNA_WEIGHTS=vicuna-7b\n", "```\n", "\n", "To avoid passing the weights as an arugment." ] }, { "cell_type": "markdown", "metadata": { "id": "jiW98C1KKc0a" }, "source": [] }, { "cell_type": "markdown", "metadata": { "id": "VE8DhmUrKc0a" }, "source": [ "## Known Issues\n", "\n", "If your model keeps talking about random news articles and suchs the `special_tokens_map.json` and `tokenizer_config.json` need to have to stop tokens populated most likely, you can find them in the repo's root dir." ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" }, "orig_nbformat": 4, "colab": { "provenance": [] } }, "nbformat": 4, "nbformat_minor": 0 }