legolasyiu commited on
Commit
4c18874
1 Parent(s): 342e3db

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -95,7 +95,9 @@ Llama-3.1-Storm-8B is a powerful generalist model useful for diverse application
95
  4. 🚀 Ollama: `ollama run ajindal/llama3.1-storm:8b`
96
 
97
 
98
- ## 💻 How to Use the Model
 
 
99
  The Hugging Face `transformers` library loads the model in `bfloat16` by default. This is the type used by the [Llama-3.1-Storm-8B](https://huggingface.co/akjindal53244/Llama-3.1-Storm-8B) checkpoint, so it’s the recommended way to run to ensure the best results.
100
 
101
  ### Installation
@@ -160,7 +162,7 @@ print(response) # Expected Output: '2 + 2 = 4'
160
  ```python
161
  from vllm import LLM, SamplingParams
162
  from transformers import AutoTokenizer
163
- model_id = "akjindal53244/Llama-3.1-Storm-8B" # FP8 model: "EpistemeAI2/FireStorm-Llama-3.1-8B"
164
  num_gpus = 1
165
  tokenizer = AutoTokenizer.from_pretrained(model_id)
166
  llm = LLM(model=model_id, tensor_parallel_size=num_gpus)
 
95
  4. 🚀 Ollama: `ollama run ajindal/llama3.1-storm:8b`
96
 
97
 
98
+ ---
99
+
100
+ ## 💻 How to Use the Model of EpistemeAI2's FireStorm-Llama-3.1-8B
101
  The Hugging Face `transformers` library loads the model in `bfloat16` by default. This is the type used by the [Llama-3.1-Storm-8B](https://huggingface.co/akjindal53244/Llama-3.1-Storm-8B) checkpoint, so it’s the recommended way to run to ensure the best results.
102
 
103
  ### Installation
 
162
  ```python
163
  from vllm import LLM, SamplingParams
164
  from transformers import AutoTokenizer
165
+ model_id = "EpistemeAI2/FireStorm-Llama-3.1-8B" # FP8 model: "EpistemeAI2/FireStorm-Llama-3.1-8B"
166
  num_gpus = 1
167
  tokenizer = AutoTokenizer.from_pretrained(model_id)
168
  llm = LLM(model=model_id, tensor_parallel_size=num_gpus)