legolasyiu
commited on
Commit
•
4c18874
1
Parent(s):
342e3db
Update README.md
Browse files
README.md
CHANGED
@@ -95,7 +95,9 @@ Llama-3.1-Storm-8B is a powerful generalist model useful for diverse application
|
|
95 |
4. 🚀 Ollama: `ollama run ajindal/llama3.1-storm:8b`
|
96 |
|
97 |
|
98 |
-
|
|
|
|
|
99 |
The Hugging Face `transformers` library loads the model in `bfloat16` by default. This is the type used by the [Llama-3.1-Storm-8B](https://huggingface.co/akjindal53244/Llama-3.1-Storm-8B) checkpoint, so it’s the recommended way to run to ensure the best results.
|
100 |
|
101 |
### Installation
|
@@ -160,7 +162,7 @@ print(response) # Expected Output: '2 + 2 = 4'
|
|
160 |
```python
|
161 |
from vllm import LLM, SamplingParams
|
162 |
from transformers import AutoTokenizer
|
163 |
-
model_id = "
|
164 |
num_gpus = 1
|
165 |
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
166 |
llm = LLM(model=model_id, tensor_parallel_size=num_gpus)
|
|
|
95 |
4. 🚀 Ollama: `ollama run ajindal/llama3.1-storm:8b`
|
96 |
|
97 |
|
98 |
+
---
|
99 |
+
|
100 |
+
## 💻 How to Use the Model of EpistemeAI2's FireStorm-Llama-3.1-8B
|
101 |
The Hugging Face `transformers` library loads the model in `bfloat16` by default. This is the type used by the [Llama-3.1-Storm-8B](https://huggingface.co/akjindal53244/Llama-3.1-Storm-8B) checkpoint, so it’s the recommended way to run to ensure the best results.
|
102 |
|
103 |
### Installation
|
|
|
162 |
```python
|
163 |
from vllm import LLM, SamplingParams
|
164 |
from transformers import AutoTokenizer
|
165 |
+
model_id = "EpistemeAI2/FireStorm-Llama-3.1-8B" # FP8 model: "EpistemeAI2/FireStorm-Llama-3.1-8B"
|
166 |
num_gpus = 1
|
167 |
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
168 |
llm = LLM(model=model_id, tensor_parallel_size=num_gpus)
|