norallm
/

norbloom-7b-scratch

Text Generation

Norwegian Bokmål

Norwegian Nynorsk

feature-extraction

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

davda54 commited on Feb 11

Commit

a218224

•

1 Parent(s): 48e5e27

add low VRAM example

Files changed (1) hide show

README.md +28 -0

README.md CHANGED Viewed

@@ -313,4 +313,32 @@ def generate(text):
 # Now you can simply call the generate function with an English text you want to translate:
 generate("I'm super excited about this Norwegian NORA model! Can it translate these sentences?")
 ```

 # Now you can simply call the generate function with an English text you want to translate:
 generate("I'm super excited about this Norwegian NORA model! Can it translate these sentences?")
+```
+## Example usage on a GPU with ~16GB VRAM (try for yourself [in Google Colab](https://colab.research.google.com/drive/1AQgJ8lN-SNOqkUKj4xpQI5rr0R7V2Xzy?usp=sharing))
+Install bitsandbytes if you want to load in 8bit
+```bash
+pip install bitsandbytes
+pip install accelerate
+```
+```python
+import torch
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained(
+    "norallm/norbloom-7b-scratch"
+)
+# This setup needs about 8gb VRAM
+# Setting `load_in_8bit=False` -> 15gb VRAM
+# Using `torch.float32` and `load_in_8bit=False` -> 21gb VRAM
+model = AutoModelForCausalLM.from_pretrained(
+    "norallm/norbloom-7b-scratch",
+    device_map='auto',
+    load_in_8bit=True,
+    torch_dtype=torch.bfloat16
+)
 ```