royleibov commited on
Commit
2e0bd6e
1 Parent(s): 5c653a0

Clarify this is a clone and correct the use of ZipNN

Browse files
Files changed (1) hide show
  1. README.md +28 -19
README.md CHANGED
@@ -11,6 +11,25 @@ language:
11
  - en
12
  base_model: ibm/granite-7b-base
13
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  # Model Card for Granite-7b-lab [Paper](https://arxiv.org/abs/2403.01081)
15
 
16
  ### Overview
@@ -78,39 +97,29 @@ Importantly, we use a set of hyper-parameters for training that are very differe
78
  - **Base model:** [ibm/granite-7b-base](https://huggingface.co/ibm/granite-7b-base)
79
  - **Teacher Model:** [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1)
80
 
81
- ## Usage
82
- This fork is compressed using ZipNN. To use the model, decompress the model tensors as discribed below and load the local weights.
83
-
84
- You need to [clone this repository](https://huggingface.co/royleibov/granite-7b-instruct-ZipNN-Compressed?clone=true) to decompress the model.
85
-
86
- Then:
87
- ```bash
88
- cd granite-7b-instruct-ZipNN-Compressed
89
- ```
90
-
91
- First decompress the model weights:
92
- ```bash
93
- python3 zipnn_decompress_path.py --path .
94
- ```
95
-
96
- Now just run the local version of the model.
97
  ### Use a pipeline as a high-level helper
98
  ```python
99
  from transformers import pipeline
 
 
 
100
 
101
  messages = [
102
  {"role": "user", "content": "Who are you?"},
103
  ]
104
- pipe = pipeline("text-generation", model="PATH_TO_MODEL") # "." if in directory
105
  pipe(messages)
106
  ```
107
 
108
  ### Load model directly
109
  ```python
110
  from transformers import AutoTokenizer, AutoModelForCausalLM
 
 
 
111
 
112
- tokenizer = AutoTokenizer.from_pretrained("PATH_TO_MODEL") # "." if in directory
113
- model = AutoModelForCausalLM.from_pretrained("PATH_TO_MODEL") # "." if in directory
114
  ```
115
 
116
  ## Prompt Template
 
11
  - en
12
  base_model: ibm/granite-7b-base
13
  ---
14
+ # Disclaimer and Requirements
15
+
16
+ This model is a clone of [ibm-granite/granite-7b-instruct](https://huggingface.co/ibm-granite/granite-7b-instruct) compressed using ZipNN. Compressed losslessly to 67% its original size, ZipNN saved ~5GB in storage and potentially ~30TB in data transfer **monthly**.
17
+
18
+ ## Requirement
19
+
20
+ In order to use the model, ZipNN is necessary:
21
+ ```bash
22
+ pip install zipnn
23
+ ```
24
+
25
+ Then simply add at the beginning of the file
26
+ ```python
27
+ from zipnn import zipnn_hf_patch
28
+
29
+ zipnn_hf_patch()
30
+ ```
31
+ And continue as usual. The patch will take care of decompressing the model correctly and safely.
32
+
33
  # Model Card for Granite-7b-lab [Paper](https://arxiv.org/abs/2403.01081)
34
 
35
  ### Overview
 
97
  - **Base model:** [ibm/granite-7b-base](https://huggingface.co/ibm/granite-7b-base)
98
  - **Teacher Model:** [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1)
99
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
100
  ### Use a pipeline as a high-level helper
101
  ```python
102
  from transformers import pipeline
103
+ from zipnn import zipnn_hf_patch
104
+
105
+ zipnn_hf_patch()
106
 
107
  messages = [
108
  {"role": "user", "content": "Who are you?"},
109
  ]
110
+ pipe = pipeline("text-generation", model="royleibov/granite-7b-instruct-ZipNN-Compressed")
111
  pipe(messages)
112
  ```
113
 
114
  ### Load model directly
115
  ```python
116
  from transformers import AutoTokenizer, AutoModelForCausalLM
117
+ from zipnn import zipnn_hf_patch
118
+
119
+ zipnn_hf_patch()
120
 
121
+ tokenizer = AutoTokenizer.from_pretrained("royleibov/granite-7b-instruct-ZipNN-Compressed")
122
+ model = AutoModelForCausalLM.from_pretrained("royleibov/granite-7b-instruct-ZipNN-Compressed")
123
  ```
124
 
125
  ## Prompt Template