royleibov
/

granite-7b-instruct-ZipNN-Compressed

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

royleibov commited on Sep 6

Commit

ab5564a

•

1 Parent(s): e640cef

Make README conform with ZipNN

Files changed (1) hide show

README.md +17 -0

README.md CHANGED Viewed

@@ -78,6 +78,23 @@ Importantly, we use a set of hyper-parameters for training that are very differe
 - **Base model:** [ibm/granite-7b-base](https://huggingface.co/ibm/granite-7b-base)
 - **Teacher Model:** [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1)
 ## Prompt Template
 ```python

 - **Base model:** [ibm/granite-7b-base](https://huggingface.co/ibm/granite-7b-base)
 - **Teacher Model:** [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1)
+## Usage
+This fork is compressed using ZipNN. To use the model, decompress the model tensors as discribed below and load the local weights.
+You need to [clone this repository](https://huggingface.co/royleibov/Jamba-v0.1-ZipNN-Compressed?clone=true) to decompress the model.
+Then:
+```bash
+cd granite-7b-instruct-ZipNN-Compressed
+```
+First decompress the model weights:
+```bash
+python3 zipnn_decompress_path.py --path .
+```
+Now just run the local version of the model.
 ## Prompt Template
 ```python