mobicham commited on
Commit
092838f
1 Parent(s): 1bfa8bd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -11,7 +11,7 @@ Quantizing small models at extreme low-bits is a challenging task. The purpose o
11
  We notice that, 1-bit quantization doesn't work well when applied directly on small models such as the Llama2-7B. However, when fine-tuned, the model's ouput significantly improves. In fact, the 1-bit base model outperforms Quip# 2-bit after fine-tuning on ~2.9K samples.
12
 
13
  Note that the weights here are unsigned 1-bit (0 or 1), <a href="https://arxiv.org/abs/2402.17764">not ternary like the recent 1.58-bit work </a>. This is a more challenging task since we lose the sign of the weights and only fine-tune a small fraction of the parameters (~94MB worth of weights).
14
- The dequantization step can be rewriten as a 1-bit matmul which could potential require only additions + a very low-rank matmul which is fast to compute.
15
 
16
  ## Datasets
17
  The adapter was trained via SFT on random subsets of the following:
 
11
  We notice that, 1-bit quantization doesn't work well when applied directly on small models such as the Llama2-7B. However, when fine-tuned, the model's ouput significantly improves. In fact, the 1-bit base model outperforms Quip# 2-bit after fine-tuning on ~2.9K samples.
12
 
13
  Note that the weights here are unsigned 1-bit (0 or 1), <a href="https://arxiv.org/abs/2402.17764">not ternary like the recent 1.58-bit work </a>. This is a more challenging task since we lose the sign of the weights and only fine-tune a small fraction of the parameters (~94MB worth of weights).
14
+ The dequantization step can be rewriten as a 1-bit matmul which could potentially require only additions + a very low-rank matmul which is fast to compute.
15
 
16
  ## Datasets
17
  The adapter was trained via SFT on random subsets of the following: