qwp4w3hyb
/

Meta-Llama-3-8B-Instruct-iMat-GGUF

Text Generation

importance matrix

Inference Endpoints

Model card Files Files and versions Community

qwp4w3hyb commited on Apr 29

Commit

f1eb834

•

1 Parent(s): 33646ca

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -17,11 +17,11 @@ license: other
 license_name: llama3
 license_link: LICENSE
 ---
-# Updated beta quants based on new fixed tokenizer, only works with llama.cpp branch gg/bpe-preprocess
 # Quant Infos
-- Updated for latest bpe pre-tokenizer fixes
 - quants done with an importance matrix for improved quantization loss
 - K & IQ quants in basically all variants from Q6_K down to IQ1_S
 - fixed end token for instruct mode (<|eot_id|>[128009])

 license_name: llama3
 license_link: LICENSE
 ---
+# Updated beta quants based on new fixed tokenizer, only works with in-development branch gg/bpe-preprocess
 # Quant Infos
+- Updated for latest bpe pre-tokenizer fixes https://github.com/ggerganov/llama.cpp/pull/6920
 - quants done with an importance matrix for improved quantization loss
 - K & IQ quants in basically all variants from Q6_K down to IQ1_S
 - fixed end token for instruct mode (<|eot_id|>[128009])