Will quant this (and others) when all the llamacpp fixes are in

by bartowski - opened Jul 25

Jul 25

Since there's still pending fixes (https://github.com/ggerganov/llama.cpp/pull/8676) will be holding off on most llama 3.1 quants, just so you know :)

Nitral-AI

Owner Jul 25

Since there's still pending fixes (https://github.com/ggerganov/llama.cpp/pull/8676) will be holding off on most llama 3.1 quants, just so you know :)

Appreciate the heads up my dude!

Joseph717171

Jul 25

This comment has been hidden

Reiterate3680

Jul 26

I made some experimental quants with the PR pulled in. Koboldcpp frankenfork has support for models with the PR.
I ran the BABIlong 32k qa2 dataset prompts on the broken quant and new, the unfixed one just repeats tokens but the new quant indeed at least produces sane responses at Q5. I've updated the linked repo to point to those

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment