Problem Model

by ClaudioItaly - opened Jul 2

Discussion

ClaudioItaly

Jul 2

this model is broken. He starts making endless signs. I've tried every way, I give up

ooooz

Jul 2

It's not broken for me, you need the last llama.cpp.

bartowski

Owner Jul 2

What he said ^

And if you're still having issues, make sure flash attention is off

ClaudioItaly

Jul 3

What he said ^

And if you're still having issues, make sure flash attention is off

Lm Studio updated the ccp yesterday and the problem remains even without flah

AndrewLockhart

Jul 3

Same. same. broken ggufs

bartowski

Owner Jul 3

I'm not seeing any issues either on lmstudio 0.2.27, can you guys share your hardware and settings?

Tested on windows with a 3070 and linux with a 3090

ooooz

Jul 4

No issue here and i even tried it in french for a full 8k tokens chat.
windows 10 with a 3090 ti llama.cpp + sillytavern as a front end and also tried with kobold.cpp (q8 quant)

AndrewLockhart

Jul 4

•

edited Jul 4

I tried this https://huggingface.co/legraphista/Gemma-2-9B-It-SPPO-Iter3-IMat-GGUF and it works fine, so idk what is wrong with these ggufs, I am using koboldcpp latest. And for me, it doesn't even load the model, errors occur.

bartowski

Owner Jul 4

That's very interesting since that quant from @legraphista (tagged so you can consider updating) is using a version of llama.cpp with a broken Gemma 2 implementation, so your experience should only be better on mine. What's the error you get @AndrewLockhart ?