[Bug Report] <0x0A> is output instead of a newline
The character sequence <0x0A> is output every time instead of a newline.
This redditor suggests there may be a problem with the tokenizer in the original non-GGUF model that carried over.
Are you using LM Studio? I saw something similar but I bet it has to do with the Preset you have chosen.
It happens using Ollama as well -- the model is outputting the token for a newline, but it's not interpreted as such.
Could be solved programatically if you're using this model in a server environment with its outputs passed to a program, otherwise - I'm not sure of another solution.
I have the same bug. I tried on few versions of koboldCpp, in their KoboldLite front-end, in SillyTavern with different chat templates, but bug stays. For me this is the best Mixtral model i tried, even better that x8 moe models. Its really good in staying in character, speech style etc.
same as above.
I have the same issue. I think this problem about the original model(Mixtral 7Bx2 MoE) missing tokenizer.model file.
Here is how i fix:
git clone https://huggingface.co/cloudyu/Mixtral_7Bx2_MoE
cd Mixtral_7Bx2_MoE && curl -L -O https://huggingface.co/mistralai/Mixtral-8x7B-v0.1/resolve/main/tokenizer.model
- use llama.cpp reconvert model
python convert.py ../Mixtral_7Bx2_MoE
./quantize ../Mixtral_7Bx2_MoE/ggml-model-f16.gguf ../Mixtral_7Bx2_MoE/ggml-model-q4_K_M.gguf q4_K_M
I can't load this model by ctranformers