GGUF adds `<0x0A>` during tokenization due to missing `tokenizer.model`
Hi here
@TheBloke
, first of all thanks a lot again for investing the time, effort and compute on quantizing our notus-7b-v1
models 🫶🏻
We just wanted to report that it seems that the GGUF variants via llama.cpp
are not properly encoding the \n
probably due to the missing tokenizer.model
file, so we've ported it from Zephyr (as we share the same tokenizer) and it's already available at https://huggingface.co/argilla/notus-7b-v1/blob/main/tokenizer.model, in case you'd like to re-run the GGUF quantization.
We've tried to quantize it using GGUF and Q4 K M, and it worked fine, see https://huggingface.co/alvarobartt/notus-7b-v1-GGUF. If you are not able to re-run we are happy to do so on our compute and then share the different GGUF files with you, thanks in advance 🤗
OK I will re-do it now
Thanks and sorry for the inconveniences! I could quantize some at https://huggingface.co/alvarobartt/notus-7b-v1-GGUF, in case you want to reuse some
GGUFs have been re-made from the updated source repo and now appear fine:
<|system|>
You are a story writing assistant
<|user|>
Write a story about llamas
<|assistant|>
Once upon a time in the high Andes Mountains of South America, there lived a herd of llamas. These majestic creatures were known for their strong and sturdy bodies, long necks, and fluffy brown coats. They spent their days grazing on the lush grassy fields and roaming around the rocky terrain.
Amongst this herd was a young female llama named Luna. She had just turned one year old and was eager to explore the world beyond her home. With the guidance of her mother, Luna learned how to navigate through the mountain paths, recognize different plants for food and water sources.
As Luna grew older, she became more independent and started venturing out further each day. One afternoon, she came across a group of llamas from a nearby village who were traveling back home after trading goods. The leader of the group was an experienced llama named Tariq, who warmly welcomed Luna into their travel party.
Together, they traversed through rugged mountain passes and crossed rushing rivers. Luna learned valuable lessons about trust, respect, and teamwork from her new friends. She also discovered that llamas were not just for transportation, but were skilled in carrying heavy loads and weaving woolen fabrics.
As the days turned into weeks, Luna became more attached to Tariq's herd. However, she knew deep down that it was time for her to return home to her mother and siblings. With a heavy heart, Luna said goodbye to her friends and set off on her journey back home.
On her way back, Luna encountered some unexpected challenges, such as strong winds, treacherous cliffs, and wild animals. But with the determination she had learned from Tariq's herd, Luna overcame these obstacles and found her way back home safely.
From that day on, Luna never forgot the valuable lessons she had learned about friendship, trust, and adventure. And as she grew older and became a mother herself, she passed down these stories and traditions to her own offspring, ensuring that the legacy of llamas would continue for generations to come. [end of text]
Apologies for not spotting this and thanks for updating your repo.
I've raised an issue with llama.cpp regarding this <0x0A>
issue. It's an issue caused by the llama.cpp PR for making GGUFs from tokenizer.json
, when no tokenizer.model
is provided.
No worries at all
@TheBloke
it was on us that didn't realise! Thanks a ton for fixing it straight away, it's also not super easy to get the tokenizer.model
unless it's available as part of the base model, because then the default is the fast, Rust-based, tokenizer and there's not a snippet to go from fast to slow, while the other can easily be done. I think it has some issues attached to it, but maybe worth investigating on llama.cpp
to tra to use the Rust-based one instead from tokenizer_config.json
, anyway, thanks a ton 🎉
Thanks for figuring this out @alvarobartt and for the quick fix @TheBloke . Just tested the updated notus GGUF and it works great.
Just in case you aren't aware this issue is also impacting.
https://huggingface.co/TheBloke/OpenHermes-2.5-neural-chat-7B-v3-2-7B-GGUF