This model cannot be used normally

#1
by hyunfzen - opened

llama.cpp error:
libc++abi: terminating due to uncaught exception of type std::out_of_range: unordered_map::at: key not found

ah shit yeah I see the same. Investigating

I use it in web-ui, and I got
image.png

Hope that it is useful.

I use it in web-ui, and I got
image.png

Hope that it is useful.

Same here too...

I'm remaking all the GGUFs using an updated llama.cpp convert.py which I hope will solve the crashing issue in llama.cpp with the 6.7B model. At least it hasn't crashed for me yet.

Regarding the "byte not found in vocab", I think that's an issue in the client used by WebUI, and is not something I can solve. I can only confirm the models work with llama.cpp. After that it requires the developers of the various clients to make sure they're up-to-date with llama.cpp.

I think in that case it's because llama-cpp-python, which textgen-webui uses, has not been updated to support BPE vocab, which was recently worked on in llama.cpp.

deleted

will be really nice once all these libraries and formats stabilize. moving targets suck.

There was also an issue with the EOS token on all the Instruct models. The source models had it set incorrectly, so the GGUFs would not stop generating.

I've reported that to DeepSeek, they've fixed it, and I am now re-re-re-making the three Instruct model GGUFs, which are in the process of uploading now.

Thank you for doing this! I'm excited to try deepseek.

I have this model deepseek-coder-6.7b-instruct.Q5_K_S.gguf working in TG WebUI on Mac M1 Silicon on CPU
I don't get the "byte not found in vocab" which is very curious
However I can't get it working on Mac M2 Max on CPU and get "byte not found in vocab"- strange

I did however have to change n_ctx = 16380 on M1 Mac for it to work
Not sure why that is but going to explore

I have this model deepseek-coder-6.7b-instruct.Q5_K_S.gguf working in TG WebUI on Mac M1 Silicon on CPU
I don't get the "byte not found in vocab" which is very curious
However I can't get it working on Mac M2 Max on CPU and get "byte not found in vocab"- strange

I did however have to change n_ctx = 16380 on M1 Mac for it to work
Not sure why that is but going to explore

any update?

It appears to be my version of Llama.cpp
But even when I install that that module version it doesn't work in another environment so there's something about the way i built this one!!
I copied this module in and then I don't get the errors! any ideas how else to confirm?

llama-cpp-python 0.2.13 pypi_0 pypi

% pip show llama-cpp-python
Name: llama_cpp_python
Version: 0.2.13
Summary: Python bindings for the llama.cpp library
Home-page:
Author:
Author-email: Andrei Betlen [email protected]
License: MIT
Location: /Users/gm/miniconda3/envs/textgen/lib/python3.11/site-packages
Requires: diskcache, numpy, typing-extensions

This works for me on mac but in CPU mode only "n-gpu-layers=0"

pip uninstall llama-cpp-python -y
CMAKE_ARGS="-DLLAMA_METAL=on" pip install llama-cpp-python==0.2.13 --no-cache-dir
pip install 'llama-cpp-python[server]==0.2.13'

Thanks for reporting. Looks like llama-cpp-python updated yesterday and now how has support for models like this, with BPE vocab

deleted

Sometimes patience is a virtue :)

sometimes llama cpp gets broken (even latest update a few days ago was broken)

Upgrade llama_cpp_python to 0.2.14, and now it is working with CPU in Oobabooga on Linux:

./cmd_linux.sh
pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir

that also will work for CPU only. At least it did for me . I just did the force upgrade to 0.2.14 ( after i stopped my service , started conda env. manually.. bla bla bla.... my setup is a bit different )

....or at least it did make it load, still not tested how well it works. Just it loaded and spit out some text so i know it "worked"

THanks Mr Bloke for all your hard work:
Just tested with ctransformers
ERROR: byte not found in vocab: '

code I used
llm = AutoModelForCausalLM.from_pretrained(local_model, model_file="deepseek-coder-6.7b-instruct.Q4_K_M.gguf", model_type="deepseek", gpu_layers=50, local_files_only=True)

any ideas or tips to get it working?

ctransformers is not good at all, for some model it breaks and for most outputs are so weird and just not acceptable, try llama_cpp_python

I just downloaded "deepseek-coder-6.7b-instruct.Q5_K_M.gguf" and tested it with the current version of llama.cpp (the C/C++ variant) - and it works like a charm

Sign up or log in to comment