multi-part model
How do I use a multi-part model?
DeepSeek-V2.5-IQ1_M-00001-of-00002.gguf
DeepSeek-V2.5-IQ1_M-00002-of-00002.gg
not run on colab
You just have to load the first and then any llama.cpp tool will look for and load the second
from llama_cpp import Llama
llm = Llama.from_pretrained(
repo_id="bartowski/DeepSeek-V2.5-GGUF",
filename="DeepSeek-V2.5-IQ1_M/DeepSeek-V2.5-IQ1_M-00001-of-00002.gguf",
)
llm.create_chat_completion(
messages = [
{
"role": "user",
"content": "What is the capital of France?"
}
]
) Should I use the code and it will download the two parts by itself and run them without me merging them together?
!./llama-gguf-split --merge DeepSeek-V2.5-IQ1_M-00001-of-00002.gguf DeepSeek-V2.5-IQ1_M-00002-of-00002.gguf DeepSeek-V2.5-IQ1_M.gguf
Every time I merge, the second part is deleted.
I think you're meant to only pass the first part like this:
./llama-gguf-split --merge DeepSeek-V2.5-IQ1_M-00001-of-00002.gguf DeepSeek-V2.5-IQ1_M.gguf
I would like to add here that the files downloaded using the command huggingface-cli download bartowski/DeepSeek-V2.5-GGUF --include "DeepSeek-V2.5-Q8_0/*" --local-dir ./
will give you symlinks to the huggingface cache and the merge command didn't work with symlinks. I copied the actual files to a directory and that worked fine during the merge