multi-part model

#2
by goodasdgood - opened

How do I use a multi-part model?

DeepSeek-V2.5-IQ1_M-00001-of-00002.gguf

DeepSeek-V2.5-IQ1_M-00002-of-00002.gg

not run on colab

You just have to load the first and then any llama.cpp tool will look for and load the second

from llama_cpp import Llama

llm = Llama.from_pretrained(
repo_id="bartowski/DeepSeek-V2.5-GGUF",
filename="DeepSeek-V2.5-IQ1_M/DeepSeek-V2.5-IQ1_M-00001-of-00002.gguf",
)

llm.create_chat_completion(
messages = [
{
"role": "user",
"content": "What is the capital of France?"
}
]
) Should I use the code and it will download the two parts by itself and run them without me merging them together?

!./llama-gguf-split --merge DeepSeek-V2.5-IQ1_M-00001-of-00002.gguf DeepSeek-V2.5-IQ1_M-00002-of-00002.gguf DeepSeek-V2.5-IQ1_M.gguf

Every time I merge, the second part is deleted.

I think you're meant to only pass the first part like this:

./llama-gguf-split --merge DeepSeek-V2.5-IQ1_M-00001-of-00002.gguf DeepSeek-V2.5-IQ1_M.gguf

I would like to add here that the files downloaded using the command huggingface-cli download bartowski/DeepSeek-V2.5-GGUF --include "DeepSeek-V2.5-Q8_0/*" --local-dir ./will give you symlinks to the huggingface cache and the merge command didn't work with symlinks. I copied the actual files to a directory and that worked fine during the merge

Sign up or log in to comment