Please add to llama.cpp and ollama
#21
by
KeilahElla
- opened
As the title says. It would be great to use this with ollama/llama.cpp. It is usually much faster compared to transformers.
Not sure that claim is super accurate :) torch compile can get you pretty far with transformers
It will become more convenient to use in ollama. Request support for ollama.
@ArthurZ you are right , but what about when we run it on cpu maybe ollama.cpp work very well what you think?