Is is possible for you to distill this model to 3B?
First of all, I'm very impressed with this model. Unlike the previous one, it follows the translation instructions. And an overall score for several languages is very impressive. It's the first model that can really be used for the translation tasks.
Unfortunately, such people like me have poor GPU (In my case, it's an M1 Mac with only 8GB). I can't even run this model in Q4 quant on my machine. The only way to try this model out was to run it on my remote server using CPU.
Most of developers who introduce their models underestimate how many people have poor or even shitty GPU. Meanwhile, me and the rest of us need these models for every day use (especially personal private use).
I know that to train another 3B model is a challenge. Yet make a distillation of this excellent 8B model is not such a big issue as trining it from scratch.
Can you please distill this 8B model to 3B that we could run it quantized on CPU or poor GPU?
Thank you!
Hey @alexcardo ! We're happy you find the model useful. Thanks for the suggestion. At this time we do not have plans for creating a distilled version of this model, but we will let you know if this changes in the future!