Why does this model only have 1.27B params?
#1
by
boy977
- opened
@boy977 Hi, this model is the same as the original Vistral without any modification, the only change is that this version is made for Apple MLX Backend (you need an Apple Silicon Mac M1/M2/M3 to run it) + 4bit quantization. I think what being shown is a bug from HuggingFace that they have not been able to fix. Anyway, the model is 7B param in total.
qnguyen3
changed discussion status to
closed