Must be a quant

#2
by vanakema - opened

The size of this model is the same size as the 4 bit quantization. Are you sure this is not a 4 bit quantization that's just being "expanded" to bf16 at runtime? The size of the pretrained model mlx-community/Meta-Llama-3.1-70B-bf16 is several times larger than this instruct model.

Sign up or log in to comment