Must be a quant

by vanakema - opened 3 days ago

3 days ago

The size of this model is the same size as the 4 bit quantization. Are you sure this is not a 4 bit quantization that's just being "expanded" to bf16 at runtime? The size of the pretrained model mlx-community/Meta-Llama-3.1-70B-bf16 is several times larger than this instruct model.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment