how is it used?

#48

by Codigoabierto - opened Mar 19

Discussion

Codigoabierto

Mar 19

When entering the test an error appears.

mahiatlinux

Mar 19

@Codigoabierto It isn't a test for inferencing.

Codigoabierto

Mar 19

Will they place one to test the operation?

mahiatlinux

Mar 19

Will they place one to test the operation?

The model is 314 B. I don't think yet, as this is still a raw model. They need to quantise it, as the model is around 228 GB.

MrRaja

Mar 20

If someone creates a 4-bit quantized model of the 314B it would be around ~60gb which means it will still need around that much VRAM minimum right?

Considering that you need about 30-40GB VRAM to engage in inference with a 70b model.

mahiatlinux

Mar 20

If someone creates a 4-bit quantized model of the 314B it would be around ~60gb which means it will still need around that much VRAM minimum right?

Considering that you need about 30-40GB VRAM to engage in inference with a 70b model.

Yes, probably. Probably even more VRAM.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment