will there be an FP8 version later?

by ZealotTt - opened 30 days ago

30 days ago

This is a very good project, thank you for your efforts!
will there be a distilled FP8 version model in the later stage?

jimmycarter

Owner 29 days ago

The model can be loaded in FP8 or int8 using quanto, as mentioned in the inference section of README.md. There is a Q5 quantized community model. I personally won't be doing GGUF versions, but if someone else wants to convert them I would welcome it.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment