fp16 or bf16 version?
#6
by
xiangli
- opened
Hi, is there a float16 or bfloat16 version? The fp32 model takes too much memory , and the code is customized specifically for fp32, not easy to infer in fp16 or bf16.
We have adjusted to code for work with bfloat16, although note I have seen this change the model's output a bit.
We have adjusted to code for work with bfloat16, although note I have seen this change the model's output a bit.
What kind of VRAM requirements are there for this model + fp32 as well as bf16? Am already blown away by the 7B but curious to interact with the 72B.