fit in 24gb?
#1
by
apol
- opened
Any plans to release a 2b or 1b version that is functional and can fit into a 4090?
Congrats on the release and bravo!
A 4090 has 24GB of VRAM. Assuming want to offload the whole model to GPU, you can fit an IQ2_XS GGUF or a 2.24bpw EXL2 of this model into your 4090 with the 8K native context of Llama 3.
If you only want to run full precision models, you can easily fit a Llama 3 8B model into your VRAM. No need to go all the way down to 2B.
thanks!
apol
changed discussion status to
closed