Not able to use it with TGI
1
#5 opened 3 months ago
by
Alokgupta96
Does this model only work on CUDA devices with compute capability >= 9.0 or 8.9/ROCm MI300+?
1
#4 opened 3 months ago
by
jcfasi
How to fast inference with FP8
1
#2 opened 4 months ago
by
CCRss