request for model card: benchmarks for each GGUF quant to explore size-vs-quality tradeoff
Thank you for releasing this model as GGUF. Releasing models at common quant depths provides the ability to "right-size" the model for various applications. I think adding benchmark information to the model card for each quant level will further enhance GGUF distribution.
Currently, huggingface makes it easy to determine the size of models at various depths, which is important for picking a model quant. At the same time, it is common to find benchmark information for full-resolution models. However, these data are rarely crossed to enable comparison among quant depths based on benchmarks. Such benchmark information would enable better understanding of the size-vs-quality tradeoff, making GGUF quants even more useful.
Adding benchmarks-per-quant to the model card is non-trivial. I understand it might be a big ask. But, speaking for myself: I know I would make use of this information.