Independent evaluation results

by yaronr - opened Sep 26

Sep 26

Dear Qwen team,

I'm pleased to share our independent evaluation of the model using our implementation of the MMLU-Pro benchmark.
The results demonstrate impressive performance for the model across multiple categories compared with other models.
I hope you find this useful.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment