Please submit to HF Leaderboard

by eNavarro - opened Nov 15, 2023

Nov 15, 2023

I've recently read a comparison of LLM models, and this one (version Q2_K) sits right on top, next to GPT-4.
Comparison is here: https://www.reddit.com/r/LocalLLaMA/comments/17vcr9d/llm_comparisontest_2x_34b_yi_dolphin_nous/

Impressive as that may sound, the test is still some home made crafted test with a particular purpose in mind, not an all-round test. I'd love to see this model compared to others in HuggingFace Leaderboard: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

Thanks!

alpindale

Owner Nov 15, 2023

I submitted the model for evaluation a few days ago. Checked back again now and it wasn't in the pending section, not even any reports of failing. Did they remove it?

Well, I added it again.

Kaylard

Nov 22, 2023

I submitted the model for evaluation a few days ago. Checked back again now and it wasn't in the pending section, not even any reports of failing. Did they remove it?

Well, I added it again.

It's just too good

alpindale

Owner Nov 26, 2023

Update: It's still not appearing in the pending, evaluating, or finished categories. When I try to re-submit, it says it's already submitted. No clue what's happening.

ChuckMcSneed

Nov 27, 2023

The elites don't want regular people to know about the infinite POWER of frankenmerges!

clefourrier

Nov 28, 2023

Hi!
For these kind of problems, it would be good to open an issue on the leaderboard's discussions next time, so we can give you a hand :)
We have an FAQ : if a model stops appearing in the queues, it usually means that evaluating it failed.

I checked for your specific model, and that's indeed what happened (connection problem when downloading the weights, it happens sometimes).
It's been relaunched, we'll see if it fits in memory - if it does, you'll get results in a couple of days.

silver

Dec 5, 2023

Hi!
For these kind of problems, it would be good to open an issue on the leaderboard's discussions next time, so we can give you a hand :)
We have an FAQ : if a model stops appearing in the queues, it usually means that evaluating it failed.

I checked for your specific model, and that's indeed what happened (connection problem when downloading the weights, it happens sometimes).
It's been relaunched, we'll see if it fits in memory - if it does, you'll get results in a couple of days.

Is it solved? I still cannot find goliath-120b on the leaderboard. Really looking forward to see its results.

ChuckMcSneed

Dec 7, 2023

@clefourrier @HuggingFaceH4 WHERE ARE THE RESULTS?

clefourrier

Dec 7, 2023

Hi @silver !

No, this model does not fit in the memory of our GPUs, we'll need to adapt our backend to run multi-node evaluation automatically to be able to run it.
The best would be for you to open an issue on the Open LLM Leaderboard page directly, so we can track this and make sure it stays in our backlog.

Btw, as indicated in the FAQ, you can follow the status of your models by looking at their request file.

ChuckMcSneed

Dec 7, 2023

https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/434
Opened issue.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment