open-llm-leaderboard/open_llm_leaderboard · Announcement: Flagging merged models with incorrect metadata

Open LLM Leaderboard org Jan 3

Hi!
As some users removed the merge tag from their model's metadata to appear in the main view of the leaderboard, we are adding a mechanism to automatically flag all the models identified as merges where the metadata is incorrect.

If your model is a merge and you want to remove its flag, you just need to add the following in its model card.

 tags:
- merge

The leaderboard is rebuilt every hour, and re-reads this info each time.

kyujinpy

Jan 3

Hello!

https://huggingface.co/kyujinpy/Sakura-SOLAR-Instruct
I added themerge tag in my readme!

Could it be recovered?
Thanks! :)

clefourrier

Open LLM Leaderboard org Jan 3

Hi, well done!
It's been updated automatically when the leaderboard restarted, see below :)

kyujinpy

Jan 3

Oh, I see..!
Thank you for your reply!

JusticeDike

Jan 3

•

edited Jan 3

Nice work!
But what is the criteria for "merge"?
If the criterion is that a new model is created using two or more existing models, then it seems fair that the following models should also be tagged with "merge".

https://huggingface.co/DopeorNope/SOLARC-MOE-10.7Bx6
"MOE" is also possible via "merge". (https://github.com/cg123/mergekit/blob/mixtral/moe.md)
https://huggingface.co/DopeorNope/SOLARC-MOE-10.7Bx4
https://huggingface.co/upstage/SOLAR-10.7B-Instruct-v1.0
This model was sliced and merged from Mistral 7B (https://arxiv.org/abs/2312.15166)

Also, should models fine-tuned using merged models be tagged with "merge"?

clefourrier

Open LLM Leaderboard org Jan 4

Hi, good question about edge cases!

Are MOE merges?
We considered that merges are models which combine several other models in a way that does not keep the individual weights of the original models (like fusions).
For this reason, I would not consider MOEs to be merges (if they keep the individual weights separate), but I'm open to discussion on this - If I had to choose, I would probably suggest using a different tag for MOEs
Should the fine-tune of a merged model be tagged with merge?
Imo yes

Mikael110

Jan 4

•

edited Jan 4

I personally disagree with point one. My definition of a merge matches that of JusticeDike. If two or more separate models are combined to form a new model, in any way, then it's a merge as far as I'm concerned. And I feel like this is the most common understanding of the term. I would also argue that for the sake of consistency everything produced by Mergekit should be considered a merge, regardless of which technique is used. And most (all?) of the current MoE models (Other than Mixtral and it's finetunes) has been created using Mergekit.

There's also the fact that Mergekit MoE's can (and often do) contain models that are themselves merges. For instance Mixtral MOE 2x10.7B which is currently third on the filtered leaderboard contains two merged models.

I agree with point two. If finetuning a merged model removed the merge label then it would be trivial to cheat the system.

clefourrier changed discussion title from Flagging merged models with incorrect metadata to Announcement: Flagging merged models with incorrect metadata Jan 5

Mihaiii

Jan 6

•

edited Jan 7

Are frankenmerges considered merged models?

IMO they shouldn't because it just implies duplicating some layers of a model, without the involvement of another model.

I'm asking because I have a pretty decent frankenmerge model myself and I created it using mergekit, but I also could have duplicate the layers myself, with some custom code.

For clarity/context, these are frankenmerges: https://github.com/cg123/mergekit?tab=readme-ov-file#passthrough

adamo1139

Jan 6

I believe my model has been incorrectly flagged as a sneaky incorrectly tagged merge while it's a fine-tune of base Yi-34B-200K model on a publicly available instruction dataset.

Eval details:
https://huggingface.co/datasets/open-llm-leaderboard/details_adamo1139__Yi-34B-200K-AEZAKMI-v2
Model page:
https://huggingface.co/adamo1139/Yi-34B-200K-AEZAKMI-v2
Dataset page:
https://huggingface.co/datasets/adamo1139/AEZAKMI_v2
It's not a merge, it's a fine-tune. I fine-tuned it from base Yi-34B-200K model.
LoRA adapter that was then merged to the base model to create this model is here.
https://huggingface.co/adamo1139/Yi-34B-200K-AEZAKMI-v2-LoRA
If you doubt my claims, you can merge that adapter file with base model and re-create my model. SHA256 would probably match assuming you specify the same shard size.

Please unflag this model.

Thanks for flagging other sneaky merges and hiding them away a little, I think it's a great idea that cleans up the leaderboard from low-effort merges and encourages medium-effort fine-tuning and high-effort training from scratch.

adamo1139

Jan 7

•

edited Jan 8

@clefourrier I found some other models near the top of the leaderboard that are very likely to be merges, yet they are not tagged as such or flagged. Can you please flag them so that their authors will have to correct tags before re-appearing on a leaderboard? If I do it myself it will probably just open 40 different discussions and it will be a mess to manage.

merge of kyujinpy/Sakura-SOLAR-Instruct (a merge in itself) and jeonsworld/CarbonVillain-en-10.7B-v1 (a merge too)
https://huggingface.co/datasets/open-llm-leaderboard/details_DopeorNope__SOLARC-M-10.7B
https://huggingface.co/DopeorNope/SOLARC-M-10.7B

MoE containing merge kyujinpy/Sakura-SOLAR-Instruct
https://huggingface.co/datasets/open-llm-leaderboard/details_DopeorNope__SOLARC-MOE-10.7Bx6
https://huggingface.co/DopeorNope/SOLARC-MOE-10.7Bx6

MoE containing merge kyujinpy/Sakura-SOLAR-Instruct
https://huggingface.co/datasets/open-llm-leaderboard/details_DopeorNope__SOLARC-MOE-10.7Bx4
https://huggingface.co/DopeorNope/SOLARC-MOE-10.7Bx4

Merge of VAGOsolutions/SauerkrautLM-SOLAR-Instruct and kyujinpy/Sakura-SOLAR-Instruct (a merge in itself)
https://huggingface.co/datasets/open-llm-leaderboard/details_gagan3012__MetaModelv2
https://huggingface.co/gagan3012/MetaModelv2

merge of jeonsworld/CarbonVillain-en-10.7B-v4 and jeonsworld/CarbonVillain-en-10.7B-v2
https://huggingface.co/datasets/open-llm-leaderboard/details_gagan3012__MetaModelv3
https://huggingface.co/gagan3012/MetaModelv3

SEE EDIT BELOW

Not sure what merge method was used for this but model card suggest it's a merge

Merges:
Fan in: 0:2
Fan out: -4:
Intermediary layers: 1/1/1/0/1/1/0/1/0/1/1/0/1/1/0 use the On/Off as a way of regularise.

https://huggingface.co/datasets/open-llm-leaderboard/details_fblgit__UNA-SOLAR-10.7B-Instruct-v1.0
https://huggingface.co/fblgit/UNA-SOLAR-10.7B-Instruct-v1.0

SEE EDIT BELOW

Fine-tune of UNA-SOLAR-10.7B-Instruct-v1.0 which is likely a merge
https://huggingface.co/datasets/open-llm-leaderboard/details_fblgit__UNA-POLAR-10.7B-InstructMath-v2
https://huggingface.co/fblgit/UNA-POLAR-10.7B-InstructMath-v2

according to https://github.com/KyujinHan/Sakura-SOLAR-DPO, this model is based on model that is a merge (kyujinpy/Sakura-SOLAR-Instruct)
https://huggingface.co/datasets/open-llm-leaderboard/details_kyujinpy__Sakura-SOLRCA-Math-Instruct-DPO-v2
https://huggingface.co/kyujinpy/Sakura-SOLRCA-Math-Instruct-DPO-v2

according to https://github.com/KyujinHan/Sakura-SOLAR-DPO, this model is based on model that is a merge (kyujinpy/Sakura-SOLAR-Instruct)
https://huggingface.co/datasets/open-llm-leaderboard/details_kyujinpy__Sakura-SOLAR-Instruct-DPO-v2
https://huggingface.co/kyujinpy/Sakura-SOLAR-Instruct-DPO-v2

according to https://github.com/KyujinHan/Sakura-SOLAR-DPO, this model is based on model that is a merge (kyujinpy/Sakura-SOLAR-Instruct)
https://huggingface.co/datasets/open-llm-leaderboard/details_kyujinpy__Sakura-SOLRCA-Math-Instruct-DPO-v1
https://huggingface.co/kyujinpy/Sakura-SOLRCA-Math-Instruct-DPO-v1

according to https://github.com/KyujinHan/Sakura-SOLAR-DPO, this model is based on model that is a merge (kyujinpy/Sakura-SOLAR-Instruct)
https://huggingface.co/datasets/open-llm-leaderboard/details_kyujinpy__Sakura-SOLRCA-Instruct-DPO
https://huggingface.co/kyujinpy/Sakura-SOLRCA-Instruct-DPO

looks like a merge of https://huggingface.co/fblgit/UNA-SOLAR-10.7B-Instruct-v1.0 and https://huggingface.co/VAGOsolutions/SauerkrautLM-SOLAR-Instruct
https://huggingface.co/fblgit/LUNA-SOLARkrautLM-Instruct

it's based on zyh3826/GML-Mistral-merged-v1 which is a merge of quantum-v0.01 and mistral-7b-dpo-v5
https://huggingface.co/CultriX/MistralTrix-v1

merge of merges (cookinai/CatMacaroni-Slerp and viethq188/LeoScorpius-7B)
https://huggingface.co/datasets/open-llm-leaderboard/details_samir-fama__SamirGPT-v1
https://huggingface.co/samir-fama/SamirGPT-v1

merge of merges (cookinai/CatMacaroni-Slerp and shadowml/Marcoro14-7B-slerp)
https://huggingface.co/datasets/open-llm-leaderboard/details_samir-fama__FernandoGPT-v1
https://huggingface.co/samir-fama/FernandoGPT-v1

it's based on Q-bert/MetaMath-Cybertron-Starling which is a merge of Q-bert/MetaMath-Cybertron and berkeley-nest/Starling-LM-7B-alpha
https://huggingface.co/datasets/open-llm-leaderboard/details_perlthoughts__Marcoroni-8x7B-v3-MoE
https://huggingface.co/perlthoughts/Marcoroni-8x7B-v3-MoE

fine tune over go-bruins, which is based on Q-bert/MetaMath-Cybertron-Starling and therefore a merge of Q-bert/MetaMath-Cybertron and berkeley-nest/Starling-LM-7B-alpha
https://huggingface.co/datasets/open-llm-leaderboard/details_rwitz__go-bruins-v2
https://huggingface.co/rwitz/go-bruins-v2

DPO fine tune over Q-bert/MetaMath-Cybertron-Starling, therefore a merge of Q-bert/MetaMath-Cybertron and berkeley-nest/Starling-LM-7B-alpha
https://huggingface.co/datasets/open-llm-leaderboard/details_rwitz__go-bruins
https://huggingface.co/rwitz/go-bruins

fine tune of kyujinpy/Sakura-SOLAR-Instruct, which in itself is a merge
https://huggingface.co/datasets/open-llm-leaderboard/details_Walmart-the-bag__Solar-10.7B-Cato
https://huggingface.co/Walmart-the-bag/Solar-10.7B-Cato

looks like a merge of mistral base, neural-chat and marcoroni
https://huggingface.co/datasets/open-llm-leaderboard/details_aqweteddy__mistral_tv-neural-marconroni
https://huggingface.co/aqweteddy/mistral_tv-neural-marconroni

it's based on https://huggingface.co/viethq188/LeoScorpius-7B-Chat-DPO which has been already flagged for dataset contamination in https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/474
LeoScorpius-7B that the LeoScorpius-7B-Chat-DPO is based on is a merge of AIDC-ai-business/Marcoroni-7B-v3 and Q-bert/MetaMath-Cybertron-Starling
https://huggingface.co/datasets/open-llm-leaderboard/details_NExtNewChattingAI__shark_tank_ai_7_b
https://huggingface.co/NExtNewChattingAI/shark_tank_ai_7_b

merge of fblgit/una-cybertron-7b-v2-bf16 and meta-math/MetaMath-Mistral-7B
https://huggingface.co/datasets/open-llm-leaderboard/details_Q-bert__MetaMath-Cybertron
https://huggingface.co/Q-bert/MetaMath-Cybertron

merge of teknium/OpenHermes-2.5-Mistral-7B, Intel/neural-chat-7b-v3-3, meta-math/MetaMath-Mistral-7B, and openchat/openchat-3.5-1210
https://huggingface.co/OpenPipe/mistral-ft-optimized-1227
https://huggingface.co/datasets/open-llm-leaderboard/details_OpenPipe__mistral-ft-optimized-1227

merge between Chupacabra 7b v2.04 and dragon-mistral-7b-v0
https://huggingface.co/datasets/open-llm-leaderboard/details_perlthoughts__Falkor-7b
https://huggingface.co/perlthoughts/Falkor-7b

fine-tune of v1olet/v1olet_marcoroni-go-bruins-merge-7B which is in itself a merge of AIDC-ai-business/Marcoroni-7B-v3 and rwitz/go-bruins-v2. There are few generations of merges in this one.
https://huggingface.co/datasets/open-llm-leaderboard/details_v1olet__v1olet_merged_dpo_7B
https://huggingface.co/v1olet/v1olet_merged_dpo_7B

merge of OpenHermes-2.5-neural-chat-7b-v3-1 and Bruins-V2
https://huggingface.co/datasets/open-llm-leaderboard/details_Ba2han__BruinsV2-OpHermesNeu-11B
https://huggingface.co/Ba2han/BruinsV2-OpHermesNeu-11B

merge of kyujinpy/Sakura-SOLAR-Instruct and Weyaxi/SauerkrautLM-UNA-SOLAR-Instruct - both of which are also merges..
https://huggingface.co/datasets/open-llm-leaderboard/details_DopeorNope__You_can_cry_Snowman-13B
https://huggingface.co/DopeorNope/You_can_cry_Snowman-13B

merge of Q-bert/MetaMath-Cybertron-Starling and maywell/Synatra-7B-v0.3-RP
https://huggingface.co/datasets/open-llm-leaderboard/details_DopeorNope__You_can_cry_Snowman-13B
https://huggingface.co/PistachioAlt/Synatra-MCS-7B-v0.3-RP-Slerp

merge of meta-math/MetaMath-Mistral-7B and fblgit/una-cybertron-7b-v2-bf16
https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__MetaMath-una-cybertron-v2-bf16-Ties
https://huggingface.co/Weyaxi/MetaMath-una-cybertron-v2-bf16-Ties

Merge of teknium/OpenHermes-2.5-Mistral-7B and Intel/neural-chat-7b-v3-2 using ties merge.
https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__OpenHermes-2.5-neural-chat-7b-v3-2-7B
https://huggingface.co/Weyaxi/OpenHermes-2.5-neural-chat-7b-v3-2-7B

Model merge between Chupacabra, openchat, and dragon-mistral-7b-v0.
https://huggingface.co/datasets/open-llm-leaderboard/details_perlthoughts__Falkor-8x7B-MoE
https://huggingface.co/perlthoughts/Falkor-8x7B-MoE

merge of Chronos-70b-v2 and model 007 at a ratio of 0.3 using the SLERP method
https://huggingface.co/elinas/chronos007-70b
https://huggingface.co/datasets/open-llm-leaderboard/details_elinas__chronos007-70b

merge of meta-math/MetaMath-Mistral-7B and mlabonne/NeuralHermes-2.5-Mistral-7B
https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__MetaMath-NeuralHermes-2.5-Mistral-7B-Linear
https://huggingface.co/Weyaxi/MetaMath-NeuralHermes-2.5-Mistral-7B-Linear

Merge of meta-math/MetaMath-Mistral-7B and Intel/neural-chat-7b-v3-2 using ties merge.
https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__MetaMath-neural-chat-7b-v3-2-Ties
https://huggingface.co/Weyaxi/MetaMath-neural-chat-7b-v3-2-Ties

fine tune of Mistral-7B-Instruct-v0.2 and cookinai/CatMacaroni-Slerp merge
https://huggingface.co/datasets/open-llm-leaderboard/details_diffnamehard__Mistral-CatMacaroni-slerp-uncensored
https://huggingface.co/diffnamehard/Mistral-CatMacaroni-slerp-uncensored-7B

merge of Intel/neural-chat-7b-v3-1 and teknium/OpenHermes-2.5-Mistral-7B
https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__neural-chat-7b-v3-1-OpenHermes-2.5-7B
https://huggingface.co/Weyaxi/neural-chat-7b-v3-1-OpenHermes-2.5-7B

merge of meta-math/MetaMath-Mistral-7B and mlabonne/NeuralHermes-2.5-Mistral-7B
https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__MetaMath-NeuralHermes-2.5-Mistral-7B-Ties
https://huggingface.co/Weyaxi/MetaMath-NeuralHermes-2.5-Mistral-7B-Ties

merge of teknium/OpenHermes-2-Mistral-7B and Open-Orca/Mistral-7B-SlimOrca
https://huggingface.co/datasets/open-llm-leaderboard/details_Walmart-the-bag__Misted-7B
https://huggingface.co/Walmart-the-bag/Misted-7B

merge of garage-bAInd/Platypus2-70B and augtoma/qCammel-70-x
https://huggingface.co/datasets/open-llm-leaderboard/details_garage-bAInd__Camel-Platypus2-70B
https://huggingface.co/garage-bAInd/Camel-Platypus2-70B

merge of HuggingFaceH4/zephyr-7b-alpha and Open-Orca/Mistral-7B-OpenOrca
https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__OpenOrca-Zephyr-7B
https://huggingface.co/Weyaxi/OpenOrca-Zephyr-7B

seems to be a merge of Intel/neural-chat-7b-v3-1, migtissera/SynthIA-7B-v1.3, bhenrym14/mistral-7b-platypus-fp16, jondurbin/airoboros-m-7b-3.1.2, teknium/CollectiveCognition-v1.1-Mistral-7B and uukuguy/speechless-mistral-dolphin-orca-platypus-samantha-7b
https://huggingface.co/datasets/open-llm-leaderboard/details_uukuguy__speechless-mistral-7b-dare-0.85
https://huggingface.co/uukuguy/speechless-mistral-7b-dare-0.85

EDIT: I have low confidence that 2 models below are merges of two various models. They might be frankenmerges where layers of one model are merged with each other. fblgit's description is not clear enough for me.

fblgit/UNA-SOLAR-10.7B-Instruct-v1.0
fblgit/UNA-POLAR-10.7B-InstructMath-v2

rombodawg

Jan 7

@clefourrier i added the merge tag the rombodawg/Open_Gpt4_8x7B model card, can you please remove the flag and make it visible on the leaderboard?

rombodawg

Jan 7

I just updated it btw, i did it wrong the first time, my bad

rombodawg

Jan 7

Oh im an idiot im just now seeing this

Hi, well done!
It's been updated automatically when the leaderboard restarted, see below :)

clefourrier

Open LLM Leaderboard org Jan 8

Hi!
@Mihaiii I agree that changing a model with itself (= a frankenmerge) is not a merge according to our above definition. Maybe I should change the tag to "model fusions" - I'm still going to leave this issue open to get more comments from the community.
@adamo1139 great job! I'll investigate your report in the day and flag all these models manually. I'll need to add a mechanism to unflag your specific model, so it might take a bit more time.
@rombodawg no problem, thanks for updating your metadata!

Weyaxi

Jan 8

Hi @clefourrier

I don't think this model should be flagged because currently MOE models are not being flagged. Do you know why it is flagged?

https://huggingface.co/Weyaxi/Nous-Hermes-2-SUS-Chat-2x34B

beomi

Jan 9

Dear H4 Team,

I'm writing to seek clarification regarding the recent flagging of my works. I understand there's a suggestion to describe these as "merged," but I'd like to explain my perspective.

My efforts primarily focus on expanding the tokenizer vocabulary and continuing pretraining. It's important to note that I haven't uploaded my models to the H4 team's leaderboard, as my model is mainly targeted at Korean language processing, rather than English.

Given this context, I'm curious about the reasons behind the flagging of my models. Also, I believe that describing them as "merged" might not accurately represent the nature of my work. I would greatly appreciate any insights or feedback you can provide on this matter.

Thank you for your time and understanding.

--Junbum Lee

maywell

Jan 9

So, do SOLAR 10.7B should be tagged as merge?

Just asking since it’s not flagged.

clefourrier

Open LLM Leaderboard org Jan 9

Hi @Weyaxi and @beomi ,
The filter we are using to flag merges is at the moment completely automatic and a bit aggressive, I'll add specific exceptions in the week. We really wanted to encourage people to tag their models correctly, but it clearly led to some false positives.

@maywell SOLAR 10.7B is not a merge, as it's a modification of one single model, not a combination of several different models and architectures. Models called frankenmerges (= complex modifications of the layers of an existing single model to create a new model) are imo not "merges" as we mean above. I'll replace the term merge with the term "fusion" to make clearer what we mean there.

maywell

Jan 9

@clefourrier Understood.

Just wanted to mention that my Synatra 10.7B, which is finetune of SOLAR 10.7B, was flagged as merged, but it seems to be resolved now.

Thank you always for your amazing work :)

clefourrier

Open LLM Leaderboard org Jan 10

•

edited Jan 10

Hi all!
I added an MoE checkbox - if a model has "moe" in its tags, it will become selectable/unselectable through a checkbox.
Now all that's left is for people to actually use the metadata better :)

samir-fama

Jan 12

@clefourrier I added the merge tag to:
https://huggingface.co/samir-fama/SamirGPT-v1
https://huggingface.co/samir-fama/FernandoGPT-v1

clefourrier

Open LLM Leaderboard org Jan 12

@samir-fama Thanks a lot for updating your metadata! I removed the flags, and once the leaderboard gets rebuilt, your models should appear properly

CultriX

Jan 13

Hi,

I added the merge tag to my model "CultriX/MistralTrix-v1" a few days ago but it's still flagged (and therefore not appearing on the LLM-LeaderBoard unless you enable flagged models).
Is there anything else that I need to change or do in order to get the model unflagged again? From what I'm reading it should have happened automatically after adding the tag, but please correct me if I am mistaken.

Thank you in advance for your time and efforts!

CultriX

clefourrier

Open LLM Leaderboard org Jan 14

Hi @CultriX ,

Thanks for updating the metadata! There are about 40 models which were reported by users as not detected by our automated system and which I flagged manually - I just removed the flag on your model, it should be good at next leaderboard build.

sbranco

Jan 15

In my opinion this is a very narrow definition of merged (or maybe merged is the wrong termonology here). Mixtral and MoE models created by mergekit are fundamentally different in my opinion, and I would definitely prefer the second to be classified as merges. The same counts for frankenmerges. I understand the desire to not be classified as merges from the creators point of view, and I understand there are technical differences and nuances, but I think the focus here should be more on what the "consumer" would like to happen.

For me it's never clearer then when I select 'pre-trained' and unselect 'show merges', which I would expect to just show base models, but is now just filled with countless MoEs created through mergekit.

clefourrier

Open LLM Leaderboard org Jan 15

Hi @sbranco ,
Mixtral and MoE models are now tagged as MoE on the leaderboard, which allows to hide them easily through a specific checkbox. If an MoE is made from merged models, it should also be tagged as merge.
So as a consumer, you need to select "pre-trained" and unselect "show merges" and "show MoE".
However, as these tags are quite recent, not every model has the correct metadata. Feel free to ping us (ideally in another discussion) about models that you think are not tagged properly.

DopeorNope

Jan 15

•

edited Jan 15

@adamo1139

Why are you evaluating my MOE model without fully understanding it? I have created this model using a MOE methodology by merging different base models. Each of these base models, even though they are merged, has been domain-specifically fine-tuned and combined with the expertise of various experts. I'm puzzled why you would spread information without fully understanding the effort put into this work. I demand a sincere apology.

I didn't just haphazardly create my model; I carefully reviewed each performance and refined the datasets to align with their outstanding capabilities before crafting it with the MOE methodology. Going forward, I would appreciate it if arguments weren’t made without certainty on such matters. It would be beneficial to research thoroughly before raising such issues. I also hope for a sincere apology for any harm caused to me.

@clefourrier
For this reason, I am currently unable to find my model on the leaderboard. It seems that a prompt recovery is necessary. I request the restoration of my models.
DopeorNope/SOLARC-MOE-10.7Bx6
DopeorNope/SOLARC-MOE-10.7Bx4

clefourrier

Open LLM Leaderboard org Jan 15

•

edited Jan 15

Hi @DopeorNope ,
Your models seem to include merged models in the experts used for your MoEs.
This is completely OK, but you need to edit your metadata as such (add merge in the tags - it's good that you already added moe :) ).
Once it's done, I'll remove the flag on your model.

DopeorNope

Jan 15

•

edited Jan 15

@clefourrier
I already added moe tags..!

Thank you for your dedication.

clefourrier

Open LLM Leaderboard org Jan 15

Yes, you just need to also add merge as your model includes merges in its experts :)

DopeorNope

Jan 15

•

edited Jan 15

@clefourrier I didn't just use a merged model as it was. Even if it was a model created through merging, I fine-tuned each one according to its specific domain before combining them. Should this still be categorized under a 'merge' tag?

clefourrier

Open LLM Leaderboard org Jan 15

Yes, we consider that the fine-tune of a merge is also a merge.

DopeorNope

Jan 15

•

edited Jan 15

@clefourrier If I fine-tuned the model on specific domain data, wouldn't that be considered an instruct-tuned or fine-tuned model?
Even if the base model was merged, if I have trained it anew, it seems somewhat unfair to continuously categorize it simply as 'merged'.
I just wanted to be fair and clear by listing the base models, but now it feels somewhat unjust to me.

clefourrier

Open LLM Leaderboard org Jan 15

Your model will be both a fine-tuned model, but also contain merges - and it would appear with a "fine-tuned" icon while being hide-able by people who do not want to see models containing merges.
Tagging @davanstrien who is our ML librarian and might have better idea about our classification system :)

In any case, it's great that you were open about which models you used as base! It's really important for the community that models info and lineage are displayed in an open way.

DopeorNope

Jan 15

@clefourrier

Thank you. First of all, I used a model that had been merged, but then I retrained this model using the MOE (Mixture of Experts) approach.

However, I think it's somewhat unfair to classify this as a merged tag. If I had used the model as it was, it would be right to classify it as merged. But since it has been refined and reborn, and used as an expert in the new form, I believe it should be classified as MOE, not merged.

I would appreciate it if this could be reflected and restored as MOE on the leaderboard.

Thank you.

clefourrier

Open LLM Leaderboard org Jan 15

(Just to clarify, it would be classified as both an moe and merge)

DopeorNope

Jan 15

@clefourrier Thank you! Then, I will put it under two tags, so please reflect this. Thank you!

clefourrier

Open LLM Leaderboard org Jan 15

Thanks a lot!
I removed the manual flag on your model, it should be visible again at the next leaderboard restart (within an hour)

rombodawg

Jan 15

@clefourrier What does the huggingface team think about my write up about Perfected/Higher quality MoE models using mergekit, and their consideration as base models in the same catagory as Mixtral-base, rather than being flagged as merges and being sectioned off in the leaderboard behind a checkbox?

Such as my model: https://huggingface.co/rombodawg/Everyone-Coder-4x7b-Base

My write up:
https://docs.google.com/document/d/1_vOftBnrk9NRk5h10UqrfJ5CDih9KBKL61yvrZtVWPE/edit?usp=sharing

adamo1139

Jan 15

@DopeorNope

I'm puzzled why you would spread information without fully understanding the effort put into this work. I demand a sincere apology.

Classification of what constitutes a merge has been discussed earlier in the thread and it explains why I flagged your models. I also fully support efforts to organize the leaderboard.
Rankings on this leaderboard have become literally a meme.

@clefourrier

I don't want to annoy people with notifications too much, so here's another batch of models that should be classified as a merge but are not tagged as such, can you please flag them on my behalf?

Turdus' ancestry goes back to merge of AIDC-ai-business/Marcoroni-7B-v3 and EmbeddedLLM/Mistral-7B-Merge-14-v0.1 and possibly involves more merges.
https://huggingface.co/udkai/Turdus
https://huggingface.co/datasets/open-llm-leaderboard/details_udkai__Turdus

Slerp merge of upstage/SOLAR-10.7B-Instruct-v1.0 and bhavinjawade/SOLAR-10B-OrcaDPO-Jawade
https://huggingface.co/kodonho/Solar-OrcaDPO-Solar-Instruct-SLERP
https://huggingface.co/datasets/open-llm-leaderboard/details_kodonho__Solar-OrcaDPO-Solar-Instruct-SLERP

Slerp merge of DopeorNope/SOLARC-M-10.7B and kyujinpy/Sakura-SOLRCA-Math-Instruct-DPO-v2, both of which are also merges.
https://huggingface.co/kodonho/SolarM-SakuraSolar-SLERP
https://huggingface.co/datasets/open-llm-leaderboard/details_kodonho__SolarM-SakuraSolar-SLERP

Merge of upstage/SOLAR-10.7B-Instruct-v1.0 and rishiraj/meow
https://huggingface.co/Yhyu13/LMCocktail-10.7B-v1
https://huggingface.co/datasets/open-llm-leaderboard/details_Yhyu13__LMCocktail-10.7B-v1

Based on merge of AIDC-ai-business/Marcoroni-7B-v3 and EmbeddedLLM/Mistral-7B-Merge-14-v0.1
https://huggingface.co/mlabonne/NeuralMarcoro14-7B
https://huggingface.co/datasets/open-llm-leaderboard/details_mlabonne__NeuralMarcoro14-7B

Neuronovo is based on CultriX/MistralTrix-v1, which in turn is based on zyh3826/GML-Mistral-merged-v1 merge. zyh3826/GML-Mistral-merged-v1 is based on quantumaikr/quantum-v0.01 which I believe was flagged for likely being trained from contaminated model rwitz2/go-bruins-v2.1.1
https://huggingface.co/Neuronovo/neuronovo-7B-v0.2
https://huggingface.co/datasets/open-llm-leaderboard/details_Neuronovo__neuronovo-7B-v0.2

fine-tune of CultriX/MistralTrix-v1 which is definitely merged and may be contaminated as explained above
https://huggingface.co/ryandt/MusingCaterpillar
https://huggingface.co/datasets/open-llm-leaderboard/details_ryandt__MusingCaterpillar

fine-tune of Neuronovo/neuronovo-7B-v0.2 which is a fine-tune of CultriX/MistralTrix-v1 which is definitely merged and may be contaminated as explained above
https://huggingface.co/Neuronovo/neuronovo-7B-v0.3
https://huggingface.co/datasets/open-llm-leaderboard/details_Neuronovo__neuronovo-7B-v0.3

DPO of SanjiWatsuki/Lelantos-7B which is a merge of mostly unspecified models but openaccess-ai-collective/DPOpenHermes-7B-v2 and jan-hq/stealth-v1.2 are mentioned as being used for the merge.
https://huggingface.co/SanjiWatsuki/Lelantos-DPO-7B
https://huggingface.co/datasets/open-llm-leaderboard/details_SanjiWatsuki__Lelantos-DPO-7B

based on mindy-labs/mindy-7b-v2 which in turn is a merge of AIDC-ai-business/Marcoroni-7B-v3 and Weyaxi/Seraph-7B
https://huggingface.co/bardsai/jaskier-7b-dpo
https://huggingface.co/datasets/open-llm-leaderboard/details_bardsai__jaskier-7b-dpo

fine-tune of merge of EmbeddedLLM/Mistral-7B-Merge-14-v0.2 and cookinai/CatMacaroni-Slerp
https://huggingface.co/cookinai/OpenCM-14
https://huggingface.co/datasets/open-llm-leaderboard/details_cookinai__OpenCM-14

based on mindy-labs/mindy-7b-v2 which in turn is a merge of AIDC-ai-business/Marcoroni-7B-v3 and Weyaxi/Seraph-7B
https://huggingface.co/bardsai/jaskier-7b-dpo-v2
https://huggingface.co/datasets/open-llm-leaderboard/details_bardsai__jaskier-7b-dpo-v2

merge of OpenHermes-2.5-neural-chat-v3-3-Slerp, MetaMath-Cybertron-Starling and Marcoroni-7B-v3.
https://huggingface.co/jan-hq/supermario-v2
https://huggingface.co/datasets/open-llm-leaderboard/details_janhq__supermario-v2

merge of Seraph-7B and Marcoroni-7B-v3
https://huggingface.co/jan-hq/supermario-slerp
https://huggingface.co/datasets/open-llm-leaderboard/details_janhq__supermario-slerp

Additionally, I noticed some models that have a tag but filter doesn't work for them and they show even when merges should be filtered out
Seems like there is some bug with the way merge tag in the readme.md corresponds to the merged argument in the leaderboard.

I flagged this one before and author added merge tag, but for some reason it still appears with merge=False when sorting through the leaderboard.
https://huggingface.co/samir-fama/SamirGPT-v1
https://huggingface.co/datasets/open-llm-leaderboard/details_samir-fama__SamirGPT-v1

Same issue here. Author added merge tag, looks to be set properly, but it still appears with filter that shoudn't show merges.
https://huggingface.co/samir-fama/FernandoGPT-v1
https://huggingface.co/datasets/open-llm-leaderboard/details_samir-fama__FernandoGPT-v1

This one too is flagged as a merge but leaderboard shows it even when merges are filtered out
https://huggingface.co/Toten5/Marcoroni-neural-chat-7B-v1
https://huggingface.co/datasets/open-llm-leaderboard/details_Toten5__Marcoroni-v3-neural-chat-v3-3-Slerp

Still appears as merged=False even though it's tagged as merge
https://huggingface.co/argilla/distilabeled-Marcoro14-7B-slerp
https://huggingface.co/datasets/open-llm-leaderboard/details_argilla__distilabeled-Marcoro14-7B-slerp

Still appears as merged=False even though it's tagged as merge
https://huggingface.co/argilla/distilabeled-Marcoro14-7B-slerp-full
https://huggingface.co/datasets/open-llm-leaderboard/details_argilla__distilabeled-Marcoro14-7B-slerp-full

Still appears as merged=False even though it's tagged as merge. Because Garrulus was used, it's also contaminated on Winogarde so it should stay flagged even if author marks it as a merge.
https://huggingface.co/abideen/NexoNimbus-7B
https://huggingface.co/datasets/open-llm-leaderboard/details_abideen__NexoNimbus-7B

Still appears as merged=False even though it's tagged as merge
https://huggingface.co/CultriX/MistralTrix-v1
https://huggingface.co/datasets/open-llm-leaderboard/details_CultriX__MistralTrix-v1

DopeorNope

Jan 15

@adamo1139
Based on your message, I realize the value of your long-standing efforts, and it seems I took things too seriously.

I should have provided a more detailed explanation, but it seems I was only focused on explaining the origins of my model transparently and fairly.

I feel grateful for your efforts and realize that your actions are not diminishing my efforts. In fact, I feel sorry for being a bit angry earlier.

I support your efforts.

Best,

clefourrier

Open LLM Leaderboard org Jan 16

@adamo1139 Thanks a lot, this is a super cool and useful contribution!
What do you think of creating a specific discussion for your reports, so that this discussion can stay manageable in length?

clefourrier

Open LLM Leaderboard org Jan 16

@rombodawg I'll take a look at your report, but our policy is the following atm: an MoE including merges or the fine-tune of a merge should still be tagged as merge.
Quite an important part of the community is using the leaderboard to find new models to experiment with, and often do not want to use merges (harder to trace their data/license or their lineage, especially for merges of merges, harder to do precise experiments on them, since perf can change a bit randomly, etc)

rombodawg

Jan 16

Oh I understand, its more about the lineage of the model, and no so much about the quality that tags it as a merge. I just hate how merges are hidden from the default view of the leaderboard is all i guess. I dont know if that can be changed

clefourrier

Open LLM Leaderboard org Jan 16

It's exactly about that!
As it's been a very requested feature to have the default view hide merges, we'll keep it that way, but you're just a checkbox away from seeing your model :)

rombodawg

Jan 16

Ok wow i didnt know that was such a wanted feature. Ill have to remember that

adamo1139

Jan 16

•

edited Jan 16

@clefourrier

Sure, I will make a new discussion with all of the info that people whose models were flagged as incorrectly tagged merges should know. You can expect it later today (~12h from now)

Edit: Done, #540

cloudyu

Jan 17

•

edited Jan 17

https://huggingface.co/cloudyu/Mixtral_7Bx2_MoE_DPO is DPO , why be flagged?

MoE Mixture and then DPO is cealrly a new way to improve performance of LLM, I don't know why you guys dislike it.

clefourrier

Open LLM Leaderboard org Jan 17

•

edited Jan 17

Hi @cloudyu ,
It's just a problem of metadata: it should say it's an moe and that it contains merges by using the moe and merge tags, once it's done, I'll remove the associated flag. (Users want to be able to hide merges/moe from the main view without trouble)

aigeek0x0

Jan 17

@clefourrier I have added the tags as suggested. Can you please remove the falg from this model: AIGeekLabs/radiantloom-mixtral-8x7b-fusion? Thanks.

TomGrc

Jan 17

I added the MoE tag to:
TomGrc/FusionNet_7Bx2_MoE_14B

Kquant03

Jan 17

Hi!
As some users removed the merge tag from their model's metadata to appear in the main view of the leaderboard, we are adding a mechanism to automatically flag all the models identified as merges where the metadata is incorrect.

If your model is a merge and you want to remove its flag, you just need to add the following in its model card.
 tags:
- merge
The leaderboard is rebuilt every hour, and re-reads this info each time.

All of my models are flagged yet I have them tagged as merges... :(

clefourrier

Open LLM Leaderboard org Jan 18

Hi @aigeek0x0 , @TomGrc and @Kquant03 ,
You all have the same issue, which is that our dynamic info updater is not being launched - I'm investigating why atm, thanks a lot for your patience

Kquant03

Jan 18

Hi @aigeek0x0 , @TomGrc and @Kquant03 ,
You all have the same issue, which is that our dynamic info updater is not being launched - I'm investigating why atm, thanks a lot for your patience

thank you! It's not too big of a deal I'm mostly doing it for research but it's also nice to get a bit of traction through the leaderboard, as users tend to give feedback on how to improve the model

lodrick-the-lafted

Jan 19

•

edited Jan 19

I added merge tag to my model, could you unflag please? (lodrick-the-lafted/Winged-Lagomorph-2x13B)

facat

Jan 19

Hello, our model https://huggingface.co/SUSTech/SUS-Chat-34B is finetuned from yi-34b directly and seems should not be flagged as merged, could you please fix this?

adamo1139

Jan 19

Are we supposed to tag models that are MoE but are not made out of merged models as merges now?
Why was mistralai/Mixtral-8x7B-Instruct-v0.1 flagged?

Kquant03

Jan 19

•

edited Jan 19

Are we supposed to tag models that are MoE but are not made out of merged models as merges now?
Why was mistralai/Mixtral-8x7B-Instruct-v0.1 flagged?

It should look like this or similar, this is how my Prokaryote-8x7B model got unflagged...the most important part is the license, language, and tags...with MoE you need both merge and MoE (unless it's not a merge).

adamo1139

Jan 19

•

edited Jan 20

@clefourrier If MoE models with incorrect tag are also targeted by this discussion, I think this information should be added to OP. Otherwise it's unclear to people whose MoE model that doesn't contain merges gets flagged with link to this discussion.

@Kquant03 Your model Prokaryote-8x7B contains merged models, so I can understand why it was flagged, but I don't get how SUS-Chat-34B or Mixtral Instruct got flagged in a discussion primarily about merges as neither of them are merges.

Edit: Again, my fine-tune adamo1139/Yi-34B-200K-AEZAKMI-RAW-1701 that is not a merge at all got flagged seemingly automatically. Whatever is in the script that flags those models, is horribly inaccurate.
Please instruct me what I should change in my model card to get it un-flagged because I don't get it. I think it's just a mistake.

Karko

Jan 20

•

edited Jan 20

Karko/Proctora has been flagged as a merge, but this is a MoE

with a base => OpenPipe/mistral-ft-optimized-1227
and two experts : SanjiWatsuki/Kunoichi-7B & samir-fama/SamirGPT-v1

I added the moe tag. If this is not enough please unflag.

EDIT: nevermind I guess the base models were merges. I added the merge tag as well.

adamo1139

Jan 20

•

edited Jan 20

@Karko all of the models you used for your MoE are merges, therefore resulting MoE should be flagged both as a merge and also as a MoE.

Edit: See FAQ In #540

Karko

Jan 20

@Karko all of the models you used for your MoE are merges, therefore resulting MoE should be flagged both as a merge and also as a MoE.

Edit: See FAQ In #540

Right. I realized that and edited my post. Sorry for the bothering...

Karko

Jan 21

BTW I edited the model card as asked, adding moe and merge tags. Can you please unflag.

robinsmits

Jan 21

•

edited Jan 22

@clefourrier My model has been flagged as a 'merge' ... it is however an adapter model. Did something went wrong? Or am I misinterpreting the term 'merge'?
Also it only shows up under Private/deleted models.

open-llm-leaderboard/details_robinsmits__Mistral-Instruct-7B-v0.2-ChatAlpaca

Please let me know.

NovoCode

Jan 22

•

edited Jan 22

My model Valor-7B-v0.1 only shows up on the leader-board under deleted/private but its not deleted or made private, can you please help me find out why?

clefourrier

Open LLM Leaderboard org Jan 22

Hi all!
We've been iterating a lot in the last 2 weeks on which metadata is crucial to display or not, and some models were aggressively flagged, sorry for that. If your model card or name mentions merging (using the word merge for example), our filter assumes your model is a merge. There is no way to do this less aggressively at the moment.
I'll add a whitelist of models very soon.

@adamo1139 re the Mixtral model flags, yes it was a matter of metadata, but they changed it so the model should appear again.
@Karko and @lodrick-the-lafted your models should be unflagged, thanks a lot for updating the metadata!
@robinsmits The problem with your model is that it's not correctly tagged as an moe (not a merge) - every metadata problem is redirecting to this conversation at the moment, sorry about that.

@robinsmits and @NovoCode I'm investigating why your models are currently hidden.

robinsmits

Jan 22

@clefourrier

Please note that the base model is: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2
This is not a MoE as far as I'am aware...at least it does not contain the MoE tag.

clefourrier

Open LLM Leaderboard org Jan 22

My bad, I read too fast (thought it was mixtral based)! Checking your model then!

clefourrier

Open LLM Leaderboard org Jan 22

Hi @robinsmits and @NovoCode !
Found the problem for both your models, our check for "is this model on the hub" was failing for adapters. I'm fixing it, it should be good by tonight.

robinsmits

Jan 22

Thanks @clefourrier Keep up the good work! Much appreciated!

Kquant03

Jan 22

So, all of my models except hippolyta are private/deleted? I have them selected as MoE and merge as they're supposed to...

clefourrier

Open LLM Leaderboard org Jan 22

HI @Kquant03 , I'm not sure what you mean? They are displayed when "Hide deleted" and "Hide flagged" are selected (maybe you are used to the old name of the checkboxes? We changed them this afternoon for homogeneity)

Kquant03

Jan 22

HI @Kquant03 , I'm not sure what you mean? They are displayed when "Hide deleted" and "Hide flagged" are selected (maybe you are used to the old name of the checkboxes? We changed them this afternoon for homogeneity)

You guys are really nice for letting us show up on the leaderboards same as the rest. It makes me feel better about my 7 terabytes of transfer recorded in my pc's data settings lol.

I never thought I'd see myself on the board without having moe or merge selected...this made my day. Thank you!

Cookize

Jan 24

Hello, our model https://huggingface.co/SUSTech/SUS-Chat-34B is finetuned from yi-34b directly and seems should not be flagged as merged, could you please fix this?

Cookize

Jan 24

Hello, our model https://huggingface.co/SUSTech/SUS-Chat-34B is finetuned from yi-34b directly and seems should not be flagged as merged, could you please fix this?

Sorry, I just found it. Ignore the above.

clefourrier

Open LLM Leaderboard org Jan 24

Going to close this discussion, as it's become quite long, and I assume it's going to send notifs to anyone who took part in it ^^.

If you think there is a problem with the tagging/flagging or your model, please open a specific discussion for it!

clefourrier changed discussion status to closed Jan 24

Walmart-the-bag

Apr 17

@clefourrier I found some other models near the top of the leaderboard that are very likely to be merges, yet they are not tagged as such or flagged. Can you please flag them so that their authors will have to correct tags before re-appearing on a leaderboard? If I do it myself it will probably just open 40 different discussions and it will be a mess to manage.

merge of kyujinpy/Sakura-SOLAR-Instruct (a merge in itself) and jeonsworld/CarbonVillain-en-10.7B-v1 (a merge too)
https://huggingface.co/datasets/open-llm-leaderboard/details_DopeorNope__SOLARC-M-10.7B
https://huggingface.co/DopeorNope/SOLARC-M-10.7B

MoE containing merge kyujinpy/Sakura-SOLAR-Instruct
https://huggingface.co/datasets/open-llm-leaderboard/details_DopeorNope__SOLARC-MOE-10.7Bx6
https://huggingface.co/DopeorNope/SOLARC-MOE-10.7Bx6

MoE containing merge kyujinpy/Sakura-SOLAR-Instruct
https://huggingface.co/datasets/open-llm-leaderboard/details_DopeorNope__SOLARC-MOE-10.7Bx4
https://huggingface.co/DopeorNope/SOLARC-MOE-10.7Bx4

Merge of VAGOsolutions/SauerkrautLM-SOLAR-Instruct and kyujinpy/Sakura-SOLAR-Instruct (a merge in itself)
https://huggingface.co/datasets/open-llm-leaderboard/details_gagan3012__MetaModelv2
https://huggingface.co/gagan3012/MetaModelv2

merge of jeonsworld/CarbonVillain-en-10.7B-v4 and jeonsworld/CarbonVillain-en-10.7B-v2
https://huggingface.co/datasets/open-llm-leaderboard/details_gagan3012__MetaModelv3
https://huggingface.co/gagan3012/MetaModelv3

SEE EDIT BELOW

Not sure what merge method was used for this but model card suggest it's a merge

Merges:
Fan in: 0:2
Fan out: -4:
Intermediary layers: 1/1/1/0/1/1/0/1/0/1/1/0/1/1/0 use the On/Off as a way of regularise.

https://huggingface.co/datasets/open-llm-leaderboard/details_fblgit__UNA-SOLAR-10.7B-Instruct-v1.0
https://huggingface.co/fblgit/UNA-SOLAR-10.7B-Instruct-v1.0

SEE EDIT BELOW

Fine-tune of UNA-SOLAR-10.7B-Instruct-v1.0 which is likely a merge
https://huggingface.co/datasets/open-llm-leaderboard/details_fblgit__UNA-POLAR-10.7B-InstructMath-v2
https://huggingface.co/fblgit/UNA-POLAR-10.7B-InstructMath-v2

according to https://github.com/KyujinHan/Sakura-SOLAR-DPO, this model is based on model that is a merge (kyujinpy/Sakura-SOLAR-Instruct)
https://huggingface.co/datasets/open-llm-leaderboard/details_kyujinpy__Sakura-SOLRCA-Math-Instruct-DPO-v2
https://huggingface.co/kyujinpy/Sakura-SOLRCA-Math-Instruct-DPO-v2

according to https://github.com/KyujinHan/Sakura-SOLAR-DPO, this model is based on model that is a merge (kyujinpy/Sakura-SOLAR-Instruct)
https://huggingface.co/datasets/open-llm-leaderboard/details_kyujinpy__Sakura-SOLAR-Instruct-DPO-v2
https://huggingface.co/kyujinpy/Sakura-SOLAR-Instruct-DPO-v2

according to https://github.com/KyujinHan/Sakura-SOLAR-DPO, this model is based on model that is a merge (kyujinpy/Sakura-SOLAR-Instruct)
https://huggingface.co/datasets/open-llm-leaderboard/details_kyujinpy__Sakura-SOLRCA-Math-Instruct-DPO-v1
https://huggingface.co/kyujinpy/Sakura-SOLRCA-Math-Instruct-DPO-v1

according to https://github.com/KyujinHan/Sakura-SOLAR-DPO, this model is based on model that is a merge (kyujinpy/Sakura-SOLAR-Instruct)
https://huggingface.co/datasets/open-llm-leaderboard/details_kyujinpy__Sakura-SOLRCA-Instruct-DPO
https://huggingface.co/kyujinpy/Sakura-SOLRCA-Instruct-DPO

looks like a merge of https://huggingface.co/fblgit/UNA-SOLAR-10.7B-Instruct-v1.0 and https://huggingface.co/VAGOsolutions/SauerkrautLM-SOLAR-Instruct
https://huggingface.co/fblgit/LUNA-SOLARkrautLM-Instruct

it's based on zyh3826/GML-Mistral-merged-v1 which is a merge of quantum-v0.01 and mistral-7b-dpo-v5
https://huggingface.co/CultriX/MistralTrix-v1

merge of merges (cookinai/CatMacaroni-Slerp and viethq188/LeoScorpius-7B)
https://huggingface.co/datasets/open-llm-leaderboard/details_samir-fama__SamirGPT-v1
https://huggingface.co/samir-fama/SamirGPT-v1

merge of merges (cookinai/CatMacaroni-Slerp and shadowml/Marcoro14-7B-slerp)
https://huggingface.co/datasets/open-llm-leaderboard/details_samir-fama__FernandoGPT-v1
https://huggingface.co/samir-fama/FernandoGPT-v1

it's based on Q-bert/MetaMath-Cybertron-Starling which is a merge of Q-bert/MetaMath-Cybertron and berkeley-nest/Starling-LM-7B-alpha
https://huggingface.co/datasets/open-llm-leaderboard/details_perlthoughts__Marcoroni-8x7B-v3-MoE
https://huggingface.co/perlthoughts/Marcoroni-8x7B-v3-MoE

fine tune over go-bruins, which is based on Q-bert/MetaMath-Cybertron-Starling and therefore a merge of Q-bert/MetaMath-Cybertron and berkeley-nest/Starling-LM-7B-alpha
https://huggingface.co/datasets/open-llm-leaderboard/details_rwitz__go-bruins-v2
https://huggingface.co/rwitz/go-bruins-v2

DPO fine tune over Q-bert/MetaMath-Cybertron-Starling, therefore a merge of Q-bert/MetaMath-Cybertron and berkeley-nest/Starling-LM-7B-alpha
https://huggingface.co/datasets/open-llm-leaderboard/details_rwitz__go-bruins
https://huggingface.co/rwitz/go-bruins

fine tune of kyujinpy/Sakura-SOLAR-Instruct, which in itself is a merge
https://huggingface.co/datasets/open-llm-leaderboard/details_Walmart-the-bag__Solar-10.7B-Cato
https://huggingface.co/Walmart-the-bag/Solar-10.7B-Cato

looks like a merge of mistral base, neural-chat and marcoroni
https://huggingface.co/datasets/open-llm-leaderboard/details_aqweteddy__mistral_tv-neural-marconroni
https://huggingface.co/aqweteddy/mistral_tv-neural-marconroni

it's based on https://huggingface.co/viethq188/LeoScorpius-7B-Chat-DPO which has been already flagged for dataset contamination in https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/474
LeoScorpius-7B that the LeoScorpius-7B-Chat-DPO is based on is a merge of AIDC-ai-business/Marcoroni-7B-v3 and Q-bert/MetaMath-Cybertron-Starling
https://huggingface.co/datasets/open-llm-leaderboard/details_NExtNewChattingAI__shark_tank_ai_7_b
https://huggingface.co/NExtNewChattingAI/shark_tank_ai_7_b

merge of fblgit/una-cybertron-7b-v2-bf16 and meta-math/MetaMath-Mistral-7B
https://huggingface.co/datasets/open-llm-leaderboard/details_Q-bert__MetaMath-Cybertron
https://huggingface.co/Q-bert/MetaMath-Cybertron

merge of teknium/OpenHermes-2.5-Mistral-7B, Intel/neural-chat-7b-v3-3, meta-math/MetaMath-Mistral-7B, and openchat/openchat-3.5-1210
https://huggingface.co/OpenPipe/mistral-ft-optimized-1227
https://huggingface.co/datasets/open-llm-leaderboard/details_OpenPipe__mistral-ft-optimized-1227

merge between Chupacabra 7b v2.04 and dragon-mistral-7b-v0
https://huggingface.co/datasets/open-llm-leaderboard/details_perlthoughts__Falkor-7b
https://huggingface.co/perlthoughts/Falkor-7b

fine-tune of v1olet/v1olet_marcoroni-go-bruins-merge-7B which is in itself a merge of AIDC-ai-business/Marcoroni-7B-v3 and rwitz/go-bruins-v2. There are few generations of merges in this one.
https://huggingface.co/datasets/open-llm-leaderboard/details_v1olet__v1olet_merged_dpo_7B
https://huggingface.co/v1olet/v1olet_merged_dpo_7B

merge of OpenHermes-2.5-neural-chat-7b-v3-1 and Bruins-V2
https://huggingface.co/datasets/open-llm-leaderboard/details_Ba2han__BruinsV2-OpHermesNeu-11B
https://huggingface.co/Ba2han/BruinsV2-OpHermesNeu-11B

merge of kyujinpy/Sakura-SOLAR-Instruct and Weyaxi/SauerkrautLM-UNA-SOLAR-Instruct - both of which are also merges..
https://huggingface.co/datasets/open-llm-leaderboard/details_DopeorNope__You_can_cry_Snowman-13B
https://huggingface.co/DopeorNope/You_can_cry_Snowman-13B

merge of Q-bert/MetaMath-Cybertron-Starling and maywell/Synatra-7B-v0.3-RP
https://huggingface.co/datasets/open-llm-leaderboard/details_DopeorNope__You_can_cry_Snowman-13B
https://huggingface.co/PistachioAlt/Synatra-MCS-7B-v0.3-RP-Slerp

merge of meta-math/MetaMath-Mistral-7B and fblgit/una-cybertron-7b-v2-bf16
https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__MetaMath-una-cybertron-v2-bf16-Ties
https://huggingface.co/Weyaxi/MetaMath-una-cybertron-v2-bf16-Ties

Merge of teknium/OpenHermes-2.5-Mistral-7B and Intel/neural-chat-7b-v3-2 using ties merge.
https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__OpenHermes-2.5-neural-chat-7b-v3-2-7B
https://huggingface.co/Weyaxi/OpenHermes-2.5-neural-chat-7b-v3-2-7B

Model merge between Chupacabra, openchat, and dragon-mistral-7b-v0.
https://huggingface.co/datasets/open-llm-leaderboard/details_perlthoughts__Falkor-8x7B-MoE
https://huggingface.co/perlthoughts/Falkor-8x7B-MoE

merge of Chronos-70b-v2 and model 007 at a ratio of 0.3 using the SLERP method
https://huggingface.co/elinas/chronos007-70b
https://huggingface.co/datasets/open-llm-leaderboard/details_elinas__chronos007-70b

merge of meta-math/MetaMath-Mistral-7B and mlabonne/NeuralHermes-2.5-Mistral-7B
https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__MetaMath-NeuralHermes-2.5-Mistral-7B-Linear
https://huggingface.co/Weyaxi/MetaMath-NeuralHermes-2.5-Mistral-7B-Linear

Merge of meta-math/MetaMath-Mistral-7B and Intel/neural-chat-7b-v3-2 using ties merge.
https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__MetaMath-neural-chat-7b-v3-2-Ties
https://huggingface.co/Weyaxi/MetaMath-neural-chat-7b-v3-2-Ties

fine tune of Mistral-7B-Instruct-v0.2 and cookinai/CatMacaroni-Slerp merge
https://huggingface.co/datasets/open-llm-leaderboard/details_diffnamehard__Mistral-CatMacaroni-slerp-uncensored
https://huggingface.co/diffnamehard/Mistral-CatMacaroni-slerp-uncensored-7B

merge of Intel/neural-chat-7b-v3-1 and teknium/OpenHermes-2.5-Mistral-7B
https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__neural-chat-7b-v3-1-OpenHermes-2.5-7B
https://huggingface.co/Weyaxi/neural-chat-7b-v3-1-OpenHermes-2.5-7B

merge of meta-math/MetaMath-Mistral-7B and mlabonne/NeuralHermes-2.5-Mistral-7B
https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__MetaMath-NeuralHermes-2.5-Mistral-7B-Ties
https://huggingface.co/Weyaxi/MetaMath-NeuralHermes-2.5-Mistral-7B-Ties

merge of teknium/OpenHermes-2-Mistral-7B and Open-Orca/Mistral-7B-SlimOrca
https://huggingface.co/datasets/open-llm-leaderboard/details_Walmart-the-bag__Misted-7B
https://huggingface.co/Walmart-the-bag/Misted-7B

merge of garage-bAInd/Platypus2-70B and augtoma/qCammel-70-x
https://huggingface.co/datasets/open-llm-leaderboard/details_garage-bAInd__Camel-Platypus2-70B
https://huggingface.co/garage-bAInd/Camel-Platypus2-70B

merge of HuggingFaceH4/zephyr-7b-alpha and Open-Orca/Mistral-7B-OpenOrca
https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__OpenOrca-Zephyr-7B
https://huggingface.co/Weyaxi/OpenOrca-Zephyr-7B

seems to be a merge of Intel/neural-chat-7b-v3-1, migtissera/SynthIA-7B-v1.3, bhenrym14/mistral-7b-platypus-fp16, jondurbin/airoboros-m-7b-3.1.2, teknium/CollectiveCognition-v1.1-Mistral-7B and uukuguy/speechless-mistral-dolphin-orca-platypus-samantha-7b
https://huggingface.co/datasets/open-llm-leaderboard/details_uukuguy__speechless-mistral-7b-dare-0.85
https://huggingface.co/uukuguy/speechless-mistral-7b-dare-0.85

EDIT: I have low confidence that 2 models below are merges of two various models. They might be frankenmerges where layers of one model are merged with each other. fblgit's description is not clear enough for me.

fblgit/UNA-SOLAR-10.7B-Instruct-v1.0
fblgit/UNA-POLAR-10.7B-InstructMath-v2

I have never had 'merge' tag in misted-7b. That is a false flag.

Walmart-the-bag

Apr 17

All commits are here.

https://huggingface.co/Walmart-the-bag/Misted-7B/commits/main

clefourrier

Open LLM Leaderboard org Apr 18

@Walmart-the-bag This is precisely why your model was flagged: according to your model card, your model is a merge

base_model: teknium/OpenHermes-2-Mistral-7B
models:
      - model: teknium/OpenHermes-2-Mistral-7B
      - model: Open-Orca/Mistral-7B-SlimOrca
merge_method: slerp

yet it is not indicated in the metadata of your model, which should include "merge" as a tag.

picAIso

May 31

hi, can someone please explain why picAIso/TARS-8B was flagged despite having the merge tag attached?

clefourrier

Open LLM Leaderboard org Jun 3

Hi @picAIso !
If it's about this tag and the model was submitted before having the tag attached, it was flagged then, but it should be updated automatically.

hyokwan

Jun 24

https://huggingface.co/hyokwan/hkcode-solar-youtube-merged
I added the merge tag in my readme!

Could it be recovered?
Thanks! :)

clefourrier

Open LLM Leaderboard org Jun 24

Hi! It should be updated automatically, feel free to ping us again if it's not OK in a couple days