mistral-goliath-12b ?

by edwardDali - opened Nov 18, 2023

Nov 18, 2023

•

edited Nov 18, 2023

I heard goliath 120b is at GPT4 level for some benchmarks, it it possible to use the same merge techniques and generate a merge of 2 mistral models? should be interesting if same capabilities are amplified as well. Maybe a merge of 3 models would be even stronger :)

acrastt

Nov 20, 2023

I heard goliath 120b is at GPT4 level for some benchmarks, it it possible to use the same merge techniques and generate a merge of 2 mistral models? should be interesting if same capabilities are amplified as well. Maybe a merge of 3 models would be even stronger :)

There is no publically available 70B Mistral yet though.

edwardDali

Nov 20, 2023

The 7b models are quite strong. Maybe merging multiple small models will improve the overall result. I don't know... newbie here. But your Goliath experiment seems to indicate a valid path.

aslawliet

Nov 29, 2023

The 7b models are quite strong. Maybe merging multiple small models will improve the overall result. I don't know... newbie here. But your Goliath experiment seems to indicate a valid path.

It's not the way you think it works, adding like 3 same models, would be quite a amount to layer duplicasy which would eventually lead to a garbage model if not fine tuned further

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment