BETTER THAN GOLIATH?!
I've merged Euryale-lora that I made with Xwin and then merged it with itself in goliath-style merge using mergekit. The resulting model performs better than goliath on my tests(note: performance on tests is not necessarily performance in practice). Test it, have fun with it. This is a sister model of Premerge-EX-EX-123B.
Prompt format
Alpaca.
Ideas behind it
Since the creation of Goliath I was wondering if it was possible to make something even better. I've tried linear, passthrough, SLERP, TIES-merging models, but I could not recreate the greatness of goliath, at least not in a way that I liked in practical use. I knew about the existence of LORAs but I didn't know how well they performed. I created a model named Gembo by merging a shitton of LORAs together, and surprisingly it worked! In fact it worked so well that it was the best model on my benchmarks until now. When I found a tool named LORD, which can extract LORA from any model, I knew I could do something even better.
I've extracted LORA from Euryale, then from Xwin and began testing. Merging Euryale-lora to Xwin and the other way around, created better models, which outperformed their parents:
Name | Quant | Size | B | C | D | S | P | total | BCD | SP |
---|---|---|---|---|---|---|---|---|---|---|
Sao10K/Euryale-1.3-L2-70B | Q6_K | 70B | 0 | 2 | 0 | 3 | 5 | 10 | 2 | 8 |
Sao10K/Euryale-1.3-L2-70B+xwin-lora | Q6_K | 70B | 2 | 2 | 1 | 5.5 | 5.5 | 16 | 5 | 11 |
Xwin-LM/Xwin-LM-70B-V0.1 | Q6_K | 70B | 0 | 1 | 2 | 5.5 | 5.25 | 13.75 | 3 | 10.75 |
Xwin-LM/Xwin-LM-70B-V0.1+euryale-lora | Q6_K | 70B | 3 | 2 | 2 | 6 | 5 | 18 | 7 | 11 |
Results seemed promising, so I continued testing, merging it in goliath-like way in different orders(EX=Euryale+LORAXwin; XE=Xwin+LORAEuryale). The results were even more surprising:
Name | Quant | Size | B | C | D | S | P | total | BCD | SP |
---|---|---|---|---|---|---|---|---|---|---|
alpindale/goliath-120b | Q6_K | 120B | 3 | 2 | 1 | 6 | 6 | 18 | 6 | 12 |
ChuckMcSneed/Premerge-EX-EX-123B | Q6_K | 123B | 2 | 2 | 1.5 | 7.25 | 6 | 18.75 | 5.5 | 13.25 |
ChuckMcSneed/Premerge-EX-XE-123B | Q6_K | 123B | 2 | 2 | 2 | 5.75 | 6 | 17.75 | 6 | 11.75 |
ChuckMcSneed/Premerge-XE-EX-123B | Q6_K | 123B | 2 | 2 | 2.5 | 6.75 | 5.5 | 18.75 | 6.5 | 12.25 |
ChuckMcSneed/Premerge-XE-XE-123B(this model) | Q6_K | 123B | 3 | 2 | 2.5 | 7.25 | 5.25 | 20 | 7.5 | 12.5 |
Sao10K/Euryale-1.3-L2-70B+xwin-lora | Q6_K | 70B | 2 | 2 | 1 | 5.5 | 5.5 | 16 | 5 | 11 |
Xwin-LM/Xwin-LM-70B-V0.1+euryale-lora | Q6_K | 70B | 3 | 2 | 2 | 6 | 5 | 18 | 7 | 11 |
Contrary to my expectations, merging two different models was suboptimal in this case. Selfmerge of Euryale-LORAXwin did beat all of the other merges on SP tests(creative writing), making it the highest scoring model on those tests that I've tested so far, and selfmerge of Xwin-LORAEuryale(this model) had highest score overall.
What it means
Potentially in the future we can get better models by controlled merging of LORAs.
Benchmarks
NeoEvalPlusN
Name | Quant | Size | B | C | D | S | P | total | BCD | SP |
---|---|---|---|---|---|---|---|---|---|---|
alpindale/goliath-120b | Q6_K | 120B | 3 | 2 | 1 | 6 | 6 | 18 | 6 | 12 |
ChuckMcSneed/Premerge-EX-EX-123B | Q6_K | 123B | 2 | 2 | 1.5 | 7.25 | 6 | 18.75 | 5.5 | 13.25 |
ChuckMcSneed/Premerge-EX-XE-123B | Q6_K | 123B | 2 | 2 | 2 | 5.75 | 6 | 17.75 | 6 | 11.75 |
ChuckMcSneed/Premerge-XE-EX-123B | Q6_K | 123B | 2 | 2 | 2.5 | 6.75 | 5.5 | 18.75 | 6.5 | 12.25 |
ChuckMcSneed/Premerge-XE-XE-123B(this model) | Q6_K | 123B | 3 | 2 | 2.5 | 7.25 | 5.25 | 20 | 7.5 | 12.5 |
Sao10K/Euryale-1.3-L2-70B | Q6_K | 70B | 0 | 2 | 0 | 3 | 5 | 10 | 2 | 8 |
Sao10K/Euryale-1.3-L2-70B+xwin-lora | Q6_K | 70B | 2 | 2 | 1 | 5.5 | 5.5 | 16 | 5 | 11 |
Xwin-LM/Xwin-LM-70B-V0.1 | Q6_K | 70B | 0 | 1 | 2 | 5.5 | 5.25 | 13.75 | 3 | 10.75 |
Xwin-LM/Xwin-LM-70B-V0.1+euryale-lora | Q6_K | 70B | 3 | 2 | 2 | 6 | 5 | 18 | 7 | 11 |
- Downloads last month
- 17