[Model Discussion]

#1
by Darkknight535 - opened

Hey, thanks for the 8B model.

The issue with the 8B model was that it was losing coherency because it was an 8B model. That's why I first converted it into 12B variants and then merged them. This approach allows it to utilize the full layers of a specific model (full 8B) while adding the remaining layers from another model on top of it. I tried using the 8B model myself, but ultimately decided to remove it.

Cheers!

Sign up or log in to comment