Discussion: Moistral meter?
quantization_options = [
"IQ3_M", "IQ3_S", "IQ3_XS", "IQ3_XXS", "Q4_K_M", "Q4_K_S", "IQ4_XS", "Q5_K_M", "Q5_K_S", "Q6_K", "Q8_0"
]
@Nitral-AI @jeiku @Epiculous @localfultonextractor @Virt-io
Just curious if this looks interesting and moist enough for y'all?
Check it out here...
https://huggingface.co/TheDrummer/Moistral-11B-v2
Rebalanced the genres and writing perspectives:
Included more romance, "family", fantasy, "diversity", science fiction, and many more that I can't make euphemisms for!
The classic favorites.
11B is the maximum I can say feels comfortable in 8GB VRAM, and that's already sacrificing with IQ3-imat quants. Not something I can say I had much of a taste with.
But some people will swear for higher performance on lower quants since the model is bigger?
@TheDrummer
Heya mate, just curious if you'd have any thoughts about how bad you reckon it'd do with the I/Q3 quants (if you had to resort to them)?
I know it's always an option but just imagining if it would be too degraded to do the Moistful experience justice.
Honestly, I have not tried anything lower than Q4 for my 11B models.
If it's any help, Moistral is based on Fimbulvetr v2, which is based on Solar 10.7B, which is a continued pretraining (or merge?) of Mistral 7B v2 Instruct.
At Q4, things are alright. I don't see any obvious perplexity issues. There are some logic issues but I don't know if that's a quant thing.
If you all want a more level-headed model, the dried % models might suit you.
If you all want a more level-headed model, the dried % models might suit you.
This was never an option.
Unless the moist is way too moist, we'll see then, thanks. It might deal better at not being too dumbed down.
I am expecting IQ3 quants to perform decently at least if they use calibration data. Time to inject the copium.
Absolutely. I'm sitting through the IQ3s at the moment - for testing.
Can you upload a Q4_K_M quant so I can test it?
I've never had much luck on quants lower than 4 bit.
I'll still test the 3 bit, it's just that whenever I use a 3 bit I feel like the reasoning goes down drastically.
Yeah, it's tough... I dunno, it doesn't feel any better than Erosumika or Eris Prime Vision V4 32K β I'm doing temperature at 0.95 and it's not really jumping as any better, there's also the context size, with the mentioned 7Bs sharing that goodness of Mistral 0.2 and actually handling long context a lot better and being vision compatible in the case of Eris, this is happening likely due to an IQ3 quant, that's just too much degraded to be worth, but I also don't like the model Moistral is based on that much, even at Q4, so probably some of that as well.
For 12GB VRAM folks this will be very nice.
Notably...
It is obviously VERY compliant to imediatly jump into NSFW kink you have, so it's an option to keep in handy if you need that.
Maybe I get more experiences playing with it.
I saw @TheDrummer did mention perhaps a V2.2, so that's an option.
Yeah, having issues with hallucination.
I have a character card of a merchant's daughter, and she kept saying she was a princess trapped in a castle or screaming for help. (I only said hello.)
Ah, yeah, that's rough, I wasn't having it that bad but it was dumb and bland due to the extreme quant, also making mistakes like that, just smaller, like having a bad spatial sense.
The problem is I also don't like the model it's based on, so take me with a grain of salt.
@Nitral-AI @jeiku @Epiculous @localfultonextractor @Virt-io @ABX-AI
Just a quick notice that'll be gone or not here too often for a little while, need to sort some pressing matters at the moment.
I might pop in a few times a week to see what's cooking. Expect quant-requests to take a bit longer than usual, but I should be able to run them. ABX can fill in if they are available and I take too long, haha. Cheers mateys. Message me on discord if something massive happens.
@Nitral-AI @jeiku @Epiculous @localfultonextractor @Virt-io @ABX-AI
Just a quick notice that'll be gone or not here too often for a little while, need to sort some pressing matters at the moment.
I might pop in a few times a week to see what's cooking. Expect quant-requests to take a bit longer than usual, but I should be able to run them. ABX can fill in if they are available and I take too long, haha. Cheers mateys. Message me on discord if something massive happens.
Take care bud, no worries! Life > AI (for now)
I can't promise anything about doing requests atm (I'm keeping my repo entirely with my own merges atm for self-archiving purposes, although I may change that for some outstanding models) but I'm gonna miss you. Do what you need, plus hopefully you can re-charge for llama3 and 1.5bit era :D The wait has been too long already
@Nitral-AI @jeiku @Epiculous @localfultonextractor @Virt-io @ABX-AI
Just a quick notice that'll be gone or not here too often for a little while, need to sort some pressing matters at the moment.
I might pop in a few times a week to see what's cooking. Expect quant-requests to take a bit longer than usual, but I should be able to run them. ABX can fill in if they are available and I take too long, haha. Cheers mateys. Message me on discord if something massive happens.
Take care, space cowboy! We'll miss ya'! Until next time!
@Nitral-AI @jeiku @Epiculous @localfultonextractor @Virt-io @ABX-AI
Just a quick notice that'll be gone or not here too often for a little while, need to sort some pressing matters at the moment.
I might pop in a few times a week to see what's cooking. Expect quant-requests to take a bit longer than usual, but I should be able to run them. ABX can fill in if they are available and I take too long, haha. Cheers mateys. Message me on discord if something massive happens.
Will miss ya bud, hope everything is dealt with easy enough!
I'll still be checking in. :3
Take care friend, hope whatever you got going on goes smooth.