nan in blk.27.attn_output.weight

#6
by mradermacher - opened

just fyi, llama says blk.27.attn_output.weight contains nans, so likely the model is broken.

Interesting. It talks just fine when I use it. I did encounter issues however when I evaluated it.

it's not completely uncommon to have a model with some nans that still talk. for example, with luck it can only affect some vocabulary. the bigger issue is that the resulting quants won't load in nan checking is on (it's off by default though).

Anyway, the real test is if it pleases the users. nobody cares about issues if the result is to their liking :)

Right. I'll retrain it when compute clears up. Thank you for catching this!!

Sign up or log in to comment