Congrats!

#1
by aaronday3 - opened

It's great to see someone else being inspired by my work, you might want to reduce the learning rate a bit, maybe 4e-6, depending on what your eval loss looks like.

I would heavily suggest turning on eval loss, even using something like 1% of your dataset, as without it, you are going in blind.

Feel free to ask me for help, I'm on my discord server and Kobold. Also curious what exactly is in your dataset, and which human conversations you are referring to :P

Yeah, I'm probably going to show up on the Discord once I feel like I'm not a complete idiot - but seriously, it was the write-up for Celeste that made me think I could do this. Huge thanks for that!

Since I trained v5 I've managed to work out eval loss and get stats going to wandb. You're not wrong, it makes a big difference, and now I'm able to do some real comparison testing of learning rates and rank/alpha.

Sign up or log in to comment