Rys loss

by dnhkng - opened Aug 18

Aug 18

How were the loss curves on the RYS models? Did you see a faster initial drop than in the calme-2.1 on the base Qwen2-72B model?

MaziyarPanahi

Owner Aug 19

The loss curve looks identical to calme-2.1-qwen2-72b model. The differences are negligible in this case considering the RLHF. The model is too large for me to evaluate it locally on those benchmarks so I don't know how much improvements we are expecting.

dnhkng

Aug 20

Is the training data public? I see a calme legal, is that it?

MaziyarPanahi

Owner Aug 20

the used dataset for RLHF is listed in the README

dnhkng

Aug 20

•

edited Aug 20

Ah cool, thanks!

Did you hand generate these? 1k pairs seems within the limit of hand-tailored DPO pairs, barely 😅

I've not tried fine tuning in anger before, when I'm back from holiday.in September, let me know if you are interested in meeting. I'm always interested to meet other AI enthusiasts!

dnhkng

Sep 4

Wow, I'm really shocked this didnt take first place.

I assumed this would be top of the Leaderboard when the results came in. Very surprising outcome, considering the procedures used.

Maybe we should do some collaboration?

MaziyarPanahi

Owner Sep 4

To be honest, this one locally scored higher. The seed, the batch, etc. cause some small precision changes. It's fine for me, I was trying to top my previous model without making it worse. So mission accomplished.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment