What datasets were these trained on?

by rombodawg - opened Apr 20

Apr 20

Can you add the datasets this model was trained on to the model card so people know? Id like to know what kind of model this is in general

BirdThomas

Apr 20

+1 … what is SMAUG?

raidhon

Apr 20

SMAUG is the dragon from Lord of the Rings, I believe. If you've read Tolkien. ))

ArkaAbacus

Apr 22

We've now updated to include the datasets that this model was trained on. It still will have many of the qualities of Meta-Llama, but we have tried to improve its reasoning, math and coding skills in particular in this finetune.

ArkaAbacus

Apr 22

More information on the exact technique/data will be released later on. For now, see the previous Smaug paper: https://arxiv.org/abs/2402.13228.

Wanfq

Apr 24

•

edited Apr 24

Hello, the DPOP method proposed in Smaug paper is based on preference datasets. However, the datasets provided in the model card are SFT datasets. I was wondering how to convert the provided SFT datasets to preference datasets. Maybe sampling from Llama-3-8B-instruct and using a reward model for rewarding?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment