Model and training documentation

by ftvalentini - opened Oct 16

Oct 16

Is there any details or documentation about how these sentence embeddings are extracted from Croissant LLM and how they are fine-tuned, if they are?
Thanks in advance!

manu

Owner Oct 16

Hey !
Not documented but basically just take the CroissantLLM hidden state for the EOS token (or the weighted average of the tokens) and train contrastively with sentence transformers on the dataset that is listed !

ftvalentini

Oct 16

great, thanks! And do you happen to know how that dataset was built?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment