qgallouedec
/

online-dpo-qwen2-2

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

qgallouedec HF staff commited on 27 days ago

Commit

0849991

•

1 Parent(s): 40676ce

End of training

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -10,6 +10,6 @@ tags:
 licence: license
 ---
-# Model Card for Model name
 This model is a fine-tuned version of [Qwen/Qwen2-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct) on the https://huggingface.co/datasets/trl-lib/ultrafeedback-prompt dataset.

 licence: license
 ---
+# Model Card for online-dpo-qwen2-2
 This model is a fine-tuned version of [Qwen/Qwen2-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct) on the https://huggingface.co/datasets/trl-lib/ultrafeedback-prompt dataset.