weixinchen
/

GRATH-gradtruth

Model card Files Files and versions Community

weixinchen commited on Jul 17

Commit

724b142

•

1 Parent(s): 21ae3ae

Update README.md

Files changed (1) hide show

README.md +5 -0

README.md CHANGED Viewed

@@ -1,6 +1,11 @@
 ---
 library_name: peft
 ---
 ## Training procedure

 ---
 library_name: peft
 ---
+This is a gradually self-truthified model (with one iteration) proposed in the paper [GRATH: Gradual Self-Truthifying for Large Language Models](https://arxiv.org/abs/2401.12292).
+Note: This model is applied with DPO twice. The reference model of DPO is set as the current base model.
 ## Training procedure