stanfordnlp
/

SteamSHP-flan-t5-large

Text2Text Generation

preference model

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

kawine commited on Oct 10, 2023

Commit

ea35567

•

1 Parent(s): 4a34927

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -21,6 +21,8 @@ tags:
 <!-- Provide a quick summary of what the model is/does. -->
 SteamSHP-Large is a preference model trained to predict -- given some context and two possible responses -- which response humans will find more helpful.
 It can be used for NLG evaluation or as a reward model for RLHF.

 <!-- Provide a quick summary of what the model is/does. -->
+**If you mention this dataset in a paper, please cite the paper:** [Understanding Dataset Difficulty with V-Usable Information (ICML 2022)](https://proceedings.mlr.press/v162/ethayarajh22a.html).
 SteamSHP-Large is a preference model trained to predict -- given some context and two possible responses -- which response humans will find more helpful.
 It can be used for NLG evaluation or as a reward model for RLHF.