What is the split used to report results on OpenAI Moderation Dataset?
It seems the model is trained/finetuned on evaluation OpenAI Moderation Dataset: https://huggingface.co/datasets/mmathys/openai-moderation-api-evaluation
There are some validation metric scores mentioned here: https://huggingface.co/KoalaAI/Text-Moderation#validation-metrics
I want to ask could you provide the split you used for it? I am not able to replicate your scores on entire https://huggingface.co/datasets/mmathys/openai-moderation-api-evaluation dataset.
@KoalaAI could you please comment on it?
Hi! Sorry for the delayed response-- I don't get notifications from this org.
This dataset was used as a base, it was modified to fit within the requirements of AutoTrain; which has since been axed so I'm not sure I still have the variant training data split.
I still have the script used to modify the training data, but the split was randomly made by AT during training.