What is the split used to report results on OpenAI Moderation Dataset?

by avanigupta - opened Apr 17

Apr 17

•

It seems the model is trained/finetuned on evaluation OpenAI Moderation Dataset: https://huggingface.co/datasets/mmathys/openai-moderation-api-evaluation

There are some validation metric scores mentioned here: https://huggingface.co/KoalaAI/Text-Moderation#validation-metrics

I want to ask could you provide the split you used for it? I am not able to replicate your scores on entire https://huggingface.co/datasets/mmathys/openai-moderation-api-evaluation dataset.

@KoalaAI could you please comment on it?

avanigupta

Apr 17

This comment has been hidden

DarwinAnim8or

Koala AI org May 8

Hi! Sorry for the delayed response-- I don't get notifications from this org.

This dataset was used as a base, it was modified to fit within the requirements of AutoTrain; which has since been axed so I'm not sure I still have the variant training data split.
I still have the script used to modify the training data, but the split was randomly made by AT during training.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment