gpt-Youtube / README.md
breadlicker45's picture
Adding Evaluation Results (#3)
f911b49
metadata
datasets:
  - breadlicker45/youtube-comments-180k
pipeline_tag: text-generation

this is trained on 180K YouTube comments.

this is trained for 100k steps.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 24.86
ARC (25-shot) 23.29
HellaSwag (10-shot) 26.34
MMLU (5-shot) 23.54
TruthfulQA (0-shot) 48.63
Winogrande (5-shot) 48.93
GSM8K (5-shot) 0.0
DROP (3-shot) 3.32