leaderboard-pr-bot's picture
Adding Evaluation Results
a27af31
|
raw
history blame
1.2 kB
metadata
license: apache-2.0
datasets:
  - Open-Orca/SlimOrca
  - jondurbin/airoboros-3.1
  - riddle_sense
language:
  - en
library_name: transformers

Built with Axolotl

SlimOrcaBoros

A Mistral 7B finetuned model using SlimOrca, Auroboros 3.1 and RiddleSense.

Training

Trained for 4 epochs, but released @ epoch 3.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 54.1
ARC (25-shot) 63.65
HellaSwag (10-shot) 83.7
MMLU (5-shot) 63.46
TruthfulQA (0-shot) 55.81
Winogrande (5-shot) 77.03
GSM8K (5-shot) 23.43
DROP (3-shot) 11.62