GOAT-AI
/

GOAT-7B-Community

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Adding Evaluation Results

#3

by leaderboard-pr-bot - opened Nov 17, 2023

base: refs/heads/main

←

from: refs/pr/3

Discussion Files changed

Files changed (1) hide show

README.md +14 -1

README.md CHANGED Viewed

@@ -68,4 +68,17 @@ GOAT-7B-Community model weights are available under LLAMA-2 license. Note that t
 ### Risks and Biases
-GOAT-7B-Community model can produce factually incorrect output and should not be relied on to deliver factually accurate information. The model was trained on various private and public datasets. Therefore, the GOAT-7B-Community model could possibly generate wrong, biased, or otherwise offensive outputs.

 ### Risks and Biases
+GOAT-7B-Community model can produce factually incorrect output and should not be relied on to deliver factually accurate information. The model was trained on various private and public datasets. Therefore, the GOAT-7B-Community model could possibly generate wrong, biased, or otherwise offensive outputs.
+# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
+Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_GOAT-AI__GOAT-7B-Community)
+| Metric                | Value                     |
+|-----------------------|---------------------------|
+| Avg.                  | 42.74   |
+| ARC (25-shot)         | 48.81          |
+| HellaSwag (10-shot)   | 74.63    |
+| MMLU (5-shot)         | 49.58         |
+| TruthfulQA (0-shot)   | 42.48   |
+| Winogrande (5-shot)   | 72.3   |
+| GSM8K (5-shot)        | 4.47        |
+| DROP (3-shot)         | 6.91         |