uukuguy commited on
Commit
43dea8e
1 Parent(s): 74cc608

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -7
README.md CHANGED
@@ -23,7 +23,7 @@ model-index:
23
  metrics:
24
  - name: pass@1
25
  type: pass@1
26
- value: 50.0
27
  verified: false
28
  ---
29
 
@@ -51,7 +51,14 @@ Total 201,981 samples.
51
 
52
  | Metric | Value |
53
  | --- | --- |
54
- | humaneval-python | 50.0|
 
 
 
 
 
 
 
55
 
56
  [Big Code Models Leaderboard](https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard)
57
 
@@ -69,14 +76,28 @@ CodeLlama-13B: 35.07
69
 
70
  ## lm-evaluation-harness
71
 
 
 
 
 
 
 
 
 
 
 
 
72
  [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
 
73
  | Metric | Value |
74
  | --- | --- |
75
- | ARC |59.64 |
76
- | HellaSwag |82.25 |
77
- | MMLU | 61.33 |
78
- | TruthfulQA | 48.45 |
79
- | Average | 62.92 |
 
 
80
 
81
  ## Parameters
82
 
 
23
  metrics:
24
  - name: pass@1
25
  type: pass@1
26
+ value: 51.21951219512195
27
  verified: false
28
  ---
29
 
 
51
 
52
  | Metric | Value |
53
  | --- | --- |
54
+ | humaneval-python | 51.21951219512195|
55
+
56
+ ## Big Code Evaluation
57
+
58
+ | | Humaneval | Java | Javascript | CPP | Php | Rust | Swift | R | Lua | D | Racket | Julia |
59
+ | ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ |
60
+ | pass@1 | 0.4260 | 0.3165 | 0.4241 | 0.3467 | 0.3548 | 0.2454 | 0.0000 | 0.1735 | 0.2942 | 0.1087 | 0.0000 | 0.3081 |
61
+ | pass@10 | 0.5784 | 0.4506 | 0.5891 | 0.4845 | 0.4997 | 0.3858 | 0.0000 | 0.2516 | 0.4126 | 0.2018 | 0.0000 | 0.4427 |
62
 
63
  [Big Code Models Leaderboard](https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard)
64
 
 
76
 
77
  ## lm-evaluation-harness
78
 
79
+ ```json
80
+ {'ARC (acc_norm)': 0.6109215017064846,
81
+ 'HellaSwag (acc_norm)': 0.8358892650866361,
82
+ 'MMLU (acc)': 0.6325456394049195,
83
+ 'TruthfulQA (mc2)': 0.4746745250371087,
84
+ 'Winoground (acc)': 0.7829518547750592,
85
+ 'GSM8K (acc)': 0.467778620166793,
86
+ 'DROP (f1)': 0.49585675335570545,
87
+ 'Open LLM Score': 0.61437428571428571}
88
+ ```
89
+
90
  [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
91
+
92
  | Metric | Value |
93
  | --- | --- |
94
+ | ARC |60.58 |
95
+ | HellaSwag |83.47 |
96
+ | MMLU | 62.98 |
97
+ | TruthfulQA | 47.9 |
98
+ | Winoground | 78.69 |
99
+ | GSM8K | 19.18 |
100
+ | Average | 58.85 |
101
 
102
  ## Parameters
103