Update README.md
Browse files
README.md
CHANGED
@@ -10,6 +10,11 @@ base_model:
|
|
10 |
This reward model is finetuned from the [Ray2333/GRM-llama3-8B-sftreg](https://huggingface.co/Ray2333/GRM-llama3-8B-sftreg) using the [Skywork preference dataset](https://huggingface.co/datasets/Skywork/Skywork-Reward-Preference-80K-v0.1).
|
11 |
|
12 |
|
|
|
|
|
|
|
|
|
|
|
13 |
## Usage
|
14 |
|
15 |
```
|
|
|
10 |
This reward model is finetuned from the [Ray2333/GRM-llama3-8B-sftreg](https://huggingface.co/Ray2333/GRM-llama3-8B-sftreg) using the [Skywork preference dataset](https://huggingface.co/datasets/Skywork/Skywork-Reward-Preference-80K-v0.1).
|
11 |
|
12 |
|
13 |
+
# Evaluation
|
14 |
+
We evluated this reward model on reward-bench (https://huggingface.co/spaces/allenai/reward-bench) with an average score of 91.6.
|
15 |
+
|
16 |
+
{'Chat': 0.9553072625698324, 'Chat Hard': 0.8618421052631579, 'Safety': 0.9116798876798876, 'Reasoning': 0.9361529437442025}
|
17 |
+
|
18 |
## Usage
|
19 |
|
20 |
```
|