Safetensors
llama
Ray2333 commited on
Commit
2d6becc
1 Parent(s): cf0d660

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -10,6 +10,11 @@ base_model:
10
  This reward model is finetuned from the [Ray2333/GRM-llama3-8B-sftreg](https://huggingface.co/Ray2333/GRM-llama3-8B-sftreg) using the [Skywork preference dataset](https://huggingface.co/datasets/Skywork/Skywork-Reward-Preference-80K-v0.1).
11
 
12
 
 
 
 
 
 
13
  ## Usage
14
 
15
  ```
 
10
  This reward model is finetuned from the [Ray2333/GRM-llama3-8B-sftreg](https://huggingface.co/Ray2333/GRM-llama3-8B-sftreg) using the [Skywork preference dataset](https://huggingface.co/datasets/Skywork/Skywork-Reward-Preference-80K-v0.1).
11
 
12
 
13
+ # Evaluation
14
+ We evluated this reward model on reward-bench (https://huggingface.co/spaces/allenai/reward-bench) with an average score of 91.6.
15
+
16
+ {'Chat': 0.9553072625698324, 'Chat Hard': 0.8618421052631579, 'Safety': 0.9116798876798876, 'Reasoning': 0.9361529437442025}
17
+
18
  ## Usage
19
 
20
  ```