Big difference between the before-cooldown-ckpt and the final checkpoint in the results of downstream tasks?
#9
by
siqi-zz
- opened
We tested the checkpoint before cooldown and the final checkpoint, and found that there was a big difference in the results of downstream tasks. The final checkpoint significantly improved the results of downstream tasks. Are there any special strategies for the cooldown phase?
arc(25shot) hellaswag(10shot) mmlu(5-shot) truthfulqa winnogrande(5-shot) gsm(5-shot)
43.52 70.3 39.8 36.61 64.17 17.29
38.4 67.59 30 34.9 61.96 7.35
This comment has been hidden