Stick To Your Role! Leaderboard

{{ table_html|safe }}
Evaluate a custom model

To evaluate a custom model you can use our open-source code. If a model is in the huggingface transformers format (saved either localy or on the hub), it can be simply added by adding a config file. The model can then be evaluated as any other model. To do so, follow the instructions in the README.md file.

Submit a custom model to the Stick To Your Role! Leaderboard

If you want, your model can be to the Stick To Your Role! Leaderboard, as an unofficial submission. A separate list of models containing both official and unofficial submissions will be created. The procedure is as follows:

  1. Add and evaluate your model - Add your model as a config file as described above. This procedure should result in 9 json files as such: `Leaderboard/results/stability_leaderboard/<your_model_name>/chunk_0_<timestamp>/results.json`
  2. Submit the config file - Create a pull request to our repository from a branch "unofficial_model/<your_model_name>". The pull request should ideally only add the config file in `./models/leaderboard_configs`. If additional changes are needed, they should ideally be constrained to a new model class (see huggingfacemodel.py for reference).
  3. Submit the model results - submit the *json files as a ZIP using the form below. We will integrate the model's results on our side, and rerank models with yours included.
Please upload a ZIP file containing the results directory.
Main page

If you found this project useful, please cite our related paper:

@inproceedings{kovavc2024stick, title={Stick to your Role! Stability of Personal Values Expressed in Large Language Models}, author={Kova{\v{c}}, Grgur and Portelas, R{\'e}my and Sawayama, Masataka and Dominey, Peter Ford and Oudeyer, Pierre-Yves}, booktitle={Proceedings of the Annual Meeting of the Cognitive Science Society}, volume={46}, year={2024} }