Dataset link opens in new tab
Browse files- templates/about.html +2 -2
- templates/index.html +1 -1
templates/about.html
CHANGED
@@ -216,7 +216,7 @@
|
|
216 |
We adopt the <b>Schwartz Theory of Basic Personal Values</b>, which defines 10 values: Self-Direction, Stimulation, Hedonism, Achievement, Power, Security, Conformity, Tradition, Benevolence, and Universalism.
|
217 |
To evaluate their expression we use the associated questionnaires: <b>PVQ-40</b>, and <b>SVS</b>.
|
218 |
</p>
|
219 |
-
<p>You can browse the questionnaires, population, and contexts used on our <a href="https://huggingface.co/datasets/flowers-team/StickToYourRole">🤗 StickToYourRole dataset</a>. </p>
|
220 |
<p>
|
221 |
The Stick to Your Role! leaderboard aims to provide an up-to-date comparison of recent LLMs based on their ability to coherently simulate popultions.
|
222 |
It, in tandem with other minimal-context benchmarks, should enable you to choose the best-suited model for your usecase!
|
@@ -258,7 +258,7 @@
|
|
258 |
<li> <b> chess </b>: "1. e4" is given as the initial message to all personas, but for each persona the Interlocutor model is instructed to simulate a different persona (instead of a human user) </li>
|
259 |
<li> <b> grammar </b>: like chess, but "Can you check this sentence for grammar? \n Whilst Jane was waiting to meet hers friend their nose started bleeding." is given as the initial message.
|
260 |
</ul>
|
261 |
-
<p>You can browse the simulated population, questionnaires, and contexts used on our <a href="https://huggingface.co/datasets/flowers-team/StickToYourRole">🤗 StickToYourRole dataset</a>.</p>
|
262 |
</div>
|
263 |
<div class="section" id="validation">
|
264 |
<div class="section-title">Validation</div>
|
|
|
216 |
We adopt the <b>Schwartz Theory of Basic Personal Values</b>, which defines 10 values: Self-Direction, Stimulation, Hedonism, Achievement, Power, Security, Conformity, Tradition, Benevolence, and Universalism.
|
217 |
To evaluate their expression we use the associated questionnaires: <b>PVQ-40</b>, and <b>SVS</b>.
|
218 |
</p>
|
219 |
+
<p>You can browse the questionnaires, population, and contexts used on our <a target="_blank" href="https://huggingface.co/datasets/flowers-team/StickToYourRole">🤗 StickToYourRole dataset</a>. </p>
|
220 |
<p>
|
221 |
The Stick to Your Role! leaderboard aims to provide an up-to-date comparison of recent LLMs based on their ability to coherently simulate popultions.
|
222 |
It, in tandem with other minimal-context benchmarks, should enable you to choose the best-suited model for your usecase!
|
|
|
258 |
<li> <b> chess </b>: "1. e4" is given as the initial message to all personas, but for each persona the Interlocutor model is instructed to simulate a different persona (instead of a human user) </li>
|
259 |
<li> <b> grammar </b>: like chess, but "Can you check this sentence for grammar? \n Whilst Jane was waiting to meet hers friend their nose started bleeding." is given as the initial message.
|
260 |
</ul>
|
261 |
+
<p>You can browse the simulated population, questionnaires, and contexts used on our <a target="_blank" href="https://huggingface.co/datasets/flowers-team/StickToYourRole">🤗 StickToYourRole dataset</a>.</p>
|
262 |
</div>
|
263 |
<div class="section" id="validation">
|
264 |
<div class="section-title">Validation</div>
|
templates/index.html
CHANGED
@@ -252,7 +252,7 @@
|
|
252 |
The Stick to You Role! leaderboard focuses on the <b>stability of simulated personal values during role-playing</b>.
|
253 |
We study the <b>coherence of a simulated population</b>.
|
254 |
In contrast to evaluating each simulated persona separately, we evaluate personas relative to each other, i.e. as a population.
|
255 |
-
You can browse the simulated population, questionnaires, and contexts used on our <a href="https://huggingface.co/datasets/flowers-team/StickToYourRole">🤗 StickToYourRole dataset</a>.
|
256 |
</p>
|
257 |
<div class="table-responsive main-table">
|
258 |
<!-- Render the table HTML here -->
|
|
|
252 |
The Stick to You Role! leaderboard focuses on the <b>stability of simulated personal values during role-playing</b>.
|
253 |
We study the <b>coherence of a simulated population</b>.
|
254 |
In contrast to evaluating each simulated persona separately, we evaluate personas relative to each other, i.e. as a population.
|
255 |
+
You can browse the simulated population, questionnaires, and contexts used on our <a target="_blank" href="https://huggingface.co/datasets/flowers-team/StickToYourRole">🤗 StickToYourRole dataset</a>.
|
256 |
</p>
|
257 |
<div class="table-responsive main-table">
|
258 |
<!-- Render the table HTML here -->
|