Paper link as new tab
Browse files- templates/index.html +4 -4
templates/index.html
CHANGED
@@ -242,7 +242,7 @@
|
|
242 |
LLM-exhibited behavior always depends on the context (prompt).
|
243 |
While some context-dependence is desired (e.g. following instructions),
|
244 |
some is undesired (e.g. drastically changing the simulated value expression based on the interlocutor).
|
245 |
-
As proposed in our <a href="https://arxiv.org/abs/2402.14846">paper</a>,
|
246 |
undesired context-dependence should be seen as a <b>property of LLMs</b> - a dimension of LLM comparison (alongside others such as model size speed or expressed knowledge).
|
247 |
This leaderboard aims to provide such a comparison and extends our paper with a more focused and elaborate experimental setup.
|
248 |
Standard benchmarks present <b>many</b> questions from the <b>same minimal contexts</b> (e.g. multiple choice questions),
|
@@ -259,12 +259,12 @@
|
|
259 |
{{ main_table_html|safe }}
|
260 |
</div>
|
261 |
<div class="image-container">
|
262 |
-
<a href="{{ url_for('static', filename='models_data/cardinal.svg') }}" target="_blank">
|
263 |
-
<img src="{{ url_for('static', filename='models_data/cardinal.svg') }}" alt="Cardinal">
|
264 |
-
</a>
|
265 |
<a href="{{ url_for('static', filename='models_data/ordinal.svg') }}" target="_blank">
|
266 |
<img src="{{ url_for('static', filename='models_data/ordinal.svg') }}" alt="Ordinal">
|
267 |
</a>
|
|
|
|
|
|
|
268 |
</div>
|
269 |
<p>
|
270 |
We leverage Schwartz's theory of <a href="https://www.sciencedirect.com/science/article/abs/pii/S0065260108602816">Basic Personal Values</a>,
|
|
|
242 |
LLM-exhibited behavior always depends on the context (prompt).
|
243 |
While some context-dependence is desired (e.g. following instructions),
|
244 |
some is undesired (e.g. drastically changing the simulated value expression based on the interlocutor).
|
245 |
+
As proposed in our <a target="_blank" href="https://arxiv.org/abs/2402.14846">paper</a>,
|
246 |
undesired context-dependence should be seen as a <b>property of LLMs</b> - a dimension of LLM comparison (alongside others such as model size speed or expressed knowledge).
|
247 |
This leaderboard aims to provide such a comparison and extends our paper with a more focused and elaborate experimental setup.
|
248 |
Standard benchmarks present <b>many</b> questions from the <b>same minimal contexts</b> (e.g. multiple choice questions),
|
|
|
259 |
{{ main_table_html|safe }}
|
260 |
</div>
|
261 |
<div class="image-container">
|
|
|
|
|
|
|
262 |
<a href="{{ url_for('static', filename='models_data/ordinal.svg') }}" target="_blank">
|
263 |
<img src="{{ url_for('static', filename='models_data/ordinal.svg') }}" alt="Ordinal">
|
264 |
</a>
|
265 |
+
<a href="{{ url_for('static', filename='models_data/cardinal.svg') }}" target="_blank">
|
266 |
+
<img src="{{ url_for('static', filename='models_data/cardinal.svg') }}" alt="Cardinal">
|
267 |
+
</a>
|
268 |
</div>
|
269 |
<p>
|
270 |
We leverage Schwartz's theory of <a href="https://www.sciencedirect.com/science/article/abs/pii/S0065260108602816">Basic Personal Values</a>,
|