grg commited on
Commit
0467a8f
1 Parent(s): 99de6d0

Paper link as new tab

Browse files
Files changed (1) hide show
  1. templates/index.html +4 -4
templates/index.html CHANGED
@@ -242,7 +242,7 @@
242
  LLM-exhibited behavior always depends on the context (prompt).
243
  While some context-dependence is desired (e.g. following instructions),
244
  some is undesired (e.g. drastically changing the simulated value expression based on the interlocutor).
245
- As proposed in our <a href="https://arxiv.org/abs/2402.14846">paper</a>,
246
  undesired context-dependence should be seen as a <b>property of LLMs</b> - a dimension of LLM comparison (alongside others such as model size speed or expressed knowledge).
247
  This leaderboard aims to provide such a comparison and extends our paper with a more focused and elaborate experimental setup.
248
  Standard benchmarks present <b>many</b> questions from the <b>same minimal contexts</b> (e.g. multiple choice questions),
@@ -259,12 +259,12 @@
259
  {{ main_table_html|safe }}
260
  </div>
261
  <div class="image-container">
262
- <a href="{{ url_for('static', filename='models_data/cardinal.svg') }}" target="_blank">
263
- <img src="{{ url_for('static', filename='models_data/cardinal.svg') }}" alt="Cardinal">
264
- </a>
265
  <a href="{{ url_for('static', filename='models_data/ordinal.svg') }}" target="_blank">
266
  <img src="{{ url_for('static', filename='models_data/ordinal.svg') }}" alt="Ordinal">
267
  </a>
 
 
 
268
  </div>
269
  <p>
270
  We leverage Schwartz's theory of <a href="https://www.sciencedirect.com/science/article/abs/pii/S0065260108602816">Basic Personal Values</a>,
 
242
  LLM-exhibited behavior always depends on the context (prompt).
243
  While some context-dependence is desired (e.g. following instructions),
244
  some is undesired (e.g. drastically changing the simulated value expression based on the interlocutor).
245
+ As proposed in our <a target="_blank" href="https://arxiv.org/abs/2402.14846">paper</a>,
246
  undesired context-dependence should be seen as a <b>property of LLMs</b> - a dimension of LLM comparison (alongside others such as model size speed or expressed knowledge).
247
  This leaderboard aims to provide such a comparison and extends our paper with a more focused and elaborate experimental setup.
248
  Standard benchmarks present <b>many</b> questions from the <b>same minimal contexts</b> (e.g. multiple choice questions),
 
259
  {{ main_table_html|safe }}
260
  </div>
261
  <div class="image-container">
 
 
 
262
  <a href="{{ url_for('static', filename='models_data/ordinal.svg') }}" target="_blank">
263
  <img src="{{ url_for('static', filename='models_data/ordinal.svg') }}" alt="Ordinal">
264
  </a>
265
+ <a href="{{ url_for('static', filename='models_data/cardinal.svg') }}" target="_blank">
266
+ <img src="{{ url_for('static', filename='models_data/cardinal.svg') }}" alt="Cardinal">
267
+ </a>
268
  </div>
269
  <p>
270
  We leverage Schwartz's theory of <a href="https://www.sciencedirect.com/science/article/abs/pii/S0065260108602816">Basic Personal Values</a>,