Spaces:

flowers-team
/

StickToYourRoleLeaderboard

Running

App Files Files Community

grg commited on Sep 24

Commit

0467a8f

•

1 Parent(s): 99de6d0

Paper link as new tab

Browse files

Files changed (1) hide show

templates/index.html +4 -4

templates/index.html CHANGED Viewed

@@ -242,7 +242,7 @@
             LLM-exhibited behavior always depends on the context (prompt).
             While some context-dependence is desired (e.g. following instructions),
             some is undesired (e.g. drastically changing the simulated value expression based on the interlocutor).
-            As proposed in our <a href="https://arxiv.org/abs/2402.14846">paper</a>,
             undesired context-dependence should be seen as a <b>property of LLMs</b> - a dimension of LLM comparison (alongside others such as model size speed or expressed knowledge).
             This leaderboard aims to provide such a comparison and extends our paper with a more focused and elaborate experimental setup.
             Standard benchmarks present <b>many</b> questions from the <b>same minimal contexts</b> (e.g. multiple choice questions),
@@ -259,12 +259,12 @@
             {{ main_table_html|safe }}
         </div>
         <div class="image-container">
-            <a href="{{ url_for('static', filename='models_data/cardinal.svg') }}" target="_blank">
-                <img src="{{ url_for('static', filename='models_data/cardinal.svg') }}" alt="Cardinal">
-            </a>
             <a href="{{ url_for('static', filename='models_data/ordinal.svg') }}" target="_blank">
                 <img src="{{ url_for('static', filename='models_data/ordinal.svg') }}" alt="Ordinal">
             </a>
         </div>
         <p>
             We leverage Schwartz's theory of <a href="https://www.sciencedirect.com/science/article/abs/pii/S0065260108602816">Basic Personal Values</a>,

             LLM-exhibited behavior always depends on the context (prompt).
             While some context-dependence is desired (e.g. following instructions),
             some is undesired (e.g. drastically changing the simulated value expression based on the interlocutor).
+            As proposed in our <a target="_blank" href="https://arxiv.org/abs/2402.14846">paper</a>,
             undesired context-dependence should be seen as a <b>property of LLMs</b> - a dimension of LLM comparison (alongside others such as model size speed or expressed knowledge).
             This leaderboard aims to provide such a comparison and extends our paper with a more focused and elaborate experimental setup.
             Standard benchmarks present <b>many</b> questions from the <b>same minimal contexts</b> (e.g. multiple choice questions),
             {{ main_table_html|safe }}
         </div>
         <div class="image-container">
             <a href="{{ url_for('static', filename='models_data/ordinal.svg') }}" target="_blank">
                 <img src="{{ url_for('static', filename='models_data/ordinal.svg') }}" alt="Ordinal">
             </a>
+            <a href="{{ url_for('static', filename='models_data/cardinal.svg') }}" target="_blank">
+                <img src="{{ url_for('static', filename='models_data/cardinal.svg') }}" alt="Cardinal">
+            </a>
         </div>
         <p>
             We leverage Schwartz's theory of <a href="https://www.sciencedirect.com/science/article/abs/pii/S0065260108602816">Basic Personal Values</a>,