Updating bibtex
Browse files- static/figures/cardinal.svg +48 -48
- static/figures/ordinal.svg +48 -48
- templates/about.html +9 -4
- templates/index.html +5 -4
- templates/new_model.html +7 -6
static/figures/cardinal.svg
CHANGED
static/figures/ordinal.svg
CHANGED
templates/about.html
CHANGED
@@ -8,6 +8,8 @@
|
|
8 |
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/5.1.3/css/bootstrap.min.css">
|
9 |
<!-- Include DataTables CSS -->
|
10 |
<link rel="stylesheet" href="https://cdn.datatables.net/1.11.5/css/dataTables.bootstrap5.min.css">
|
|
|
|
|
11 |
<!-- Custom CSS for additional styling -->
|
12 |
<style>
|
13 |
body {
|
@@ -307,8 +309,10 @@ their expression of that value).
|
|
307 |
To rank models, we aggregate the rank-order and validity metrics in two ways :
|
308 |
</p>
|
309 |
<ul>
|
310 |
-
<li> <b> Cardinal - Score (↑) </b> - the score is averaged over all metrics (with descending metrics inverted), context pairs (for stability) and contexts (for validity metrics)
|
311 |
-
|
|
|
|
|
312 |
</ul>
|
313 |
<p>
|
314 |
Following this <a href="https://arxiv.org/abs/2405.01719">paper</a> and associated <a href="https://github.com/socialfoundations/benchbench">benchbench</a> library,
|
@@ -354,10 +358,11 @@ their expression of that value).
|
|
354 |
<div class="citation-section">
|
355 |
<p>If you found this project useful, please cite our related paper:</p>
|
356 |
<div class="citation-box" id="citation-text">
|
357 |
-
@
|
358 |
title={Stick to your Role! Stability of Personal Values Expressed in Large Language Models},
|
359 |
author={Kova{\v{c}}, Grgur and Portelas, R{\'e}my and Sawayama, Masataka and Dominey, Peter Ford and Oudeyer, Pierre-Yves},
|
360 |
-
|
|
|
361 |
year={2024}
|
362 |
}
|
363 |
</div>
|
|
|
8 |
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/5.1.3/css/bootstrap.min.css">
|
9 |
<!-- Include DataTables CSS -->
|
10 |
<link rel="stylesheet" href="https://cdn.datatables.net/1.11.5/css/dataTables.bootstrap5.min.css">
|
11 |
+
<!-- Include mathjax -->
|
12 |
+
<script type="text/javascript" async src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/3.1.2/es5/tex-mml-chtml.js"></script>
|
13 |
<!-- Custom CSS for additional styling -->
|
14 |
<style>
|
15 |
body {
|
|
|
309 |
To rank models, we aggregate the rank-order and validity metrics in two ways :
|
310 |
</p>
|
311 |
<ul>
|
312 |
+
<li> <b> Cardinal - Score (↑) </b> - the score is averaged over all metrics (with descending metrics inverted), context pairs (for stability) and contexts (for validity metrics),
|
313 |
+
i.e. the average of \( \binom{n\_context\_chunks}{2} + n\_validity\_metrics*n\_context\_chunks \) values.</li>
|
314 |
+
<li> <b> Ordinal - Win rate (↑) </b> - for each metric, each context pair (for stability) and each context (for validity metrics) is considered as a game between two models, the win rate of a model is the percentage of won games against all models,
|
315 |
+
i.e. the average of \( (n\_models-1) * ( \binom{n\_context\_chunks}{2} + n\_validity\_metrics*n\_context\_chunks) \).</li>
|
316 |
</ul>
|
317 |
<p>
|
318 |
Following this <a href="https://arxiv.org/abs/2405.01719">paper</a> and associated <a href="https://github.com/socialfoundations/benchbench">benchbench</a> library,
|
|
|
358 |
<div class="citation-section">
|
359 |
<p>If you found this project useful, please cite our related paper:</p>
|
360 |
<div class="citation-box" id="citation-text">
|
361 |
+
@inproceedings{kovavc2024stick,
|
362 |
title={Stick to your Role! Stability of Personal Values Expressed in Large Language Models},
|
363 |
author={Kova{\v{c}}, Grgur and Portelas, R{\'e}my and Sawayama, Masataka and Dominey, Peter Ford and Oudeyer, Pierre-Yves},
|
364 |
+
booktitle={Proceedings of the Annual Meeting of the Cognitive Science Society},
|
365 |
+
volume={46},
|
366 |
year={2024}
|
367 |
}
|
368 |
</div>
|
templates/index.html
CHANGED
@@ -110,7 +110,7 @@
|
|
110 |
border-radius: 8px;
|
111 |
padding: 5px;
|
112 |
margin-top: 5px;
|
113 |
-
font-size:
|
114 |
text-align: left;
|
115 |
font-family: 'Courier New', Courier, monospace;
|
116 |
white-space: pre;
|
@@ -224,7 +224,7 @@
|
|
224 |
and the associated PVQ-40 and SVS questionnaires (available <a href="https://www.researchgate.net/publication/354384463_A_Repository_of_Schwartz_Value_Scales_with_Instructions_and_an_Introduction">here</a>).
|
225 |
</p>
|
226 |
<p>
|
227 |
-
Using the <a href="https://pubmed.ncbi.nlm.nih.gov/31402448/">methodology from psychology</a>, we focus on population-level (interpersonal) value stability, i.e. <b>Rank-Order stability
|
228 |
Rank-Order stability refers to the extent the order of different personas (in terms of expression of some value) remains the same along different contexts.
|
229 |
Refer <a href="{{ url_for('about', _anchor='rank_order_stability') }}">here</a> or to our <a href="https://arxiv.org/abs/2402.14846">paper</a> for more details.
|
230 |
</p>
|
@@ -256,10 +256,11 @@
|
|
256 |
Refer <a href="{{ url_for('about', _anchor='paper') }}">here</a> for details.
|
257 |
</p>
|
258 |
<div class="citation-box" id="citation-text">
|
259 |
-
@
|
260 |
title={Stick to your Role! Stability of Personal Values Expressed in Large Language Models},
|
261 |
author={Kova{\v{c}}, Grgur and Portelas, R{\'e}my and Sawayama, Masataka and Dominey, Peter Ford and Oudeyer, Pierre-Yves},
|
262 |
-
|
|
|
263 |
year={2024}
|
264 |
}
|
265 |
</div>
|
|
|
110 |
border-radius: 8px;
|
111 |
padding: 5px;
|
112 |
margin-top: 5px;
|
113 |
+
font-size: 13px;
|
114 |
text-align: left;
|
115 |
font-family: 'Courier New', Courier, monospace;
|
116 |
white-space: pre;
|
|
|
224 |
and the associated PVQ-40 and SVS questionnaires (available <a href="https://www.researchgate.net/publication/354384463_A_Repository_of_Schwartz_Value_Scales_with_Instructions_and_an_Introduction">here</a>).
|
225 |
</p>
|
226 |
<p>
|
227 |
+
Using the <a href="https://pubmed.ncbi.nlm.nih.gov/31402448/">methodology from psychology</a>, we focus on population-level (interpersonal) value stability, i.e. <b>Rank-Order stability (RO stability)</b>.
|
228 |
Rank-Order stability refers to the extent the order of different personas (in terms of expression of some value) remains the same along different contexts.
|
229 |
Refer <a href="{{ url_for('about', _anchor='rank_order_stability') }}">here</a> or to our <a href="https://arxiv.org/abs/2402.14846">paper</a> for more details.
|
230 |
</p>
|
|
|
256 |
Refer <a href="{{ url_for('about', _anchor='paper') }}">here</a> for details.
|
257 |
</p>
|
258 |
<div class="citation-box" id="citation-text">
|
259 |
+
@inproceedings{kovavc2024stick,
|
260 |
title={Stick to your Role! Stability of Personal Values Expressed in Large Language Models},
|
261 |
author={Kova{\v{c}}, Grgur and Portelas, R{\'e}my and Sawayama, Masataka and Dominey, Peter Ford and Oudeyer, Pierre-Yves},
|
262 |
+
booktitle={Proceedings of the Annual Meeting of the Cognitive Science Society},
|
263 |
+
volume={46},
|
264 |
year={2024}
|
265 |
}
|
266 |
</div>
|
templates/new_model.html
CHANGED
@@ -271,12 +271,13 @@
|
|
271 |
<div class="citation-section">
|
272 |
<p>If you found this project useful, please cite our related paper:</p>
|
273 |
<div class="citation-box" id="citation-text">
|
274 |
-
|
275 |
-
|
276 |
-
|
277 |
-
|
278 |
-
|
279 |
-
|
|
|
280 |
</div>
|
281 |
</div>
|
282 |
</div>
|
|
|
271 |
<div class="citation-section">
|
272 |
<p>If you found this project useful, please cite our related paper:</p>
|
273 |
<div class="citation-box" id="citation-text">
|
274 |
+
@inproceedings{kovavc2024stick,
|
275 |
+
title={Stick to your Role! Stability of Personal Values Expressed in Large Language Models},
|
276 |
+
author={Kova{\v{c}}, Grgur and Portelas, R{\'e}my and Sawayama, Masataka and Dominey, Peter Ford and Oudeyer, Pierre-Yves},
|
277 |
+
booktitle={Proceedings of the Annual Meeting of the Cognitive Science Society},
|
278 |
+
volume={46},
|
279 |
+
year={2024}
|
280 |
+
}
|
281 |
</div>
|
282 |
</div>
|
283 |
</div>
|