|
--- |
|
library_name: transformers |
|
tags: |
|
- transformers |
|
- peft |
|
- arxiv:2406.08391 |
|
license: llama2 |
|
base_model: meta-llama/Llama-2-13b-chat-hf |
|
datasets: |
|
- calibration-tuning/Llama-2-13b-chat-hf-20k-choice |
|
--- |
|
|
|
# Model Card |
|
|
|
**Llama 13B Chat CT-Choice** is a fine-tuned [Llama 13B Chat](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf) model that provides well-calibrated confidence estimates for multiple-choice question answering. |
|
|
|
The model is fine-tuned (calibration-tuned) using a [dataset](https://huggingface.co/datasets/calibration-tuning/Llama-2-13b-chat-hf-20k-choice) of *multiple-choice* generations from `meta-llama/Llama-2-13b-chat-hf`, labeled for correctness. |
|
At test/inference time, the probability of correctness defines the confidence of the model in its answer. |
|
For full details, please see our [paper](https://arxiv.org/abs/2406.08391) and supporting [code](https://github.com/activatedgeek/calibration-tuning). |
|
|
|
**Other Models**: We also release a broader collection of [Multiple-Choice CT Models](https://huggingface.co/collections/calibration-tuning/multiple-choice-ct-models-66043dedebf973d639090821). |
|
|
|
## Usage |
|
|
|
This adapter model is meant to be used on top of `meta-llama/Llama-2-13b-chat-hf` model generations. |
|
|
|
The confidence estimation pipeline follows these steps, |
|
1. Load base model and PEFT adapter. |
|
2. Disable adapter and generate answer. |
|
3. Enable adapter and generate confidence. |
|
|
|
All standard guidelines for the base model's generation apply. |
|
|
|
For a complete example, see [play.py](https://github.com/activatedgeek/calibration-tuning/blob/main/experiments/play.py) at the supporting code repository. |
|
|
|
**NOTE**: Using the adapter for generations may hurt downstream task accuracy and confidence estimates. We recommend using the adapter to estimate *only* confidence. |
|
|
|
## License |
|
|
|
The model is released under the original model's Llama 2 Community License Agreement. |