Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

This is a Llama 3.1 fine tune using the RL algorithm and benchmark data proposed in the paper "Deal or no deal (or who knows)" published in ACL Findings 2024. Models from this paper are designed to predict the outcome of an unfolding conversation, specifically noting the probability that the outcome will occur. For instance, these models can estimate the probability that a deal will occur before the end of a negotiation.

The "Direct Forecaster" (the model in this repo) is trained with RL to output the probability in it's sampled tokens. In the paper, this model seemed to handle out-of-distribution data the best. Based off experiments, we expect lower, non-zero temperatures to be best for sampling.

The "Implicit Forecaster" (available here) is trained with SFT to output the estimated probability using the logit for the token " Yes". In the paper, this model performed best overall . Temperature should be the default value (i.e., 1).

Here's a comparison of these models with some previous runs of GPT-4 (no fine-tuning). We use data priors and temperature scaling for both models (see paper for details).

model alg instances Brier Score
Llama-3.1-8B-Instruct DF RL interp awry 0.255467
casino 0.216955
cmv 0.261726
deals 0.174899
deleted 0.255129
donations 0.251880
supreme 0.231955
Llama-3.1-8B-Instruct IF SFT awry 0.220083
casino 0.196558
cmv 0.207542
deals 0.118853
deleted 0.114553
donations 0.238121
supreme 0.223060
OpenAI GPT 4 None awry 0.247775
casino 0.204828
cmv 0.230229
deals 0.132760
deleted 0.169750
donations 0.262453
supreme 0.230321

Note, for the best performance, certain prompt-engineering and post-processing procedures should be used (details in the paper).

The GitHub repo. (here) is also available if you wish to train new models with similiar training algorithms. This repo. also contains plenty of examples of how to use these models for inference and load them from a local directory.

For any questions, please reach feel free to reach out!

Some quantization details are given below:


library_name: peft

Training procedure

The following bitsandbytes quantization config was used during training:

  • quant_method: QuantizationMethod.BITS_AND_BYTES
  • _load_in_8bit: False
  • _load_in_4bit: True
  • llm_int8_threshold: 6.0
  • llm_int8_skip_modules: None
  • llm_int8_enable_fp32_cpu_offload: False
  • llm_int8_has_fp16_weight: False
  • bnb_4bit_quant_type: nf4
  • bnb_4bit_use_double_quant: False
  • bnb_4bit_compute_dtype: float16
  • bnb_4bit_quant_storage: uint8
  • load_in_4bit: True
  • load_in_8bit: False

The following bitsandbytes quantization config was used during training:

  • quant_method: QuantizationMethod.BITS_AND_BYTES
  • _load_in_8bit: False
  • _load_in_4bit: True
  • llm_int8_threshold: 6.0
  • llm_int8_skip_modules: None
  • llm_int8_enable_fp32_cpu_offload: False
  • llm_int8_has_fp16_weight: False
  • bnb_4bit_quant_type: nf4
  • bnb_4bit_use_double_quant: False
  • bnb_4bit_compute_dtype: float16
  • bnb_4bit_quant_storage: uint8
  • load_in_4bit: True
  • load_in_8bit: False

Framework versions

  • PEFT 0.5.0

  • PEFT 0.5.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .