Quantization made by Richard Erkhov.

Excalibur-7b-DPO - GGUF

Model creator: https://huggingface.co/InferenceIllusionist/
Original model: https://huggingface.co/InferenceIllusionist/Excalibur-7b-DPO/

Name	Quant method	Size
Excalibur-7b-DPO.Q2_K.gguf	Q2_K	2.53GB
Excalibur-7b-DPO.IQ3_XS.gguf	IQ3_XS	2.81GB
Excalibur-7b-DPO.IQ3_S.gguf	IQ3_S	2.96GB
Excalibur-7b-DPO.Q3_K_S.gguf	Q3_K_S	2.95GB
Excalibur-7b-DPO.IQ3_M.gguf	IQ3_M	3.06GB
Excalibur-7b-DPO.Q3_K.gguf	Q3_K	3.28GB
Excalibur-7b-DPO.Q3_K_M.gguf	Q3_K_M	3.28GB
Excalibur-7b-DPO.Q3_K_L.gguf	Q3_K_L	3.56GB
Excalibur-7b-DPO.IQ4_XS.gguf	IQ4_XS	3.67GB
Excalibur-7b-DPO.Q4_0.gguf	Q4_0	3.83GB
Excalibur-7b-DPO.IQ4_NL.gguf	IQ4_NL	3.87GB
Excalibur-7b-DPO.Q4_K_S.gguf	Q4_K_S	3.86GB
Excalibur-7b-DPO.Q4_K.gguf	Q4_K	4.07GB
Excalibur-7b-DPO.Q4_K_M.gguf	Q4_K_M	4.07GB
Excalibur-7b-DPO.Q4_1.gguf	Q4_1	4.24GB
Excalibur-7b-DPO.Q5_0.gguf	Q5_0	4.65GB
Excalibur-7b-DPO.Q5_K_S.gguf	Q5_K_S	4.65GB
Excalibur-7b-DPO.Q5_K.gguf	Q5_K	4.78GB
Excalibur-7b-DPO.Q5_K_M.gguf	Q5_K_M	4.78GB
Excalibur-7b-DPO.Q5_1.gguf	Q5_1	5.07GB
Excalibur-7b-DPO.Q6_K.gguf	Q6_K	5.53GB
Excalibur-7b-DPO.Q8_0.gguf	Q8_0	7.17GB

Original model description:

license: apache-2.0 library_name: transformers tags: - finetune - dpo - chatml base_model: - InferenceIllusionist/Excalibur-7b datasets: - Intel/orca_dpo_pairs model-index: - name: Excalibur-7b-DPO results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 70.9 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=InferenceIllusionist/Excalibur-7b-DPO name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 87.93 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=InferenceIllusionist/Excalibur-7b-DPO name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 65.46 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=InferenceIllusionist/Excalibur-7b-DPO name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 70.82 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=InferenceIllusionist/Excalibur-7b-DPO name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 82.48 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=InferenceIllusionist/Excalibur-7b-DPO name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 65.43 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=InferenceIllusionist/Excalibur-7b-DPO name: Open LLM Leaderboard

Excalibur-7b-DPO

An initial foray into the world of fine-tuning. The goal of this release was to amplify the quality of the original model's responses, in particular for vision use cases*

Weighted (Importance Matrix) Quants available here

Static (Legacy) quants available here

Notes & Methodology

Excalibur-7b fine-tuned with Direct Preference Optimization (DPO) using Intel/orca_dpo_pairs
This is a quick experiment to determine the impact of DPO finetuning on the Excelsior-7b base model
Ran for a little over an hour on a single A100
Fine-tuning succeeded in making model conversational and more well-rounded
Benchmark scores increased in the following categories versus base Excelsior-7b:
- ARC: 69.71 -> 70.9
- HellaSwag: 87.56 -> 87.93
- TruthfulQA: 67.24 -> 70.82
- Average: 73.6 -> 73.84
Precision: bfloat16

Sample Question - Vision

*Requires additional mmproj file. You have two options for vision functionality (available inside this repo):

Select the gguf file of your choice in Koboldcpp as usual, then make sure to choose the mmproj file above in the LLaVA mmproj field of the model submenu:

Prompt Format

For best results please use ChatML for the prompt format. Alpaca may also work.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	73.84
AI2 Reasoning Challenge (25-Shot)	70.90
HellaSwag (10-Shot)	87.93
MMLU (5-Shot)	65.46
TruthfulQA (0-shot)	70.82
Winogrande (5-shot)	82.48
GSM8k (5-shot)	65.43