Experimental and negative results
Collection
Models that didn't always quite work out, but may still be of interest.
•
9 items
•
Updated
This is a rank=32 LoRA extracted from a language model. It was extracted using mergekit.
This LoRA is derived from the refusal ablation vector also computed in Llama-3-Instruct-abliteration-OVA-8B, and is around 5 orders of magnitude smaller than full bf16 weights of the base model.
This LoRA adapter was extracted from failspy/Meta-Llama-3-8B-Instruct-abliterated-v3 and uses meta-llama/Meta-Llama-3-8B-Instruct as a base.
The following command was used to extract this LoRA adapter:
mergekit-extract-lora meta-llama/Meta-Llama-3-8B-Instruct failspy/Meta-Llama-3-8B-Instruct-abliterated-v3 OUTPUT_PATH --rank=32 --model_name=Llama-3-8B-Instruct-counter-refusal