Triangle104 commited on
Commit
37479a1
1 Parent(s): fe08400

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +78 -0
README.md CHANGED
@@ -18,6 +18,84 @@ tags:
18
  This model was converted to GGUF format from [`huihui-ai/Qwen2.5-7B-Instruct-abliterated-v3`](https://huggingface.co/huihui-ai/Qwen2.5-7B-Instruct-abliterated-v3) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
19
  Refer to the [original model card](https://huggingface.co/huihui-ai/Qwen2.5-7B-Instruct-abliterated-v3) for more details on the model.
20
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  ## Use with llama.cpp
22
  Install llama.cpp through brew (works on Mac and Linux)
23
 
 
18
  This model was converted to GGUF format from [`huihui-ai/Qwen2.5-7B-Instruct-abliterated-v3`](https://huggingface.co/huihui-ai/Qwen2.5-7B-Instruct-abliterated-v3) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
19
  Refer to the [original model card](https://huggingface.co/huihui-ai/Qwen2.5-7B-Instruct-abliterated-v3) for more details on the model.
20
 
21
+ ---
22
+ Model details:
23
+ -
24
+ This is an uncensored version of Qwen/Qwen2.5-7B-Instruct created with abliteration (see remove-refusals-with-transformers to know more about it). This is a crude, proof-of-concept implementation to remove refusals from an LLM model without using TransformerLens. The test results are not very good, but compared to before, there is much less garbled text.
25
+ Usage
26
+
27
+ You can use this model in your applications by loading it with Hugging Face's transformers library:
28
+
29
+ from transformers import AutoModelForCausalLM, AutoTokenizer
30
+
31
+ # Load the model and tokenizer
32
+ model_name = "huihui-ai/Qwen2.5-7B-Instruct-abliterated-v3"
33
+ model = AutoModelForCausalLM.from_pretrained(
34
+ model_name,
35
+ torch_dtype="auto",
36
+ device_map="auto"
37
+ )
38
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
39
+
40
+ # Initialize conversation context
41
+ initial_messages = [
42
+ {"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."}
43
+ ]
44
+ messages = initial_messages.copy() # Copy the initial conversation context
45
+
46
+ # Enter conversation loop
47
+ while True:
48
+ # Get user input
49
+ user_input = input("User: ").strip() # Strip leading and trailing spaces
50
+
51
+ # If the user types '/exit', end the conversation
52
+ if user_input.lower() == "/exit":
53
+ print("Exiting chat.")
54
+ break
55
+
56
+ # If the user types '/clean', reset the conversation context
57
+ if user_input.lower() == "/clean":
58
+ messages = initial_messages.copy() # Reset conversation context
59
+ print("Chat history cleared. Starting a new conversation.")
60
+ continue
61
+
62
+ # If input is empty, prompt the user and continue
63
+ if not user_input:
64
+ print("Input cannot be empty. Please enter something.")
65
+ continue
66
+
67
+ # Add user input to the conversation
68
+ messages.append({"role": "user", "content": user_input})
69
+
70
+ # Build the chat template
71
+ text = tokenizer.apply_chat_template(
72
+ messages,
73
+ tokenize=False,
74
+ add_generation_prompt=True
75
+ )
76
+
77
+ # Tokenize input and prepare it for the model
78
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
79
+
80
+ # Generate a response from the model
81
+ generated_ids = model.generate(
82
+ **model_inputs,
83
+ max_new_tokens=8192
84
+ )
85
+
86
+ # Extract model output, removing special tokens
87
+ generated_ids = [
88
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
89
+ ]
90
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
91
+
92
+ # Add the model's response to the conversation
93
+ messages.append({"role": "assistant", "content": response})
94
+
95
+ # Print the model's response
96
+ print(f"Qwen: {response}")
97
+
98
+ ---
99
  ## Use with llama.cpp
100
  Install llama.cpp through brew (works on Mac and Linux)
101