Triangle104 commited on
Commit
2f67385
1 Parent(s): 6e1bf2c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +82 -0
README.md CHANGED
@@ -18,6 +18,88 @@ tags:
18
  This model was converted to GGUF format from [`huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2`](https://huggingface.co/huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
19
  Refer to the [original model card](https://huggingface.co/huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2) for more details on the model.
20
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  ## Use with llama.cpp
22
  Install llama.cpp through brew (works on Mac and Linux)
23
 
 
18
  This model was converted to GGUF format from [`huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2`](https://huggingface.co/huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
19
  Refer to the [original model card](https://huggingface.co/huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2) for more details on the model.
20
 
21
+ ---
22
+ Model details:
23
+ -
24
+ This is an uncensored version of Qwen2.5-14B-Instruct created with abliteration (see this article to know more about it).
25
+
26
+ Special thanks to @FailSpy for the original code and technique. Please follow him if you're interested in abliterated models.
27
+
28
+ Important Note This version is an improvement over the previous one Qwen2.5-14B-Instruct-abliterated.
29
+ Usage
30
+
31
+ You can use this model in your applications by loading it with Hugging Face's transformers library:
32
+
33
+ from transformers import AutoModelForCausalLM, AutoTokenizer
34
+
35
+ # Load the model and tokenizer
36
+ model_name = "huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2"
37
+ model = AutoModelForCausalLM.from_pretrained(
38
+ model_name,
39
+ torch_dtype="auto",
40
+ device_map="auto"
41
+ )
42
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
43
+
44
+ # Initialize conversation context
45
+ initial_messages = [
46
+ {"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."}
47
+ ]
48
+ messages = initial_messages.copy() # Copy the initial conversation context
49
+
50
+ # Enter conversation loop
51
+ while True:
52
+ # Get user input
53
+ user_input = input("User: ").strip() # Strip leading and trailing spaces
54
+
55
+ # If the user types '/exit', end the conversation
56
+ if user_input.lower() == "/exit":
57
+ print("Exiting chat.")
58
+ break
59
+
60
+ # If the user types '/clean', reset the conversation context
61
+ if user_input.lower() == "/clean":
62
+ messages = initial_messages.copy() # Reset conversation context
63
+ print("Chat history cleared. Starting a new conversation.")
64
+ continue
65
+
66
+ # If input is empty, prompt the user and continue
67
+ if not user_input:
68
+ print("Input cannot be empty. Please enter something.")
69
+ continue
70
+
71
+ # Add user input to the conversation
72
+ messages.append({"role": "user", "content": user_input})
73
+
74
+ # Build the chat template
75
+ text = tokenizer.apply_chat_template(
76
+ messages,
77
+ tokenize=False,
78
+ add_generation_prompt=True
79
+ )
80
+
81
+ # Tokenize input and prepare it for the model
82
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
83
+
84
+ # Generate a response from the model
85
+ generated_ids = model.generate(
86
+ **model_inputs,
87
+ max_new_tokens=8192
88
+ )
89
+
90
+ # Extract model output, removing special tokens
91
+ generated_ids = [
92
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
93
+ ]
94
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
95
+
96
+ # Add the model's response to the conversation
97
+ messages.append({"role": "assistant", "content": response})
98
+
99
+ # Print the model's response
100
+ print(f"Qwen: {response}")
101
+
102
+ ---
103
  ## Use with llama.cpp
104
  Install llama.cpp through brew (works on Mac and Linux)
105