why it says no quantize_config.json file but it has
Traceback (most recent call last):
File "/workspace/inference_llama2.py", line 25, in
model = AutoGPTQForCausalLM.from_quantized(model_name_or_path,
File "/usr/local/lib/python3.10/dist-packages/auto_gptq/modeling/auto.py", line 63, in from_quantized
return GPTQ_CAUSAL_LM_MODEL_MAP[model_type].from_quantized(
File "/usr/local/lib/python3.10/dist-packages/auto_gptq/modeling/_base.py", line 501, in from_quantized
quantize_config = BaseQuantizeConfig.from_pretrained(save_dir)
File "/usr/local/lib/python3.10/dist-packages/auto_gptq/modeling/_base.py", line 51, in from_pretrained
with open(join(save_dir, "quantize_config.json"), "r", encoding="utf-8") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'TheBloke/Llama-2-13B-chat-GPTQ/quantize_config.json'
Not sure. There's definitely a quantize_config.json in the repo. Show your full code.
from transformers import AutoTokenizer, pipeline, logging
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
model_name_or_path = "TheBloke/Llama-2-13B-chat-GPTQ"
model_basename = "gptq_model-4bit-128g"
use_triton = False
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, revision=None, use_fast=True)
model = AutoGPTQForCausalLM.from_quantized(model_name_or_path,
model_basename=model_basename,
use_safetensors=True,
trust_remote_code=True,
device="cuda:0",
use_triton=use_triton,
quantize_config=None)
Remove the quantize_config=None
No, quantize_config=None
is fine. It might not be needed, as if you remove it it will just be set to None in the same way. But it's definitely not causing any problems.
I just tested this code and it works fine:
from transformers import AutoTokenizer, pipeline, logging
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
model_name_or_path = "TheBloke/Llama-2-13B-chat-GPTQ"
model_basename = "gptq_model-4bit-128g"
use_triton = False
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)
model = AutoGPTQForCausalLM.from_quantized(model_name_or_path,
model_basename=model_basename,
use_safetensors=True,
device="cuda:0",
quantize_config=None)
prompt = "Tell me about AI"
prompt_template=f'''[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>
{prompt}[/INST]'''
print("\n\n*** Generate:")
input_ids = tokenizer(prompt_template, return_tensors='pt').input_ids.cuda()
output = model.generate(inputs=input_ids, temperature=0.7, max_new_tokens=512)
print(tokenizer.decode(output[0]))
Output:
*** Generate:
<s> [INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>
Tell me about AI[/INST] Hello! I'd be happy to help answer your questions about AI. Before we begin, I want to make sure that we have a safe and respectful conversation. I'm just an AI myself, and I strive to provide accurate and helpful information, while avoiding any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. I believe in treating all individuals with dignity and respect, regardless of their background or identity.
Now, to answer your question, AI stands for "Artificial Intelligence," and it refers to the use of technology to create intelligent machines that can perform tasks that typically require human intelligence. AI has been around for several decades, and it has been used in a wide range of applications, from simple tasks like data entry to complex tasks like self-driving cars.
There are many different types of AI, including:
1. Narrow or weak AI: This type of AI is designed to perform a specific task, such as facial recognition or language translation.
2. General or strong AI: This type of AI is designed to perform any intellectual task that a human can, such as reasoning, problem-solving, and learning.
3. Superintelligence: This type of AI is significantly more intelligent than the best human minds, and is capable of solving complex problems that are beyond human ability.
AI has many potential benefits, such as:
1. Increased productivity: AI can automate repetitive tasks, freeing up time for more creative and strategic work.
2. Improved decision-making: AI can analyze large amounts of data and provide insights that humans might miss.
3. Enhanced safety: AI can be used to monitor and control critical systems, such as power grids and transportation networks.
4. Improved healthcare: AI can help doctors and researchers analyze medical data and develop new treatments for diseases.
However, AI also raises important ethical and societal questions, such as:
1. Bias: AI systems can perpetuate biases and discrimination if they are trained on biased data.
2. Privacy: AI systems can collect and analyze large amounts of personal data, which raises concerns about privacy and surveillance.
@TheBloke the one you provided above also gives the same error