This model requires A LOT of resources... But how much? Trying to build a chatbot
Hello!
I'm new using this model, as many of you! I'm trying to create a Chatbot in my local machine but I'm facing a very big problem: the model is loading in memory, and it's consuming 100% of the resources. I have 64GB RAM, a very good CPU and also a very good GPU, but I cannot run the model because it is taking a lot of time to load.
What Am I doing wrong? Here is my code in Python:
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "togethercomputer/GPT-NeoXT-Chat-Base-20B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Define a function to generate a response from the model given an input message
def generate_response(input_message):
# Encode the input message using the tokenizer
input_ids = tokenizer.encode(input_message, return_tensors="pt")
# Generate a response from the model
output = model.generate(input_ids, max_length=1024, do_sample=True, top_p=0.9, top_k=50)
# Decode the output tokens and return the response as a string
response = tokenizer.decode(output[0], skip_special_tokens=True)
return response
# Define a loop to take user input and generate responses
while True:
# Take user input
user_input = input("You: ")
# Generate a response from the model
response = generate_response(user_input)
# Print the response
print("Bot:", response)
You should load the weights in 8bits or bfloat16. Should cut the resource consumption quite a bit. Make sure to pip install accelerate.
@banalyst Do you have some code I could try out 8bit inference?
This should workmodel = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", load_in_8bit=True)
And make sure have accelerate
and bitsandbytes
installed. :)
@banalyst Do you have some code I could try out 8bit inference?
This should workmodel = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", load_in_8bit=True)
And make sure have
accelerate
andbitsandbytes
installed. :)
I get this error "topk_cpu" not implemented for 'Half' on colab
@banalyst Do you have some code I could try out 8bit inference?
This should workmodel = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", load_in_8bit=True)
And make sure have
accelerate
andbitsandbytes
installed. :)I get this error "topk_cpu" not implemented for 'Half' on colab
Able to fix this with sample=false, but the response is not coming even for 2 minutes on a premium colab. Attached screenshot
@banalyst Do you have some code I could try out 8bit inference?
This should workmodel = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", load_in_8bit=True)
And make sure have
accelerate
andbitsandbytes
installed. :)
Thanks!
Now I am facing this:
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64... CUDA SETUP: WARNING! libcuda.so not found! Do you have a CUDA driver installed? If you are on a cluster, make sure you are on a CUDA machine! CUDA SETUP: Loading binary C:\Users\Felipe\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\bitsandbytes\libbitsandbytes_cpu.so... argument of type 'WindowsPath' is not iterable CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64... CUDA SETUP: WARNING! libcuda.so not found! Do you have a CUDA driver installed? If you are on a cluster, make sure you are on a CUDA machine! CUDA SETUP: Loading binary C:\Users\Felipe\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\bitsandbytes\libbitsandbytes_cpu.so... argument of type 'WindowsPath' is not iterable CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64... CUDA SETUP: WARNING! libcuda.so not found! Do you have a CUDA driver installed? If you are on a cluster, make sure you are on a CUDA machine! CUDA SETUP: Loading binary C:\Users\Felipe\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\bitsandbytes\libbitsandbytes_cpu.so... argument of type 'WindowsPath' is not iterable CUDA SETUP: Problem: The main issue seems to be that the main CUDA library was not detected. CUDA SETUP: Solution 1): Your paths are probably not up-to-date. You can update them via: sudo ldconfig. CUDA SETUP: Solution 2): If you do not have sudo rights, you can do the following: CUDA SETUP: Solution 2a): Find the cuda library via: find / -name libcuda.so 2>/dev/null CUDA SETUP: Solution 2b): Once the library is found add it to the LD_LIBRARY_PATH: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:FOUND_PATH_FROM_2a CUDA SETUP: Solution 2c): For a permanent solution add the export from 2b into your .bashrc file, located at ~/.bashrc
IDK what to do. I just pip-installed transformers
and my PC is running with Windows 10, I have CUDA 10.1 installed... Can anyone help me?
@banalyst Do you have some code I could try out 8bit inference?
This should workmodel = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", load_in_8bit=True)
And make sure have
accelerate
andbitsandbytes
installed. :)Thanks!
Now I am facing this:
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64... CUDA SETUP: WARNING! libcuda.so not found! Do you have a CUDA driver installed? If you are on a cluster, make sure you are on a CUDA machine! CUDA SETUP: Loading binary C:\Users\Felipe\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\bitsandbytes\libbitsandbytes_cpu.so... argument of type 'WindowsPath' is not iterable CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64... CUDA SETUP: WARNING! libcuda.so not found! Do you have a CUDA driver installed? If you are on a cluster, make sure you are on a CUDA machine! CUDA SETUP: Loading binary C:\Users\Felipe\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\bitsandbytes\libbitsandbytes_cpu.so... argument of type 'WindowsPath' is not iterable CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64... CUDA SETUP: WARNING! libcuda.so not found! Do you have a CUDA driver installed? If you are on a cluster, make sure you are on a CUDA machine! CUDA SETUP: Loading binary C:\Users\Felipe\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\bitsandbytes\libbitsandbytes_cpu.so... argument of type 'WindowsPath' is not iterable CUDA SETUP: Problem: The main issue seems to be that the main CUDA library was not detected. CUDA SETUP: Solution 1): Your paths are probably not up-to-date. You can update them via: sudo ldconfig. CUDA SETUP: Solution 2): If you do not have sudo rights, you can do the following: CUDA SETUP: Solution 2a): Find the cuda library via: find / -name libcuda.so 2>/dev/null CUDA SETUP: Solution 2b): Once the library is found add it to the LD_LIBRARY_PATH: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:FOUND_PATH_FROM_2a CUDA SETUP: Solution 2c): For a permanent solution add the export from 2b into your .bashrc file, located at ~/.bashrc
IDK what to do. I just pip-installed
transformers
and my PC is running with Windows 10, I have CUDA 10.1 installed... Can anyone help me?
Hi, this seems to be that libcuda.so is not specified in any of the environmental path. Have you tried the solutions at the end? i.e.,
CUDA SETUP: Solution 1): Your paths are probably not up-to-date. You can update them via: sudo ldconfig.
CUDA SETUP: Solution 2): If you do not have sudo rights, you can do the following:
CUDA SETUP: Solution 2a): Find the cuda library via: find / -name libcuda.so 2>/dev/null
CUDA SETUP: Solution 2b): Once the library is found add it to the LD_LIBRARY_PATH: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:FOUND_PATH_FROM_2a
CUDA SETUP: Solution 2c): For a permanent solution add the export from 2b into your .bashrc file, located at ~/.bashrc`
Did it give any further information after you tried them?
CUDA SETUP: Solution 1): Your paths are probably not up-to-date. You can update them via: sudo ldconfig.
from the paths (C:\Users...) it looks like Felipe is running this on a PC and as far as I know bitsandbytes isn't supported on Windows (yet?).